The Google Books Mess

There were a couple of tell-tale signs last week that Google may be having some pain and problems with its vastly ambitious Google Books project. First, was the news that Google was pulling the plug on its corresponding, open-ended, plan to scan and database masses of historic newspaper archives. Second a report that Google was diverting all its programmers from its eBookstore and perhaps not vigorously pursuing plans selling eBooks.

The problem that Google has, is that there was huge momentum within the company towards its grandiose plan for a comprehensive universal digital library and this vision, with its accompanying class action settlement [ASA or amended settlement agreement] was decisively stopped in March by the opinion of Judge Chin (USDC SDNY)

While the digitization of books and the creation of a
universal digital library would benefit many, the ASA would
simply go too far. It would permit this class action – – which
was brought against defendant Google Inc. (“Google”) to challenge
its scanning of books and display of “snippets” for on-line
searching – – to implement a forward-looking business arrangement
that would grant Google significant rights to exploit entire
books, without permission of the copyright owners. Indeed, the
ASA would give Google a significant advantage over competitors,
rewarding it for engaging in wholesale copying of copyrighted
works without permission, while releasing claims well beyond
those presented in the case. (Opinion 22 March 2011)

Chin’s decision is styled an opinion, and it might yet be appealed or revised, but most observers would tell you that it has pretty well stopped the Google project in its tracks.

Google has got a lot of figuring out to do:

  1. Google is not out of its legal woes, although such a rich and powerful company can probably stall or out-manoeuvre the authors and publishers who are parties to the original suite in the USA. Yet Google will need some resolution to the case or it risks enormous damages for breach of copyright ($3.6 trillion according to one scholar).
  2. Google will not find it straightforward to avoid legal actions in other jurisdictions. It has ongoing legal woes in France, and if some French publishers win substantial damages, many others will charge through these same gates.
  3. Google is continuing to scan without permission millions of works which are not out of copyright on behalf of its library partners. So the liabilities grow.
  4. Google will be required to deliver digital library services to some of its core collaborating libraries. The libraries of Michigan and Stanford in particular. To the extent that these services depend on copyright works digitized without permission Google remains at significant risk.
  5. There will be increasing concern about advantages that may accrue to Google from the works that it has already scanned and databased, and which it may use in ways impervious and invisible to external actors. Perhaps Google will gain enormous advantage in the fields of search, automated translation and semantic technologies through private access to vast amounts of unregistered, unlicensed, copyright material. That putative advantage creates legal risks for Google from competitors and regulators.
  6. Without a recognized and legitimized settlement Google cannot deliver services of general public benefit, and at some point Google loses good will. Without a settlement Google cannot even be generous.
  7. Google has plenty of agreements with publishers and authors for the distribution, display and potential licensing of millions of copyright works. So it could be an active participant in the eBooks market, but it has been strangely hesitant and stuttering in recent years about its commercial activities. Almost certainly because Google’s lawyers are anxious about the way such commercial exploitation may play against the unresolved matters in dispute. If Google carries on havering it will lose its opportunity in the digital books market, much as it appears to be losing its opportunity in the market for digital music.

I am not sure that Google has an easy way of stepping out of this mess. But it needs to find, or create through disruptive action, some solution.

The original goal of a universal library designed, built and maintained by a single technical player was hubristic and naive, driven by the enthusiasm and commitment of the founders (Page in particular who felt that he owed a debt to his alma mater, the University of Michigan). Google’s best hope now would be to distance its involvement from the prospect of private gain and to place all works not public domain, and not explicitly licensed to Google, in the sole care and control of the public academic institutions from which the original works were taken, and to renounce any commercial advantage through its involvement in converting ‘orphan’ works. Google will have to pay the authors and publishers something (if only to cover some of the legal bills, that will otherwise be pursued to the bitter end on a contingency basis by the other side), it can afford to finance the first blocks of a Rights Registry, but it should be more open and more public, more consultative, in part foundation funded, than the original design. Google does not need and should not look for special advantages on rights and forward-looking business models. If Google were to do that it could help to promote the cause of orphan works legislation in a disinterested manner. Google needs to get legitimate, beyond all shadow of doubt, fast.

Google often likes to play the ‘open’ card, but it has been far too closed and ‘private’ over its books project. It needs to rethink the game-plan and its style of involvement. That way it will retain the good will of the library community and the reading public. By being highly generous and public spirited it looks after the interests of its shareholders also. Page is now CEO and he may need to bite on the books bullet and own up to a change of course, only be being much more open and generous can Google hope to make something like the Google Books project a reality.


Google goes into Culture Commerce

The rumour mill has it that Google will launch a Chrome netbook, cloud-based, computer before Christmas or early in the New Year. When you put this rumour alongside the others coming from the Googleplex you get an interesting picture

  1. Is Google going to buy a big package of movie rights? Is that why it has hired the former Netflix executive George Kynci?
  2. Google is possibly quite close to signing a deal with the major music labels for its cloud-based music-streaming service.
  3. For a couple of years, Google has been apparently on the brink of releasing a digital books service in collaboration with book publishers. Most recently Dan Clancy told us that Google Editions will be launching very soon (très bientôt”)

The rumours about the Chrome netbook suggest that its really all about the web, cloud-based productivity and web browsing, but if its launch is accompanied by, or closely followed by, a Google distribution and e-commerce solution for books, films and music, the market place for publishers and entertainment companies may change very fast. Google will be a formidable competitor if it becomes an information publisher and an e-commerce platform for film, music and books. Competitor primarily for Apple and Amazon, Google may well be seen as more of a ‘friend’, because more collaborative and more open than either Amazon or Apple, by the big incumbent publishers and media groups. Knowing, as we do, the way Google works (quiet launches, ‘beta’ services, and something of a scatter gun approach) I think its unlikely that Google will launch a fully fledged, cloud-based, Chrome-machine, with a multi-channel, multi-media dashboard in place in the first quarter of next year. It is surely more likely that this hardware platform and this constellation of media services will each emerge in their own good time. But if the plans come off and Google has these publishing partnerships in good order, it is highly likely that Google will be selling a lot of consumer products next Christmas. And I do not meant via Groupon; a commercial solution that can stream all kinds of media stuff from your locker in the cloud to Android and Chrome platforms, will be a dazzling consumer attraction

In Praise of Not-Reading

Reading is, in these days, an over-rated activity.

Most of what is most important about books is now about not-reading them. I was reminded of this deep but counter-intuitive truth by a blogger (An American Editor) recently complaining that his To Be Read pile (TBR Pile) was getting unmanageable because it was full of ebooks that often cost nothing and were without physical presence.

Which brings us to the special problem of ebooks. Yes, ebooks are a special problem because they take up virtually no space — just a bunch of bits and bytes, digits if you will, on a disk that can store gigabytes of digits. And so that TBR pool steadily grows. I looked this morning and I have more than 300 TBR ebooks, and that pile keeps growing. Acquiring Books for the TBR Pile: The Special Case of eBooks

American Editor is here admitting to a very old-fashioned mistake. He has not caught up with the twenty-first century. Books are now not really for reading — or to be more accurate — they are only occasionally, under the most special circumstances, for reading. Publishers are partly to blame for this (culpable, since all publishers, especially of newspapers and magazines, know that their profits are entirely dependent on selling stuff that the customers do not read) and digital book experts would be much more on the button if they spent less time fretting about ‘reading‘. And part of the problem is that the digital experts operate with a vastly over-simple model of what reading is. The conventional wisdom is that proper reading (sometimes called ‘long form reading’ — a ghastly phrase for a dubious concept) is the measurable phase in which you open all the pages of the book and look at them, the hours and minutes through which a book, conveyor-like, passes, between the moment that you bought it and the moment that you shelve it in your personal library, never to be looked at again. Incidentally, ‘re-reading’ is a much more interesting concept than mere ‘reading’, but we note that in passing and may return to the topic on another occasion (you will have observed that you can do that with writing as with reading). This Taylorean model of, conveyor-like, reading predicates that in serious reading our eyes scan more or less consecutively the whole book from page 1 to page umpteen. Efficiently and quickly. The time and motion expert holding a stop-watch, just as Google analytics calibrates our use of the Google library. As though reading a book might not actually comprise understanding it, or failing to enjoy it, or realising pretty instantly that it was not worth reading at all. Under any circumstances.

Of course, reading has always been, but is becoming steadily more, episodic; very little of our reading is like this conventional model of continuous reading, and most of us who now work in intellectual or bureaucratic activities which involve web-based reading, spend a lot of time, yes reading, in ways which are not at all like the way you first read and enjoyed Babar, P.G. Wodehouse or Jane Austen. You see, we spend a lot of our time and energy deciding what not to read. And these decisions matter. Possibly even as much as enjoying Babar, or re-reading Jane Austen.

Our understanding of digital books would be much better if we spent less time wondering about how we might read them, and a lot more time thinking about the ways in which we may use them without necessarily, or even at all, reading them. For certainly, and beyond all doubt, when there are 20 million books in Google Books Search we will not seriously, continuously, read more than the tiniest fraction of them. There are a lot of things that we need to do with books and it is not at all clear to me that we have a framework in which these activities can take place with digital books, half as effectively as with print books. For example, we need to be able to:

  • search them (that activity appears to be brilliantly covered by the already mentioned Google Books Search)
  • provide access to them (possibly well covered, in the USA, by the afore-mentioned)
  • buy them
  • listen to them
  • lend them to a friend or a colleague
  • translate them (well)
  • quote from them
  • (ideally) cite them when we quote them
  • non-consumptively compute them (we none of us know quite what that might involve)

These are all important points, but I will admit to playing a rhetorical trick with this list, my bullet points, and the bold face. The key point about the list is the recurrent ‘them’. There are so many things that we need to do with books aside from, and apart from reading them. The key thing about digital books is that we need them. We need digital books to be the ‘object’ of all these newly digital verbs and activities. Digital books need to be as versatile and as ‘real’ as physical books in all these ways, even though they are now becoming entirely virtual and insubstantial. The big challenge that Google, Apple and Amazon have yet to project is that books themselves are becoming networked. And the Google, Apple and Amazon models of network usage will inevitably fail if they are not truly book-centric.

May I recommend (unreservedly, though I have forgotten most of it, and disagreed with much of it) Pierre Bayard’s How to Talk About Books You Have Not Read. Which, in case you mistakenly decide not to read it, has many reviews here.

Google Books Search over the Summer

Judge Chin is still considering his decision in the case of Google ….. His ruling may come this week, next week, or in a few months. Only he and his team have a good idea of that. Meanwhile Professor Pam Samuelson has produced a very thorough, balanced, somewhat critical review of the proposed Settlement and of Google’s efforts in a 60 page paper for the Minnesota Law Review. If you haven’t been following GBS too closely, this is an excellent place to get an insightful review and summary of what has been going on. If, like several hundred lawyers and digital library experts you have been following GBS too closely for years, you will already have read her piece and it will have reminded you of stuff that you had forgotten. Her conclusion:

The future of public access to the cultural heritage of humankind embodied in books is too important to leave in the hands of one company and one registry that will have a de facto monopoly over a huge corpus of digital books and rights in them.
Google has yet to accept that its creation of this substantial public good brings with it public trust responsibilities that go well beyond its corporate slogan about not being evil. Google Books Search and the Future of Books in Cyberspace

I have been a ‘qualified’ supporter of Google Books Search from the beginning. The qualifications are coming more to the fore. Whatever Judge Chin decides, we can be sure that Google Books Search is going to be mired in legal complexities for years to come. The international ramifications of the venture are hopeless and will sap energy and innovation. Google Books Search, if it is approved, will work badly and too patchily for European literature and libraries, and it will be especially rough and unsatisfactory for British literature, libraries and universities. It will be a mess of conflicting and irresolvable copyright regimes for years. Google itself seems to find it hard to innovate or roll out new services. A clean and direct implementation of Google Editions has been ‘promised’ for this summer, or this year, but it has been promised before. Several times. No doubt part of the reason that it is being held up is that its roll out may have unpredictable or unwelcome legal consequences (or unwelcome splash-back from the court of public opinion). Google Editions when it comes should be a very useful and popular service, but Google have to get it out of the door before it can properly grow and bed itself into the array of digital books that is now mushrooming.

Pamela Samuelson points to the lack of substance in Google’s mantra ‘we will not be evil’; but its arguable that Google has failed in a more fundamental and troubling way. It has failed to sacrifice the idols of its founders; it has failed in corporate governance. Page and Brin met and worked together in a project for digital libraries. The Google Books Search proposition was clearly motivated in part by Page’s promise to digitize the libraries of his alma mater the University of Michigan. The two big leaps in the Google Books enterprise, were first to dream of digitizing millions of books in one universal searchable index (the original project, defended by an appeal to ‘fair use’ and the transformative effect of a large database of books) and then secondly to aim for a commercial settlement to the ‘class action’ suite, through which Google, the authors and publishers would effectively enclose, exploit and privatize millions of copyrights for which they cannot claim ownership. I suspect that the Google Books project, and especially the Library component, has always been too close to the goals and aspirations of Google’s young founders. The big and aggressive steps that the company has taken to stake out its claims have been part of the founding DNA, the dreams that brought Brin and Page together. The third ‘founder’, Erik Schmidt joined the company in 2001. At about that time the initial steps for the Google Books enterprise may have been taken, perhaps Schmidt may have been too much the ‘new boy’ to question the goals of the original founders. Schmidt should have spotted that there were copyright problems, he should have noticed that there were at least issues of politesse involved in digitizing and then using for profit, stuff that did not belong to Google or to the Universities with which they worked. I bet that he has since then wished that the aims of the Google Books undertaking were more clearly understood within and without the company. And more cautiously and generously drawn. At some point Google has to take a much more humble view of its role, and at that point things might start to work in its favour.

Nominalism, Realism and Digital Books

There is a quasi-philosophical disagreement underlying the steady digitisation of literature. A radical disagreement about what digital books really are. In a strange manner this dispute parallels the controversy between nominalists and realists in medieval scholastic philosophy about the status of universals (properties, numbers, virtues etc). Texts in the twenty-first century take the place of properties in the fourteenth. Are books more than texts, are texts more than digital file formats? Are these abstract concepts: “red”, “thirteen”, “chastity” real entities or are they simply instances and constructs based on our experience of coloured objects, groups of cakes and the people we meet? The nominalists denied the reality of these abstractions and the realists retaliated. Blood was shed. Now we find the digerati divided over the question whether a book is really more than a text; since the ebook nominalists, finding meaning in sentences and texts and not much else, would be be happy with books digitised as texts (preferably in the ePub standard) and the realists say that a book is much more than its text and that the pagination matters, the layout matters, the entirety of the book matters, the references and the citations to the book matter, and of course the illustrations matter; therefore in pursuit of realism, digital systems should virtualise the whole book, not just its text. While Project Gutenberg is at one end of the spectrum (nominalists carefully proof-reading and hunkered down in ASCII or XML), Google with its Book Search digitisation project is at the realist edge — some would say ‘hyper-realist’ in its acceptance of blank endpapers and leather bindings, all part of the ‘real book’ as represented in a Google database. Google probably would, if it could, encode the sensory aura of historic books, the vinegar that comes from cholera-touched books.

But the modern predicament over books as texts, or books as virtual objects, is complicated by a dimension of uncertainty over the appropriateness of treating books as a collective whole as parts of a library and a literature, or of digitizing them one at a time as individual atoms; perhaps, in some cases, with unique and unusual bibliographic or structural properties. Digital nominalists are governed by a standard of simplicity and hold that a text is a text, is a text. But some nominalists are atomic, whilst others favour a more holistic and uniform approach, in the interests of creating a library or a reading platform, in which all books can be searched and individual books isolated as readable downloads. Correspondingly, on the ‘realist’ side sits Google with its holistic and scalable method, Google’s whole strategy for digitizing books has been based on an assumption that all books should be accessed, searched and distributed through a single canonical library. Amazon, which has in most respects taken a ‘nominalist’ approach to the distribution of eBooks, it doesn’t do ePub but its proprietary format ‘AZW’ is just another ASCII encoding standard, has also embraced a ‘holistic’ attitude. Amazon offers its customers global searching of the Amazon archive and encourages users to build up a collection, a mini-library of eBooks on their Kindle. Amazon, just as much as Google, would like to have a scalable and totalitarian solution to the whole of published literature. All Kindle titles are atoms in the same collective, distinguished by the fact that they can be sampled or acquired from Amazon, one book at a time as the consumer dictates and purchases.

Perhaps a diagram will aid the explanation of this digital predicament for computerized books:

What approach to digital books heads to the top left quadrant of this matrix? Why, apps of course. If we think of books as apps, they do have a reality and concreteness which exceeds the flatness of the mere ASCII text, but the book as app is also highly individual. Perhaps a paradigm of this approach is the Atomic Antelope Alice App which has caused such a stir. The inventors of the Alice book found an intriguing way of applying ‘physics’, acceleration and gravity, to the Teniel illustrations in the Alice book. This is obviously a very special and un-generalisable treatment of a classic work, but as an app it is a brilliant proposition. Apps can afford to be sui generis since they stand on their own, and if this gives cataloguers and librarians a headache, too bad. The Exact Editions book and magazine apps are also in this segment of the diagram. It is, I would suggest, the potential inventiveness and the unpredictable future of the book as an application that has the most intriguing potential for the future of digital books (and libraries). If digital books do something completely novel and free-standing, something unprecedented in the world of print books, it will be because they are also software applications and can in that way assume a digital reality which exceeds our expectations of the traditional text.