Archive for June 6th, 2008
Harvard University librarian Robert Darnton has a winding argument in the New York Review of Books about the role of research libraries in the age of the Google. In it, he takes the reader on a fantastic, if slightly discontinuous, contrarian view of the nature of information and news.
Information systems, he contends, have been reinvented repeatedly throughout history and each iteration tends to shape the era in unique ways. Pages, instead of scrolls, allowed concise limitations on information. Paper and movable press revolutionized the scale of printing and the spread of ideas. Search engines have decreased the time and cost of research. Professor Darnton’s claim, though, is that “every age was an age of information, each in its own way, and that information has always been unstable.”
A former journalist, Darnton says, “I now distrust newspapers as a source of information, and I am often surprised by historians who take them as primary sources for knowing what really happened. I think newspapers should be read for information about how contemporaries construed events, rather than for reliable knowledge of events themselves.” To him, information is constantly in flux and biased in any one snapshot of it.
This isn’t detrimental, though. People tend to be able to learn to sort out truth from fiction. During the Revolutionary War, false reports from the battlefields were discounted before national policy enacted. In Soviet Russia, state-controlled media was known to be biased and untrustworthy. With this weight of history, it is difficult to see the current explosion of information in the form of online publications as dangerous. Blogs may spread disinformation at times, but newspapers, radio and TV are no better.
After making this point, Darnton takes a look at Google Book Search and, though a supporter of the program, provides 8 points for caution. Though it is important to have a healthy debate about the future of books and information, I think many of Darnton’s points are premature or over-hyped worries. Darnton’s 8 points and my thoughts follow:
- Google cannot possibly scan every book in existence. The ignored books today could be valuable in the future, and if Google Book Search is the sole source of research, they will be effectively invisible.
However, by Darnton’s own admission, libraries, even the best, cannot contain every book. “Contrary to what one might expect, there is little redundancy in the holdings of the five libraries: 60 percent of the books being digitized by Google exist in only one of them.” So in our current system of geographically dispersed research libraries, it is very difficult to see the books Google might miss. - Google’s index is missing specialized collections which have valuable books. They, too, might be invisible in a Google-centric future.
This is true… for now. While it is a real concern that Google might not have the financial incentive to scan every rare book, they have shown no sign of stopping their mission to “organize all the world’s information.” - Even if Google can get the permission of publishers to display segments of previously published, copyrighted books, they have to scan all future books, as well. Even during this Internet age, more books are published every year in the USA (nearly 300,000 in 2006).
This assumes Google will not prevail in court (which seems highly probable). But even if they do settle, why would the publishers who are suing them not allow future books? If they see value in publishing current books, why not future ones? - High-tech companies rise and fall quickly. If Google is lost, and it very well could happen, there goes the digitized books.
This is a very important point. After Microsoft shuttered their scanning effort last month, it became apparent that the digitized version could disappear in bankruptcy or new business plans for Google. How about a “Free Our Books” campaign to put public domain books in the digital commons? - Google will make mistakes in the scans.
So? The number of errors are infinitesimally small (in my experience). Besides, this isn’t anything new: books have printing errors, typos and, when I visited the Library of Congress, I saw that many precious books have certain pages stolen by collectors. - The digital copies may not last. Just like old movies which have been lost to time, Google’s index may suffer failure and degradation. “The best preservation system ever invented was the old-fashioned, pre-modern book.”
What about fire? It sure doesn’t go nicely with paper. Anyways, Googe’s engineers (who he critiques in the next point) are experts at protecting digital data and providing redundant systems. - Google’s algorithm is a secret and they will have the ability to shape research through what is displayed at the top of results. Even further, their algorithms might not be best suited to books; after all, they are great engineers, not bibliographers.
This is important and strikes at a larger question: do we want to decisions of one company to so grandly shape research in the future? With Google having such large market shares of the search markets, “the best answer” is increasingly decided by one firm. - There is real value in the touch and smell of a book. The size conveys something that is also lost on a computer screen.
I agree. I love books and still voraciously consume dead-tree information.
Finally, although it isn’t explicitly one of Darnton’s points, he spends considerable time expounding on the unstable nature of information. Diderot’s Encyclopedia was published in myriad different forms, frustrating buyers and sellers (and no doubt researchers). Isn’t Google better suited to deal with this than a research library? No matter how many stacks Harvard has, they cannot contain as many books as Google’s hard drives. Google can scan all the copies of Diderot’s Encyclopedia (and they are working on it) and make them accessible from any computer. Harvard can, at most, have a couple copies. And with this in mind, Darnton’s conclusion seems a little hazy:
“Meanwhile, I say: shore up the library. Stock it with printed matter. Reinforce its reading rooms… I also say: long live Google, but don’t count on it living long enough to replace that venerable building with the Corinthian columns. As a citadel of learning and as a platform for adventure on the Internet, the research library still deserves to stand at the center of the campus, preserving the past and accumulating energy for the future.”
I say, reinvent the library. Find business models to make use of the mass of scanned books. Find ways to scan even the oldest, most brittle books. Find ways to organize, link and elucidate books formerly consigned to disparate individual locations due to their physicality.
[Photo: Flickr]