May 11, 2010 - Please note that this page is entirely out of date.
The Open Library project is an enormous undertaking. We can't do it without your help.
If you can help with anything do not hesitate to contact us!
Here are some things we can use your help on, categorized by skill:
Making phone calls
A site like this requires massive amounts of data; we'll take whatever we can get our hands on. Most of the data in the book world is locked up behind contracts or access restrictions. By contrast, we put everything we can on the Internet for free. But there's more data than we can collect by ourselves. That's where you come in: you can help us by making the phone calls with various data providers to try and get us their data.
Libraries
Lots of libraries have large catalog collections stored as MARC data. If you can get your favorite friendly library to send us a dump of their data, we'll be forever indebted. There are a few requirements for data submitted to the Open Library. Status:
Imported
Records from these libraries have been imported.
-
Library of Congress: imported (8M records)
-
University of North Carolina: imported (4162599 good records, 12 bad)
-
Oregon State University: imported
-
Washington State University: imported
-
Lewis and Clark: imported
-
Oregon Health Science University: imported
-
National College of Natural Medicine: imported
-
Western States Chiropractic Community Library: imported
-
Portland Community College: imported
-
University of Toronto: imported (6M records)
-
Miami University of Ohio: imported (2028337 records, parses okay)
-
Western Washington University: imported
-
Boston College: imported (2.1M records)
-
Laurentian: imported
-
Boston Public Library: imported 2165372 records
-
Talis: imported
-
Buffalo State College: imported
-
Collingswood Public Library: imported
-
Harvard's HOLLIS catalog: imported
-
Ithaca College Library: imported
-
San Francisco Public Library: imported
- Binghamton University: imported
Not yet imported
-
Montana State: received
-
Drexel: received
-
Wayne State: unknown
-
indcat: uploaded
- Lacrosse University: received
To pursue
-
University of California: pursuit in progress (12M records)
-
British Library: to pursue
-
Bibliotheque Nationale de France: to pursue
-
Library and Archives Canada: to pursue
-
national libraries of other countries: to pursue
- Libon online books (Italy) translate from UNIMARC
Having trouble convincing your local library? See if the Open Library Information Sheet will answer their questions.
Worried about OCLC licensing? Read more about their agreement.
Publishers
Most publishers provide "ONIX feeds" -- XML export of all their information about the books they publish. Unfortunately, you generally need to call them and make a deal to get these feeds. If you can pick a favorite publisher and try to work out a deal for us, you'll be doing a service for the cause of freedom. Lots of presses are owned by larger companies, so you might want to follow the ownership ladder to make sure you end up with as many records as possible. Status:
-
HarperCollins: imported
-
Simon and Schuster: imported
-
Random House: imported
-
Cambridge University Press: imported
-
Penguin: downloaded
-
Wiley: downloaded
-
Thomas Nelson: downloaded
-
O'Reilly: downloaded
-
Elsevier: received
-
Penn State University Press: received
-
other publishers...
- sample letter to publisher
Book data sources
We have a list of all the ISBNs we know about (text, html) -- if you can collect information about these ISBNs from other data sources, we'd love it.
-
Powells: downloaded
- Amazon: crawl donated
<a name="bookscans"></a>
Book scans
Obviously, we'd also like scans of books:
-
Google: in October 2008, Open Library integrated 400-500K books into our system, retaining the original watermark. These books were not scanned by the Internet Archive, and we are not responsible for their quality.
-
OCA: downloaded
-
Million Books Project: negotiations in progress
- BNF: downloaded
Copyright status
It's important to know whether a book has fallen into the public domain or not. First this requires data about the book's copyright registration and renewal:
-
Canada: We're working with Creative Commons which has an agreement with Access Copyright Canada providing us with data on 300K works from 1742 to the present.
-
US:
- We've received a copy of the Stanford copyright renewal database which has 250K records on US copyright renewal from 1950 to the 1995.
- public.resource.org has been crawling the copyright.gov database, this gets us copyright records since 1978.
- Our own Sebastian Hammer has been parsed Project Gutenberg scans from 1950 to 1977, resulting in 150K records. (Aaron Swartz has the details.)
- Our own Sebastian Hammer has crawled the copyright.gov database from 1978 to 1995, resulting in 600K records.
It also requires the development of algorithms for analyzing this information and calculating a current copyright status.
-
Canada: David Strauss has been working on an algorithm for us.
- US: We have some legal advice from Creative Commons on what the algorithm is. There's also the famed Hirtle rules (more...)
Swap sites
Popularity data
In a sea of books, it's nice to have some ways of seeing which ones are more "important" than others. You want your search engine to bring up The Da Vinci Code before The Da Vinci Method, if only because the numbers say that's more likely what people want. And when you're importing a book that says its by David Eggers, you want to guess it's the author of A Heartbreaking Work of Staggering Genius and not Physical Chemistry: A Textbook. Similarly, good popularity data can keep you from recommending people things like Harry Potter.
-
Amazon.com salesrank: proprietary
-
Bookscan sales numbers: proprietary
-
Library circulation data: to pursue
-
LibraryThing: received
-
Web mentions: to pursue
-
Store/library availability: to pursue (see price checker below)
- our page view data: once we get more popular
Respect data
Strict popularity isn't the only thing that matters. We also want to know whether the book is respected, even if it's not a bestseller.
For example, if you happen to have a copy of the Book Review Digest CD-ROM, we'd be massively indebted. The following libraries supposedly have a copy:
-
Los Angeles Public Library (we called them and they can't find their copy; it might be worth bothering them more)
-
Mohave County Library District (Kingman, AZ)
-
Southern Illinois University (Carbondale, IL)
-
World Book (Chicago, IL)
-
Indianapolis Public Library
-
Library of Congress
-
Pentagon Library
-
Saint Augustine's College (Raleigh, NC)
-
Nova Southeastern University (Ft. Lauderdale, FL)
-
US Military Academy (West Point, NY)
-
Université de Montréal (Montreal, QC)
-
Univ. of Nottingham (Nottingham, UK)
-
University of Hong Kong (Pokfulam, HK)
-
North West Univ. Potchefstroom Campus (NW, South Africa)
-
Indonesian national library
-
Williams College (Western Massachusetts) Stetson library
- National University of Singapore Library
Other collections of reviews include:
-
Book Review Index
-
major journals (LJ, PW, Booklist)
-
intra-book citations
- awards
Is there an online review aggregation site for books? Metacritic only does a handful.
Inter-book relations
-
ThingISBN: available as full dumps
-
xISBN: costs money, uses only published algorithms
- FRBRizing algorithms: first draft of a merge algorithm
Copyright information
-
Registration and renewal records
-
Orphaned works data (from users and other archivists)
- Our own registration service
Other
If you know other people with library data, we'd love that too.
Library science
See our page on librarianship for things you can help with.
Design
We want our site to look as good as possible, which means we're always interested in new designs. Luckily, Infogami has a powerful templating system that allows you to create your own look and feel for the site. For more information, read the guide to our wiki language. Let us know if you have questions or have developed a nifty new template.
Programming
Lots of our work here requires programming. If you have a lot of time, you can hop right in and become a serious developer on Infogami or Open Library. (Check out our bug list.) But if you have less time, perhaps you can pick up a smaller project:
Price check
We want Open Library to be a hub for all the book information on the Internet. As part of that, we're developing plugins to grab data from other sites. For example, we'd like code that checks prices at various stores and lists them, code that sees which libraries a book is available at near a zip code, code which checks to see if a book is available in a Borders near a bookstore, etc. We'd also like code that sees whether a book is available at particular libraries or on particular book trading sites (like bookmooch).
Export
We'd love to have our data exported in RDF/XML, database dumps, OAI, microformats, Z39.50, a cover repository API.
History
- Created March 4, 2009
- 68 revisions
August 2, 2024 | Edited by jachamp | Replace character references with the characters that they represent |
April 25, 2022 | Edited by DriniBot | Fix permission open pages |
April 12, 2022 | Edited by Daniel Beatty | Edited without comment. |
November 18, 2021 | Edited by 207.241.232.201 | 1 |
March 4, 2009 | Created by webchick | creating .en /about/help page |