Last edited by jachamp

August 2, 2024 | History

Edit

How You Can Help

May 11, 2010 - Please note that this page is entirely out of date.

The Open Library project is an enormous undertaking. We can't do it without your help.

If you can help with anything do not hesitate to contact us!

Here are some things we can use your help on, categorized by skill:

Making phone calls

A site like this requires massive amounts of data; we'll take whatever we can get our hands on. Most of the data in the book world is locked up behind contracts or access restrictions. By contrast, we put everything we can on the Internet for free. But there's more data than we can collect by ourselves. That's where you come in: you can help us by making the phone calls with various data providers to try and get us their data.

Libraries

Lots of libraries have large catalog collections stored as MARC data. If you can get your favorite friendly library to send us a dump of their data, we'll be forever indebted. There are a few requirements for data submitted to the Open Library. Status:

Imported

Records from these libraries have been imported.

Library of Congress: imported (8M records)
University of North Carolina: imported (4162599 good records, 12 bad)
Oregon State University: imported
Washington State University: imported
Lewis and Clark: imported
Oregon Health Science University: imported
National College of Natural Medicine: imported
Western States Chiropractic Community Library: imported
Portland Community College: imported
University of Toronto: imported (6M records)
Miami University of Ohio: imported (2028337 records, parses okay)
Western Washington University: imported
Boston College: imported (2.1M records)
Laurentian: imported
Boston Public Library: imported 2165372 records
Talis: imported
Buffalo State College: imported
Collingswood Public Library: imported
Harvard's HOLLIS catalog: imported
Ithaca College Library: imported
San Francisco Public Library: imported
Binghamton University: imported

Not yet imported

Montana State: received
Drexel: received
Wayne State: unknown
indcat: uploaded
Lacrosse University: received

To pursue

University of California: pursuit in progress (12M records)
British Library: to pursue
Bibliotheque Nationale de France: to pursue
Library and Archives Canada: to pursue
national libraries of other countries: to pursue
Libon online books (Italy) translate from UNIMARC

Having trouble convincing your local library? See if the Open Library Information Sheet will answer their questions.
Worried about OCLC licensing? Read more about their agreement.

Publishers

Most publishers provide "ONIX feeds" -- XML export of all their information about the books they publish. Unfortunately, you generally need to call them and make a deal to get these feeds. If you can pick a favorite publisher and try to work out a deal for us, you'll be doing a service for the cause of freedom. Lots of presses are owned by larger companies, so you might want to follow the ownership ladder to make sure you end up with as many records as possible. Status:

HarperCollins: imported
Simon and Schuster: imported
Random House: imported
Cambridge University Press: imported
Penguin: downloaded
Wiley: downloaded
Thomas Nelson: downloaded
O'Reilly: downloaded
Elsevier: received
Penn State University Press: received
other publishers...
sample letter to publisher

Book data sources

We have a list of all the ISBNs we know about (text, html) -- if you can collect information about these ISBNs from other data sources, we'd love it.

Powells: downloaded
Amazon: crawl donated

Book scans

Obviously, we'd also like scans of books:

Google: in October 2008, Open Library integrated 400-500K books into our system, retaining the original watermark. These books were not scanned by the Internet Archive, and we are not responsible for their quality.
OCA: downloaded
Million Books Project: negotiations in progress
BNF: downloaded

Copyright status

It's important to know whether a book has fallen into the public domain or not. First this requires data about the book's copyright registration and renewal:

Canada: We're working with Creative Commons which has an agreement with Access Copyright Canada providing us with data on 300K works from 1742 to the present.
US:
- We've received a copy of the Stanford copyright renewal database which has 250K records on US copyright renewal from 1950 to the 1995.
- public.resource.org has been crawling the copyright.gov database, this gets us copyright records since 1978.
- Our own Sebastian Hammer has been parsed Project Gutenberg scans from 1950 to 1977, resulting in 150K records. (Aaron Swartz has the details.)
- Our own Sebastian Hammer has crawled the copyright.gov database from 1978 to 1995, resulting in 600K records.

It also requires the development of algorithms for analyzing this information and calculating a current copyright status.

Canada: David Strauss has been working on an algorithm for us.
US: We have some legal advice from Creative Commons on what the algorithm is. There's also the famed Hirtle rules (more...)

Swap sites

BookMooch: inventory, wish list
PaperBackSwap: inventory, wish list
More coming...

Popularity data

In a sea of books, it's nice to have some ways of seeing which ones are more "important" than others. You want your search engine to bring up The Da Vinci Code before The Da Vinci Method, if only because the numbers say that's more likely what people want. And when you're importing a book that says its by David Eggers, you want to guess it's the author of A Heartbreaking Work of Staggering Genius and not Physical Chemistry: A Textbook. Similarly, good popularity data can keep you from recommending people things like Harry Potter.

Amazon.com salesrank: proprietary
Bookscan sales numbers: proprietary
Library circulation data: to pursue
LibraryThing: received
Web mentions: to pursue
Store/library availability: to pursue (see price checker below)
our page view data: once we get more popular

Respect data

Strict popularity isn't the only thing that matters. We also want to know whether the book is respected, even if it's not a bestseller.

For example, if you happen to have a copy of the Book Review Digest CD-ROM, we'd be massively indebted. The following libraries supposedly have a copy:

Los Angeles Public Library (we called them and they can't find their copy; it might be worth bothering them more)
Mohave County Library District (Kingman, AZ)
Southern Illinois University (Carbondale, IL)
World Book (Chicago, IL)
Indianapolis Public Library
Library of Congress
Pentagon Library
Saint Augustine's College (Raleigh, NC)
Nova Southeastern University (Ft. Lauderdale, FL)
US Military Academy (West Point, NY)
Université de Montréal (Montreal, QC)
Univ. of Nottingham (Nottingham, UK)
University of Hong Kong (Pokfulam, HK)
North West Univ. Potchefstroom Campus (NW, South Africa)
Indonesian national library
Williams College (Western Massachusetts) Stetson library
National University of Singapore Library

Other collections of reviews include:

Book Review Index
major journals (LJ, PW, Booklist)
intra-book citations
awards

Is there an online review aggregation site for books? Metacritic only does a handful.

Inter-book relations

ThingISBN: available as full dumps
xISBN: costs money, uses only published algorithms
FRBRizing algorithms: first draft of a merge algorithm

Copyright information

Registration and renewal records
Orphaned works data (from users and other archivists)
Our own registration service

Other

If you know other people with library data, we'd love that too.

Library science

See our page on librarianship for things you can help with.

Design

We want our site to look as good as possible, which means we're always interested in new designs. Luckily, Infogami has a powerful templating system that allows you to create your own look and feel for the site. For more information, read the guide to our wiki language. Let us know if you have questions or have developed a nifty new template.

Programming

Lots of our work here requires programming. If you have a lot of time, you can hop right in and become a serious developer on Infogami or Open Library. (Check out our bug list.) But if you have less time, perhaps you can pick up a smaller project:

Price check

We want Open Library to be a hub for all the book information on the Internet. As part of that, we're developing plugins to grab data from other sites. For example, we'd like code that checks prices at various stores and lists them, code that sees which libraries a book is available at near a zip code, code which checks to see if a book is available in a Borders near a bookstore, etc. We'd also like code that sees whether a book is available at particular libraries or on particular book trading sites (like bookmooch).

Export

We'd love to have our data exported in RDF/XML, database dumps, OAI, microformats, Z39.50, a cover repository API.

History

Created March 4, 2009
68 revisions

August 2, 2024	Edited by jachamp	Replace character references with the characters that they represent
April 25, 2022	Edited by DriniBot	Fix permission open pages
April 12, 2022	Edited by Daniel Beatty	Edited without comment.
November 18, 2021	Edited by 207.241.232.201	1
March 4, 2009	Created by webchick	creating .en /about/help page