Open Library provides dumps of all its data, generated every month. All of the dumps are formatted as tab separated files with the following columns:
-
type
- type of record (/type/edition, /type/work etc.)
-
key
- unique key of the record. (/books/OL1M etc.)
-
revision
- revision number of the record
-
last_modified
- last modified timestamp
-
JSON
- the complete record in JSON format
Dumps
-
editions dump (~ 9.2G)
-
works dump (~ 2.9G)
-
authors dump (~ 0.5G)
-
all types dump (~ 12.4G): includes editions, works, authors, redirects, etc.
-
complete dump (~ 29.6G): also includes past revisions of all the records in Open Library
-
ratings dump (~ 5M): with columns: "Work Key, Edition Key (optional), Rating, Date"
-
reading log dump (~ 65M): with columns "Work Key, Edition Key (optional), Shelf, Date"
-
redirects dump (~ 50M)
-
deletes dump (~ ?M)
-
lists dump (~ ?M)
- other dump (~ ?M)
For past dumps, see: https://archive.org/details/ol_exports?sort=-publicdate
Downloading the dumps take too long? Checkout the link above and download via torrent for higher speeds!
Format of JSON records
A JSON schema for the various types is located at https://github.com/internetarchive/openlibrary-client/tree/master/olclient/schemata
-
Author Records: JSON serialization of a type/author
-
Edition Records: JSON serialization of a type/edition
- Work Records: JSON serialization of a type/work
Using Open Library Data Dumps
Please see this great guide by contributor on the LibrariesHacked
github about how to load Open Library's data dumps into postgres to make it more useful and queriable:
https://github.com/LibrariesHacked/openlibrary-search
GraphQL
DiFronzo on github has produced a graphql proxy to search books using work, edition and ISBN with Open Library API. Deployed with Deno and GraphQL:
https://github.com/DiFronzo/OpenLibrary-GraphQL
DiFronzo/OpenLibrary-GraphQL
OL Covers Dump
We do not yet have rolling monthly dumps of our book covers, despite a shared desire for its existence. Some historical cover dumps may be explored here:
https://archive.org/details/ol_data?tab=collection&query=identifier%3Acovers&sort=-addeddate
History
- Created December 14, 2011
- 28 revisions
May 22, 2024 | Edited by raybb | Edited without comment. |
May 22, 2024 | Edited by raybb | covers url sorted by date added |
May 22, 2024 | Edited by raybb | Edited without comment. |
May 22, 2024 | Edited by raybb | placeolder for redirects and other |
December 14, 2011 | Created by Anand Chitipothu | Documented Open Library Data Dumps |