It looks like you're offline.
Open Library logo
additional options menu
Last edited by mangtronix
March 3, 2009 | History

Books URL - Proposal at 2009-01-29

Note: This document is in process -- if interested please send feedback to mang at archive org

Goals

Bookreader URLs have the following goals:

Key-Value

Bookreader URLs are composed of key-value pairs. The keys and values are separated by '/'. The key-value pairs can occur in any order (really what we want?) but we will specify a canonical order. A user supplied URL will be remapped to this canonical order when given back to the user (e.g. by redirecting, so it appears in the address bar). The purpose for the remapping to canonical order is to reduce the number of URLs "out there" on the net that point to the same resource.

If a reader implementation does not understand a given key-value pair, it should be ignored.

We decided to lose the distinction between "display options" and other options since there could be confusion over which options are "display" options versus "location" options.

Functionality

Bookreader URLs support the following functionality:

Example URLs

http://www.archive.org/stream/aliceinwonderlan00carriala/
http://www.archive.org/stream/aliceinwonderlan00carriala/page/23
http://www.archive.org/stream/aliceinwonderlan00carriala/page/23/zoom/50

These two are equivalent and would be remapped to a single URL:

http://www.archive.org/stream/aliceinwonderlan00carriala/page/23/zoom/50/mode/2up
http://www.archive.org/stream/aliceinwonderlan00carriala/zoom/50/mode/2up/page/23

Referring to pages, leafs, indices

For a book with a set of numbered camera images we do not always have a mapping between these images and the page numbers (as printed in the book). In addition, certain pages are not numbered at all (e.g. a completely blank page may face a figure page, both of which are inserted between consecutively numbered pages). The image stack can also contain images which should not be considered for access (e.g. colour calibration cards).

When the page numbers are available they may be referenced with:

page/{page number}

The page numbers may be either numeric or a string (e.g. 'iii'). Our earlier Scribe 1 books may have Roman Numeral pages marked. Books scanned with Scribe 2 do not. String-based page numbers should be compared in lowercase. Named pages (such as the title page) may be referred to using the page name.

Examples:

page/2
page/iv
page/title

Question: There exist books (e.g. compilations of articles) which may have more than one page with the same number. How do we handle these?
Question: What named pages should we support?

An external site or embedding should not assume that the page numbers are available or monotonically increasing. There may be foldouts, pages missing (e.g. damaged) or other reasons page numbers are not continuous.

"Leaf" is a concept from the Archive's Scribe scanning software. It corresponds to the image sequence taken during the scanning process. The Archive.org scandata.xml refers to leafs. At the level of the bookreader and user-visible URLs the underlying leaf numbers should not be exposed unless necessary.

"Accessible page index" (pindex). Each page that should be included in the access formats (bookreader, PDF, etc) is given a monotonically increasing number starting from 0. For the Archive this corresponds to pages with addToAccessFormat true in the scandata.xml. The

Examples:

pindex/0
pindex/23

For books where there are multiple leafs with the same page number both the pindex and the page number should be specified.

Example:
pindex/23/page/4

Display options

Display options inform the bookreader how the book should be displayed to the user. The GnuBook reader supports the following options:

Searching

Search terms can be highlighted by using search/ followed by the search string. The search string should be URL escaped and the slash character ('/') is not allowed.

Examples:

Searching for "cats":
search/cats

Searching for "cheshire cat":
search/cheshire%20cat

More background reading

Image tiling

Transclusion

Ideas from meeting 2009-02-29

Concatenating multiple books (or sections of books):

stream/alice/pageRange/22-25/id/tomsawyer/pageRange/15-20

History

March 3, 2009 Edited by mangtronix Edited without comment.
March 3, 2009 Edited by mangtronix Edited without comment.
March 3, 2009 Edited by mangtronix Edited without comment.
February 25, 2009 Edited by mangtronix Edited without comment.
January 22, 2009 Created by mangtronix Edited without comment.