Last edited by Drini

August 9, 2023 | History

Edit

The State of the UI

"Public libraries are beloved institutions. When one is being built, everyone wants it to be special and everyone wants to feel a part of the project." - via librarian.net

The software powering Open Library is centered around a flexible templating system that enables anyone to have control over the organization of their information display. It is designed to be simple to use, to encourage participation, yet powerfully modular through the application of macros and principles of networked interface. In this document, we will review existing technologies, explain concepts, present a framework for discussion and outline a plan to meet those goals for a full site launch in October.

To discuss this document, join the General Discussion mailing list.

Table of Contents

Summary
Introduction
Becoming A Patron
1.	User Registration
1.1	The User Profile (Creating User Identities)
1.2	Expert Users
Finding Books
2.	Search
2.1	External Search Engines
2.2	Internal Search
3.	Tags (Faceted Folksonomy)
4.	Collections
4.1	Booklists
4.2	Groups
4.3	Curated Collections
Viewing Books
5.	The Book Viewer
5.1	OCR Text / Distributed Proofreading
5.2	Notable Quotations / Transclusion
5.3	Bookmarks
6	Print Production
Building The Library
7.	Editing The Catalog
7.1	Editing Metadata
7.2	Reviewing Books
7.3	Rating Books
7.4	Adding Comments
7.5	Uploading Books
7.6	Conducting Research
7.7	Contributing Records

Developing The Site
8.	WikiLanguage
9.	Templates
9.1	Templates and Types
9.2	Macros

User Scenarios 10.0 Administrator 10.1 Commenter 10.2 Book Author 10.3 Writer (Reviewer / Metadata Contributor) 10.4 Promoter 10.5 Book Hunter 10.6 Freeloader 10.7 Researcher / Scholar 10.8 Proofreader 10.9 Curators / Contextualizers 10.10 Book Lovers (clubs, groups) 10.11 Libraries 10.12 Publishers 10.13 API Users / Developers Acknowledgements Authors References Design Mockups Rollout Plan Special Considerations A. Small Library Outreach B. Ensuring Data Integrity C. User Base Fragmentation D. Internationalization

Summary

Open Library will have a graduated accounts system in which users automatically get an account by taking an action on the site, without having to explicitly register. In addition to lots of traffic from Google, it will have a powerful search engine with faceting. It will support tags and user-defined collections, and enable many different types of users who love books to communicate, share knowledge and work effectively together.

It will have an advanced book viewer that will handle OCRed text with distributed proofreading as well as transclusion and bookmarks. Books will also be available as print on demand. Anyone can edit the catalog or review or rate or comment on a book. Out-of-copyright books can be uploaded or, through a donation to Open Library, users can fund the scanning of them.

The site uses an advanced wiki language to allow users to modify its templates and add new types of data.

Introduction

Open Library is built using what is known as a structured wiki, "a system that combines the benefits of - as it seems like - contradicting worlds of plain wikis and database systems. This gives you a collaborative database environment where knowledge can be shared freely, and where structure can be added as needed. In a structured wiki, users can create wiki applications that are very specific to their needs [...]" [1].

Open Library architecture goes a step further than other structured wikis, in that not only is the content fully editable, but the underlying template sets as well. Whereas a normal wiki page is just a blob of text [2], a semi-structured wiki allows more granular storage of data. In our structured wiki, users can create new "types" of pages, with schemas that allow editing forms to be generated, and templates that determine how the information gets displayed. Advanced modes will allow users to break out of the traditional templating system to give them full control over the display of the page, while still having the data be stored as data [3].

This unique approach to flexibility by creating an architectural environment that is not monolithic [*] can adapt to the needs of any individual or group that wants to use it. This goes beyond the notion of designing site skins or branding - it encourages ongoing construction and development of the site content as well as the site's tools by the users of the system.

The beauty of this is in its simplicity: anyone willing to learn a simple formatting language can edit the catalog (see Editing the Catalog), and anyone with a knowledge of CSS and HTML can build their own templates (see WikiLanguage Documentation). By opening up the system for this type of involvement, we are creating a "low-barrier enrichment system" that encourages participation from a broad base of users, and will make the content more meaningful over time [4].

As technologies and APIs are getting more complex, it is exciting to see a system that puts the tools back in the hands of humans as the original WWW was designed, thus facilitating easy and meaningful access to our collective history [5]. Open Library is purposefully designed to appeal to people familiar with these fundamental technologies, plus capitalize on newer adopted standards, such as the formatting language broadly used by Wikipedia and other wiki software.

Becoming A Patron

1. User Registration

User accounts will be created in Open Library by what is known as a graduated accounts system. An account is automatically created as soon as a user begins interacting with the site and is stored in the browser as a cookie. Initially, the user will be identified by an anonymous handle, such as "Welcome user8585, (Fix your name!)". As a user continues to interact with the site, they can provide more information (username, password, email, etc...) as they see fit, so they can slowly grow into an account. Support for OpenID will be provided as well [6].

All user contributions will be available as RSS Feeds.

1.1 The User Profile

In Open Library, user pages will be protected pages that only a user can edit, and they will contain system-generated data that reflects their role on the system to other users. (For example, on Open Library, each page, book, author, and user is a thing in the database [7]); For instance, the pages of commenters will show their comments, the pages of curators will show their collections, and developers pages will display their latest contributions to Open Library. Comments about users will be realized in the form of user ratings and testimonials.

In addition to this, a user can answer a series of questions so their profile page will best reflect the set of tools (macros) to suit their needs when viewed by them. The basic pairing of interface to user tasks will follow the profiles set forth in our user scenarios. In essence, the user profile page becomes an independent workspace for the individual. All of this relates to task-oriented navigation that reflects the way that different individuals use a traditional library. Some people are primarily focused on research. Others are much more social. And some would just like to rearrange all of the books on the shelves so that they are easier to find. All suggested views are user-modifiable as the macros and templates are interchangeable and editable through the wiki (once a user masters the skills of an Open Library developer).

This work-oriented approach to a user's view of their own profile page is very similar to an intranet architecture.

1.2 Expert Users

For many users, expert credentials will be important. Registered librarians and other certified information professionals will be able to link to their credentials online to help establish credibility in a particular domain (library associations, for example). Special "certifications" will be visible to other users when they view an expert's profile page, and their contributions will also be clearly delineated when the history of any page is viewed. Alternatively, a user can flag a page that requires the attention of an information expert so that it will appear in a queue on their user page.

This approach has raised some concerns regarding fragmenting the user base (See User Base Fragmentation).

Finding Books

2. Search

One goal of our interface is to make the discovery of books easy and, at times, serendipitous. Through a variety of methods tailored to meet what comes natural to a broad variety of users, we plan to enable users to discover books in unexpected ways.

At the core of every discovery effort is the search query and the result. This may be a very precise search where a user has a clear idea what they are looking for (and may even have hard data, such as an ISBN number). In this scenario, a user may prefer that the actual object be delivered to them with minimal intermediary information (i.e the card in the library catalog). Alternatively, users may have some idea what they are looking for but either (a) do not have hard data to perform a precise search, or (b) are open to alternative suggestions. This is where the card catalog from a traditional standpoint can be completely reinvented through folksonomies and social networking:

The role of the cataloging rules in enabling or disabling these goals is not just a matter of insuring that library systems can accept metadata from non-library sources, and that library metadata can leak out into the public network environment. We must set the stage, with our standards and our use of technology, for library bibliographic services that serve today's users. These users are increasingly ones who have never known a world without computers, much less a world without the Internet. The new generation of users begins each information quest with a few typed keywords into an online query box. When seeking a book whose title they only partly remember, many of them turn to Amazon. There they not only get the bibliographic information that they sought but also find themselves in a reassuring online community that reviews, recommends, and encourages them to take part [8].

The Open Library strategy for helping users locate a book is multi-faceted, and involves capitalizing on both external and internal methods.

2.1 External Search Engines

External search engines (such as Google) will provide us with significant traffic to our individual book pages. With this in mind, it is important to view every book page as a landing page, and extend that context into our own environment. Referrer tracking should pass variables such as keyword used in a search, which can be used to guide users to other book selections. This would enable us to reverse engineer concepts such as Google Adsense, that extracts key themes of a page - except instead of showing our site visitors ads, we will present them with book recommendations.

Constructing the book landing pages so that they reflect a large library of content with a healthy readership will improve the individual page rank in search engines like Google. It is important here to create an inviting entry point that will immerse the user into our method of search and retrieval immediately.

2.2 Internal Search (Predetermined Facets)

Open Library's search engine, which is powered by Solr, uses a familiar library classification technique known as faceted navigation. These facets help users find a book they are searching for as it breaks down the rigidity of more structured taxonomical hierarchies. Internal search and browse macros will be available to view search results as facets or clouds.

"We're starting to talk about faceted browsing of the catalog, which would give users the sort of "looking at books nearby on the shelf" experience, but along many more axes than the single one provided by a physical arrangement of books. What "facets" of books might our users be interested in? Might librarians like to browse through a list of subject headings provided by the Library of Congress, and then find a list of all books under a particular heading? How might a user browse through subjects taken from different vocabularies? How ought geographical regions and historical periods to be represented, in order that they can be browsed effectively by users? Imagine being able to pan around a map of the world and click on a place to find books about it [9]."

Some examples of useful subject-based classification systems [10] include date, language, place (as topic), genre, time period (topical), topic area, subject, author, full text, and perhaps even age group - so that parents and young readers can easily target age-appropriate materials. [*]

3. Tags (Folksonomy)

In addition to a predetermined categorization, Open Library will enable users to tag books so they can easily categorize their collections in a framework that is comfortable to them. As Casey Bisson notes, "People don't tag to help the community, they tag because it helps the tagger" [11] - however, this organic method of organizing can lead to a serendipitous book discovery when the tags are exposed to the entire community. This is one widely adopted method that users can begin participating in linking books together in Open Library.

4. Collections

4.1. Booklists

Tags provide users with a low-hassle, but very general, mechanism to classify books. Booklists provide users with the next level of control, enabling them to group books at another order of classification.

These booklists are pages created by users (or groups of users) that display books that are deemed related by some criteria specified by the user (or group). The most basic classification at this level would be a list of favorite books from the catalog, made readily accessible through an "Add to My Favorites" button or link. This list would be displayed on a user's page, and the book pages would link back to users who have added this book to their "Favorites" collection.

However, these booklists can become much more sophisticated by allowing some form of discretionary access control. Then these booklists become a useful organizational tool by creating an environment where users can build sophisticated maps of their collections, publishing each object, and perhaps even attribute, on an individual basis. Representations of media objects can be added or removed from these collections, and be classified as 'public' or 'private' as a user deems appropriate. It is important to note that what is being made 'private' is the representation of the media object within a user's collection and not the global media object itself (which is still visible to all other users of the system).

Using the popular photo-sharing site Flickr as a model, it divides content channels known as "photostreams" into three types: "my photos," "my contacts photos," and "everyone's photos" so a user can control their view of the world. Conversely, the H20 project, by the Berkman Center For Internet & Society, does not have a 'private/public' feature on an item by item basis - just an entire list can be made public or private [12].

Aside from showing that Open Library is in tune with the needs facing today's libraries in an information economy [13] [14], enabling this level of control provides scholars with an environment to conduct original research using Open Library tools. Once that research has concluded, the data can then be published to the system when the author deems appropriate, thus enriching the data at Open Library.

4.2 Groups

Informal groups are created by establishing the equivalent of watchlists of other users contributions to Open Library and exposing these relationships on the user's page (once again this is similar to contact lists in the Flickr model).

Explicit groups can be established by creating a group and a corresponding group wiki page that is editable only by members of that group. Group membership can be closed, by invitation only or public. Group affiliations will be automatically listed on user pages.

Implicit groups (as outlined in our user scenarios) will generally form through organic use of the system.

4.3. Curated Collections / Independent Library OPAC

Open Library will provide tools for any group or library to host their curated collection. In the case of libraries, a small organization could submit a list of ISBNs (and/or other identifying info) and then download a complete set of catalog records for everything in their library. They could, additionally, choose to automatically create an online OPAC (open to the public) and an online ILS (private to the library) hosted at Open Library, but branded with their information. As long as they have an internet connection, their tech problem is solved.

More advanced users may opt to download Open Library software and data to create their own systems instead of using the public Open Library site. We will support this special class of users at our developer hub and foster a community environment that encourages continued participation between the public and independent efforts.

One notable related effort is NINES (Networked Infrastructure for Nineteenth-century Electronic Scholarship) [15] [16] [17], which illustrates an excellent example of The Rosetti Archive, a curated collection running on open source software developed by Bethany Nowviskie. The software will soon support the creation of syllabi, annotated bibliographies, illustrated essays, and timelines [*].

Viewing Books

When the full text of a work is available, books will be downloadable in a variety of formats, including PDF, DjVu, XML and full text. All of these formats will be readily accessible from every book page in the wiki, and repurposed in the book viewer UI, currently under development.

5. The Book Viewer

The book viewer, which is currently based on a flip-book model inspired by a British Library kiosk [18], is a key area where Open Library can innovate from a user interface perspective. At present, it inserts tabs on pages with search terms so they are easy to navigate to, and it highlights those search terms in context on those pages. Unfortunately, there is no support for OCR, bookmarking within pages, and it does not support the viewing of shorter academic papers such as the one used in our current guided tour. Major cross-platform compatibilities also exist, especially on the Macintosh platform.

Fortunately, the flexibility of the wiki template architecture will support multiple book viewers if desired. These alternative book viewers can take advantage of the DjVu=>OCR efforts by enabling a true full text viewing of books where users can contribute to the proofreading process by correcting the OCR, highlighting notable quotations, inserting those notations into other pages (or external blogs - transclusion), annotating, and bookmarking within other books.

5.1. OCR Text / Distributed Proofreading

All book content will be made available as OCR so users can help correct the data. The interface will be easily accessible so that users can make corrections to the text with minimal effort (an early prototype of the book viewer supported this by clicking on the scan, another solved the problem by displaying the OCR text next to the scan). Books in need of proofing can be flagged so they appear on a user's page who has assumed a proofreader's role in Open Library. Any revisions will be tracked by the versioning system as any other edit to the system.

5.2. Notable Quotations / Transclusion

The ability to mark notable quotations in books and embed them into external documents yet remain linked in context opens up the possibility to truly achieve transclusion. Markup language (or an AJAX driven annotation interface where markers can be dragged and inserted into the full text view) will be used to highlight text, and then once the snippet has been contained, a user action can be applied to that selection.

Actions include, but are not limited to, export, discussion, annotation, and citation. Once an action has been initiated, other users can mouseover a selection of text and participate in the associated activity. Or, alternatively, the actions could be docked in the page margin anchored to the corresponding page line number. Either way, a user should be able to easily turn on all discussion highlights, or footnote highlights, etc., or turn them off so that a user can get a clear view of the text.

The interface for these marks should also support filtering so a user can view "all the marks I made in this book" or "all the marks [this person - this group] made in this book"

With export, a code identifier will be generated so that a representation of the highlighted text can be embedded into another book or into an external document (web page, blog, PDF). One possible model to support this is the social bookmarking site Ma.gnolia, which enables users to embed system-generated tag clouds or lists of bookmarks on their site by generating code for insertion into an external page.

5.3. Bookmarks

Persistent bookmarking and retaining descriptive page information that can be extracted from the DjVu scan files will be important, so that edits to the text can remain anchored to line numbers or pages. For instance, it would be useful if we could know that a mark was made in this book on this page (or even more meaningful: in this book in this chapter and in this paragraph).

6. Print Production

Having access to the OCR of books opens up greater possibilities when it comes to the output of these books by users as PDF files. Using an HTML - PDF converter such as PrinceXML (is there an open source alternative?) a user can control the appearance of how all the books they print from Open Library will look. These special CSS files can be stored in the wiki as any other template and shared with other users (like a virtual bookmobile). In essence, anyone can design their own library from card catalog to printed product. This feature would work in conjunction with the distributed proofreading efforts planned for Internet Archive's DjVu OCR.

An interesting scenario is that a small rural library could possibly raise funds by designing books and selling them online to their patrons, with proceeds from the sale being divided between the library and Open Library.

Building The Library

7. Editing The Catalog

Anyone can help build Open Library. A user doesn't have to have advanced technical knowledge in computer or informational sciences to dive in - all anyone needs to begin contributing is a knowledge of books.

7.1. Editing Metadata

The content of every page in the wiki is editable by any user. The site uses a simple text-to-HTML conversion tool for web writers known as Markdown [19], which presents a low barrier to entry to begin editing the catalog.

When an edit has been made, it is stored in the revision history so it is easy to see exactly what changes have been made and by whom. Each revision in the history can be annotated by a comment to explain exactly why a user made a particular change.

This opens up a world of possibility, and raises a few red flags (see Ensuring Data Integrity). In addition to the core bibliographic record, a user can enhance the metadata with additional information such as links to other books, table of contents, ratings, book reviews, and comments. Other types of edits that a user will need to make in the system are design modifications and proofreading edits. It will be important to denote these edits in the revision history so they are easy to distinguish from one another.

Users will also be encouraged to provide copyright information on any content they glean from external sources. A style guide for copyright and citation is currently under development.

7.2. Reviewing Books

A simple review system supporting professional and user-submitted reviews is currently under development. Reviews are more informal critiques, a la Amazon, that can be associated with the corresponding book, timestamped, and listed on the contributing user page. A more in-depth analysis of a work falls under Conducting Research.

7.3. Rating Books

In addition to using ILL data to determine ratings, internal popularity data about a books can be collected by providing users with a simple rating mechanism [20] and keeping track of books that a user has downloaded. When they return to the site, they would be prompted by a macro that asks them to rate the book they downloaded on their last visit. Inbound links from user pages can also help develop a book ranking algorithm. As the system reaches a critical mass, users can take a ratings survey (a la Netflix), so that Open Library can make more informed recommendations.

7.4. Adding Comments

Comments are fairly personal opinions that don't fit into the standard model of wiki editing, so a separate commenting system will be developed so that people can have their say [21].

One possible solution is to build a system similar to reddit's, with simple nesting and voting. However, it might be interesting to explore something akin to an IM environment (or a less real-time environment, such as the Swhack IRC Log). That way, we can build an environment that is more conversational, and similar to meeting up with someone at a local library or bookstore. This could extend into the bookmarking scenario [see 5.3], so we can encourage conversation and linking at any point in a book. Creating a mashup with popular microblogging tools (such as Twitter or Jaiku) could provide a clever solution to enable users to microblog their journey through a book. [*]

7.5. Uploading / Adopting Books

Open Library will solicit uploads of out-of-copyright books. If a user has a book in their collection that they believe is eligible for upload, they may digitize the book and upload it the site to share with others. The Internet Archive already has technology that can process and derive many different file formats.

Adopting books is another way Open Library will facilitate the digitization of books. A "Scan This Book" button will be made available so a user can sponsor the addition of a book to the digital archive. Users who adopt books will be credited as scan sponsors on the site and within the generated PDF files. This will encourage growth of digital collection at Open Library.

7.6. Conducting Research

In addition to book reviews, Open Library will build tools to help users conduct research. The goal is to combine the features of the book viewer and the wiki to create a rich inter-linking of metacontent to content. A template is currently under development to support some of these features, which ultimately include, but are not limited to: (1) citation of works, (2) bookmarking within books, (3) quotations / transclusion, (4) contextual discussion and (5) saved searches.

7.7. Contributing Records

Users may want to contribute individual records for books we do not have information for, or libraries may way to contribute larger numbers of records in order to help build Open Library (and/or create their own OPAC/collection on our site - see Curated Collections / Independent Library OPAC).

Developing The Site

8. WikiLanguage

The key component in template construction is learning the underlying WikiLanguage and the process of template set creation. This functionality is outside of the scope of this document, and has its own documentation effort at http://demo.openlibrary.org/dev/docs/wikilanguage

9. Templates

9.1 Templates and Types

All templates are stored in the wiki so they can be freely edited by users. Templates can call other templates, like in MediaWiki, and have access to a limited set of programming functions. Mini-embedded queries should also be possible to allow for interesting answers. As a result, the wiki software can be used for all sorts of different sites, each of which will need their own template. By incorporating template editing in the web interface, it becomes a wiki page with a revision history, creating a casual and much less restrictive CVS system.

This type of architecture can also facilitate the sharing and cloning of templates so a user can easily copy another users template root so they can continue to improve the system by building on existing knowledge. See the infogami docs for more info. Template sets can be made readily accessible by opting to make them public. Once a template set has been published, it will be made available at a developer hub so other users can review available template sets.

9.2 Macros

Macros are self-contained modules that can be embedded in any template by creating a macro file in the macros directory, and then referring to the name of that macro file using a simple WikiLanguage command:


    {{BookSwap()}}

In this environment, more advanced programming techniques are encouraged as our architecture can support anything that can be embedded into a web page. A macro can be a sophisticated AJAX widget or a simple HTML snippet (such as the databar macro at the top of our current demo site that is repeated on every page). This open architecture will encourage innovation by independent developers to design macros to suit their particular book-finding needs.

The popular social networking site, Facebook, has a similar architecture where they enable users to embed applications into their FaceBook page. Recently, the UIUC released an application that allows a simple query into their catalog. This nascent implementation points to how powerful a fully realized system around books can be [22].

As the development community advances, we will reach a critical mass of macros to support the intelligent user scenario outlined in the User Profiles section. Again, these macros can be collected at the developer hub, and classified (tagged) by user area of interest.

As with Freebase, a developer community will be supported with materials and encouraged to discover new ways to interact with and access Open Library data. Examples of macros might include:

A price comparison widget to find the best price for any book online (see isbn.nu).
A tag cloud list.
A contextual listing of the 10 most recent bookmarks.
A bubble chart data visualization to illustrate the interconnectedness of books by author, title, subject, keyword (see Martin Wattenberg's Copernica).
a FRBR widget.

Click image for animated view.

Once a widget has been embedded into a template, any subsequent changes to that widget will automatically update on the page it is connected to.

User Scenarios

10.0 Administrator

WHAT DO I DO: Manage metadata, modify xml, generate supporting meta-files.

FRIENDS ARE: Users of Open Library.

most likely a member of Internet Archive, but could be some other uber-metadata-geek.
Filemaker pro-like interface based on a system currently in use at Internet Archive is under development to support this segment of users.

10.1 Commenter

WHAT DO I DO: Read books. Buy books. Talk about books. Hyperlink.

FRIENDS ARE: Bloggers, Book Enthusiasts, External Mavens.

most likely to come to a book page via a hot topic (link aggregators, blogs, conversation) or Open Library (RSS).
a social reviewer (conversational) - interested in connecting to other people.
may not have a blog and build an identity as an Open Library user.
may have a blog and want to make comments there (Trackback).
makes external connections - most likely to connect to content outside of Open Library.

10.2 Book Author

WHAT DO I DO: Write books.

FRIENDS ARE: Open Library Users.

Most likely to come to Open Library by learning of it through search on their name since authors automatically have a page on our system.
Interested in promoting books and connecting to people who read their books.
Can use Open Library site to build community.
Can adopt their page in the system and identify themselves as author of book (credentialed user).

10.3 Writer (Reviewer / Metadata Contributor)

WHAT DO I DO: Read books. Buy books. Write about books.

FRIENDS ARE: Book Enthusiasts.

Most likely to come to Open Library via personal referral by expert user or promoter.
Most likely to add original content to Open Library through comments and reviews on existing books.
an individual reviewer (self aggrandizement) - most interested in establishing themselves as an expert on the topic.
May not have a blog and build an identity as an Open Library user.
May have a blog and want to link to external content.
Librarians, bloggers.

10.4 Promoter

WHAT DO I DO: Promote books, authors or metadata about books. Promote myself. Advertise.

FRIENDS ARE: Authors, Publishers.

Looking for a means to promote books.
Most likely to create new pages within Open Library.
Will find Open Library through search, word of mouth, industry referral.
Publishers, vanity press, authors, individuals.
Least likely to object to promotional emails (outreach).
Will be interested in advertising opportunities within OL.

10.5 Book Hunter

WHAT DO I DO: Buy Books.

FRIENDS ARE: Bloggers, Book Enthusiasts.

Looking for a specific edition of a book or price comparison.
Will most likely find Open Library through search.
Not likely to add any original content.

10.6 Freeloader

WHAT DO I DO: Download Books.

FRIENDS ARE: Bloggers, Book Enthusiasts.

Reaches Open Library via Google.
Specifically looking for public domain PDF files for free download.
Not likely to add any original content.

10.7 Researcher / Scholar

WHAT DO I DO: Read and annotate books.

FRIENDS ARE: Other scholars, researchers, curators, librarians.

Looking for specific information about a subject.
Most likely to find Open Library through search, link from an external site by a maven on a particular topic, or other expert referral.
Interested in annotation by other expert users, citation, references, especially in their topic of interest.
Adds value by making connections and annotating within books, footnoting.
Likely to add valued original information once the site reaches a critical mass of expert commentary.
Likely to add valued original information if tools are powerful and easy-to-use.

10.8 Proofreader

WHAT DO I DO: Correct information. Write markup.

FRIENDS ARE: Open Library Users and developers.

Most likely to find the site via previous knowledge of Internet Archive efforts or referral by similar efforts.

10.9 Curators / Contextualizers

WHAT DO I DO: Collect books. Organize information. Tag. Build taxonomies.

FRIENDS ARE: Open Library Users.

Most likely to find the site via previous knowledge of Internet Archive efforts or referral by Open Library developers.
Desire to build respected collection and become known as expert on a particular subject (curator)
Adds knowledge through tagging information and creating collections
Makes internal connections - most likely to connect to content on Open Library.

10.10 Book Lovers

EXAMPLE USERS: Book Clubs, Open Library group members.

WHAT DO I DO: Talk, write, buy, collect, tag, organize and download books as a group.

FRIENDS ARE: Other book lovers, especially members of my group. Open Library API users.

Enthusiastic about books and pride in their group.
Most likely to discover Open Library based on invitation, expert user or professional referral.
Desire to add value to their collections and have clear ownership (attribution) over that collection.
Clueful members are likely to use Open Library development tools.
Highly likely to add new content and contribute in all forms (reviews, comments, metadata, book upload).
Most likely to convert new users through word-of-mouth.

10.11 Libraries

Please refer to Small Library Outreach.

10.12 Publishers

WHAT DO I DO: Publish Books.

FRIENDS ARE: Open Library users, authors.

10.13 API Users / Developers

EXAMPLE USERS: Downloaders, mash-up developers, OPAC builders.

WHAT DO I DO: Create software, enhance data, design templates.

FRIENDS ARE: Open Library users, other Open Library developers and Librarians.

Acknowledgements

This document was reviewed by Aaron Swartz and Alexis Rossi. Open Library Mailing List discussions greatly influenced this document. Additional input from Steve Sisney and Tom Gruber helped shape the book viewer and user profile sections, respectively. Many thanks to Bill Gaetjens for proofreading and general commentary.

Authors

Rebecca Hargrave Malamud <webchick@invisible.net>.

References

[1]	Wikipedia, "Structured Wiki", last edited June 2007. http://en.wikipedia.org/wiki/Structured_Wiki
[2]	Open Library, "About The Technology", last edited June 23, 2007. http://demo.openlibrary.org/about/tech
[3]	Swartz, Aaron, "The Pharos Project", ("Flexible Data Types"), last edited April 2007. http://pharos.infogami.com/spec/
[4]	Blyberg, John, Taking advantage of Web and Library 2.0, December 2006. http://www.blyberg.net/2006/02/09/taking-advantage-of-web-and-library-20/
[5]	Internet Archive, About The Internet Archive, ("Why the Archive is Building an 'Internet Library"). http://www.archive.org/about/about.php
[6]	Swartz, Aaron, "The Pharos Project", ("User Accounts"), last edited April 2007. http://pharos.infogami.com/spec/
[7]	Open Library Documentation, "About the technology", last edited June 2007. http://en.wikipedia.org/wiki/Structured_Wiki
[8]	Coyle, Karen and Hillman, Diane, " D-Lib Magazine: Resource Description and Access (RDA)", January/February 2007 http://dlib.org/dlib/january07/coyle/01coyle.html
[9]	Daniel Giffin, "Open Library Mailing List". http://mail.archive.org/cgi-bin/mailman/listinfo/openlibrary
[10]	Garshol, Lars Marius. "Metadata? Thesauri? Taxonomies? Topic Maps!", October 26, 2004 http://www.ontopia.net/topicmaps/materials/tm-vs-thesauri.html#N412
[11]	Bisson, Casey, "Tags, Folksonomies, And Whose Library Is It Anyway", July 27, 2006. http://maisonbisson.com/blog/post/11392/
[12]	The Berkman Center For Internet & Society, "H20 Playlist". http://h2obeta.law.harvard.edu/help.do#public
[13]	ALA (American Library Association), "The USA PATRIOT Act". http://www.ala.org/ala/washoff/woissues/civilliberties/theusapatriotact/usapatriotact.cfm
[14]	Electronic Frontier Foundation, "The USA PATRIOT Act". http://www.eff.org/patriot/
[15]	Nowviskie, Bethany, "Networked Infrastructure for Nineteenth-century Electronic Scholarship". http://www.nines.org/collex/
[16]	Nowviskie, Bethany, "The Rossetti Archive". http://www.rossettiarchive.org/nines.html
[17]	Nowviskie, Bethany, "NINES: a federated model for integrating digital scholarship", April 2007. http://www.electronicbookreview.com/thread/enfolded
[18]	British Library, "Turning The Pages". http://www.electronicbookreview.com/thread/enfolded
[19]	Gruber, John. "Markdown". http://daringfireball.net/projects/markdown/
[20]	Zeldman, Jeff. "“Maybe” is one option too many". http://www.zeldman.com/2007/06/20/remove-maybe-from-invitation-systems/
[21]	Swartz, Aaron, "The Pharos Project". http://pharos.infogami.com/spec
[22]	West, Jessamyn, "librarian.net: but once libraries get to facebook, what do they do there?". http://www.librarian.net/stax/2062/but-once-libraries-get-to-facebook-what-do-they-do-there/
[23]	Citizendium, "About The Citizen's Compendium". http://www.citizendium.org/about.html
[24]	Scholarpedia, "The Free Peer-Reviewed Encyclopedia". http://scholarpedia.org/
[25]	Corante, "Larry Sanger, Citizendium, and the Problem of Expertise". http://many.corante.com/archives/2006/09/18/larry_sanger_citizendium_and_the_problem_of_expertise.php
[26]	H20 Playlist, "Philosophy". http://many.corante.com/archives/2006/09/18/larry_sanger_citizendium_and_the_problem_of_expertise.php
[27]	Swartz, Aaron, "The Pharos Project", ("Revision History"), last edited April 2007. http://pharos.infogami.com/spec/
[28]	Dempsey, Lorcan, "Libraries and The Long Tail", ("Some Thoughts about Libraries in a Network Age"), April 2006. http://www.dlib.org/dlib/april06/dempsey/04dempsey.html

Design Mockups

The architecture of Open Library supports multiple views of the same information and enables system-wide sharing of templates and site components. Rather than have one methodology rule the UI, the community can help build the site to make it the best that it can be.

Book Viewer by Aaron Swartz
Book Viewer by Rebecca Malamud [ Detail ]
Distributed Proofreading Book Viewer by Rebecca Malamud
Faceted Search by Paul Rubin
Meta-manager by Steve Sisney

Rollout Plan

July 16, 2007 - Soft Launch
The goal of this phase is to prepare the site for an influx of new helpers to help with outreach, coding, design, etc. (see the about page for more information). This will help free up the core Open Library team to continue to guide the site towards its October 17 launch date.

October 17, 2007 - Phase 1 Launch.
Open Content Alliance meeting.

Post-October, 2007 - Phase 2 Development.

Special Considerations

A Small Library Outreach

One important role that Open Library can assume in regards to small libraries is that of education of new technologies. A library user (or even a librarian) may not be familiar with available tools or may not be aware that other materials are available [28]. And many small libraries, especially rural libraries, have limited ability to network and partake in conferences that would introduce them to new concepts such as "Library 2.0." Open Library will be a leader in providing information on how new technologies can help all libraries, and provide tutorials to make these technologies less formidable to Digital Immigrants.

KEY ISSUES AND CHALLENGES:

Financial support

Limited staffing

Staff training

Technology support or availability

Limited physical space

Remoteness / Isolation from some of their patrons

POSSIBLE ROADBLOCKS:

Reluctance to adopt new technologies (by patrons and staff)

HOW OPEN LIBRARY CAN HELP

Empower them by providing an outlet for creative fundraising activities

Become a valued learning resource for these communities (Library 2.0 [*])

Lower the technological barrier to entry (build custom OPACs.)

Provide a personalized environment for their OPAC (papl.openlibrary.org)

Solve their shelf problem by extending their local collection.

Connect them more closely with their remote patrons.

Give patrons a sense of involvement by letting them help build the online catalog and interact with their library

The current strategy is to create a wiki page that libraries can maintain when they discover it - so they can immediately become a part of Open Library. This page would offer information about the library itself, and provide news feeds, links to useful resources, and expert commentary so the libraries that find it can use it as an educational tool. At any time, they can create an account and take over the "page" (similar to the strategy for Book Authors) and enhance the public view with their own information. They will also gain access to an interface view to support information experts who need more direct interaction with the bibliographic record.

B. Ensuring Data Integrity (Data Reconciliation)

Because the possible user base for Open Library is so broad, there is some concern for the integrity of data. It has been expressed that certain types of "hard" data (such as a precise bibliographic record in a format like MARC) should be editable only by experts (certified librarians), leaving the rest of the "soft" data (reviews, comments, table-of-contents, booklists, etc.) to "normal" users. This practice is evident the library world, where some libraries won't even accept records from other libraries. Academic libraries in particular will only accept records from other pre-determined librarians that they trust.

Scholarpedia and Citizendium are examples of wikis where information experts and information consumers need to interact. In both scenarios, some users are distinguished as domain experts, and given special privileges. The Citizendium is an experimental new wiki project started by a co-founder of Wikipedia, and aims to improve on that model by adding "gentle expert oversight" and requiring contributors to use their real names [23]. Scholarpedia also follows the Wikipedia model, but differs in some very important ways:

Each article is written by an expert (invited or elected by the public).
Each article is anonymously peer reviewed to ensure accurate and reliable information.
Each article has a curator - typically its author -- who is responsible for its content.
Any modification of the article needs to be approved by the curator before it appears in the final, approved version [24].

The problem with this process is it is not flexible enough. Every group may have their own expert - or every user for that matter. As Clay Shirky notes:

The problem Citizendium faces is that experts are social facts - society typically recognizes experts through some process of credentialling, such as the granting of degrees, professional certifications, or institutional engagement. We have a sense of what it means that someone is a doctor, a judge, an architect, or a priest, but these facts are only facts because we agree they are.

and

Sanger's view seems to be that expertise is a quality like height - some people are obviously taller than others, and the rest of us have no problem recognizing who the tall people are. But expertise isn't like that at all; it is in fact highly subject to shifts in context [25].

And as stated on the H20 Playlist site by The Berkman Center for Internet and Society:

"The collective wisdom of academics is valuable ... People should learn from their peers and role models. With H2O Playlists, everyone can turn to broad communities of expertise for educational recommendations. As Suroweicki's "Wisdom of Crowds" tells us, these communities produce better results over time than any one subject expert [26]."

Filtering user contributions on a case-by-case basis is one solution to this problem. Users can "block" other individuals whose contributions they consider "noise". This will be editable on each wiki history page. Additionally, a user can form a group of contacts that they deem "expert" in knowledge and select an option that only shows the contributions of people in that group. This could be a certified librarian or some other information maven. This type of architecture will also support geographical location (so that a user can only see comments from other users in their area). This would be desirable by any small library that might want to develop their own OPAC within Open Library's web site environment, as opposed to downloading the software and customizing it themselves.

A user can also define classes of users from the top-down (a la Flickr), breaking the global community into distinct classes such as "private," "friends," "family," "expert," etc. This is a little more restrictive since the groups need to be pre-determined instead of being allowed to form in an organic way.

The primary user base we are concerned with here are (a) people who find it important to have credentialed experts modify the data or (b) people who are geographically-centric. If this approach proves too controversial (i.e. the online equivalent of "kicking someone out of the library"), we should explore other methods that might appeal to closed groups or ignore their needs entirely.

Revision history could be a fundamental approach at data reconciliation. The core infogami underlying Open Library will provide full revision history, time travel, inter-page diffs, and conflict resolution (with merging) [27]. Referrer tracking can also provide a source of popularity data, by tracking what pages, both inside and outside the wiki, are linking to a page (with a sorted view with excerpts). This is useful for determining page rank, but not entirely relevant when trying to determine the authenticity of data.

On a final note, all data used by Open Library will be made available for bulk download so anyone can build a service of their own design.

C. User Base Fragmentation

There have been concerns expressed regarding fragmenting the user base of a site that is essentially built by a collective whole. As John Blyberg states on his weblog (see Footnote 4):

This is an interesting one, because on one hand, libraries have been given the responsibility to provide authoritative information to the public. If you allow the public to start changing and editing content, you run into some very practical problems. It's important to have a very distinct line between authoritative and non-authoritative content. Commercial web 2.0 websites do not have that problem--their business models depend on the users to create almost 100% of their content and the users themselves decide what is mutable.

As outlined in the section on User Profiles, we will solve part of this problem by clearly identifying librarians and other information experts in the wiki environment, so a user can easily roll back to any change by that person (or filter out other individuals) if they so desire. In addition, macros will be designed specifically for defined user groups that will make the information that is most relevant to them readily accessible.

Blyberg continues:

Our job is here is tricky because we want users to have a seamless experience. Finding those areas or "hot-zones" where content can be opened up to the public is something you'll need to spend a lot of time talking about. You don't want your site to come off as being condescending, ie, "Here's your little play area, have fun!". User involvement needs to be integrated into the entire website and OPAC experience.

Categorizing people can definitely be a dangerous practice. Users will essentially categorize themselves by organizing their own view of the system (again, based on Open Library User Profiles,), and then other users can label them using a positive controlled vocabulary that will indicate their rank in the system. An example of this vocabulary would be similar to what is often used in development environments (apprentice / journeyman / master), but geared more towards the literary world. So, in addition to the official designation of "Certified Librarian," we would have:

"Maven" (reads a lot of books)
"Scholar" (conducts original research on our system)
"Guide" (links books together)

By doing this, we will be able to support the landing page of books mentioned in External Search Engines. Users designated as "expert" in "Tom Sawyer" will appear in a search along with a book. A link to that person leads to other people and books.

D. Internationalization

Internationalization is a high priority for Open Library. In order for the UI to be viewed in different languages, there are two parts to be translated: the program code and the user-generated content. Technology is currently under development to handle both types of conversions. Users will be able to use the wiki interface to translate the UI elements into multiple languages and also create parallel page elements translating the site content into different languages. HTTP language negotiation will be used to automatically serve pages to the user in the language specified by their browser preferences.

History

Created April 11, 2008
17 revisions

August 9, 2023	Edited by Drini	Convert type to page
March 3, 2009	Edited by phr	Edited without comment.
October 10, 2008	Edited by webchick	removing thickbox links due to conflict with master.css
July 29, 2008	Edited by 59.92.174.191	revert back spam
April 11, 2008	Created by an anonymous user	moving dev docs