Professional Development

In the 'metadata' Category...

NCLA RTSS Spring Workshop

Monday, May 26, 2008 3:56 pm

RTSS 2008 - The Future of Bibliographic Control

At NCLA’s Resources & Technical Services Section’s Spring workshop, held this year on May 22 in Raleigh, the keynote speaker was Jose-Marie Griffiths, Dean of the Library School at Chapel Hill, and also a member of a working group charged by the Library of Congress to:

(1) Explore how bibliographic control (formerly known as cataloging, also including related activities) can support access to library materials in the web environment;

(2) Advise the Library of Congress on its future roles and priorities.

The group published its report, titled “The Future of Bibliographic Control”, in January of this year. It’s available on LC’s website: http://www.loc.gov/bibliographic-future/

Concerning the web environment, Giffiths began by noting that many users nowadays turn first to Google or some other web browser for their information needs; that despite the number of web-based library catalogs, there are still many separate library databases that are not accessible by a web search; that, due to the web’s worldwide reach, our users are increasingly diverse, using multiple venues (vendors, databases, social networking, etc); also, that bibliographic data now comes from increasingly diverse sources via the web; and that, as a result, bibliographic control must be thought of as “dynamic, not static”, and that the “bibliographic universe,” traditionally controlled by libraries, will in future involve “a vast field of players” (including vendors, publishers, users, even authors/creators themselves).

As for LC’s role, the report reminds us that LC’s official mandate is to support the work of Congress. It has never been given any official mandate — and most importantly, the funding — to be a national library, providing the kinds of services (cataloging, authority control, standards) for the nation’s other libraries that national libraries typically do. Of course, over the years LC has become a de facto national library, providing all the above services, upon which not only American libraries but libraries worldwide rely heavily. As this unfunded mandate is rapidly becoming unsustainable, pressures are building to “identify areas where LC is no longer the sole provider” and create partnerships to distribute the responsibility for creating and maintaining bibliographic data more widely (among other libraries, vendors, publishers, etc.); also, to review current LC services to other libraries with an eye to economic viability, or “return on investment.”

To achieve these aims (exploiting the web environment, and sharing responsibility), the working group offers 5 recommendations:

(1) Increase efficiency in producing and maintaining bibliographic data. Griffiths noted that duplicated effort persists not so much in creating bib records nowadays (thanks to OCLC and other shared databases), but in the subsequent editing and maintaining of these records: many libraries do these tasks individually offline. Proposed solutions: recruit more libraries into the CCP (Cooperative Cataloging Program, those other large research libraries that contribute LC-quality records to OCLC). Convince OCLC to authorize more libraries to upgrade master records (the ones we see when we search) in the OCLC database. Also, exploit data from further upstream: Publishers and vendors create bib data before libraries do. Find more ways to import vendor data directly into library systems, without library catalogers having to re-transcribe it all. (This may cause some of us who’ve seen certain vendor records in OCLC to blanch; however, the Working Group’s report adds: “Demonstrate to publishers the business advantages of supplying complete and accurate metadata”[!]). Similarly, recruit authors, publishers, abstracting-and-indexing services, and other communities that have an interest in more precisely identifying the people, places, and things in their files, to collaborate in authority control. Team up with other national libraries to internationalize authority records.

(2/3) Position our technology, and the library community, for the (web-based) future. We need to “integrate library standards into the web environment.” Proposed solutions: Ditch the 40-year-old MARC format (only libraries use it), and develop a “more flexible, extensible metadata carrier [format]“, featuring “standard” “non-language-specific” “data identifiers” (tags, etc.) which would allow libraries’ bib data to happily roam the World Wide Web, and in turn enable libraries to import data from other web-based sources. Relax standards like ISBD (the punctuation traditionally used in library bib records) to further sharing of data from diverse sources. “Consistency of description within any single environment, such as the library catalog, is becoming less significant than the ability to make connections between environments, from Amazon to WorldCat to Google to PubMed to Wikipedia, with library holdings serving as but one node in this web of connectivity.” Incorporate user-contributed data (like we see in Amazon, LibraryThing, etc.) that helps users evaluate library resources. Take all those lists buried in library-standards documentation - language codes, geographical codes, format designators (GMDs), etc. - and put those out on the web for the rest of the world to use. Break up those long strings of carefully-coordinated subdivisions in LC subject headings (”Work — Social aspects — United States — History — 19th century”) so they’ll work in faceted systems (like NC State’s Endeca) that allow users to mix-and-match subdivisions on their own. (This is already generating howls of protests from the cataloging community, with counter-arguments that the pre-coordinated strings provide a logical overview of the topic — including those aspects the user didn’t think of on their own.) The Working Group supports development of FRBR (Functional Requirements for Bibliographic Records, a proposed digital-friendly standard), but like many in the library community, remains skeptical of RDA (Resource Description and Access, another proposed standard meant to bring the Anglo-American Cataloging Rules into the digital age) until a better business case can be made for it: “The financial implications … of RDA adoption … may prove considerable. Meanwhile, the promised benefits of RDA — such as better accommodation of electronic materials, easier navigation, and more straightforward application — have not been discernible in the drafts seen to date…. Indeed, many of the arguments received by the Working Group for continuing RDA development unabated took the form of ‘We’ve gone too far to stop’ or ‘That horse has already left the barn,’ while very few asserted either improvements that RDA may bring or our need for it.”

(4) Strengthen the profession. Griffiths noted that in many areas we lack the comprehensive data we need for decision-making and for cost-benefit analysis. We need to build an evidence base, and “work to develop a stonger and more rigorous culture of formal evaluation, critique, and validation.”

(5) Finally, with the efficiencies gained from the above steps, LC and other libraries will be able to devote more resources to cataloging and digitizing their rare and unique materials. The Working Group feels that enhancing access to more of these “hidden materials” should be a priority.

Griffiths shared with us LC’s immediate reactions to the Working Group’s report. The concepts of shared responsibility, and of accepting data from multiple sources, were “expected.” More controversial were the shifting of priorities to rare materials; the relinquishing of the MARC format; and the focus on return-for-investment in assessing standards, such as RDA.

LC’s final decisions regarding the Working Group’s recommendations are expected to be announced this summer.

Carolyn at NISO Forum on Next Generation Discovery: New Tools, Aging Standards

Monday, March 31, 2008 10:15 am

On March 27-28, 2008, I attended NISO’s 2-day forum on Next Generation Discovery: New Tools, Aging Standards in Chapel Hill. Todd Carpenter, NISO’s Managing Director, began the conference by referencing discovery as being one of the primary reasons people visit libraries either in person or virtually and, that the standards and systems that are currently in use at many libraries are beginning to fray. Libraries are not keeping up with advancing technologies. Out of this meeting, he hopes ideas will come to the forefront in areas of standards and development that NISO needs to address.

I took notes fast and furious so as not to miss anything. Here are some of my interpretations of highlights from Day 1 talks. I hope that they are accurate reflections of what was said. Any misinterpretation is this writer’s fault.

The keynote speaker, Richard Akerman, Technology Architect and Information Systems Security Officer of NRC CISTI, began his speech with the example of SkyNet, a term from science fiction used in the Terminator movies. Terminator fans will remember that the machine (i.e. Terminator) was cold and heartless and employed a hostile user interface. Akerman went on to say that exploring ways of getting machines to function in manners that users want is vital. Machines are not meeting all users’ expectations, and that Google crawlers have shaped all discovery expectations of users today.

How can we as humans better serve the machines our users utilize? Because machines don’t speak our language or have a deep contextual knowledge, humans need to be knowledge translators for the machines so as to enable machines to bring greater discovery to users. Some suggestions he offered included:

  1. Produce information in formats that machines can easily understand, and in parallel formats that are human readable.
  2. For every web resource and its machine reader,the number of formats should be kept simple so as to enable interchange easily.
  3. Bibliographic metadata should be a first class citizen by using OpenURL and COinS. Embedding metadata in webpages can provide bibliographic services around that metadata. Functionality to users can be added by using embedded knowledge.

Humans are seeking rich information experiences, and the general OPAC is not a discovery interface. A discovery layer needs to be built over the catalog’s metadata using APIs, and the catalog should work in ways that the Google generation understands. It should go to wherever your user is (example: a Wake Forest student user is searching Amazon for a book while drinking coffee at Starbucks, a box pops up and alerts the user that the book is available at the library) and able to work at web speed. Embedded knowledge can be enriched by using XML, RDF, RSS, GeoRSS, microformats, aggregators, and recommender APIs. An interesting example of a discovery tool developed by MIT’s SIMILE project is its Timeline component. Timeline is described by MIT’s SIMILE website as a “widget for visualizing time-based events.”

Akerman stated that instead of having too much information, he feels there is too much information poverty. We need to continuously search for and find ways to provide information to users everywhere. There is much information that is not getting indexed and is therefore inaccessible to people. We must tap the knowledge of people all over the world and provide information access to all.

In another talk, Mike Teets, VP of OCLC Global Product Architecture, demonstrated new discovery tools that OCLC is currently providing and those that are in development for users. Three tools that I found most interesting were xISBN, xISSN and Identities. xISBN is a service that consolidates ISBNs of a specific title into a list. It is driven off of FRBR algorithms. OCLC is still testing its xISSN service, which will bring together a graphical representation of the history and relationships of specific serial titles’ ISSNs. Identities provides information about authors and utilizes publication timelines (books by and about an author), audience level indicators (this number is computed by what institutions hold a specific author’s work(s)), and relationships to other authors and/or organizations. You can try Identities by searching for a title in WorldCat, click on the details tab and then click on the author’s name or you can go directly to worldcat.org/identities.

Other interesting discovery tools presented were 2collab and Scitopia.org. 2collab is an Elsevier produced free collaboration tool for researchers and scientists. Information can be shared with peers by creating groups. Users can add tags, bookmarks, ratings, comments, as well as, display one’s current research activity and interests and groups in which one is a member, and highlight one’s scientific record of publications. Privacy is of utmost importance to scientific researchers. Only members within a private group can share and access each other’s information. Group owners can accept or decline membership into a group. ScienceDirect has an “add to 2collab” button that allows users to transfer metadata about pertinent articles to their profiles and they are able to share this information with their groups. IEEE has developed a web service, Scitopia.org, which is a free federated search service of 18 not-for-profit science and technical libraries. It is open to the general public, but is designed primarily for researchers. Partners pay a contribution fee to help fund the service. Subscribers to the partner libraries and members of partner societies are able to view full text included in their subscriptions or memberships; other users have a pay-per-view option.

All conference talks were recorded and the presentation slides are to be posted shortly to the NISO website on the Discovery Tools agenda webpage. For more in depth information, check out NISO’s website. Day two reflections will appear later this week.


Related Links & Other Resources

Note

You are currently browsing the archives for the metadata category.

Search this blog

User Tools

Pages

Archives

Categories

Subscribe

Powered by WordPress.org, protected by Akismet. Blog with WordPress.com.

Service and Resource Portals