Professional Development

In the 'metadata' Category...

Saturday at ALA with Carolyn

Wednesday, July 15, 2009 11:13 pm

On Saturday, I attended “Workflow Tools for Automating Metadata Creation and Maintenance” which was a panel discussion comprised of individuals who work on digital projects at their institutions.

Much of the talk was highly technical and I didn’t quite understand everything, but one of the most interesting projects discussed was by Brown University’s Ann Caldwell, Metadata Coordinator for the Center of Digital Initiatives, who spoke about their recent project in assisting the Engineering Department with its upcoming accreditation. Engineering professors wanted to digitize materials such as syllabi and assignments so that the accreditation team could have them in advance of their visit. The Center created an easy way for professors to put stuff into the repository by creating a very simple MODS (Metadata Object Description Schema) record form with required fields to fill in (e.g. date, title, genre) and providing an easy way for individuals to upload files (i.e. digital objects). Faculty decide how they want to set up folders for their stuff; they can dump everything in one folder or create multiple folders down to the micro-level. Faculty also determine who and what individuals can see. Because of the enormous amount of material being brought in to be digitized, the Center developed a tracking system. Due to the success of this project, the Engineering Department will continue digitizing their materials for future accreditations, and Ms. Caldwell indicated other departments were interested in doing the same.

In regards to metadata creation workflow, consistency, automation, streamlining and true interoperability between systems are of utmost importance. With the help of metadata tools, librarians can do their jobs better and more efficiently. Smart systems are possible and necessary. We need to pay attention to user interface design for cataloging tools because it is critical to the success of our data.

Next, I attended a four hour panel discussion titled “Look Before You Leap: Taking RDA for a Test-Drive.” Again, a highly technical presentation. RDA is the acronym for “resource description and access” and is a new cataloging tool to be utilized for the description of all types of resources and content. It is compatible with established principles, models, and standards and is adaptable to the needs of a wide range of resource description communities (i.e. museums, libraries, etc.) Tom Delsey began the session by comparing and contrasting AACR2 and RDA. Nanette Naught followed by previewing the RDA Toolkit which is currently in the alpha testing stage. Sally McCallum of the Library of Congress spoke on new fields developed for the MARC record in conjunction with RDA. John Espley, Director of Design at VTLS, gave attendees a preview of what an RDA record would like like in the ILS he represents. His presentation finally shed some light for me as to how an RDA cataloging record would appear in an online catalog. National Library of Medicine’s Barbara Bushman described the upcoming testing of RDA at 23 select institutions. The testing will occur in OCLC Connexion as well as in various ILS. Voyager being one. Once the RDA Online software is released sometime in November or December 2009, a preparation period which includes training for the testing institutions will occur in the months of January-March 2010. Formal testing will commence in April-June, followed in July-September with a formal assessment. October 2010 a final report will be shared with the U.S. library community.

If and when RDA is approved for use, training for catalogers will be the next step. Knowledge and training about RDA for all library staff will need to take place as well. People on the front lines working with patrons in catalog instruction will need to know the differences between a specific work and its possible multiple manifestations (work and manifestation being FRBR terminology).

For more information, one can visit the RDA web site.

Needless to say, after this session ended, I was ready to head back to my hotel for some rest. I will post more information on the rest of my conference experiences on Friday.

Leslie at MLA 2009

Monday, March 16, 2009 7:59 pm

I’m back from this year’s annual conference of the Music Library Association, held in Chicago (during a snowstorm) Feb. 17-21. This year I also attended the pre-conference hosted by MOUG (Music OCLC Users Group). Some highlights:

Sound Recordings and Copyright

Tim Brooks of the Association of Recorded Sound Collections described the ARSC’s work lobbying Congress to reform US copyright law on pre-1972 sound recordings. These recordings are not covered by federal law, but are often governed by state law, which tends to give copyright holders, in Tim’s words, “absolute control.” Tim cited some startling statistics: of all recordings made in the 1940s-70s, only 30% have been made available by the copyright holders; of recordings made in the 1920s-30s, only 10% are available; and of the enormous corpus of ethnic and traditional music from all over the world that was recorded by Columbia and Victor in the early years of the 20th century, only 1% is available. Because US copyright law for sound recordings is the most restrictive in the world, early recordings of American artists are currently legally available in other countries but not in the US — which means that American libraries and archives are unable to preserve this portion of our own heritage.

In response, the ARSC has made the following reccomendations:

  • Place pre-1972 recordings under a single federal law.
  • Harmonize US copyright law with that of other countries.
  • Legalize use of “orphaned” works (whose copyright holders cannot be identified).
  • Permit use of “abandoned” works, with compensation to the copyright holders.
  • Permit “best practices” digitization for preservation. Libraries and archives are the most likely to preserve early recordings (they have a better track record on this than the recording companies themselves) and the least likely to re-issue recordings (so they’re no financial threat to copyright holders).

Of ARSC’s experiences lobbying Congress members, Tim reports that many were simply unaware of the situation, but were sympathetic when informed; that libraries are seen as non-partisan and a public good, “the guys in the white hats”; and that there is now much “soft” support in Congress. Other ARSC activities include a “white paper” for the Obama administration, and the establishment of an organization called the Historical Recording Coalition for Access and Preservation (HRCAP) to further lobbying efforts.

In another copyright session, attendees and speakers offered some good tips for approaching your legal counsel re digitization projects:

  • Present your own credentials (copyright workshops you’ve attended, etc.) pertaining to libraries and copyright.
  • Cite specific passages of the law (section 108, 110, etc.)
  • Show you’ve done due diligence (e.g., you’ve replaced LPs with CD re-issues where available; you’ve determined other LPs are in deteriorating condition, etc.)
  • Try to persuade counsel to adopt a “risk assessment” approach (i.e., just how likely is it that a copyright holder will challenge you in this case) versus the more typical “most conservative” approach.
  • File a “contemporaneous writing” — a memo or other document, written at the outset of a digitization project, in which you explain why you believe that you are acting in good faith. This will go a long way towards protecting you if you are in fact challenged by a copyright holder.

Is the Compact Disc Dead?

This was the question addressed by a very interesting panel of speakers, including a VP of Digital Product Strategy at Universal Music Group; the CEO of the Cedille recording label; a concert violinst (Rachel Barton Pine); a former president of the American Symphony Orchestra League; and a music librarian at Northwestern U.

The panel quickly cited a number of reasons to believe that the CD remains a viable format: among these, the universal human desire to own a physical artifact “to give and to show”; the ability to listen on room speakers, not just earbuds; violinst Pine noted that she sells and autographs some 40-70 of her CDs after each performance, that people enjoy the personal contact with the artist, and relish being able to take home a souvenir of the concert. Flaws of downloadable releases were cited in comparison: garbled indexing, making identifying and retrieving of classical works difficult; frequent lack of program notes to provide historical context; the inferior audio quality of compressed files. Changes in student behavior were also noted: in online databases, students tend to retrieve only selected works, or excerpts of works; there doesn’t seem to be the inherent incentive to browse like that offered by physical albums, with the result that students don’t develop as much in-depth knowledge of a composer’s works. On the other hand, the reduced cost of digital distribution has enabled smaller orchestras and other groups to reach a larger audience.

Concern was expressed over an increasing trend among major labels to release performances only in the form of downloadable files, often with a license restricted to “end user only” — preventing libraries from purchasing and making available these performances to their users. The panel proposed that performers and IAML (the International Association of Music Libraries) put pressure on the record companies. Alternative approaches? CDs-on-demand: Cedille’s boss sees this as a growing trend. Also, consortial deals with individual record companies: OhioLink has recently done one with Naxos.

Finally, a concern was expressed about the aggregator model of audio-steaming databases: that these hamper libraries’ responsiveness to local user needs, and the building of the unique collections important for research. The music library community needs to negotiate for distribution models that enable individual selection for traditional collection development.

How Music Libraries are Using New Technologies

  • Videos demonstrating specific resources, such as composers’ thematic catalogs (similar to Lauren’s Research Toolkits).
  • “Un-associations,” in informal online forums like Yahoo or Google groups. There are currently groups for orchestra libraries, flutists, etc.
  • Use of Delicious to create user guides.
  • Meebo for virtual ref.
  • Twitter for virtual ref and for announcements/updates.
  • Widgets and gadgets to embed customized searches, other libraries’ searchboxes, and other web content into LibGuides, etc.
  • ChaCha (a cellphone question-answering service) for virtual ref. Indiana U is partnering with ChaCha in a beta test.

JSTOR

A JSTOR rep presented palns to add 20 more music journals to the database, including more area-studies and foreign-language titles. Attendees pointed out that popular music serials (Downbeat, Rolling Stone, etc.) are becoming primary source material for scholarly research — would JSTOR consider including them? The rep replied that JSTOR originally required that journals be peer-reviewed, but had recently begun to relax this rule. A dabate ensued among attendees as to whether the pop publications were sufficiently relevant to JSTOR’s mission — some believed that JSTOR should stick to its original focus on scholarly literature, and that others could preserve the pop stuff.

Bibliographic Control and the LC Working Group (or: Music Catalogers Freak Out)

The MOUG plenary session gave catalogers a forum to discuss ramifications of the LC Working Group’s recommendations on bibliographic control (see my blog posting for RTSS 08). Concerns expressed:

If collaboration is properly defined as “doing something together for a purpose,” then the disparate (and sometimes opposing) purposes of publishers, vendors, and libraries means that LC’s vision of collective responsibility for metadata and bibliographic control will not constitute true collaboration, but merely exploitation.

The Working Group appears to some to harbor a naive faith in digital architecture to meet all discovery and retrieval needs (it reminded one attendee of predictions that microform would solve all our problems). This is perceived to cultivate a gobal, generalist, one-size-fits-all outlook divorced from existing patterns of scholarly communication and “communities of practice” (e.g., the subject specialist and the community of practitioners that he/she serves). Bibliographic control should be “a network of communication between communities of practice.” An MLA liaison to ALA’s RDA committee noted that the RDA folks expected local catalogers to help fill in the gaps in the currently-vague RDA code — but when specialist communities actually propose details (such as a list of genre terms for music), they’re “dissed.”

Others fear that if LC backs away from its historical role as national library, relying on the larger community of publishers, vendors, and libraries to collaborate in bibliographic control, the actual effect will be that library administrators will think: “If LC isn’t doing this work, then we don’t have to either” — and collaboration will disappear.

Yet others fear the “commodification of cataloging.” With the increasing availability of MARC records and other metadata from third-party sources, there seems to be a growing perception that all metadata is the same — and a concommitant decline in willingness to investigate its source and quality. Administrators increasingly speak of metadata as a commodity.

Remember Katrina?

I’ll close with an item from the business meeting of SEMLA (the Southeast chapter) which was a cause of great celebration: our colleagues from Tulane University in New Orleans, whose music collection was flooded in Hurricane Katrina, announced that 70% of their collection has successfully been restored, and the last portion of it recently returned to them. They brought along a few representative items for show and tell — including a score died pink by its red paper covers. Recalling photos of the original damage, a 70% recovery rate seems a miracle!

NCLA RTSS Spring Workshop

Monday, May 26, 2008 3:56 pm

RTSS 2008 - The Future of Bibliographic Control

At NCLA’s Resources & Technical Services Section’s Spring workshop, held this year on May 22 in Raleigh, the keynote speaker was Jose-Marie Griffiths, Dean of the Library School at Chapel Hill, and also a member of a working group charged by the Library of Congress to:

(1) Explore how bibliographic control (formerly known as cataloging, also including related activities) can support access to library materials in the web environment;

(2) Advise the Library of Congress on its future roles and priorities.

The group published its report, titled “The Future of Bibliographic Control”, in January of this year. It’s available on LC’s website: http://www.loc.gov/bibliographic-future/

Concerning the web environment, Giffiths began by noting that many users nowadays turn first to Google or some other web browser for their information needs; that despite the number of web-based library catalogs, there are still many separate library databases that are not accessible by a web search; that, due to the web’s worldwide reach, our users are increasingly diverse, using multiple venues (vendors, databases, social networking, etc); also, that bibliographic data now comes from increasingly diverse sources via the web; and that, as a result, bibliographic control must be thought of as “dynamic, not static”, and that the “bibliographic universe,” traditionally controlled by libraries, will in future involve “a vast field of players” (including vendors, publishers, users, even authors/creators themselves).

As for LC’s role, the report reminds us that LC’s official mandate is to support the work of Congress. It has never been given any official mandate — and most importantly, the funding — to be a national library, providing the kinds of services (cataloging, authority control, standards) for the nation’s other libraries that national libraries typically do. Of course, over the years LC has become a de facto national library, providing all the above services, upon which not only American libraries but libraries worldwide rely heavily. As this unfunded mandate is rapidly becoming unsustainable, pressures are building to “identify areas where LC is no longer the sole provider” and create partnerships to distribute the responsibility for creating and maintaining bibliographic data more widely (among other libraries, vendors, publishers, etc.); also, to review current LC services to other libraries with an eye to economic viability, or “return on investment.”

To achieve these aims (exploiting the web environment, and sharing responsibility), the working group offers 5 recommendations:

(1) Increase efficiency in producing and maintaining bibliographic data. Griffiths noted that duplicated effort persists not so much in creating bib records nowadays (thanks to OCLC and other shared databases), but in the subsequent editing and maintaining of these records: many libraries do these tasks individually offline. Proposed solutions: recruit more libraries into the CCP (Cooperative Cataloging Program, those other large research libraries that contribute LC-quality records to OCLC). Convince OCLC to authorize more libraries to upgrade master records (the ones we see when we search) in the OCLC database. Also, exploit data from further upstream: Publishers and vendors create bib data before libraries do. Find more ways to import vendor data directly into library systems, without library catalogers having to re-transcribe it all. (This may cause some of us who’ve seen certain vendor records in OCLC to blanch; however, the Working Group’s report adds: “Demonstrate to publishers the business advantages of supplying complete and accurate metadata”[!]). Similarly, recruit authors, publishers, abstracting-and-indexing services, and other communities that have an interest in more precisely identifying the people, places, and things in their files, to collaborate in authority control. Team up with other national libraries to internationalize authority records.

(2/3) Position our technology, and the library community, for the (web-based) future. We need to “integrate library standards into the web environment.” Proposed solutions: Ditch the 40-year-old MARC format (only libraries use it), and develop a “more flexible, extensible metadata carrier [format]“, featuring “standard” “non-language-specific” “data identifiers” (tags, etc.) which would allow libraries’ bib data to happily roam the World Wide Web, and in turn enable libraries to import data from other web-based sources. Relax standards like ISBD (the punctuation traditionally used in library bib records) to further sharing of data from diverse sources. “Consistency of description within any single environment, such as the library catalog, is becoming less significant than the ability to make connections between environments, from Amazon to WorldCat to Google to PubMed to Wikipedia, with library holdings serving as but one node in this web of connectivity.” Incorporate user-contributed data (like we see in Amazon, LibraryThing, etc.) that helps users evaluate library resources. Take all those lists buried in library-standards documentation - language codes, geographical codes, format designators (GMDs), etc. - and put those out on the web for the rest of the world to use. Break up those long strings of carefully-coordinated subdivisions in LC subject headings (”Work — Social aspects — United States — History — 19th century”) so they’ll work in faceted systems (like NC State’s Endeca) that allow users to mix-and-match subdivisions on their own. (This is already generating howls of protests from the cataloging community, with counter-arguments that the pre-coordinated strings provide a logical overview of the topic — including those aspects the user didn’t think of on their own.) The Working Group supports development of FRBR (Functional Requirements for Bibliographic Records, a proposed digital-friendly standard), but like many in the library community, remains skeptical of RDA (Resource Description and Access, another proposed standard meant to bring the Anglo-American Cataloging Rules into the digital age) until a better business case can be made for it: “The financial implications … of RDA adoption … may prove considerable. Meanwhile, the promised benefits of RDA — such as better accommodation of electronic materials, easier navigation, and more straightforward application — have not been discernible in the drafts seen to date…. Indeed, many of the arguments received by the Working Group for continuing RDA development unabated took the form of ‘We’ve gone too far to stop’ or ‘That horse has already left the barn,’ while very few asserted either improvements that RDA may bring or our need for it.”

(4) Strengthen the profession. Griffiths noted that in many areas we lack the comprehensive data we need for decision-making and for cost-benefit analysis. We need to build an evidence base, and “work to develop a stonger and more rigorous culture of formal evaluation, critique, and validation.”

(5) Finally, with the efficiencies gained from the above steps, LC and other libraries will be able to devote more resources to cataloging and digitizing their rare and unique materials. The Working Group feels that enhancing access to more of these “hidden materials” should be a priority.

Griffiths shared with us LC’s immediate reactions to the Working Group’s report. The concepts of shared responsibility, and of accepting data from multiple sources, were “expected.” More controversial were the shifting of priorities to rare materials; the relinquishing of the MARC format; and the focus on return-for-investment in assessing standards, such as RDA.

LC’s final decisions regarding the Working Group’s recommendations are expected to be announced this summer.

Carolyn at NISO Forum on Next Generation Discovery: New Tools, Aging Standards

Monday, March 31, 2008 10:15 am

On March 27-28, 2008, I attended NISO’s 2-day forum on Next Generation Discovery: New Tools, Aging Standards in Chapel Hill. Todd Carpenter, NISO’s Managing Director, began the conference by referencing discovery as being one of the primary reasons people visit libraries either in person or virtually and, that the standards and systems that are currently in use at many libraries are beginning to fray. Libraries are not keeping up with advancing technologies. Out of this meeting, he hopes ideas will come to the forefront in areas of standards and development that NISO needs to address.

I took notes fast and furious so as not to miss anything. Here are some of my interpretations of highlights from Day 1 talks. I hope that they are accurate reflections of what was said. Any misinterpretation is this writer’s fault.

The keynote speaker, Richard Akerman, Technology Architect and Information Systems Security Officer of NRC CISTI, began his speech with the example of SkyNet, a term from science fiction used in the Terminator movies. Terminator fans will remember that the machine (i.e. Terminator) was cold and heartless and employed a hostile user interface. Akerman went on to say that exploring ways of getting machines to function in manners that users want is vital. Machines are not meeting all users’ expectations, and that Google crawlers have shaped all discovery expectations of users today.

How can we as humans better serve the machines our users utilize? Because machines don’t speak our language or have a deep contextual knowledge, humans need to be knowledge translators for the machines so as to enable machines to bring greater discovery to users. Some suggestions he offered included:

  1. Produce information in formats that machines can easily understand, and in parallel formats that are human readable.
  2. For every web resource and its machine reader,the number of formats should be kept simple so as to enable interchange easily.
  3. Bibliographic metadata should be a first class citizen by using OpenURL and COinS. Embedding metadata in webpages can provide bibliographic services around that metadata. Functionality to users can be added by using embedded knowledge.

Humans are seeking rich information experiences, and the general OPAC is not a discovery interface. A discovery layer needs to be built over the catalog’s metadata using APIs, and the catalog should work in ways that the Google generation understands. It should go to wherever your user is (example: a Wake Forest student user is searching Amazon for a book while drinking coffee at Starbucks, a box pops up and alerts the user that the book is available at the library) and able to work at web speed. Embedded knowledge can be enriched by using XML, RDF, RSS, GeoRSS, microformats, aggregators, and recommender APIs. An interesting example of a discovery tool developed by MIT’s SIMILE project is its Timeline component. Timeline is described by MIT’s SIMILE website as a “widget for visualizing time-based events.”

Akerman stated that instead of having too much information, he feels there is too much information poverty. We need to continuously search for and find ways to provide information to users everywhere. There is much information that is not getting indexed and is therefore inaccessible to people. We must tap the knowledge of people all over the world and provide information access to all.

In another talk, Mike Teets, VP of OCLC Global Product Architecture, demonstrated new discovery tools that OCLC is currently providing and those that are in development for users. Three tools that I found most interesting were xISBN, xISSN and Identities. xISBN is a service that consolidates ISBNs of a specific title into a list. It is driven off of FRBR algorithms. OCLC is still testing its xISSN service, which will bring together a graphical representation of the history and relationships of specific serial titles’ ISSNs. Identities provides information about authors and utilizes publication timelines (books by and about an author), audience level indicators (this number is computed by what institutions hold a specific author’s work(s)), and relationships to other authors and/or organizations. You can try Identities by searching for a title in WorldCat, click on the details tab and then click on the author’s name or you can go directly to worldcat.org/identities.

Other interesting discovery tools presented were 2collab and Scitopia.org. 2collab is an Elsevier produced free collaboration tool for researchers and scientists. Information can be shared with peers by creating groups. Users can add tags, bookmarks, ratings, comments, as well as, display one’s current research activity and interests and groups in which one is a member, and highlight one’s scientific record of publications. Privacy is of utmost importance to scientific researchers. Only members within a private group can share and access each other’s information. Group owners can accept or decline membership into a group. ScienceDirect has an “add to 2collab” button that allows users to transfer metadata about pertinent articles to their profiles and they are able to share this information with their groups. IEEE has developed a web service, Scitopia.org, which is a free federated search service of 18 not-for-profit science and technical libraries. It is open to the general public, but is designed primarily for researchers. Partners pay a contribution fee to help fund the service. Subscribers to the partner libraries and members of partner societies are able to view full text included in their subscriptions or memberships; other users have a pay-per-view option.

All conference talks were recorded and the presentation slides are to be posted shortly to the NISO website on the Discovery Tools agenda webpage. For more in depth information, check out NISO’s website. Day two reflections will appear later this week.


Related Links & Other Resources

Note

You are currently browsing the archives for the metadata category.

Search this blog

User Tools

Pages

Archives

Categories

Tags

Subscribe

Powered by WordPress.org, protected by Akismet. Blog with WordPress.com.

Service and Resource Portals