BIBFRAME and RDF vocabulary reuse

The conversations surrounding BIBFRAME can be dizzying with unfamiliar terminology, questions posed in areas the library community has little experience with, and our community thinking aloud and learning as we go. These discussions and issues are deep, with many competing perspectives. An announcement this week by the National Library of Medicine (NLM) regarding their BIBFRAME testing highlights one of these issues: whether BIBFRAME should define all of the properties and classes (i.e., the whole metadata structure) needed for resource description in libraries in its own namespace, or define a core set in which libraries have particular expertise but rely on other specialist communities for other parts of the vocabulary.

Back in early 2013 this issue was first discussed on the BIBFRAME list. As part of that thread, Eric Miller from Zepheira, the company contracted to design the RDF model for BIBFRAME, stated the initiative’s intentions were to take the former (define everything themselves) approach: “While the recommendation of a singular namespace is counter to several current Linked Data bibliographic efforts, it is crucial to clarify responsibility and authority behind the schematic framework of BIBFRAME in order to minimize confusion and reduce the complexity of the resulting data formats.” With this approach, connections to identical or similar concepts in other RDF vocabularies can be made through mechanisms such as OWL’s sameAs property. A vocabulary designed this way is slightly easier to implement on its own, but is more difficult for machines to process and perform inferences on, and is to some degree less likely to be used by other communities.

Questions have been posed from the community about this decision for the direction of BIBFRAME. One particularly cogent analysis of a number of related BIBFRAME issues comes from Rob Sanderson in a discussion document first released in the summer of 2014. While the issue of vocabulary reuse is not in and of itself the focus of Sanderson’s analysis, it underscores many of his points. For example, when describing what he sees as unnecessary complexity in the BIBFRAME model that he describes as “predicate proliferation,” Sanderson states “[t]he proliferation is made worse by not reusing predicates that could be reused from other ontologies.” In Sanderson’s analysis, not reusing vocabularies from other sources seems to be a symptom of what he sees as other problematic modeling practices within BIBFRAME.

Most recently, the issue of a single BIBFRAME namespace vs relying on specialist communities for parts of the vocabulary has been raised again through NLM’s November 21 announcement regarding their future direction for BIBFRAME testing. In their post to the BIBFRAME list with this announcement, NLM expresses unease about aspects of BIBFRAME modeling: “…as NLM has experimented with BIBFRAME over the past several years, we are increasingly concerned that the vocabulary development, in attempting to become sufficiently aligned with traditional bibliographic cataloging, may hinder meeting all of BIBFRAME’s goals, particularly those of flexibility and extensibility.” The lack of reuse of outside vocabularies and the complexity of the BIBFRAME model that results is a particular area of concern for NLM. Attempting to reduce this complexity is the core of NLM’s approached way forward: “We intend to draft a core BIBFRAME vocabulary for experimentation (we fully understand that a workable core vocabulary will require collaboration from many communities, but we need to start with something) and extend it with RDA (using the RDA Registry Elements) and an NLM vocabulary for local data.”

Vocabulary re-use is one of many issues central to the design of BIBFRAME, and like other of these issues, is still a topic of debate. It is an example of the tension between following traditional library models (including ease of moving legacy data forward) and adopting information models and practices from outside of libraries with the goal of leveraging work designed for the web and integrating library data into the broader information space. NLM’s testing announcement reminds us that optimum place on this continuum for BIBFRAME, and library metadata in general, is far from resolved within our community.


Leave a Reply