Searching is Not the Answer
The following post was submitted by students enrolled in LIS2407 – Metadata at the University of Pittsburgh School of Information Sciences. For more information on the series, see the introductory post.
By Rose Chiango and Katelyn Quirin
The relationship between metadata and cataloging is growing more indistinct in the digital era. Metadata, being data about data, does not presuppose that the data has to be an object, but it does still need distinct boundaries. Cataloging presumes that the item has boundaries and can be described using its different characteristics in such a way that would be helpful to a user searching for the object.
Metadata and cataloging standards regarding digital objects, particularly those which do not have distinct boundaries and are part of a larger network, present challenges in description. A Google search, for example, does not necessarily search for a description of a desired object (in this case, a webpage). The search focuses on the content of the page to determine its relevance to the search.
Does this mean the end of cataloging, and more specifically, subject headings? This is an abstract argument, leading to a perhaps predictable response. What does a metadata record have that is value-added, and what makes it more than just having the resource entirely searchable? Are resources that are entirely searchable self-describing? Technology has only developed to the point where text, not still images or video, is searchable. Unless the metadata tells you what is in those type of objects, you will be lost. Google is trying to change that, by allowing searches via image instead of keyword.
Obviously, this ability to be linked to the entire content of the item, instead of just the record of the content of the item, changes how we search and browse. Item level cataloging provided access to print resources available in institutions, and the interchange of records among institutions changed our profession’s relationship with metadata. Standardization became the best way to make this interchange functional.
Selecting a standardized method to describe networked internet resources is a challenge, since any new method will have to be backwards compatible with many older ways of describing non-electronic records. What makes describing these types of items different from describing static items? Keep in mind that we are not talking about static versions of the web, such as screenshots of a webpage.
OCLC was a leaders in adapting cataloging for electronic and web based materials. In 1999, they established guidelines for cataloguing these items in MARC 21 and AACR2. The Library of Congress also released guidelines for cataloguing electronic resources during the same time period. From the 2005 update for AACR2, an electronic resource is described as “Material (data and/or program(s)) encoded for manipulation by a computerized device. This material may require the use of a peripheral directly connected to a computerized device (e.g., CD-ROM drive) or a connection to a computer network (e.g., the Internet).” This definition does not include electronic resources that do not require the use of a computer, for example, music compact discs and videodiscs.” (OCLC 2005). These guidelines were last updated in 2006 and involve some changes in the “Type of Record” coding and the 006 field.
There are other methods of representing web content that are not tied to MARC. The World Wide Web Consortium (W3) has developed the Ontology for Media Resources, last updated in 2012. It describes resources available on the web that do not strictly live in repositories. It does have some similarities to Dublin Core but is mostly used to map between different types of metadata standards for the web.
What about the Resource Description Framework (RDF)? RDF can be used to describe objects using N-Triples, Turtle, or XML. It is a framework for thinking about how to describe web resources as objects and how they relate to other objects. Basically, RDF makes relationships between items computer readable in a way that ‘flatter’ systems cannot. It draws the relationships between things within a record, which echoes the architecture of the internet.
As greater numbers of resources are produced in digital formats, traditional markers (e.g. pagination) will have to be replaced by other signifiers. These changing formats and accompanying discovery methods should prompt changes within the metadata community to reflect the nature of networked web pages and resources.
Qin, Jian. “Representation and Organization of Information in the Web Space: From MARC to XML” Vol. 3, No. 2, 2000.
Weitz, Jay. “Cataloguing Electronic Resources: OCLC-MARC Coding Guidelines”. Revised 2006 July 11. http://www.oclc.org/support/services/worldcat/documentation/cataloging/electronicresources.en.html
WC3. Ontology for Media Resources 1.0. Revised 2012 February 09.
LC21: A Digital Strategy for the Library of Congress (2000)
Library of Congress. “Guidelines for Coding Electronic Resources in Leader /06”. Revised December 2007.