Metadata for IRs: Fast, Easy, and Useless?
I recently met with a faculty member at my institution, the University of Cincinnati, to discuss submitting content to our digital repository Scholar@UC. He completed the metadata input form quickly and was not interested in describing his resources. When I asked him what was important for him in terms of the discoverability of his content, he replied: “Nothing. I just need a link to put in my journal article.” To metadata specialists, this is a disheartening (but not uncommon) response.
Institutional repositories (IRs) often rely on self-submission models in which users create descriptive metadata for their content. As metadata specialists, we understand the importance of consistent, high quality metadata for good indexing within an application and discoverability by search engines such as Google Scholar. However, our users may not understand the significance of their metadata, or creating metadata may not be important for their immediate needs. It is necessary for metadata specialists to understand and acknowledge that high quality metadata requires an investment of time and resources, and our users may have little of both to devote to description. For these reasons, it is challenging to write metadata guidelines for IRs and even more difficult to enforce those guidelines.
There should be a balance between fast and easy input and useful descriptive data. Faculty and students want simple submission forms, but their metadata is often not effective for discovery and reuse of data without some form of mediation. For example, a dataset with a title of “q.txt” that does not have a readme file would be very difficult to understand or use by outside researchers. If a repository contains thousands of files with inscrutable names, the repository loses usefulness. Similarly, if a repository contains files without subjects, keywords, or other descriptive information, the relationships between those files cannot be made explicit for use by an application. In short, the repository becomes university storage space rather than a rich source of institutional knowledge.
This is where the mission of an institution’s repository becomes central to the discussion. If the mission is primarily preservation-based, an argument could be made that metadata is irrelevant. As long as the content is being preserved, the repository is fulfilling its mission. Yet this argument presupposes a repository full of simple, self-contained files. It is relatively easy to understand a single text file of a student’s thesis or dissertation. But when we move into interconnected files such as datasets, metadata is essential for understanding the structure and content of the data. Without metadata, these items may be unintelligible to someone outside of the lab that created them, and they become difficult to manage over the long term. Preserving something that cannot be understood is a poor investment of resources, regardless of whether content preservation is consistent with the repository’s mission.
So what then is the role of the metadata specialist in regard to IRs? Are we evangelists for users creating granular metadata? Do we take autonomy away from our users and enhance their metadata to fit established guidelines and standards? Are we simultaneously teachers and enforcers?
I don’t know the answers to these questions. But I think that raising them is important because while creating metadata for its own sake is not useful, neither is a repository full of unintelligible resources. I want to respect the autonomy and needs of my repository’s contributors, but I also think it is appropriate for those contributors to share in the responsibility of creating a meaningful IR application.
This begs the question: How are we handling this issue at UC? The answer is complicated because Scholar@UC is in a transitional stage. Since we are only working with early adopters, faculty buy-in is valued more than high quality metadata. We do not have enforced guidelines, only general recommendations that submitters can choose to follow or ignore. Once Scholar@UC is available to the campus at large, my hope is that our metadata guidelines can be strengthened as faculty begin to see the value of their metadata in terms of discoverability. I hope that librarians and submitters can work together to create metadata that is fast, easy, and useful.