Page 184 - Building Digital Libraries
P. 184
Sharing Data—Harvesting, Linking, and Distribution
entity’s type or relevance to the work, any mapping into MARC21 would be
prone to tagging errors or be overly generalized. Would a crosswalk of this
nature be useful? It would depend on the application. Within a federated
search tool, where metadata needs to be interpreted broadly, this mapping
would likely be good enough. Within a more formalized metadata manage-
ment system that utilizes the tagged granularity to index data, this mapping
would be of minimal use.
Dealing with “Spare Parts”
Because metadata crosswalking is rarely a lossless process, decisions often
have to be made regarding what information is “lost” during the crosswalk-
ing process. Moreover, data loss isn’t limited strictly to the loss of descriptive
metadata, since it can include the loss of contextual metadata as well. Going
back to our example in figure 7.4, the metadata being crosswalked from
MARC21 to Dublin Core could be transferred in a lossless manner, since all
data could be placed into the creator element. However, while bibliographic
data would not be lost, the contextual metadata relating to the entity type of
the creator (whether it’s a personal or corporate author), as well as informa-
tion relating to the entity tagged as the main entry, could be lost. So in this
case, the data loss would be primarily contextual.
One of the primary tasks associated with creating a metadata crosswalk
is how one deals with the “spare parts”; that is, the unmappable data that
cannot be carried through the crosswalk. For example, EAD and FGDC are
two examples of very hierarchical metadata schemas that contain biblio-
graphic data and administrative data at both a collection and item level. This
type of hierarchical structure is very difficult to crosswalk between metadata
schemas, and in most cases, it will generally just be dropped. In these cases,
metadata experts need to decide what information must be preserved, and
then try to work within the confines of the crosswalking parameters.
Dealing with Localisms
Lastly, metadata crosswalking must constantly be conscious of what I like
to call “localisms”—data added to the metadata to enable data to sort or
display in a specific way within a local system. Within digital repository
software, many of these localisms will exist. At the Ohio State University
Libraries (OSUL), a number of these localisms can be found within the
library’s digital collections system. When OSUL first started adding content
to its digital repository, a great deal of care was put into defining how the
metadata should be displayed to the user. In order to normalize the metadata
displayed to the user, local, complex metadata elements were created to store
and display measurement data. These data elements fell outside of the norms
of the digital repository software being used at the time, but they represented
the libraries’ best solution for dealing with a complex issue, given the wide
range of measurements that could be made on objects. Within the local
content system, these localisms provide users with a normalized experience.
However, harvesting this metadata for indexing outside of the local system
169