Page 182 - Building Digital Libraries
P. 182
Sharing Data—Harvesting, Linking, and Distribution
create metadata and metadata profiles to best serve themselves and their
users, it is unlikely that data, even in a shared metadata format, will be
usable without the need for some transformation or data reconciliation.
When working with digital library data, it is important to remember that
in most cases, an organization’s decisions related to format, standards, and
best practices will be unapologetically local, so captured remote metadata
must be crosswalked into a format that the local system can understand.
Today, metadata crosswalking remains the primary mechanism that is used
to allow different systems to interoperate with each other. The crosswalking
process removes data transfer barriers, allowing heterogeneous systems to
successfully share data. Within the library community, this is manifested
in federated search systems, which ingest metadata in various formats and
provide a standardized search syntax between resources.
In addition to system interoperability, metadata crosswalking can be
used to move data from an obsolete metadata schema. This type of data
crosswalking has been done for decades when dealing with binary formats.
Organizations routinely need to migrate image or document data from
obsolete file formats. Like binary formats, metadata formats are gradually
changed or replaced, and so they become obsolete with the passage of time.
As formats are phased out, crosswalks can be created to provide an upgrade
path for obsolete metadata schemas.
Crosswalking Challenges
Unfortunately, crosswalking metadata is hard work. In many ways, moving
bibliographic data between various metadata schemas is like trying to fit a
square peg into a round hole. In the end, a crosswalk is simply a process of
trying to round the square peg, so that it makes for an easier fit. Fortunately,
crosswalking challenges can generally be broken down into four categories:
1. metadata consistency
2. schema granularity
3. the “spare parts”
4. dealing with localisms
Metadata Consistency
When crosswalking metadata, consistency is the Holy Grail. The crosswalk-
ing process must assume that metadata in one format has been consistently
applied if rules are to be developed for how that information should be
represented in other metadata formats. Given the algorithmic nature of
the crosswalking process and of digital interoperability efforts in general,
data consistency remains the key to these efforts. Without data consistency,
8
crosswalking processes would need to be overly complex to deal with vari-
ous data and would very likely require human interaction during or after the
process. Ideally, interoperability efforts should be fully automatic, requir-
ing few exceptions for variations in the data. However, when dealing with
interoperability, the issue of data consistency is often a large hidden cost.
167