Page 182 - Building Digital Libraries
P. 182

Sharing Data—Harvesting, Linking, and Distribution


                 create metadata and metadata profiles to best serve themselves and their
                 users, it is unlikely that data, even in a shared metadata format, will be
                 usable without the need for some transformation or data reconciliation.
                 When working with digital library data, it is important to remember that
                 in most cases, an organization’s decisions related to format, standards, and
                 best practices will be unapologetically local, so captured remote metadata
                 must be crosswalked into a format that the local system can understand.
                 Today, metadata crosswalking remains the primary mechanism that is used
                 to allow different systems to interoperate with each other. The crosswalking
                 process removes data transfer barriers, allowing heterogeneous systems to
                 successfully share data. Within the library community, this is manifested
                 in federated search systems, which ingest metadata in various formats and
                 provide a standardized search syntax between resources.
                     In addition to system interoperability, metadata crosswalking can be
                 used to move data from an obsolete metadata schema. This type of data
                 crosswalking has been done for decades when dealing with binary formats.
                 Organizations routinely need to migrate image or document data from
                 obsolete file formats. Like binary formats, metadata formats are gradually
                 changed or replaced, and so they become obsolete with the passage of time.
                 As formats are phased out, crosswalks can be created to provide an upgrade
                 path for obsolete metadata schemas.

                 Crosswalking Challenges
                 Unfortunately, crosswalking metadata is hard work. In many ways, moving
                 bibliographic data between various metadata schemas is like trying to fit a
                 square peg into a round hole. In the end, a crosswalk is simply a process of
                 trying to round the square peg, so that it makes for an easier fit. Fortunately,
                 crosswalking challenges can generally be broken down into four categories:

                        1.  metadata consistency
                        2.  schema granularity
                        3.  the “spare parts”
                        4.  dealing with localisms


                 Metadata Consistency
                 When crosswalking metadata, consistency is the Holy Grail. The crosswalk-
                 ing process must assume that metadata in one format has been consistently
                 applied if rules are to be developed for how that information should be
                 represented in other metadata formats. Given the algorithmic nature of
                 the crosswalking process and of digital interoperability efforts in general,
                 data consistency remains the key to these efforts.  Without data consistency,
                                                           8
                 crosswalking processes would need to be overly complex to deal with vari-
                 ous data and would very likely require human interaction during or after the
                 process. Ideally, interoperability efforts should be fully automatic, requir-
                 ing few exceptions for variations in the data. However, when dealing with
                 interoperability, the issue of data consistency is often a large hidden cost.

                                                                                                                     167
   177   178   179   180   181   182   183   184   185   186   187