Page 181 - Building Digital Libraries
P. 181

CHAPTER 7


                                                      While XSLT was designed as a stylesheet language for XML (much
                                                   like CSS was designed for HTML), XQuery was created as a full-featured
                                                   programming language and was designed for the search, retrieval, and
                                                   manipulation of large sets of XML data. Functionally, XSLT and XQuery
                                                   have a number of overlapping features. And this isn’t by accident. While the
                                                   XSLT and XQuery specifications are developed by different groups, they
                                                   are developed cooperatively and share oversight of many technologies like
                                                   XPath. This means that most operations that can be accomplished using one
                                                   processing technology can likely be accomplished using the other.
                                                      Given that XQuery and XSLT have many overlapping features, one
                                                   might assume that these technologies share a similar design, but in this, one
                                                   would be mistaken. XQuery is an expression-based language, and shares
                                                   many of the same design concepts with SQL. Its purpose is to provide a
                                                   query-based language that can process large XML datasets and databases,
                                                   overcoming one of the major weakness of the XSLT specification.
                                                      Within the context of digital libraries, however, XQuery has found only
                                                   limited use. In surveying the most popular library metadata formats, none
                                                   currently provide XQuery-based processing documents. All transformations
                                                   are currently provided as an XSLT document using version 1.0 or 2.0 of
                                                   the specification. In fact, as of this writing, probably the most high-profile
                                                   use of XQuery within the digital library community was the use of XQuery
                                                   to demonstrate the conversion of legacy MARC data to BIBFRAME 1.0,
                                                   developed by the Library of Congress.  However, the XQuery approach was
                                                                                    7
                                                   abandoned in favor of XSLT when the Library of Congress released the 2.0
                                                   version of the BIBFRAME specification.



                                                   Metadata Crosswalking

                                                   So what is metadata crosswalking? The crosswalking of metadata is a process
                                                   in which an XML document is transformed from one schema to another.
                                                   The crosswalking process utilizes XSLT or XQuery documents to facilitate
                                                   the movement of metadata between multiple formats. This process requires
                                                   a number of decisions to be taken for how metadata elements from one
                                                   schema relate to another. Metadata crosswalks are developed by examining
                                                   the similarities and differences between differing schemas. Are there one-
                                                   to-one relationships, that is, do elements share the same meanings, or will
                                                   the data need to be interpreted? Will the conversions be lossless (unlikely)
                                                   and if not, what level of data loss will be acceptable? These decisions are
                                                   actually some of the most important in the crosswalking process, since they
                                                   will ultimately affect the quality of the final product.
                                                      So why build metadata crosswalks? If the crosswalking of metadata
                                                   will result in the loss of metadata granularity, why not just create all meta-
                                                   data in the desired format to begin with? Well, metadata crosswalking is
                                                   done for a variety of reasons, though few are as important as remote data
                                                   interoperability. Information systems today require the ability to ingest
                                                   various types of metadata from remote sources. Since most organizations

            166
   176   177   178   179   180   181   182   183   184   185   186