Page 160 - Building Digital Libraries
P. 160

Metadata Formats


                 uncommon for a library to maintain a documents repository like DSpace
                 or BePress, and a repository for other digital content, like images and vid-
                 eos—since the metadata decisions made to support one type of content
                 often didn’t translate well to the others. But this is changing and changing
                 quickly. Tools like Fedora, and communities like the Samvera community,
                 are shifting the bibliographic data model from one where users must select
                 a single metadata framework, to one where we can utilize semantic web
                 principles and make use of multiple metadata namespaces to provide the
                 best support for our digital objects. This flexibility is allowing libraries to
                 think more holistically about the type of metadata frameworks that they
                 utilize, and choose elements from a wider range of communities that best
                 support the data model for their content. In addition, libraries may find
                 that digital repositories which support semantic principles may have easier
                 paths when considering discovery, data interoperability, and migrations.
                 But can we see this today?
                     The answer, at least as it relates to data interoperability, is that we largely
                 can’t. Data interoperability between formats and communities continues
                 to be governed primarily through the use of data crosswalks to normalize
                 the metadata from one community into a format that can be understood
                 by another. With that said, the use of semantic principles or formats like
                 the schema.org are moving quickly to provide a set of “common language”
                 elements that can be used to allow communities to cross barriers. Will these
                 common languages be as robust as older data crosswalks? Likely not. Most
                 data crosswalks provide one-to-one translations of a system, but in many
                 cases, data interoperability doesn’t require strict data mapping, but rather
                 mapping that is good enough to provide enough context to support search
                 and discovery, creating a framework that will allow machines to understand
                 the relationships between interconnected data.
                     Browsing the Web has become second nature for most individuals—but
                 even new users with very little experience working on the Web are able to
                 quickly view and make decisions regarding the content found there. When
                 browsing web content, human beings are easily able to understand the dif-
                 ference between advertisements and content—giving people the ability to
                 unconsciously filter the advertisement out of their mind’s eye. Likewise,
                 when one considers library metadata, a cataloger with any experience can
                 quickly determine the primary control number found within a MARC
                 record, allowing the cataloger to interpret not only the metadata record, but
                 the rules necessary to place that metadata into alternative formats. Machines
                 simply do not have this ability at this point in time. Automated machine
                 processes require the presence of rules and schemas to identify for the soft-
                 ware application the relationships that exist between data. Considering these
                 two examples, a machine would have a very difficult time distinguishing an
                 advertisement from content simply by examining the content. In part, this is
                 why the pop-up blockers and advertisement scrubbers that can be found in
                 web browsers today work primarily through the use of blacklists and known
                 advertising content providers to determine how the elements of a document

                                                                                                                     145
   155   156   157   158   159   160   161   162   163   164   165