Page 185 - Building Digital Libraries
P. 185

CHAPTER 7


                                                   can prove to be a challenge, since much of the context and granularity is
                                                   lost through the process.





                                                   OAI-PMH

                                                   Once an item has made it into a digital repository, how is it to be shared?
                                                   Contributors likely want their work to reach the broadest audience, while
                                                   digital repository administrators want to expose data in a way that will
                                                   maximize its exposure at a relatively low cost. Can the repository be crawled
                                                   by search engines, and can the metadata be accessed by remote systems?
                                                   Within our shared information climate, digital repository software must be
                                                   able to provide a straightforward method for sharing metadata about the
                                                   items that it houses.
                                                      Fortunately, such a method exists in all major digital repository services.
                                                   OAI-PMH (Open Archives Imitative Protocol for Metadata Harvesting) is a
                                                   simple HTTP-based protocol that can be used to make a digital repository’s
                                                   metadata available for harvest. The protocol works over a normal HTTP Get
                                                   request—allowing metadata to be harvested by the construction of a simple
                                                   URL. For example, the following URL, http://kb.osu.edu/oai/request?verb=
                                                   ListRecords&set=hdl_1811_29375&metadataPrefix=oai_dc, will harvest all
                                                   metadata items from OSUL’s 2006–07 Mershon Center Research Projects
                                                   (Use of Force and Diplomacy) collection in the libraries’ institutional reposi-
                                                   tory. The protocol utilizes a limited set of verbs, limiting its functionality
                                                   primarily to metadata harvesting and the querying of information about a
                                                   specific collection or collections on the server. To simplify the OAI-PMH
                                                   harvesting process, the protocol requires the support of Unqualified Dublin
                                                   Core. This is what is known as the compatibility schema, so no matter what
                                                   OAI-PMH repository one harvests from, one can be guaranteed that the
                                                   metadata will be available in Dublin Core. However, this doesn’t prevent
                                                   an OAI-PMH repository from supporting other metadata formats. In fact,
                                                   quite the contrary. OAI-PMH implementers are encouraged to support
                                                   multiple metadata formats, so that the repositories’ metadata can be pro-
                                                   vided in various levels of granularity. In the OSUL institutional repository,
                                                   for example, two metadata formats are supported for harvest: Unqualified
                                                   Dublin Core and RDF.
                                                      The OAI-PMH protocol recognizes five actions, or requests, that can
                                                   be made to an OAI-PMH server. Attached to these actions is a limited set
                                                   of arguments that can be set to limit the range of data to be harvested by
                                                   date or set, as well as request the harvested metadata in a specific schema.
                                                   Harvesting limits are set primarily by identifying a range of dates using the
                                                   “from” and “until” OAI-PMH arguments. Within the OAI-PMH server, date
                                                   ranges limit the OAI-PMH response to items whose metadata time stamp
                                                   has been modified within the specified date range. The “from” and “until”
                                                   argument can be used as pairs or separately to selectively harvest metadata
                                                   from an OAI-PMH repository. Additional arguments that can be found in

            170
   180   181   182   183   184   185   186   187   188   189   190