Page 51 - Building Digital Libraries
P. 51

CHAPTER 3


                                                   the library need only create a formal agreement with the entity that main-
                                                   tains the data, and this agreement establishes that the data are important and
                                                   must be maintained indefinitely. In certain cases, relying on organizations
                                                   with an inherent interest in the resource to be responsible for preserving that
                                                   access is appropriate as well as cost-effective. Just as it is perfectly reasonable
                                                   to trust the Patent Office to provide perpetual access to patents and that the
                                                   appropriate departments will maintain court proceedings, entities vested in
                                                   the ongoing provision of an information service, such as specialized scien-
                                                   tific communities, are often in a better position than the library to provide
                                                   access to that service. Having said this, other organizations have different
                                                   priorities than libraries, and providing access to such materials from other
                                                   access mechanisms can be difficult unless their systems are designed to be
                                                   used in this way.





                                                   Organizing Content and
                                                   Assigning Metadata

                                                   An object’s value is ultimately defined by use, and people must find objects
                                                   before they can use them. For this reason, metadata must be assigned to
                                                   or extracted from acquired objects so they may be searched and relation-
                                                   ships between them understood. The first step in developing a workflow
                                                   for adding or extracting appropriate access points is to consider what types
                                                   of materials will be added, who will use them and how, how they will be
                                                   searched, and what the expected size of the repository will be. If the reposi-
                                                   tory consists of Portable Document Format (PDF) documents or simple
                                                   image files that are uploaded to a server, the processing required will be
                                                   substantially different from processing web pages or an eclectic collection of
                                                   multimedia resources. Likewise, completeness and consistency of metadata
                                                   are far more important for repositories that contain millions of items than
                                                   for collections containing only a few thousand items.
                                                      A repository’s usefulness depends heavily on how well it associates
                                                   similar items and allows users to find what they need. There are many ways
                                                   to organize electronic resources, but one of the time-tested ways to provide
                                                   access is to create or extract metadata that is embedded in the resource or
                                                   stored in a separate record. All repository software relies on metadata to help
                                                   users find things—in fact, the traditional library catalog card is metadata
                                                   printed on paper.
                                                      The quality and completeness of metadata are critical to its useful-
                                                   ness. Although it’s administratively convenient to store information in its
                                                   original form and rely on automated means to identify access points, this
                                                   approach rarely leads to adequate long-term access. Resources—especially
                                                   nontextual ones such as those consisting of images, video, or sound—lend
                                                   themselves poorly to keyword searching. File formats become obsolete or
                                                   require users to install software which may be expensive or difficult to find,

            36
   46   47   48   49   50   51   52   53   54   55   56