Page 44 - Building Digital Libraries
P. 44

Acquiring, Processing, Classifying, and Describing Digital Content


                 the metadata provided by domain experts as well as the metadata extracted
                 via automation. Workflows for ingesting and processing materials need to
                 be based on realistic scenarios if the repository is to succeed.






                 Collection Development

                 Once a rough framework has been established for how materials will be
                 incorporated into the repository, repository planners must establish how
                 content will be selected and acquired. The selection process is critical, but it
                 is frequently neglected because many planners fail to appreciate how differ-
                 ent the process for selecting digital content can be from the familiar process
                 for selecting physical content.
                     For paper resources, selection and acquisitions are heavily influenced by
                 a publishing model that has been slowly evolving for over 500 years. Publish-
                 ing and distributing paper resources is a complex and expensive process.
                 Because publishers and distributors lose money if no one is interested in a
                 book or journal, mechanisms are built into the publication process to ensure
                 that an item in question is of value to enough customers to be marketable.
                 By its nature, the publication process imposes a minimal level of quality
                 and directly contributes to the traditional association of good libraries with
                 large collections. It also has led to a variety of mechanisms such as catalogs,
                 approval plans, and other ways to help librarians learn about and obtain
                 materials that might be of interest to users.
                     The Internet and rapidly advancing technology lowered the barriers
                 to distributing works to the point that anyone can completely bypass the
                 publication process and distribute virtually anything to a worldwide audi-
                 ence at negligible cost. Aside from dramatically increasing the number of
                 authors, materials need not be edited, marketed, or be of any interest to any
                 particular audience. Distribution is so decentralized that it is not reasonable
                 to believe that librarians can rely heavily upon marketing literature, catalogs,
                 approval plans, or other mechanisms used with print resources for purposes
                 of identifying materials for inclusion in the collection. Consequently, they
                 must find other means of learning about and acquiring resources that should
                 be added to the collection.
                     Adding materials to the collection simply because they are available,
                 requiring content providers to submit resources, or expecting librarians to
                 discover resources through serendipity are three common methods that
                 usually prove unsatisfactory unless used in combination with other tech-
                 niques. Acquiring resources because they are available electronically makes
                 no more sense than acquiring resources simply because they are available in
                 paper form. Content providers cannot be expected to consistently provide
                 resources because many of them will either not be aware of the library’s need
                 to archive materials or they simply don’t care. It is unrealistic to depend pri-
                 marily on individuals encountering useful resources primarily by means of
                 serendipity. For these reasons, the workflow itself needs to include a reliable
                                                                                                                      29
   39   40   41   42   43   44   45   46   47   48   49