Page 44 - Building Digital Libraries
P. 44
Acquiring, Processing, Classifying, and Describing Digital Content
the metadata provided by domain experts as well as the metadata extracted
via automation. Workflows for ingesting and processing materials need to
be based on realistic scenarios if the repository is to succeed.
Collection Development
Once a rough framework has been established for how materials will be
incorporated into the repository, repository planners must establish how
content will be selected and acquired. The selection process is critical, but it
is frequently neglected because many planners fail to appreciate how differ-
ent the process for selecting digital content can be from the familiar process
for selecting physical content.
For paper resources, selection and acquisitions are heavily influenced by
a publishing model that has been slowly evolving for over 500 years. Publish-
ing and distributing paper resources is a complex and expensive process.
Because publishers and distributors lose money if no one is interested in a
book or journal, mechanisms are built into the publication process to ensure
that an item in question is of value to enough customers to be marketable.
By its nature, the publication process imposes a minimal level of quality
and directly contributes to the traditional association of good libraries with
large collections. It also has led to a variety of mechanisms such as catalogs,
approval plans, and other ways to help librarians learn about and obtain
materials that might be of interest to users.
The Internet and rapidly advancing technology lowered the barriers
to distributing works to the point that anyone can completely bypass the
publication process and distribute virtually anything to a worldwide audi-
ence at negligible cost. Aside from dramatically increasing the number of
authors, materials need not be edited, marketed, or be of any interest to any
particular audience. Distribution is so decentralized that it is not reasonable
to believe that librarians can rely heavily upon marketing literature, catalogs,
approval plans, or other mechanisms used with print resources for purposes
of identifying materials for inclusion in the collection. Consequently, they
must find other means of learning about and acquiring resources that should
be added to the collection.
Adding materials to the collection simply because they are available,
requiring content providers to submit resources, or expecting librarians to
discover resources through serendipity are three common methods that
usually prove unsatisfactory unless used in combination with other tech-
niques. Acquiring resources because they are available electronically makes
no more sense than acquiring resources simply because they are available in
paper form. Content providers cannot be expected to consistently provide
resources because many of them will either not be aware of the library’s need
to archive materials or they simply don’t care. It is unrealistic to depend pri-
marily on individuals encountering useful resources primarily by means of
serendipity. For these reasons, the workflow itself needs to include a reliable
29