Page 46 - Building Digital Libraries
P. 46
Acquiring, Processing, Classifying, and Describing Digital Content
includes resources that might be changed, which version(s) will
be kept? The decision to retain or not retain different versions
of resources has enormous implications for workflow, staffing,
systems resources, and access.
If every version of a resource is retained, procedures need
to be developed so that the repository staff know when a new
version is available. Time will need to be spent processing the
document, and systems resources will need to be allocated
accordingly. Even if changed resources can be easily identi-
fied, and staff and systems resources are plentiful, how will
users search for and use these materials—if seven versions of a
resource are available, how do users and staff choose which one
they interact with?
Retaining only a single version of a resource has its own
problems. If only the most recent version is retained, a means
to replace previous versions and to make appropriate changes
to metadata is necessary. Similarly, a mechanism for inform-
ing the staff of updates must exist. Although confusion with
duplication in the repository is reduced when only one ver-
sion is kept, problems could emerge when a version that a user
cited as an authoritative source is replaced. A similar problem
will occur when the library only retains the first version that it
encounters, but a user cites a later version.
Who should participate in the selection process?
Will specialists locate and identify works to include in the
repository, and if so, how will they accomplish this? Will an
automated or semiautomated process identify resources of
interest? Will content providers be expected to submit resources
via a web page? The process of determining which resources
are desired, where they are, and what tools exist to detect them
should help identify who should be involved in the selection
process.
Just as it is extremely useful in the physical world to have
people with subject expertise help select materials in those top-
ics, it is also very useful to have people who can be considered
experts help select digital resources. We tend to focus on the
technical aspects of digital collections, which often means that
a web developer, programmer, or systems ad ministrator is put
in the role of curator. This is akin to having a printer or book-
binder select the books in a collection. In other words, make
sure the staff involved are playing to their strengths.
What tools exist to help automatically detect resources?
Can sys tematic ingestion tools (e.g., web spiders, data extrac-
tions from a CMS or other system, scripts, or other tools) be
used to identify desirable materials? Identifying digital resources
31