Page 46 - Building Digital Libraries
P. 46

Acquiring, Processing, Classifying, and Describing Digital Content


                           includes resources that might be changed, which version(s) will
                           be kept? The decision to retain or not retain different versions
                           of resources has enormous implications for workflow, staffing,
                           systems resources, and access.
                               If every version of a resource is retained, procedures need
                           to be developed so that the repository staff know when a new
                           version is available. Time will need to be spent processing the
                           document, and systems resources will need to be allocated
                           accordingly. Even if changed resources can be easily identi-
                           fied, and staff and systems resources are plentiful, how will
                           users search for and use these materials—if seven versions of a
                           resource are available, how do users and staff choose which one
                           they interact with?
                               Retaining only a single version of a resource has its own
                           problems. If only the most recent version is retained, a means
                           to replace previous versions and to make appropriate changes
                           to metadata is necessary. Similarly, a mechanism for inform-
                           ing the staff of updates must exist. Although confusion with
                           duplication in the repository is reduced when only one ver-
                           sion is kept, problems could emerge when a version that a user
                           cited as an authoritative source is replaced. A similar problem
                           will occur when the library only retains the first version that it
                           encounters, but a user cites a later version.

                        Who should participate in the selection process?
                           Will specialists locate and identify works to include in the
                           repository, and if so, how will they accomplish this? Will an
                           automated or semiautomated process identify resources of
                           interest? Will content providers be expected to submit resources
                           via a web page? The process of determining which resources
                           are desired, where they are, and what tools exist to detect them
                           should help identify who should be involved in the selection
                           process.
                               Just as it is extremely useful in the physical world to have
                           people with subject expertise help select materials in those top-
                           ics, it is also very useful to have people who can be considered
                           experts  help  select  digital  resources. We tend to  focus on  the
                           technical aspects of digital collections, which often means that
                           a web developer, programmer, or systems ad ministrator is put
                           in the role of curator. This is akin to having a printer or book-
                           binder select the books in a collection. In other words, make
                           sure the staff involved are playing to their strengths.

                        What tools exist to help automatically detect resources?
                           Can sys tematic ingestion tools (e.g., web spiders, data extrac-
                           tions from a CMS or other system, scripts, or other tools) be
                           used to identify desirable materials? Identifying digital resources

                                                                                                                      31
   41   42   43   44   45   46   47   48   49   50   51