Page 30 - Building Digital Libraries
P. 30
Choosing a Repository Architecture
to centrally manage an image archive or a peer-reviewed journal. As a
practical matter, most libraries can only support a few systems. However,
adapting systems that were designed for intrinsically different use cases is
difficult, expensive, and often leads to unsatisfactory solutions that are dif-
ficult to migrate from later.
What Types of Collections Will It Contain?
Part of understanding who your users are and what they need is identifying
what types of collections your repository will contain. Is access needed for
individual documents, images, and videos, objects consisting of multiple
files, or something else? What are the relative size and quantity of these
resources, what formats are they in, and what metadata are needed to man-
age and describe them? What formats will they need to be in for reuse by
the users of the system? Do they contain sensitive information that must
be tightly controlled?
The answers to these questions impact the underlying architecture as
well as software options. If the repository will contain huge video or data
files, neither hosted services requiring files to be uploaded over the Inter-
net nor locally hosted services requiring uploads using web browser-based
forms will be practical, because the uploads would take much longer than
most users are willing to wait. Such files will likely mean users need to
be able to stream, manipulate, or otherwise interact with objects without
downloading them. Proprietary formats as well as large files in universally
supported formats will require viewers, derivative copies, or other arrange-
ments to be usable. Materials associated with any kind of specialized work-
flow may impact both repository software and architecture choices. Sensitive
materials may require sophisticated access controls and audit logs of when
items are accessed or modified.
How Are Assets Acquired?
All repositories were initially designed to meet the needs associated with
specific resource types, which are in turn associated with specific methods
for ingesting materials and metadata. For example, some common platforms
were initially designed to support needs such as sharing research, serving
as a journal publication platform, disseminating images, supporting music
education, or performing a number of other tasks. The repository project
will be much more successful if the process needed for ingesting objects and
metadata is compatible with what the system is designed to do.
• Are objects added one at a time or in batches?
• Is there technical, descriptive, or administrative metadata
that must be added, or is this metadata already provided
upon ingestion?
15