Page 120 - Building Digital Libraries
P. 120
General-Purpose Technologies Useful for Digital Repositories
primarily show up in the form of syntax, object structure, and language
performance.
Ruby, Python, PHP, PERL: For the sake of brevity, these four lan-
guages are being placed together in the same category. While
each is different, they all are interpreted languages, that is, these
are languages that don’t require the use of a compiler. Within the
library community, PHP and Ruby are currently the most widely
utilized languages. For the purposes of data manipulation,
Python and PERL are the two most widely used languages, and
they have significant communities outside the cultural heritage
community that can provide support and feedback.
Java, C#, C++, Object-C, Swift: These are compiler-dependent
programming languages. Unlike interpreted languages, these
languages are compiled into binary files that can then be run
as stand-alone applications. These languages are often more
difficult to learn because they require a mastery of concepts
like garbage collection, advanced memory management, and
so on, but they also provide significant performance and func-
tional advantages over current interpreted languages. These
languages are often used in the development of core digital
library components. For example, Fedora is written in Java, as
is DSpace, while many XML processing tools like nokogiri (for
Ruby) and libxml are written in C++. For the purposes of data
manipulation, these languages are best used when large data
processing is required.
Programming Tools
A sizeable number of programming tools have been created to aid the
manipulation and creation of the data that are commonly found in cultural
heritage institutions. This section will highlight a few of them.
MARC4J (https://github.com/marc4j/marc4j): MARC4J is a Java
library developed to enable the creation, manipulation, and
processing of MARC and MARCXML data. The tool is widely
used and actively supported.
MARC::Record (http://search.cpan.org/perldoc?MARC%3A%3A
Record): MARC::Record is a popular PERL module developed
to enable the creation, manipulation, and processing of MARC
and MARCXML data. The module is used in many popular
library applications, like Koha (https://koha-community.org/),
with an active development and support community.
pymarc (https://github.com/edsu/pymarc): pymarc is a popular
Python library that was developed to enable the creation,
manipulation, and processing of MARC21 records.
105