Page 220 - Using MIS
P. 220
188 Chapter 5 Database Processing
their strategies, as you’ll learn when you study business intelligence in Chapter 9. Furthermore,
as databases become bigger and bigger, they’re more attractive as targets for theft or mischief, a
subject you’ll consider in Chapter 10.
Setting these ideas aside, what else can we imagine for database technology by 2025? We
can get a glimpse into that future by recognizing that the major principles of the relational
model—the fixed-sized tables, the relationships among tables via foreign keys, and the theory of
normalization—all came about because of limited storage space and limited processing speeds
4
back in the 1960s and early 1970s. At some point, maybe the mid-1990s, these limitations were
removed by improved storage and processing technology, and today they do not exist. Today the
relational model is not needed.
Furthermore, the relational model was never a natural fit with business documents. For ex-
ample, users want to store sales orders; they do not want to break up sales orders via normaliza-
tion and store the data in separate tables. It’s like taking your car into a parking garage and hav-
ing the attendant break it up into pieces, store the pieces in separate piles, and then reassemble
it from the pieces when you come back to get it. And why? For the efficiency and convenience of
the management of the parking garage.
This is not to say that relational databases will be replaced anytime soon. Organizations have
created thousands of relational databases with millions of lines of application code that process
SQL statements against relational data structures. There is also a strong social trend among older
technologists to hang onto the relational model. But the primary reason for the relational model’s
existence is gone, and document piece-making via normalization is no longer necessary.
Also, organizations today want to store new types of data such as images, audios, and
videos. Those files are large collections of bits, and they don’t fit into relational structures.
Collections of such files still need metadata; we need such data to record when, where, how,
and for what purpose the files exist, but we don’t need to put it into relational databases just to
obtain metadata. AllRoad Parts’ desire to store images for customers’ image query provides an
excellent example.
MongoDB is an open source document-oriented DBMS that AllRoad Parts could use to
store its nonstructured data. MongoDB does not require normalized data; instead, it manages
collections of documents where those documents can have a variety of structures, including
large bit files for image, audio, and video data. MongoDB can also store documents like sales
orders without requiring that they be normalized. It is used by companies like Craigslist and
foursquare; the name MongoDB is a play on the adjective humongous.
But MongoDB is not alone. A few years ago, Amazon.com determined that relational da-
tabase technology wouldn’t meet its needs, and it developed a nonrelational data store called
5
Dynamo. Meanwhile, for many of the same reasons, Google developed a nonrelational data
6
store called Bigtable. Facebook took concepts from both of these systems and developed a
7
third nonrelational data store called Cassandra. In 2008, Facebook turned Cassandra over to
the open source community, and now Apache has dubbed it a Top Level Project (TLP), which is
the height of respectability among open source projects.
Such nonrelational databases have come to be called NoSQL databases, where NoSQL
means nonrelational databases that support very high transaction rates processing relatively
4 For a summary of this early history and an amplification of these ideas, see David Kroenke, “Beyond the
Relational Model,” IEEE Computer, June 2005.
5 Werner Vogel, “Amazon’s Dynamo,” All Things Distributed blog, last modified October 2, 2007, www.
allthingsdistributed.com/2007/10/amazons_dynamo.html.
6 Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar
Chandra, Andrew Fikes, and Robert E. Gruber, “Bigtable: A Distributed Storage System for Structured Data,”
OSDI 2006, Seventh Symposium on Operating System Design and Implementation, Seattle, WA, last modified
November 2006, http://labs.google.com/papers/bigtable.html.
7 Jonathan Ellis, “Cassandra: Open Source Bigtable + Dynamo,” accessed June 2011, www.slideshare.net/jbellis/
cassandra-open-source-bigtable-dynamo.