Page 385 - Beginning PHP 5.3
P. 385
Chapter 12: Introducing Databases and SQL
So generally speaking, VARCHAR character fields save you disk space. Don ’ t be tempted to use VARCHAR
fields for storing every string, though, because that has drawbacks, too. The MySQL server processes
CHAR type fields much faster than VARCHAR type, for one thing, because their length is predetermined.
If your strings don ’ t vary in length much, or at all, you ’ re better off using CHAR type fields. In fact, when
your strings are all the same length, VARCHAR takes up more disk space, because it has to store the length
of each string in one or two additional bytes.
With the character types — CHAR , VARCHAR , TEXT , and so on — the amount you can store may be less
than the maximum shown, depending on the character set used. For example, the UTF - 8 (Unicode)
character set commonly uses up to 3 bytes per character, so a VARCHAR field may only be able to store up
to 21,844 UTF - 8 characters.
Using Indexes and Keys
Inexperienced database designers sometimes complain about their database engines being slow — a
problem that ’ s often explained by the lack of an index . An index is a separate sorted list of the values in a
particular column (or columns) in a table. Indexes are also often called keys ; the two words are largely
interchangeable. You can optionally add indexes for one or more columns at the time you create the
table, or at any time after the table is created.
To explain why indexing a table has a dramatic effect on database performance, first consider a table
without indexes. Such a table is similar to a plain text file in that the database engine must search it
sequentially. Rows in a relational database are not inserted in any particular order; the server inserts
them in an arbitrary manner. To make sure it finds all entries matching the information you want, the
engine must scan the whole table, which is slow and inefficient, particularly if there are only a few
matches.
Now consider an indexed table. Instead of moving straight to the table, the engine can scan the index for
items that match your requirements. Because the index is a sorted list, this scan can be performed very
quickly. The index guides the engine to the relevant matches in the database table, and a full table scan is
not necessary.
So why not just sort the table itself? This might be practical if you knew that there was only one field on
which you might want to search. However, this is rarely the case. Because it ’ s not possible to sort a table by
several fields at once, the best option is to use one or more indexes, which are separate from the table.
A primary key is a special index that, as you saw earlier, is used to ID records and to relate tables to one
another, providing the relational database model. Each related table should have one (and only one)
primary key.
You can also create an index or primary key based on combinations of fields, rather than just a single
field. For a key to be formed in this way, the combination of values across the indexed fields must be
unique.
Because an index brings about a significant boost in performance, you could create as many indexes as
possible for maximum performance gain, right? Not always. An index is a sure - fire way to increase the
speed of searching and retrieving data from a table, but it makes updating records slower, and also
increases the size of the table. This is because, when you insert a record into an indexed table, the
database engine also has to record its position in the corresponding index or indexes. The more indexes,
the slower the updating process and the larger the table.
347
9/21/09 9:11:10 AM
c12.indd 347
c12.indd 347 9/21/09 9:11:10 AM