Page 584 - Beginning PHP 5.3
P. 584
Part III: Using PHP in Practice
Say you wanted to search a string for a date in the format mmm/dd/yy or mmm/dd/yyyy (for example,
jul/15/06 or jul/15/2006 ). That ’ s three lowercase letters, followed by slash, followed by one or two
digits, followed by a slash, followed by between two and four digits. This regular expression will do
the job:
echo preg_match( “/[a-z]{3}\/\d{1,2}\/\d{2,4}/”, “jul/15/2006” ); //
Displays “1”
(This expression isn ’ t perfect — for example, it will also match three - digit “ years, ” but you get the idea.)
Greedy and Non - Greedy Matching
When you use quantifiers to match multiple characters, the quantifiers are greedy by default. This means
that they will try to match the largest number of characters possible. Consider the following code:
preg_match( “/P.*r/”, “Peter Piper”, $matches );
echo $matches[0]; // Displays “Peter Piper”
The regular expression reads, “ Match the letter ‘ P’ followed by zero or more characters of any type,
followed by the letter ‘ r’ . ” Because quantifiers are, by nature, greedy, the regular expression engine
matches as many characters as it can between the first “ P” and the last “ r” — in other words, it matches
the entire string.
You can change a quantifier to be non - greedy . This causes it to match the smallest number of characters
possible. To make a quantifier non - greedy, place a question mark ( ? ) after the quantifier. For example, to
match the smallest possible number of digits use:
/\d+?/
Rewriting the Peter Piper example using a non - greedy quantifier gives the following result:
preg_match( “/P.*?r/”, “Peter Piper”, $matches );
echo $matches[0]; // Displays “Peter”
Here, the expression matches the first letter “ P” followed by the smallest number of characters possible
( “ ete ” ), followed by the first letter “ r ”.
Using Subpatterns to Group Patterns
By placing a portion of your regular expression ’ s rules in parentheses, you can group those rules into a
subpattern . A major benefit of doing this is that you can use quantifiers (such as * and ? ) to match the
whole subpattern a certain number of times. For example:
// Displays “1”
echo preg_match( “/(row,? )+your boat/”, “row, row, row your boat” );
The subpattern in this regular expression is “ (row,? ) . It means: “ The letters ‘ r’ , ‘ o ’ , and ‘ w ’ ,
”
followed by either zero or one comma, followed by a space character. ” This subpattern is then matched
at least one time thanks to the following + quantifier, resulting in the “ row, row, row “ portion of the
target string being matched. Finally the remaining characters in the pattern match the “ your boat ” part
of the string. The end result is that the entire string is matched.
546
9/21/09 6:17:53 PM
c18.indd 546
c18.indd 546 9/21/09 6:17:53 PM