Page 584 - Beginning PHP 5.3
P. 584

Part III: Using PHP in Practice
                   Say you wanted to search a string for a date in the format  mmm/dd/yy  or  mmm/dd/yyyy  (for example,
                  jul/15/06  or  jul/15/2006 ). That ’ s three lowercase letters, followed by slash, followed by one or two
                 digits, followed by a slash, followed by between two and four digits. This regular expression will do
                 the job:

                     echo preg_match( “/[a-z]{3}\/\d{1,2}\/\d{2,4}/”, “jul/15/2006” );  //

                    Displays “1”
                   (This expression isn ’ t perfect  —  for example, it will also match three - digit  “ years, ”  but you get the idea.)

                  Greedy and Non - Greedy Matching

                   When you use quantifiers to match multiple characters, the quantifiers are  greedy  by default. This means
                 that they will try to match the largest number of characters possible. Consider the following code:

                    preg_match( “/P.*r/”, “Peter Piper”, $matches );

                    echo $matches[0];  // Displays “Peter Piper”
                   The regular expression reads,  “ Match the letter  ‘ P’  followed by zero or more characters of any type,

                 followed by the letter  ‘ r’ . ”  Because quantifiers are, by nature, greedy, the regular expression engine

                 matches as many characters as it can between the first  “ P”   and the last  “ r”    —  in other words, it matches
                the entire string.
                  You can change a quantifier to be  non - greedy . This causes it to match the smallest number of characters
                possible. To make a quantifier non - greedy, place a question mark (  ? ) after the quantifier. For example, to
                match the smallest possible number of digits use:

                    /\d+?/

                   Rewriting the Peter Piper example using a non - greedy quantifier gives the following result:
                    preg_match( “/P.*?r/”, “Peter Piper”, $matches );

                    echo $matches[0];  // Displays “Peter”

                   Here, the expression matches the first letter  “ P”  followed by the smallest number of characters possible
                 ( “ ete ” ), followed by the first letter “  r ”.

                  Using Subpatterns to Group Patterns
                   By placing a portion of your regular expression ’ s rules in parentheses, you can group those rules into a
                   subpattern . A major benefit of doing this is that you can use quantifiers (such as   *  and  ? ) to match the
                 whole subpattern a certain number of times. For example:

                    // Displays “1”
                    echo preg_match( “/(row,? )+your boat/”, “row, row, row your boat” );

                   The subpattern in this regular expression is   “ (row,? ) . It means:  “ The letters  ‘ r’  , ‘  o ’ , and ‘  w ’ ,

                                                            ”
                followed by either zero or one comma, followed by a space character. ”  This subpattern is then matched
                at least one time thanks to the following + quantifier, resulting in the    “ row, row, row “  portion of the
                 target string being matched. Finally the remaining characters in the pattern match the    “ your boat ”   part
                of the string. The end result is that the entire string is matched.
              546





                                                                                                      9/21/09   6:17:53 PM
          c18.indd   546
          c18.indd   546                                                                              9/21/09   6:17:53 PM
   579   580   581   582   583   584   585   586   587   588   589