Page 85 - Building Digital Libraries
P. 85

CHAPTER 5


                                                   XPath

                                                   XPath is a methodology for addressing parts of an XML document. XPath
                                                   is a technology designed to be utilized with XSLT and XPointer.  It defines a
                                                                                                         3
                                                   syntax by which XML data can be extracted and acted upon. In a conceptual
                                                   sense, an XML document is really like a tree, with each element a different
                                                   node on the tree. XPath defines a method for accessing the individual nodes
                                                   on the tree. For example, consider the following XML snippet:


                                                          <?xml version=“1.0” encoding=“utf-8” ?>
                                                      <book>
                                                             <item>
                                                             <title>Pride and Prejudice</title>
                                                             <author>Jane Austin</author>
                                                             <publication_date>1813</publication_date>
                                                             <language>eng</language>
                                                             <format>text</format>
                                                             </item>
                                                             <item>
                                                             <title>Pride and Prejudice</title>
                                                             <author>Jane Austin</author>
                                                             <author type=”screenwriter”>Deborah Moggach</author>
                                                             <publication_date>2015</publication_date>
                                                             <language>eng</language>
                                                             <format>film</format>
                                                             </item>
                                                             <item>
                                                             <title>Pride and Prejudice</title>
                                                             <author>Jane Austin</author>
                                                             <publication_date>2017</publication_date>
                                                             <language>eng</language>
                                                             <format>text</format>
                                                             </item>
                                                   </book>
                                                   XPath statements furnish a process to access an individual node within
                                                   an XML file by naming its location in relation to the root element. In this
                                                   case, a process looking to extract the publication_date and format from the
                                                   second item tag group would create an XPath statement that navigated the
                                                   document nodes. In this example, however, the node item is not unique—
                                                   but it appears multiple times at the same level within the XML document.
                                                   XPath accommodates this by allowing access to the item group as elements
                                                   of an array. XPath arrays, however, differ from traditional array structures
                                                   in that XPath utilizes a state at 1, while an array in PERL, C, or C# would
                                                   start at zero. Accessing the second node from our above example would use
                                                   the following statement: /item[2]/publication_date, which illustrates how
                                                   the data in the second item node would be addressed. When coupled with
                                                   XSLT, XPath gives an individual or a process the ability to loop or extract




            70
   80   81   82   83   84   85   86   87   88   89   90