DataStream

LANGUAGES: XML | XPath

ASP.NET VERSIONS: 1.x | 2.0

 

XPath Basics

Identifying Nodes and Selections in XML Data

 

By Brian Noyes

 

XML data is becoming more prevalent as more businesses and systems become integrated over the Web. Happily, XML itself becomes easier to work with in every release of the .NET Framework and Microsoft SQL Server. The upcoming release of .NET 2.0 and SQL Server 2005 (code-named Yukon) next year is no exception.

 

To work effectively with XML, however, you need to get comfortable with a number of related XML technologies, including XML Schema, XPath, XSLT, and - eventually - XQuery. This article focuses on the basic syntax of XPath. XPath is particularly important because many of the other XML technologies (including XSLT, XPointer, and XQuery) use XPath as the means to identify nodes and selections within XML data.

 

Location, Location, Location

XPath is all about specifying a location or set of locations (in the form of nodes) within a set of XML data or an XML document. Keep in mind that data can be presented as XML even though it may have no relation to a physical file or document; the term XML document is still used to describe that set of data. XPath treats XML data as a hierarchical set of nodes through which you can navigate or identify nodes based on their names and relative location (or path) to one another.

 

The easiest analogy for understanding the syntax of basic XPath is to think about your file system and the paths that you use to specify the location of files or folders within the file system. In that case, the files and folders are the node types, and folder nodes can contain other folder or file nodes, forming a hierarchical system of nodes that you can navigate. You specify the path to a file or folder either by fully qualifying it from its root (e.g. the C:\ drive), or by using a relative path from a current location (e.g. ...\include\myfile.xml).

 

You separate each step in the hierarchical navigation scheme by separating it with backslashes. These same concepts, and even a similar syntax, apply to specifying the location of nodes within an XML document using XPath.

 

XPath goes beyond specifying locations of nodes, however. XPath expressions can be composed of complex conditions, or predicates, that are evaluated against a node or set of nodes to determine the result of the XPath expression. The XPath language also includes a number of built-in functions that allow you to modify the result of the expression in many ways.

 

XPath expressions can evaluate to a node set, a number, a string, or a Boolean. They evaluate to a node set when the expression represents one or more nodes within the XML that match the criteria specified by the expression. They can evaluate to a number, string, or Boolean when they are used to extract the value of a node, or the result of a built-in function.

 

Location steps. The trick to understanding or constructing an XPath expression is to break it down into its constituent parts. An XPath expression is composed of one to many location steps. Each location step evaluates to either a node set, or a value. If the location step evaluates to a node set, it can be followed by subsequent location steps that are evaluated relative to the previous location step. In this way, you can think of the node set results of a location step as a cursor or set of cursors into the XML document. These cursors set the current context for the evaluation of subsequent location steps.

 

Each location step within an XPath expression is separated by a forward-slash. You can view everything between the forward slashes as individual location steps:

 

location-step-1/location-step-2/location-step-3...

 

In this example, location-step-2 would be evaluated relative to the nodes that were the result of location-step-1, and location-step-3 would be evaluated relative to the resulting nodes from location-step-2.

 

Each location step can itself be composed of three parts: an axis, a node test, and a predicate. The axis specifies a search direction and depth from the current node. The node test is the heart of the location step, and is the part that will be evaluated against nodes along that axis to determine whether they match the criteria. The predicate is optional, and allows you to specify assertions that will be evaluated against any nodes matching the node test to further filter the results.

 

Axes. The axes include child, parent, descendant, ancestor, and a number of others. Child means to only look at nodes that are directly underneath the current node, and parent means to only look at the node directly above the current node. Descendant means to look at all nodes at any depth beneath the current node, and ancestor means to look at all nodes that are any depth above the current node - as long as there is a direct path from that node to the current node. Axes are specified with their name, followed by the :: operator. For example:

 

parent::Element1

 

The location step above will return a single node reference if the current node has a parent element named Element1, and will return no node reference otherwise. The node reference returned will be to the Element1 element itself. The Element1 part of the location step above is the node test. It tests the node found at the specified location for a match against the specified name. Elements are specified by name; attribute names are prefaced by the @ symbol. Wildcards are also allowed, using the * symbol. The child axis is implied if no axis is specified in a location step. For example:

 

*/Album/@artistname

 

The expression above first matches all nodes that are children of the current node (the * wildcard), then matches all Album elements that are children of those child nodes, and finally matches the artistname attribute of the Album elements from the previous two location steps, if such an attribute exists on the Album element. The resulting node set would contain references to the artistname attribute nodes that satisfied the entire expression.

 

So if you start at the beginning of an XPath expression and take each location step in turn, you can visualize the result of each location step as a set of zero to many pointers to nodes within the document. Subsequent location steps are evaluated relative to those nodes, resulting in a new set of pointers, until all the location steps have been evaluated. The final set of pointers to nodes is the resulting node set of the XPath expression.

 

Node tests can also extract and return the text contents of a node, such as the value of an attribute, or the text contents of an element. You do this by specifying text for the node test. For example:

 

Track/text()

 

The expression above would return the text contained inside the Track element. You can also return Boolean or numeric values from node tests using some of the built-in functions of the XPath language, such as the contains method or the count method, respectively.

 

Predicates. Appending a predicate to a location step adds a significant amount of expressiveness to the XPath language. Once the axis, node test, and previous location steps have set the context of a particular expression, a predicate can test assertions about the resulting nodes to further decide whether to declare a match or not. Predicates are specified between square brackets immediately following the node test. They do not change the current context of the expression, but they are evaluated relative to the context set by the location step. For example:

 

Track[@number='1']

 

The expression above matches any Track elements that are children of the current node (again, the child:: axis is implied if none is specified) who have an attribute, named number, whose value is 1. The returned nodes are Track elements, not number attributes. The predicate is used to further restrict the results of the expression, but does not change the context or node type that the expression will return.

 

Multiple predicates can be strung together and represent an implicit AND Boolean operation. For example:

 

Track[@number='1'][text()='Forever']

 

The expression above would match all child Track elements whose number attribute was equal to 1, and whose text content was equal to Forever. You can also combine Boolean operators within a predicate to construct arbitrarily complex assertions about the nodes that an XPath expression will match.

 

The effect of namespaces. The last thing to understand about XPath expressions is the effect that namespaces have on the specification of nodes within an expression. If elements or attributes within an XML document are scoped to a particular namespace, that namespace becomes part of the fully qualified name of that node. To specify that node as part of an XPath expression, you therefore need to include that namespace information as part of the node name.

 

The way this is handled is going to depend on what processing engine you're using to consume the XPath expressions. Typically, the way it works is that you'll need to associate a prefix with each namespace in the document from which you need to specify nodes. You then use that prefix as part of the node name in the XPath expression. For example:

 

descendant::musicns::Track

 

The expression above would match any Track elements underneath the current node at any level, where the Track element is declared within the namespace identified with the musicns prefix.

 

The association between a prefix and the actual namespace URI must be done through some means specific to the processing context. In the case of querying XML using XPath expressions in .NET, you do this by adding prefix and namespace URI pairs to an XmlNamespaceManager instance, then associating that namespace manager with the expression you are evaluating.

 

Step by step. Again, the key to understanding any XPath expression is to first take it one location step at a time, and then to break each location step into its axis, its node test, and its predicate(s). Starting with the first location step, determine what the resulting set of nodes is. Evaluate the successive location steps relative to the results of the preceding ones, until the entire XPath expression has been evaluated.

 

The download code for this article includes a simple .NET WinForms XPath Query Analyzer application that will allow you to play around with XPath expressions. You can load an XML document into the form, then enter XPath expressions to query the document (see Figure 1). The application will highlight the matching nodes by changing the font size and color. The application is not very efficient with the rendering of the document, so I don't recommend loading large documents into it. But it's a very handy little tool for trying out XPath expressions against a document to visualize what the results will be.

 


Figure 1: The XPath Query Analyzer application.

 

The best place to look for more information on XPath, including lots of examples, is in the MSXML 4.0 SDK documentation in MSDN. Under the XPath sections, there is comprehensive coverage of all the various combinations of axes, node tests, predicates, and built-in functions.

 

If you aren't already comfortable with XPath expressions, now is a great time to start getting used to them. They are vital to querying XML documents in .NET today, and are used by a number of related XML technologies that you may need now and in the future. In the next issue I'll cover how to execute XPath expressions in .NET to perform queries against XML data in memory.

 

The files referenced in this article are available for download.

 

Brian Noyes is a software architect with IDesign, Inc. (http://www.idesign.net), a .NET-focused architecture and design consulting firm. Brian is a Microsoft MVP in ASP.NET who specializes in designing and building data-driven distributed Windows and Web applications. Brian writes for a variety of publications and is working on a book for Addison-Wesley on building Windows Forms Data Applications with .NET 2.0. Contact him at mailto:brian.noyes@idesign.net.