View examples for writing XPaths
There are several ways to write an XPath to capture the information from an HTML element. Adopting different strategies, depending on the structure of web page, can help you capture the data you want more reliably.
Common axis names
The axis name specifies the direction of the XPath, whether it navigates toward the top or bottom of the page to a sibling or an ancestor. If there is no specific syntax written, the axis name can be written in the XPath directly.
Common HTML elements
Tag | Description |
---|---|
<div> | Division or section |
<h> | Heading |
<p> | Paragraph |
<a> | Hyperlink |
<ol> | Ordered list |
<ul> | Unordered list |
<li> | List item |
Common filters/predicates:
To apply any function to your XPath you must add brackets next to the name of the HTML element. Multiple functions can be applied to one element.
Attribute selector: Use the attributes to identify an element.
Attribute contains function: Select some of the attributes value to find a match. This is helpful when the value is long or if there is a
specific value string you want.
Text contains function: Instead of using an attribute use the text on the website to identify the element you would like. For example: If you would like to select sponsored items search for the work sponsored in the HTML and return the result.
Not contains: To exclude a specific element from your capture add a not before your contains function.
And - or: add multiple functions to identify your HTML element. For example, if you want to select both the price and price shared the attribute of id="tv"
and class="price"
, using both of these would narrow in the results to show the only those that matched both tv and price.
Number operator: Adding a number inside of brackets selects that HTML element based on numerical order. Selecting my numerical order is less reliable than specifying an element by an attribute such as class because the order of HTML elements can change.