โŒXpath

๐Ÿ’ก language for locating XML nodes

๐Ÿ•ท๏ธ XPath stands for XML Path Language. ๐Ÿ“ It is a query language for selecting nodes from an XML document. ๐Ÿ‘ฉโ€๐Ÿ’ป It allows you to navigate elements and attributes in an XML document.

๐Ÿ’ป How do you use XPath? ๐Ÿค“ You use XPath to locate or select elements in an XML document. ๐Ÿ”ข XPath uses path expressions to identify nodes. โœ… XPath expressions are evaluated to a node set.

๐Ÿ”Ž How does XPath work? ๐Ÿ“‚ XPath selects nodes in an XML document. ๐Ÿ“œ It does this using path expressions that resemble the path to a file on a file system: /parent/child

XPath Expressions:

  1. Tag-name[@attribute='value'] Example: <input type="text" placeholder="Username"> Expression: //input[@placeholder='Username'] Explanation: Selects an input element with the placeholder attribute set to "Username".

  2. Tag-name[@attribute='value'][index] Example: <button class="submit">Submit</button> <button class="submit">Confirm</button> Expression: //button[contains(@class,'submit')][1] Explanation: Selects the first button element that contains the class name "submit".

  3. Parent-Child Relationship Example: <header><div><button>Submit</button><button>Cancel</button></div></header> Expression: header/div/button[1]/following-sibling::button[1] Explanation: Selects the first following sibling button element of the first button element inside a div element, which is a child of a header element.

  4. Parent Axis Example: <header><div><button>Submit</button></div></header> Expression: header/div/button[1]/parent::div Explanation: Select the parent div element of the first button element inside a div element, which is a child of a header element.

Relative vs Absolute XPath

The main difference between relative and absolute XPath lies in how they locate elements based on their position in the HTML document.

Let's explore these differences with Java examples:

Relative XPath Example:

WebElement usernameInput = driver.findElement(By.xpath("//input[@id='username']"));

In this example, we use a relative XPath expression to locate an input element with the ID attribute equal to "username". The XPath begins with "//" to search for the desired element anywhere in the HTML document relative to the current context. Relative XPath expressions are flexible and adaptable to changes in the HTML structure, making them a preferred choice in many scenarios.

Absolute XPath Example:

WebElement usernameInput = driver.findElement(By.xpath("/html/body/div[1]/div[2]/form/input[3]"));

Here, we use an absolute XPath expression to locate the same input element. The XPath starts from the root node (/html) and specifies the complete path to the element, including its position within the HTML structure. Absolute XPath expressions provide the full path from the root node to the target element, making them less flexible and more prone to breaking if there are changes in the HTML structure.

Key Differences:

  1. Flexibility: Relative XPath expressions are more flexible as they search for elements relative to a specific context, making them adaptable to changes in the HTML structure. Absolute XPath expressions, on the other hand, rely on the complete path from the root node and are less flexible.

  2. Maintainability: Relative XPath expressions are generally more maintainable because they are less affected by changes in the HTML structure. Absolute XPath expressions can become brittle and require updating if there are any modifications to the HTML structure.

  3. Reusability: Relative XPath expressions can be reused for different elements as they start from a specific element and navigate from there. Absolute XPath expressions are specific to the element's exact position in the HTML structure and may not be easily reusable.

  4. Preferred Approach: Relative XPath is often the preferred approach in most scenarios due to its flexibility and adaptability to changes in the HTML structure. Absolute XPath is typically used when there is a fixed and known HTML structure.

Here's an example to illustrate the syntax differences between relative and absolute XPaths:

Suppose we have the following HTML code:

<div class="container">
  <h1>Welcome to my website!</h1>
  <ul>
    <li><a href="<https://example.com>">Example</a></li>
    <li><a href="<https://google.com>">Google</a></li>
    <li><a href="<https://yahoo.com>">Yahoo</a></li>
  </ul>
</div>
Feature
Relative XPath
Absolute XPath

Syntax

//div[@class='container']//li/a

/html/body/div[1]/ul/li/a

Flexibility

Can use various XPath axes to navigate the document tree in different ways, such as // to select any descendant element, . to select the current node, .. to select the parent node, and @ to select an attribute. For example, //div[@class='container']/h1 selects the h1 element that is a child of the div element with class='container'.

Relies on the absolute position of elements in the HTML structure. For example, /html/body/div[1]/ul/li[2]/a selects the second a element that is a child of the second li element that is a child of the ul element that is a child of the first div element that is a child of the body element.

Readability

Uses short, descriptive expressions to identify elements. For example, //div[@class='container']//li/a selects all a elements that are descendants of li elements that are descendants of the div element with class='container'.

Requires long, complex expressions that specify the full path to an element. For example, /html/body/div[1]/ul/li[2]/a specifies the full path to the second a element in the list.

Maintainability

Adapts to changes in the document tree by using relative paths that are independent of the absolute location of elements. For example, if we add a new div element around the ul element, the relative XPath //div[@class='container']//li/a would still work.

May require updates when changes occur. For example, if we add a new div element around the ul element, the absolute XPath /html/body/div[1]/ul/li[2]/a would need to be updated to /html/body/div[1]/div[1]/ul/li[2]/a.

Performance

More efficient at locating elements and requires less processing time. For example, //div[@class='container']//li/a selects all a elements that are descendants of li elements that are descendants of the div element with class='container', without having to traverse the entire HTML document.

Requires more processing time to traverse the entire HTML document and locate elements. For example, /html/body/div[1]/ul/li[2]/a requires the script to traverse the entire HTML document to locate the second a element in the list.

What are the differences between single slash (/) and double slash (//) in XPath?

Single slash (/) - Selects from root node:

  • Specifies an absolute XPath that starts from the root/node

  • Selects a specific node relative to the root node

Example:

/html/body/div

This selects the div that is a direct child of the body which is a direct child of the html root node.

Double slash (//) - Selects from anywhere in the DOM:

  • Specifies a relative XPath that selects from any node in the HTML document

  • Does not start from the root but can match anywhere in the DOM tree

Example:

//div

This selects any div element anywhere in the HTML document, not just direct children.

Here is an explanation of the rules of XPath including relative and absolute XPath, with bold header, emojis and code examples:

๐Ÿ“ Rules of XPath ๐Ÿ“

XPath is used to navigate and locate XML/HTML elements.

๐Ÿ”€ Absolute XPath Rules

Absolute XPath provides the full path from the root element.

  • Starts with single slash / for root element

  • Use double slash // to select descendant elements

  • Specify full path from parent to target node

  • Can be prone to issues if HTML changes

<html>
 <body>
  <div>
   <img src="logo.png"/>
  </div>
 </body>
</html>
// Select image absolute path 
driver.findElement(By.xpath("/html/body/div/img"));

๐Ÿ“ Relative XPath Rules

Relative XPath starts from current node without specifying full path.

  • Starts with double slash //

  • Uses current node as reference not root

  • More resilient to changes

  • Can use dot . to refer to current node

// Relative path from div
driver.findElement(By.xpath("//img"));

// Path from current node img 
driver.findElement(By.xpath("./@src"));

โœ… Best Practices

  • Prefer relative XPath over absolute

  • Use unique ID attribute if available

  • Index matching nodes like [2] if multiple matches

  • Avoid complex long XPath expressions

So following XPath rules and best practices helps create resilient and maintainable locators for test automation.

Last updated