# XPath Injection

*XPath* is a query language for looking up data within an XML document. If your web-server looks up data in XML configuration files or an XML database, you need to ensure any XPath expressions are not generated from untrusted input. Otherwise, an attacker can perform an **XPath injection** attack to construct a malicious XPath expression that allows them to read data they should not have access to.

## XPath Injection in Python

XPath expressions can be evaluated in Python using the `lxml` package. The standard library `xml.etree.ElementTree` API also supports a limited support for XPath expressions via the `findall(…)` method. You should avoid constructing XPath queries by interpolating untrusted content, as shown below with `lxml`:

## Mitigation

To safely generate XPath expressions from an HTTP parameter, using *bind parameters*:

“`python
def credentials_are_valid(tree, username, password):
# Constructing the XPath expression with named placeholders is safer.
expression = “/users/user[@username=$username and @password=$password]”

return tree.xpath(expression, username=username, password=password)
“`

This will protect you from injection attacks.

## Further Considerations

The most secure way to process XML in Python is to use the `defusedxml` package, which offers drop-in replacements for each of the XML parsers in the standard library, specifically hardened against vulnerabilities. It is *strongly* recommended you switch to this package of you want to avoid XML exploits in your Python code.

## CWEs

* [CWE-643](https://cwe.mitre.org/data/definitions/643.html)

About ShiftLeft

ShiftLeft empowers developers and AppSec teams to dramatically reduce risk by quickly finding and fixing the vulnerabilities most likely to reach their applications and ignoring reported vulnerabilities that pose little risk. Industry-leading accuracy allows developers to focus on security fixes that matter and improve code velocity while enabling AppSec engineers to shift security left.

A unified code security platform, ShiftLeft CORE scans for attack context across custom code, APIs, OSS, containers, internal microservices, and first-party business logic by combining results of the company’s and Intelligent Software Composition Analysis (SCA). Using its unique graph database that combines code attributes and analyzes actual attack paths based on real application architecture, ShiftLeft then provides detailed guidance on risk remediation within existing development workflows and tooling. Teams that use ShiftLeft ship more secure code, faster. Backed by SYN Ventures, Bain Capital Ventures, Blackstone, Mayfield, Thomvest Ventures, and SineWave Ventures, ShiftLeft is based in Santa Clara, California. For information, visit: www.shiftleft.io.

Share

See for yourself – run a scan on your code right now