# Host Header Poisoning
The `Host` header in an HTTP request is set by the browser and can be used by backend servers to distinguish requests from the different domains being served on the same internet protocol address. The header can easily be spoofed by an attacker, however, so server-side code should not generate URLs from the supplied value.
## The “Host” Header in Python
Most Python web applications do not know what domain they are being hosted on. Domain names are registered in the *Domain Name System*, which tells a browser to route internet traffic for a domain to a number of IP addresses. HTTP and HTTPS traffic sent to these IP addresses will be routed to a load-balancer, which then distributes the requests amongst various web-servers. The Python application will sit downstream of this, unaware of how the requests entered the network, passing responses back along the same route.
Browsers will, however, set a `Host` header in each HTTP request, indicating the intended domain of the HTTP request. This can be used by server-side code to disambiguate requests on the same IP address requesting content for different domains, but this relies on trusting that the user-agent set the header authentically. In fact, there’s generally no way to check whether the domain in the `Host` header corresponds to the IP address the underlying TCP handshake was initiated with. A malicious user-agent – like a script run by a hacker – can set the `Host` header to
anything they choose.
### Relative vs Absolute URLs
Most of the time a Python application doesn’t actually need to know what domain it is running under, providing links in HTML are *relative URLs*, and client-side imports are reference relative URLs in the same manner:
“`html <script src=”js/navigation.js”/><a href=”/profile”>Profile</a> “` |
Using relative URLs is best practice in web development: they make it easier to deploy the web-code under different environments. However, there are some circumstances when absolute URLs *are* required, such as when generating website links in transactional emails. If a user requests a password reset email, for example, the link in the email need to contain the full domain of your website, since the user will be navigating from an external source (the email client).
### Generating Absolute URLs
In the password reset scenario, it is tempting to simply take the domain from the `Host` header. However, remember this can be set to any value an attacker chooses, so the following function is open to abuse:
“`python import smtplib from email.message import EmailMessagedef send_password_reset_email(request, user): message = EmailMessage() # Danger: this value for the “Host” header is under the control of a hacker. message.set_content(“Click here to reset your password: ” + password_reset_url) smtp = smtplib.SMTP(os.environ.get(“SMTP_HOST”)) |
An attacker could abuse this function to send a password reset email to a victim with a link back to a malicious website under their control. If their site looks confusingly similar to yours – and has a similar enough domain name that the victim does not notice – the attacker has an easy way to harvest credentials.
For this reason **you should always take the domain name for absolute URLs from a configuration file stored securely on the server**. A safe way of generating password reset links is shown:
“`python import smtplib from email.message import EmailMessagedef send_password_reset_email(user): message = EmailMessage() # Take the domain from configuration rather than the request. message.set_content(“Click here to reset your password: ” + password_reset_url) smtp = smtplib.SMTP(os.environ.get(“SMTP_HOST”)) |
## Mitigation
In summary, you can protect your site against `Host` header poisoning by:
* Using relative URLs wherever possible.
* Where absolute URLs are required – like in transactional emails – take the domain name from server-side configuration.