The original article can be read here: AppSec Monkey - XSS. If you prefer a YouTube video, then here.
What are XSS vulnerabilities?
XSS (Cross-Site Scripting) vulnerabilities arise when untrusted data gets interpreted as code in a web context. They usually result from:
- Generating HTML unsafely (parameterizing without encoding properly).
- Allowing users to edit HTML directly (WYSIWYG editors for example).
- Allowing users to upload HTML/SVG files and serving those back unsafely.
- Using JavaScript unsafely (passing untrusted data into executable functions/properties).
- Using outdated and vulnerable JavaScript frameworks.
How to prevent XSS vulnerabilities?
Follow these steps:
- Generate HTML safely using a templating engine, or use a static JavaScript frontend to avoid HTML generation altogether.
- If you display untrusted HTML content on your website, purify it first and contain it in a sandboxed frame.
- Serve all downloads with a proper Content-Disposition header to prevent user-supplied HTML/SVG from being rendered in your origin.
- Don't pass untrusted data into executable JavaScript functions/properties such as
eval
,innerHTML
orhref
. - Use well-known components with a good security history and keep them up to date.
- Implement a proper CSP (Content Security Policy).
What is untrusted data?
Before we begin let's quickly touch on this point. Untrusted data, for the sake of this article, is anything that is or could be controlled by someone or something else than your web application.
User input is one clear example. But you should also consider any data retrieved from external sources, even your database or API, as potentially dangerous and render it with proper safety measures.
A good rule of thumb is that if it's not a static resource then it's untrusted data at least on some level.
Why are XSS vulnerabilities bad?
Being able to execute JavaScript code on a website in other people's browsers is equivalent of logging in to the hosting server and changing the HTML files for the affected users.
As such, XSS attacks effectively make the attacker logged in as the user who is attacked, with the nasty addition of being able to trick the user into giving some information (such as their password) to the attacker, or perhaps downloading and executing malware on the user's workstation.
1. Generate HTML safely
A simple example
Here is a vulnerable PHP script:
echo "<p>Search results for: " . $_GET('search') . "</p>"
It is vulnerable because the HTML is generated unsafely, the search
parameter is not encoded properly. An attacker can create a link such as the following, which would execute the attacker's JavaScript code in the victim's browser when the link is opened.
https://www.example.com/?search=<script>alert("XSS")</script>
Results in HTML like:
<p>Search results for: <script>alert("XSS")</script></p>
The importance of encoding
So how then can you safely display the value <script>alert("XSS")</script>
in your HTML? The answer is: HTML entity encoding:
& --> &
< --> <
> --> >
" --> "
' --> '
PHP has a function called htmlspecialchars
that performs this operation. So if we change our script a little bit (This is a horrible legacy approach but it suffices now for demonstration):
echo "<p>Search results for: " . htmlspecialchars($_GET('search')) . "</p>"
...the resulting HTML will be safe.
<p>Search results for: <script>alert("XSS")</script></p>
Encoding contexts
HTML entity encoding is only for when you want to put something inside HTML tags or quoted HTML attributes.
If your variables go inside JavaScript variables or URL addresses, you need another encoding function for that.
And never forget to quote your HTML attributes and JavaScript variables or no encoding in the world will save you.
Also never put untrusted data, encoded or not, within HTML attributes that get registered as DOM event handlers. These are the onClick
, onMouseEnter
and friends.
There are quite a few things that should be accounted for and OWASP has curated a nice list of them here:
Cross-Site Scripting Prevention Cheat Sheet
However, I wouldn't advise you to focus on that too much, because... are we really happy with this?
echo "<p>Search results for: " . htmlspecialchars($_GET('search')) . "</p>"
No. That's horrible. Mixing presentation and code is so 90's.
$ rm legacy.php
Template engines to the rescue
What you should do instead is have your controller method somewhere render a template with the data that you want to display. In the case of PHP, Twig is a good option. You would have search.html.twig
with the following content:
<p>Search results for: {{search}}</p>
And your search
parameter would be automatically escaped due to the automatic encoding done by the template engine.
There are good template engines for all programming languages worth their salt. Jinja for Python, Thymeleaf for Java, and so on.
...or just don't generate HTML at all!
Another great way not to deal with the dangers of HTML generation is not to generate HTML. You can do this by creating a static HTML/JavaScript frontend and perhaps a backend API.
2. Purify and sandbox untrusted content
There are scenarios where you might want to render content that you don't fully trust. Maybe you want your users to create HTML in a WYSIWYG editor, or perhaps you want to download an HTML response from a third party and display it to the user.
Whatever the use case, the solution is the same. Purify and sandbox.
Purify
Purifying is the act of removing any dangerous parts from an HTML string. This can be done on the client-side with DOMPurify
or on the server-side with several tools such as the OWASP Java HTML sanitizer
for Java or Mozzila's bleach
for Python. Just pick a well-esteemed one.
- https://github.com/cure53/DOMPurify
- https://owasp.org/www-project-java-html-sanitizer/
- https://github.com/mozilla/bleach
Sandbox
Purifying is a good first step but I wouldn't leave my website's security hanging on one thing like that. Especially since there is a fantastic control that we can easily use to present untrusted HTML content: sandboxed iframes!
Sandboxed iframes run by default in their own origin, that is, if anything bad happens in the frame, it cannot access your website. Also sandboxed iframes by default prevent script execution and even links. Very useful for our purposes!
Here is an example of a non-sandboxed frame. If you run it, you should see an alert
box with the message "evilness".
var untrustedHtml = "<h1>test<\/h1><script>alert('evilness');<\/script>";
document.getElementById('test-frame').src = "data:" + "text/html;charset=utf-8," + escape(untrustedHtml);
Try it here!
Here is another fiddle with the sandbox
attribute specified. Notice that this time the script does not get executed.
var untrustedHtml = "<h1>test<\/h1><script>alert('evilness');<\/script>";
document.getElementById('test-frame').src = "data:" + "text/html;charset=utf-8," + escape(untrustedHtml);
Try it here!
Such is the magic of sandboxed frames.
3. Serve downloads properly
When you allow users to upload files, there is a risk that they upload a malicious HTML, SVG, etc. file that will then be downloadable from your domain. And if the download is not served properly, it might open in a web browser just like any other content on your website, and contain for example malicious JavaScript code.
You should of course prevent users from uploading anything except for whitelisted extensions and validate the file contents as well. But just to be safe, serve all downloads so, that instead of being rendered in the browser, a "save file" dialog is presented to the user.
This can be achieved via the Content-Disposition
header.
By specifying attachment
, you tell browsers to show the save file dialog.
Content-Disposition: attachment; filename="filename.jpg"
4. Use JavaScript safely
Not all XSS vulnerabilities arise from unsafe HTML. Sometimes your JavaScript code can have vulnerabilities in it that cause a vulnerability.
To get such a vulnerability, all you have to do is pass untrusted data into a function or property that either executes something or changes the HTML or HREF of something.
Here are a couple of examples just so that you get the idea.
- Passing untrusted data to jQuery
append
. This is an example of passing untrusted data to a function that writes HTML.
var untrusted = "hey<script>alert('xss')<\/script>";
$('#content').append(untrusted);
Try it here!
- Passing untrusted data to a
href
attribute. This example demonstrates that not even React applications are safe from XSS if you don't know what you are doing (click the user's homepage link to see).
class VulnerableApp extends React.Component {
constructor(props) {
super(props)
this.state = {
usersHomepage: "javascript:alert('xss')"
}
}
render() {
return (
<div>
<p>User's homepage: <a href={this.state.usersHomepage}>User's homepage</a></p>
</div>
)
}
}
ReactDOM.render(<VulnerableApp />, document.querySelector("#app"))
Try it here!
- Passing untrusted data to
eval
. This is an example of passing untrusted data to a function that executes something. This simple calculator will execute code if you enter values like;alert('xss');
into one of the operands.
function calculate() {
var op1 = $('#operand1').val();
var op2 = $('#operand2').val();
var answer = eval(op1 + ' + ' + op2);
$('#answer').text(answer);
}
Try it here!
There is virtually an infinite list of functions and properties that you shouldn't pass untrusted data into, so I couldn't possibly list them all here. They include things like innerHTML
, outerHTML
, setTimeout
and so on, and of course, the JavaScript libraries that you use have their own, like the jQuery example above.
It's better to be safe than sorry, so check the documentation for the function/property before assigning/appending untrusted data into it.
5. Use well-known JavaScript libraries and keep them up to date
Don't use NPM packages with a small number of downloads, because they can have vulnerabilities and even contain purposefully malicious code. Try to use well-known libraries with a decent security record instead.
Even the best of libraries have vulnerabilities now and then, so you should keep them up to date. You can use tools such as retire.js
and npm audit
to scan your web application for vulnerable outdated JavaScript libraries.
6. Implement a Content Security Policy
Content Security Policy (CSP) is an amazing browser security feature that can armor your web application against XSS vulnerabilities, in the case that regardless of your best efforts to follow the guidelines in this article, you still end up with a vulnerability in your application.
CSP restricts your web application in what it can do in terms of e.g. loading resources and executing scripts. Like the iframe sandbox described above, CSP also restricts everything by default, and then you can start adding exceptions for the resources that you do need.
Here is a good policy to get started with. It prevents lots of things, but most importantly:
- It prevents eval and friends.
- It prevents inline JavaScript tags.
- It prevents loading JavaScript files from external domains.
- It prevents
javascript:
URLs.
Content-Security-Policy: script-src 'self'; form-action 'self'; object-src 'none';
So basically it would prevent all of the XSS vulnerabilities we have described so far from being meaningfully exploited. The form-action
even prevents the attacker from inserting a fake form on the page asking e.g. the victim's password and submitting it to the attacker's server.
So what's the catch? Your inline scripts won't work either. For CSP to work in this simple form, you will have to refactor your code in such a way that you won't break your own rules.
- You don't use inline scripts.
- You don't use inline DOM event handlers (onClick, etc).
- You don't use eval and the scripts/frameworks that you don't use it either.
This is where you start adding exceptions. If you absolutely must use a JavaScript framework that uses eval, then you will have to specify unsafe-eval
. And if you want to load a script externally, then you will have to add that URL into the script-src
directive.
These are not the end of the world. But please, whatever you do, do not specify script-src unsafe-inline
, because then you will downgrade your CSP to the point where it's almost useless.
If you absolutely must have inline JavaScript tags, use CSP nonces and/or hashes to allow those specific tags. I won't go into detail about that in this article but you can read more about this approach here:
When you are done with your policy, you can use Google's CSP evaluator to check it.
Bonus: Implement Strict SameSite cookies
There is one more browser security feature that you can use to harden your application against reflected XSS attacks, that is, attacks that originate through malicious links or websites. And that is strict SameSite
cookies.
The crux of it is that you can set your session cookies with the SameSite=Strict
attribute, and then web browsers will no longer send the user's session cookie in requests that originate from other websites, even if they are GET
requests.
Set-Cookie: SessionId=123; SameSite=Strict
The catch is that links to your application will break in the sense that your logged in user will actually be logged out in all tabs/windows that are opened from anywhere except for your website. But if that doesn't bother you, do take advantage of SameSite
! At any rate you should at least use SameSite=Lax
which protects against CSRF (Cross-Site Request Forgery) vulnerabilities.
Conclusion
It is quite possible to avoid XSS vulnerabilities and as an additional layer of security mitigate them with CSP. But this requires using modern technologies, knowing what you are doing, and implementing an effective CSP header.
Get the web security checklist spreadsheet!
☝️ Subscribe to AppSec Monkey's email list, get our best content delivered straight to your inbox, and get our 2021 Web Application Security Checklist Spreadsheet for FREE as a welcome gift!
Don't stop here
If you like this article, check out the other application security guides we have on AppSec Monkey as well.
Thanks for reading.
Top comments (0)