DEV Community

Rodrigo Javornik

Posted on Sep 13, 2023 • Edited on Sep 28, 2023

XSS Attack - Why strip_tags is not enough

#php #xss #security #programming

In PHP, it is common to use the strip_tags() function as a way to prevent XSS intrusion. However, this function does not even work to mitigate this type of attack, giving a false sense of security. But why?

What is XSS?

XSS (Cross-Site Scripting) is a form of attack that occurs when an attacker exploits a vulnerability in a web application to insert malicious scripts into its pages. These scripts are executed in the browsers of the application's users and can compromise sensitive information, allow session theft, redirect to other sites, etc.

Why strip_tags don't work?

The strip_tags() function is commonly used to remove HTML and PHP tags from a string. However, it is not designed to handle all forms of malicious input that can lead to XSS (Cross-Site Scripting) attacks.

Here are some reasons why strip_tags() falls short in mitigating XSS attacks:

Attribute-based attacks: XSS attacks can occur through attributes such as onmouseover or onclick, which can execute JavaScript code when triggered. strip_tags() does not remove or sanitize these attributes, allowing potential XSS vulnerabilities to remain.
Tag obfuscation: Attackers can obfuscate the HTML tags and their attributes to bypass strip_tags(). They can use techniques such as mixing case variations, HTML entity encoding, or JavaScript-based obfuscation. strip_tags() alone cannot effectively handle these obfuscated tags.
Context-awareness: XSS vulnerabilities can vary depending on the context in which the user input is displayed. strip_tags() does not have knowledge of the specific context and may allow certain tags or attributes that can still lead to XSS attacks.

An example of malicious string that can be used in an XSS attack is as follows:

this is a XSS attack <script>alert(“hello world”)<script>

If we apply the strip_tags() function, we obtain the following result:

this is a XSS attack alert(“hello world”)

Okay, in this case, it was indeed possible to clean the malicious code from the string. However, the attacker can use the following code:

this is a XSS attack &lt;script&gt; alert('oi') &lt;/script&gt;

The strip_tags() function will not sanitize the string in a way that prevents the injection of code into the page.

How to prevent it?

The good way to deal with untrusted data is:

Filter on input, escape on output

This means that you handle the received data (filter), but only transform it (escape or encode) when you send it as output to another system that requires encoding.

There is no way around it. In the data sanitization phase, the only way to effectively prevent XSS attacks is by using a specific library, such as:

These libraries provide robust mechanisms for preventing XSS attacks by sanitizing and properly handling user input and output.

Here, we are going to use the AntiXSS library.
Now we can sanitize our strings in a much safer way:

<?php

use voku\helper\AntiXSS;

require_once __DIR__ . '/vendor/autoload.php';

$antiXss = new AntiXSS();
$xssString = "this is a XSS attack &lt;script&gt; alert('oi') &lt;/script&gt;";
$clearString = $antiXss->xss_clean($xssString);

//this is a XSS attack
echo $clearString;

In the phase of outputting data, you can use template engines like Twig or Blade or htmlspecialchars function.

Great! Now we have a good way to sanitize XSS.

It's worth mentioning that sanitization is just one of the steps in preventing XSS. But that is a topic for another text...

Do you like data sanitization? Then take a look at my PHP data sanitization library!

rodrigojavornik / PHPCleanup

A PHP Sanitation Library

PHP Cleanup

A powerful sanitization library for PHP and Laravel. No dependencies

Installation

composer require rodrigojavornik/php-cleanup

Usage

use PHPCleanup\Sanitize;

Sanitize::input()->sanitize(' <h1>Hello World</h1> ');//Hello World
Sanitize::trim()->captalize()->sanitize(' string    ');//String
Sanitize::trim()->lowercase()->sanitize(' MY name IS    ');//my name is
Sanitize::onlyNumbers()->sanitize(' abc1234');//1234

Available filters

captalize: Capitalize a string;
captalizeAll: Capitalize all string;
dateTime: Transform a string in DateTime object;
email: Removes all characters not allowed in an email address;
escape: Applies htmlspecialchars to value;
formatNumber: Format a number with grouped thousands;
input: Strip one whitespace from the beginning and end of a string and remove any HTML and PHP tags;
keys: applies sanitaze to elements of an array;
…

View on GitHub

Top comments (1)

spO0q • Sep 14 '23

Although it's always debatable, I tend to prefer "validating" over "sanitizing" in a security perspective.