awesome MLSecOps: aka prompt injection is the new SQL injection

#ai #security #llm #gpt3

MLSecOps is the cybersecurity field that focuses on exploiting and securing Machine Learning-based models and user-facing applications.

List of Resources

The term derives from MLOps - which you can think of as the software engineering field that encompasses the techniques used to put machine learning models in production - from testing and debugging to deploying, monitoring and scaling and anything in between-. MLOps focuses more on the operational aspect of ML rather than, for instance, the development of a ML model, with a specific emphasis on automating some key parts of a ML model lifecycle - some of the automation techniques might sound familiar -such as continuous integration and continuous delivery- whereas others are more specific to the AI/ML field - such as continuous training.

Roughly speaking, MLOps is to Machine Learning engineering what DevOps is to software engineering.

Similarly to how the cloud created a whole new security threat landscape, MLOps is also paving the way to a new class of attack vectors.

Which brings us to the topic of MLSecOps:

MLSecOps is to MLOps what DevSecOps is to DevOps.

Similarly to how MLOps occasionally intersects with existing practices, techniques and tooling from the DevOps field, some of the MLSecOps attack vectors also resemble old-fashioned exploits.

Prompt Injection is one of the MLSecOps attack vectors that has received mainstream attention. According to the OWASP definition in their (very recent) LLM top 10 threat catalog, prompt injection allows an attacker to manipulate a large language model (LLM) through crafted inputs, causing the LLM to unknowingly execute the attacker's intentions.

Basically, it boils down to taking advantage of a lack of input validation. It is similar to more old-fashioned attack vectors like SQL Injection in that they both involve injecting malicious queries into an application that is not correctly sanitizing untrusted data.
On the other hand, some exploit techniques are more unique to the MLSecOps field - such as Data Poisoning attacks, which target the training data of a model and manipulate it in way designed to cause unpredictable/undesirable behaviors.

As someone who works in InfoSec and occasionally audits applications that deal with some flavor of AI -machine learning, reinforcement learning, LLMs etc.-, lately I have been starting to research the attack vectors to exploit AI models/processes/applications, how to secure them and the AI security tooling ecosystem.

The result is the awesome-MLSecOps Github repository, where I am keeping track of useful OSS security tools, academic literature & papers and other resources about the field. If you have any recommendations, contributions are greatly appreciated.

And if you are interested in AI security & getting familiar with the field, feel free to reach out on Twitter or Telegram. I am planning to publish more in-depth articles/tutorials on the field as I go.