Bombarded with multiple alerts coming from multiple disconnected services, security analysts continue to struggle with alert fatigue. While they need context about threats facing their organizations, many also find identifying the right context challenging.
Today, companies expect security analysts to be experts in everything from the technical to the criminal underground. In reality, this just isn’t possible. However, large language models (LLMs) excel at summarizing large quantities of data, offering security teams a starting point for their analyses.
Flare uses LLMs to help analysts get the insights and answers they need quickly so that they can filter out false positives and focus on what really matters.
LLMs can help separate out important context. Typically we see two different use cases: non-technical and technical.
From a non-technical point of view, LLMs adapt well to given slang language. They understand the context and quickly summarize information, eliminating the need to do Google searches.
First, criminal underground chatter uses slang that has a different meaning within their community when compared to the technical community.
For example, they often use the term “logs,” referring to stolen usernames and passwords. Technical professionals usually recognize this term as the digital record of activity happening in their environment. With the prompt, LLMs can “translate” this for you.
Second, LLMs can help you understand what an application does. For example, when threat actors post stolen credentials for sale, they usually name the application. By explaining the application, LLM tools make it easier to determine whether the stolen credentials pose a risk to your organization’s security posture.
At a technical level, ChatGPT does an excellent job of helping analysts better identify real, exploitable risks to their environments.
For example, when Flare identifies a GitHub match, the LLM can look at the code and understand whether the mention indicates a risk or not. When developers use a base template for an app rather than starting from scratch, they often make a copy and rename it for their company before they start working. Sometimes, these files may have a hardcoded password. However, this public facing hardcoded password poses little risk to the company until the developer starts modifying the code to build the company-specific application.
When Flare detects the company name in GitHub, the LLM provides context around whether the match poses little risk because it was in a default config file or could indicate a problem because the developer made a modification.
While LLMs provide various benefits, they also come with some limitations. LLMs analyze unstructured data which poses difficulties when writing prompts.
When inputting a query, the LLM has a hard time deciphering the difference between the question being asked and the data being supplied. For example, if you’re providing the text of a message and asking if it was written by a specific threat group, the LLM gets confused. It can often focus on the threat group’s name or a username in the prompt rather than analyzing the message’s text that you want to know about.
LLMs can limit the amount of data that it can summarize so choosing what information the prompt includes is critical.
Although ChatGPT’s context window simulates a conversation to look human, the LLM doesn’t “remember” the beginning of conversations. If you engage in a long conversation, then it won’t be able to answer appropriately because it may not be accessing the original information, such as the threat actor's name.
To get the most accurate actor profile possible, you need to choose the inputs carefully. Ideally, when an actor is active on the criminal underground and the clear web, you want to take a little bit of both. You need to combine some of the:
- Oldest criminal underground activity
- Recent criminal underground activity
- Oldest clear web activity
- Recent clear web activity
With the right inputs, the model can fill the gaps, giving you a good high level view of what the threat actor does.
LLM models use tokenization that transforms texts into numbers which often means that ChatGPT can analyze less code in a given input than it can text. Consider the following:
- CHAT GPT
- chat GPT
While the human eye and mind reads these as the same two phrases, the LLM model transforms following data point into a separate number:
What people think is two pieces of data, the LLM model views it as five.
When communicating code with ChatGPT, you have a smaller token budget than you do with text, meaning that you have to be more careful when inputting queries so that you don’t waste energy computing information that doesn’t matter.
In this case, you want to look for the interesting parts of the source code and only send those. Some examples might be inputting:
- The metadata for the part of a file where your company’s name is mentioned
- Specific project name, location, and developer
When dealing with token budgets, engineering prompts become incredibly important. Unfortunately, no clear answer around best technique exists. The process requires you to test and iterate the prompt. Some prompt improvements more than doubled the output’s value.
LLMs can assume that all information provided at input is true or factual. For example, a prompt might ask a question about whether your company’s data was part of a specific data breach. The LLM will assume that the corporate data was part of the breach rather than researching the breach to look for the corporate data.
A fundamental problem in cybersecurity is the communication gap across different people in the company. LLMs are powerful for communication between different audiences, especially when looking at how teams can use them for reporting. LLMs empower less experience security team members by helping them understand context so that they know how to efficiently escalate alerts. Additionally, CISOs and others interacting with business leadership can use LLMs to explain technical bugs or data breach information in a way that addresses that audience’s needs.