APIs are a necessary and central part of the strategy of any digital business that wants to stay competitive and monetize its assets. Additionally, end users’ form factor of choice when using digital services is now firmly mobile. The trend towards APIs and mobile devices has moved the attack surface in a significant way and digital businesses must adapt and evolve their security policies accordingly.
Any business utilizing APIs will certainly do some research to understand the latest trends in mobile security, bot mitigation, and API security in order to protect their business from various types of system abuse through unauthorized use of their API. Credential stuffing, fraudulent transactions, competitive data scraping, personal data breaches, and fake account creation are just some of the risks of not protecting your API eco-system effectively. Businesses will want to understand the impact on their risk profile of the many different mobile and API security approaches in order to decide what they implement.
In this article we will explore the most common techniques used to protect an API, including how important it is to use HTTPS to protect the communication channel between mobile app and API, how API keys are used to identify the mobile app on each API request, how user agents, captchas and IP addresses are used for bot mitigation, and finally how user authentication is important for the mobile security and api security. We will discuss each of these techniques and discuss how they impact the business risk profile, i.e. how easy they are get around.
The reader will come to understand why today’s commonly used mobile API protection techniques are very naive and not fit for purpose to defend digital businesses against API abuse. API abuse is its various forms is much more commonplace that most businesses realize so it is important to employ the right techniques to maintain revenue and brand reputation.
Most businesses know that their API channels should all be encrypted. Let’s Encrypt is free to use so there is no excuse for not using SSL certificates and HTTPS everywhere in your stack.
Because you are using a secure channel everywhere, you might feel that you have low risk of your data being accessed via a man-in-the-middle attack (MITM). Since all of your data in transit is encrypted, its integrity and confidentiality is guaranteed, right?. That's certainly a reasonable expectation to have as a developer, but in the past some protocols were thought to be secure, only for vulnerabilities to become public and for it to become clear that hackers had been exploiting them for a long time, such as in the Poodle attack, so beware!
Nowadays, in order to be protected from HTTPS attacks, we must not use the deprecated protocols SSL or TLS 1.0. Instead we should be using TLS 1.1, 1.2 or the brand new TLS 1.3, which, although very fast, is already being criticized for introducing a security hole known as the Zero Round Trip Time(0-RTT) resumption feature. This allows a client and a server to remember if they have met before, thus allowing a hacker to capture the HTTPS request and replay it as many times as they want.
Further, when hackers control the devices, such as mobile phones, they can intercept HTTPS requests by routing traffic through a proxy with custom TLS certificates installed on the device and the proxy. This allows them to inspect the secure traffic in order to understand how the mobile app communicates with the API server and what responses it gets back from it.
With this valuable information, they can reverse engineer the API and mount attacks against the API server in order to exfiltrate the data they are interested in. The API server and mobile app can protect from MitM attacks by using certificate pinning, but even this can be circumvented by hackers with the use of introspection frameworks like Xposed and Frida as we can see in this article, The Problem With Pinning.
While it is critical to secure the communication channel between the mobile app and the API server, it should be clear by now that HTTPS by itself is not enough to secure an API server, because it can be worked around and does not guarantee to the API server that the server is indeed connected to the original mobile device and app we uploaded into Google Play or the Apple Store.
As a final note when using the latest TLS versions, one should not feel entirely safe about the secure channel, because we never know what zero day exploits hackers may be exploiting without our knowledge.
It should be clear that HTTPS by itself is not enough to protect against API abuse.
The most commonly approach used to protect access to a backend server via an API is without doubt to add an API key in the header of each HTTPS request. Passing the API key as a query parameter in the URL is not recommended since query parameters are often logged in the clear. Each application consuming the API should have is own unique API key that will be issued in your developer portal.
API keys are popular because they provide a simple way of identifying the probable origin of the request, and they are easy to integrate and deploy both to the API server and any application consuming it. They can also be used as a basic form of authorization access to resources or to identify and rate limit the frequency of resource access.
While an API key may look like a good way of identifying what application is making a request to the API server, the truth is that on its own it is not able to prove that the request is indeed from the app the API key was issued for. Hackers can easily reverse engineer a mobile app to extract the API key by using a tool like the Mobile Security Framework or to inspect the traffic between the API and the mobile app with a tool like MITM Proxy to extract the key for later reuse.
When the API key is used for access control and where rate limiting and blacklisting are enabled to identify high frequency traffic, legitimate users may be impacted by a stolen key.
Using API keys is good, but not as a reliable security measure, because they are very easy to extract from a mobile app. They can be reused in replay attacks by automated scripts for data scraping, fake account creation or manual requests from tools like Postman.
It should be clear that API keys are not enough to prove that the API server is indeed responding to the original mobile app uploaded to Google Play or the Apple Store.
API calls typically include a user agent string which identifies what type of application (often a browser) is making the call. Using a user agent to block access to an API server may work well against good bots, such as search engine robots, but when a malicious bot is making API calls, it will more often than not disguise itself with a user agent string from a well known browser, thus easily bypassing this defense. There are even packages dedicated to help faking a user agent.
API developers often use tools like Postman to query APIs, and Postman likes to advertise itself with is own user-agent. An ex-colleague, from my previous work, was working in a mobile app for a big European retailer that was defending its API based on user agent detection and was blocking any request from Postman. Since Postman was blacklisted, my ex-colleague switched to Insomnia, another API tool similar to Postman, to easily bypass that protection.
There is nothing wrong with adding user agent detection as one more defense layer, but it should be pretty clear by now why this is not enough to defend against your API being used to extract your valuable data.
Captchas are usually used when the user needs to submit something using forms, for example, in login screens or upload pages.
The use of captchas come at a price, because the user friction they cause will lower the conversion rates on the pages where they are used. To reduce the friction, Google has released reCAPTCHA v3 which does not require user interaction and relies instead on analyzing user behavior across the entire application to decide in the background whether the user is a real human or a bot. Google recommends its use across the full application, not only on screens where forms are used. So, we can now use the reCAPTCHA v3 score during each API request to decide if we should fulfill or not the request.
This is effective in stopping simple bots, but it may not be enough to stop the most intelligent ones which have algorithms to solve captchas or will use third party services to solve them. Search the web for “solving captchas” and you will be presented with an array of service providers offering to solve captchas for you. If you prefer to do it yourself, Github holds lots of projects aimed to solve them; some even use machine learning, some claim to solve reCAPTCHA v3(2captchadotcom/blog/solving_recaptcha_v3) as well.
Unfortunately captchas by themselves are not enough to stop bots from submitting forms against your API.
Blocking by IP address is normally used to block high frequency requests to a server and can be deployed at the server level in a firewall or at the application level.
When used in the server firewall, the approach cannot be too aggressive or it may block legitimate users. For example, if a mobile app tries to load a view with lots of images, there may be a request for each image along with some other requests to get data, and this may accidentally trip a rate limit for complex pages. Another disadvantage here is that the IP address may have many users, and though only one is manifesting a bad behavior, all users will be blocked. On the plus side, the time spent per check is not noticeable, so it can be useful to employ rate limiting to cheaply block the most obvious abusers.
IP blocking is more effective at the application level where the IP address can be combined with other information in the request to decide when to block or not. Blocking can be done by the API endpoint for a specific user, the API key, the user agent, the region, or with a combination of all of these. The main disadvantage here is that the verification may have an impact on the processing time of the request, depending on the complexity of those verifications. The advantage is that we may block an IP address without blocking all users on it, just the badly behaving ones.
With cloud computing, it is unfortunately very easy for a hacker to circumvent IP address blocking. All it takes is to shutdown the current instance for it gets blocked and spin up a new one that will get a brand new IP address. It is also possible to perform the attacks at a very low frequency or from multiple addresses, which will be very hard to distinguish from normal requests performed by legitimate consumers of the API.
Hackers are now far more advanced than most businesses appreciate, employing machine learning and artificial intelligence to conduct very sophisticated attacks that learn and adapt to the API defenses or even poison the machine learning engine from the behavior analytics tools that some APIs use to defend themselves.
That said, defeating a rate limiting protection strategy does not need such a sophisticated approach.
Another of the most common ways to authenticate an API request is by using an access token which is unique for each user and has an expiration time.
Many security conscious businesses use OAUTH2 and OpenID for their user authentication and authorization flows, which provide these access tokens, also referred to as the bearer tokens. The best practice is to not roll your own solution but instead use the established libraries. The optimal flows for web and mobile applications use the authorization grant flow, which separates user authentication from user authorization and protects the leaking of user credentials into the app. The simplistic implicit flow has been proposed to be discontinued by the OAUTH working group in their RFC draft from November 09, 2018.
Let's see how, in the OAUTH 2 authorization grant flow, user authentication is separated from the user authorization:
As we can see the in this flow, the user agent will never have knowledge of the access token, thus preventing leakage which often happens in the implicit grant flow.
Developers may think that using the above flow alone is enough to have a good OAUTH2 solution to secure their mobile app and API server, but let’s look at a few example scenarios where this belief will be seen to be false.
Consider the scenario where hackers use a captive WiFi portal to spy on HTTPS traffic from devices they do not control, where users are instructed to download a TLS certificate provided by the hacker, either a self signed certificate or a trusted one from a certification authority like Let’s Encrypt. Many users are willing to perform simple actions to get free WiFi, and when they do, they are enabling the hacker to intercept all their HTTPS traffic, inspect it, and replay it, with or without modification, back to the API server as if it was coming from the original mobile app. Note also that the reverse will happen when the API server sends back a response. So now the hacker not only has access to the user’s access tokens, but he/she can now do whatever he/she wants with the request, while the user continues to believe that they are interacting in a safe environment and the API server believes that it is accepting requests from the original mobile app and from the real user.
Consider another scenario where users are socially engineered to install malware that will allow hackers to impersonate them. Users may engage with third party services that offer cheating as a service, for example, in mobile games to accelerate the accumulation of rewards/points/credits which help users progress faster in the game. The user will freely give their authentication credentials to these third parties in order to enhance their progress in the game, but these credentials can then be used by hackers for API abuse.
As we have seen, the access token allows us to identify the user, but it does not guarantee to the API server that the token being sent is actually coming from the mobile app that the real user has been authenticated for. It is therefore clear that a user access token is not enough to protect your API server from being abused.
Of course we should always use a secure communication channel on our APIs, but we must recognize that this security layer can be bypassed by hackers, even when certificate pinning is being used.
API keys are probably the most popular and easy method used today to identify the mobile app consuming an API, but they by themselves cannot prove that they are being solely used by the mobile they have been issued for.
User agents may be seen as an easy way to detect and provide bot mitigation on an API, but they are relatively easy to bypass by bots, who will just disguise themselves as a browser by sending user agent strings used by well known browsers.
Captchas are a very popular technique to protect forms from being submitted by bots, but they can be bypassed with available software solutions from third parties or with the use of advanced algorithms in combination with machine learning and artificial intelligence from the open source community.
User authentication is very important to use when we need to know who is using the API but it cannot be blindly trusted as a proof that the requests to the API are from the user or from the original mobile app, because users may freely give their credentials away to third parties in exchange for access to interesting services.
IP address blocking and rate limiting techniques can be used to try to stop bots from abusing APIs, but when used at server level they cannot be too aggressive or they will block legitimate users. At the application level they can be more effective at the price of some increase in the response time of the request, but when bots are deployed through cloud services and employ machine learning and artificial intelligence, they will be able to defeat these techniques and appear to behave like a human.
After reading this article you may think your API is doomed to suffer abuse no matter what you do. Don’t be downhearted though. Instead, take a look at some more advanced API protection techniques, such as mobile app attestation. Interestingly, you may find that these more sophisticated techniques are actually easier to adopt and deploy, and bring more effective results.
Do you have anything to say, ask, recommend? Please leave your comment below. Your feedback is appreciated and will help me to improve my future blog posts.
See you in my next article Why Does Your Mobile App Need an API Key.
This article was previously published here.