DEV Community

Cover image for Why is OAuth still hard in 2023?
Robin Guldener
Robin Guldener

Posted on • Originally published at nango.dev

Why is OAuth still hard in 2023?

OAuth is a standard protocol. Right? And there are client libraries for OAuth 2.0 available in basically every programming language you can imagine.

You could conclude that, armed with a client library, you should be able to implement OAuth for any API in about 10 minutes. Or at least in an hour.

If you manage, please email us: We would like to invite you for a really delicious dinner and hear how you did it.

OAuth in practice: We implemented OAuth for the 50 most popular APIs

These include Google (Gmail, Calendar, Sheets etc.), HubSpot, Shopify, Salesforce, Stripe, Jira, Slack, Microsoft (Azure, Outlook, OneDrive), LinkedIn, Facebook and many more.

Our conclusion: The real-world OAuth experience is comparable to javascript browser APIs in 2008. There is a general consensus on how things are to be done, but in reality every API has their own interpretation of the standard, implementation quirks, non-standard behaviours and extensions. The result: Footguns behind every corner.

If it wasn’t so annoying, it would be quite funny. Let’s dive in!

Problem 1: The OAuth standard is just too big & complex

“Hey this API also uses OAuth 2.0, we already did that a few weeks ago. I should be done by tomorrow.”
– Famous last words from the intern

OAuth is a very big standard. The official site of OAuth 2 (currently) lists 17 different RFCs (documents defining a standard) that together define how OAuth 2 works. They cover everything from the OAuth framework and Bearer tokens to Threat models and private key JWTs.

“But”, I hear you say, “surely not all of these RFCs are relevant for a simple third party access token authorization with an API?”
You are right. Let’s focus only on the things that are likely to be relevant for the typical API third-party-access use case:

  • OAuth standard: OAuth 2.0 is the default now, but OAuth 1.0a is still used by some (and 2.1 is around the corner). Once you know which one your API uses, move on to:
  • Grant type: Do you need authorization_code, client_credentials or device_code? What do they do and when should you use which? If in doubt, try authorization_code:
  • Side note, refresh tokens are also a grant type, but kind of a special one. How they work is standardized, but not how you ask for them in the first place. More on that later.
  • Now that you are ready for your requests, let’s look at the many (72, to be precise) official OAuth parameters with a defined meaning and behaviour. Common examples are prompt, scope, audience,resource, assertion and login_hint. From our experience though most API providers seem to be as oblivious of this list as you probably were until now, so don't worry too much about it.

If this still feels too complicated and like a lot to learn, we tend to agree.

It seems most teams building public APIs agree as well. Instead of implementing a full OAuth 2.0 subset they just implement the parts of OAuth they think they need for their API’s use case. This leads to pretty plong pages on docs outlining how OAuth works for this particular API, but we have hard time blaming them. They have only the best intentions in mind for their DX. And if they truly tried to implement the full standard you would need to read a small book!

Luckily most implementations today can be packed into a flow chart like this one from Salesforce:

Salesforce OAuth flow

The trouble is that everybody has a slightly different idea of which subset of OAuth is relevant for them, so you end up with lots of different (sub) implementations.

Problem 2: Everybody’s OAuth is different in subtle ways

As every API implements a different subset of OAuth, you quickly get into a situation where you are forced to read their long pages of OAuth docs in detail:

  • Which parameters do they require in the authorize call?
    • For Jira the audience parameter is key (and must be set to a specific fixed value). Google prefers to handle that through different scopes, but really cares about the prompt parameter. Meanwhile, somebody at Microsoft discovered the response_mode parameter and demands you always set it to query.
    • The (newish) Notion API takes a radical approach and does away with the ubiquitous scope parameter. In fact, you will not even find the word “scope” on their API docs. Notion calls them “capabilities” and you set them when you register the app. It took us 30 confused minutes to understand what is going on, why reinvent this wheel?
    • It gets worse with offline_access: Most APIs these days expire access tokens after a short while. To get a refresh token you need to request “offline_access”, which either need to be done through a parameter, a scope or something you set when you register your OAuth app. Ask your API or OAuth doctor for details.
  • What do they want to see in the token request call?
    • Some APIs, like Fitbit, insist on getting data in the headers. Most really want it in the body, encoded as x-www-url-form-encoded. Except for a few, looking at you, Notion, they really prefer to get that in the body as JSON.
    • Some want you to authenticate this request with Basic auth. Many don’t bother with that. But beware, they may change their mind tomorrow.
  • Where should I redirect my users to authorize?
    • Shopify and Zendesk have a model where every user gets a subdomain like {subdomain}.myshopify.com. And yes that includes the OAuth authorization page, so you better build dynamic URLs into your model and frontend code.
    • Zoho books has different data centers for different locations of their customers. Hopefully they remember where their data resides: To authorize your app, your US customers should go to https://accounts.zoho.com, whilst europeans please visit https://accounts.zoho.eu and Indians are welcome at https://accounts.zoho.in. The list goes on.
  • But at least I can pick my callback URL, no?
    • If you enter http://localhost:3003/callback as a callback for the Slack API they kindly remind you to “Please use https for security”. Yes, also for localhost.

We could go on for a long time, but we think you probably got the point by now.

Relevant XKCD

Problem 3: Many APIs add non-standard extensions to OAuth

Even though the OAuth standard is vast, many APIs still seem to find gaps in it for features they need. A common issue we see is that you need some data besides the access_token to work with the API. Wouldn’t it be neat if this additional data could be returned to you together with the access_token in the OAuth flow?

We think this is actually a good idea. Or at least better than forcing users to do quirky additional API requests afterwards to fetch this information (looking at you, Jira). But it does mean more non-standard behaviour that you specifically need to implement for every API.

Here is a small list of non-standard extensions we have seen:

  • Quickbooks has a concept of a realmID which you need to pass in with every API request. The only time they tell you this realmID is as an additional parameter in the OAuth callback. Better store it somewhere safe!
  • Braintree does the same with a companyID
  • Salesforce uses different API base URLs for different customers, they call this the instance_url. Thankfully they return the instance_url of the user together with the access_token in the token response, but you do need to parse it out from there and store it.
  • Unfortunately Salesforce also does even more annoying things: Access tokens expire (good!) after a pre-set period of time, which can be customized by the user. Fine so far, but for some reason they don’t tell you in the token response when the access token you just received expires (like everybody else does). Instead you need to query an additional token details endpoint to get the (current) expiration date of the token. Why, Salesforce, why?
  • Slack has two different types of scopes: Scopes you hold as a Slack bot and scopes that allow you to take action on behalf of the user who authorized your app. Smart, but instead of just adding different scopes for each they implemented a separate user_scopes parameter you need to pass in the authorization call. Better know about that and good luck finding support for this in your OAuth library.

For the sake of brevity and simplicity we are skipping the many not-really-standard OAuth flows we have encountered.

Problem 4: “invalid_request” - debugging OAuth flows is hard

Debugging distributed systems is always hard. It is harder when the service you are working with uses broad, generic error messages. OAuth has standardized error messages, but they are about as useful in telling you what is going on as the example in the title above (which, btw, is one of the recommended error message from the OAuth standard).

But its a standard and there are docs, what is there to debug?
A lot. I cannot tell you how often the docs are wrong. Or missing a detail. Or have not been updated for the latest change. Or you missed something when you first looked at them. A good 80% of OAuth flows we implement have some problem upon first implementation and require debugging.

XKCD on debugging

Some flows also break for, what seems to be, random reasons: LinkedIn OAuth for instance breaks if you pass in PKCE parameters. The error you get? “client error - invalid OAuth request”. That is… telling? It took us an hour to understand that passing in (optional, usually disregarded) PKCE parameters is what breaks the flow.

Another common mistake is sending scopes that don’t match the ones you pre-registered with the app (pre-register scopes? Yes, a lot of APIs these days demand that). This often results in a generic error message about there being an issue with scopes. Duh.

Problem 5: Cumbersome approvals to build on top of APIs

The truth is, if you build towards some other system by using their API you are probably in the weaker position. Your customers are asking for the integration because they are already using the other system. Now you need to make them happy.

To be fair, many APIs are liberal and provide easy self-service signup flows for developers to register their apps and start using OAuth. But some of the most popular APIs out there require reviews before your app becomes public and can be used by any of their users. Again, to be fair, most review processes are sane and can be completed in a few days. They are probably a net gain in terms of security and quality for end users.

Reviews

But some notorious examples can take months to complete or even require you to enter into revenue share agreements:

  • Google requires a “security review” if you want to access scopes with more sensitive user data, such as email contents. We have heard these can take days or weeks to pass and require a non-trivial amount of work on your side.
  • Looking to integrate with Rippling? Get ready for their 30+ questions security pre-production screening. We hear access takes months (if you are approved).
  • HubSpot, Notion, Atlassian, Shopify and pretty much everybody else who has an integrations marketplace/app store requires a review to get listed there. Some reviews are mild and some are asking you for demo logins, video walkthroughs, blog posts (yes!) and more. However, listing on the marketplace/store is often optional.
  • Ramp, Brex, Twitter and a good number of others don’t have a self-service signup flow for developers and require you to fill in forms for manual access. Many are quick to process requests, but for some we are still waiting to hear back from after weeks…
  • Xero is a particularly drastic example in how they monetize their API: If you want to remove a limit of 25 connected accounts you have to become a Xero partner and list your app in their app store. They will then take an (as of time of writing) 15% revenue cut from every lead generated from that store.

Problem 6: OAuth security is hard and a moving target

As attacks have been uncovered, and the available web technologies have evolved, the OAuth standard has changed as well. If you are looking to implement the current security best practices, the OAuth working group has rather lengthy guide for you. And if you are working with an API that is still using OAuth 1.0a today you realize that backwards compatibility is a never ending struggle.

Luckily security is getting better with every iteration, but it often comes at the cost of more work for developers. The upcoming OAuth 2.1 standard will make some current best-practices mandatory and includes mandatory PKCE (today only a handful of APIs require this) and additional restrictions for refresh tokens.

OAuth security

The biggest change has probably been ushered in with expiring access tokens and the rise of refresh tokens. On the surface the process seems simple: Whenever an access token expires, refresh it with the refresh token and store the new access token and refresh token.

In reality, when we implemented this we had to consider:

  • Race conditions: How can we make sure no other requests run whilst we refresh the current access token?
  • Some APIs also expire the refresh token if you don’t use it for X days (or the user has revoked the access). Expect some refreshes to fail.
  • Some APIs issue you a new refresh token with every refresh request…
  • …but some also silently assume that you will keep the old refresh token and keep on using it.
  • Some APIs will tell you the access token expiration time in absolute values. Others only in relative “seconds from now”. And some, like Salesforce, don’t divulge this kind of information easily.

Last but not least: Some things we have not talked about yet

Sadly, we have only just scratched the surface of your OAuth implementation. Now that your OAuth flow runs and you get access tokens it is time to think about:

  • How to securely store these access tokens and refresh tokens (they are like passwords to your user’s accounts. But hashing is not an option, you need secure, reversible encryption)
  • Checking that the granted scopes match the requested scopes (some APIs allow users to change the scopes they grant in the authorize flow)
  • Avoiding race conditions when refreshing tokens
  • Detecting access tokens revoked by the user on the provider side
  • Letting users know that access tokens have expired, so they can re-authorize your app if needed
  • How to revoke access tokens you no longer need (or the user has requested you delete them under GDPR)
  • Changes in available OAuth scopes, provider bugs, missing documentation etc.

A better way?

If you have read this far you might be thinking “There must be a better way!”

We think there is, which is why we are building Nango: An open-source, self-contained service that comes with pre-built OAuth flows, secure token storage and automatic token refreshes for 50+ APIs.

If you give it a try we would love to hear your feedback. And if you want to share your worst OAuth horror story with us we would love to hear about it in the comments or on our Slack community.

Thanks for reading and happy authorizing!

Top comments (2)

Collapse
 
theaccordance profile image
Joe Mainwaring

Is there a reason you're building ontop of OAuth and not its successor, OIDC?

Collapse
 
rguldener profile image
Robin Guldener

Good question!
OIDC (OpenID Connect) is a layer on top of OAuth aimed mostly at Single Sign On (SSO) use cases. It still means you need to implement the underlying OAuth flow first (which is what we cover in the article).