Recently I got a chance to work on OAuth 2.0 and I have noticed that there is so much complexity and confusion about
OAuth 2.0 and OpenID connect over the internet.
The reason behind it is that it's a difficult protocol to understand, tons of jargons and terminologies and there is a lot of confusion about it online.
When I have started to learn about OAuth 2.0 and OpenId connect I didn't find a single source of truth and there is a plethora of blogs and everybody on StackOverflow seems to be explaining it in their own way to implement it.
If you have ever felt confused, overwhelmed then you are not the only one, pretty much everybody has. In this blog, I will try to make things a little bit less intimidating and in the comments, you can let me know if it makes sense.
Why when we have a concrete spec but so many different interpretations about how to use it? that doesn't happen with a protocol like HTTP and there is a pretty correct and concrete way to use but with OAuth, there is some fuzziness in the implementation which makes it more difficult and creates confusion on how to use it.
So before explaining about OAuth and OpenId, I just want to set a stage for simple login and forms authentication which is the most basic type of authentication on the web where user enters username/email and password, the request goes to the server and the server talks to the database and verifies the username and password hopefully, hashed one 😉.
If the verification was successful then the server sends a cookie to the web browser to keep track of the user.
Now the industry is moving away from this kind of homegrown solution, this simple form authentication has nothing to do with OAuth but I just want to let you know that this is what we are going to compare with.
So there are a couple of downsides of this approach like
Because you are responsible for maintaining the authentication system there is only your server-side code which will talk to the database and verify the password and all. You should be always aware of new security standards and best practices like hashing, encryption and storing user information securely, etc.
Let's go back in time 2006-07, Applications and websites back then also had several use cases such as Login, Authentication, Authorization, etc
so we can combine these cases and call it Identity use cases.
Identity use cases(pre-2010)
- Simple login (forms and cookies)
- Single sign-on across sites (SAML)
- Mobile login (user should stay logged in if the app is closed also
long living sessions)
- Delegated authorization (????)
Here the term Delegated authorization is the Genesys of OAuth protocol.
Despite sounding very boring this is the thing I am going to focus on.
You might be aware of delegated authorization and which is something you interact every day, without actually knowing what it is. It is a common pattern now but it was not 10 years back.
Delegated authorization is the process where we let sepecific websites access our data without actually giving away the credentials.
So back then there was no good way to solve this problem and one company called Yelp tried to solve this problem but in a really bad way
at the end of registration Yelp asked users hey you want to share Yelp with your Gmail or Yahoo contacts for sign up or promotions etc,
for this, they want user's actual Gmail email address and password it means its like Yelp login to user's Gmail account grab user's contacts sends email to them, then logout and throw away the password and promise not to do anything evil 😱
For many people, their Gmail or Yahoo account was their main email accounts and key for everything they have like bank details, finances, password resets, etc. So it is a really bad idea to give their password randomly to any app or website.
Here I am not targeting Yelp to design this type of solution it's just a good example to illustrate what was the problem then how they didn't have the proper way to solving the problem.
Today we are having a better way to solve this problem using OAuth protocol like the image above where Gitlab wants to access Github user information.
Now we will see how Yelp should have been solved this problem using OAuth.
Nowadays we are having one link/button mostly on every application something like
connect with google, what happens when user will click on this button user will put into OAuth flow and ultimate result of this, the application can access the user's authorized information.
User can ensure that the application which wants the contact access can only be able to access contacts not like delete the contact, send an email or view the photos on google photos and drive information, etc.
When user will click on
connect with google button browser will redirect to a google login page like
account.google.com this is a little bit better than the previous solution because the user is providing the password to google though not to Yelp or somebody else.
Assume now the user logged in successfully, then one prompt will appear as this application Yelp wants to access your list of information which includes your public profile or contact details etc, are you sure you want to allow to access Yelp this information.
Now User explicitly has consent to whatever they're granting access to, that's important so he doesn't get tricked into like agreed to something that he doesn't know.
So if the user clicks No, then we are done nothing interesting happens here and flow will restart again but suppose if the user clicks on Yes then the browser one more time redirect back to the application, back to where it started to a special place in the application called a callback or a redirect callback or redirect URI that again we will discuss later in the blog.
Then with a little bit of magic that application which is Yelp in our case allowed to talk to some other API say like google contacts API and Yelp usually don't have access to this API but now it has something(
access_token) which this application got when the user clicks on Yes button.
Now, below are the things which go into this magic.
- Resource Owner
- Authorization Server
- Resource Server
- Authorization Grant
- Redirect URI
- Access Token
A resource Owner is a fancy way of talking about you and me as a user who is clicking on Yes button and sitting in front of computer who owns the data which application Yelp wants to access, in this context I am the resource owner who is having Gmail account and contacts and Yelp is want to access my Gmail contacts.
A client is just a way to refer to the application in this case Yelp is the client, the application that wants the resource owner data basically.
An authorization server is a system which resource owner can use to say yes, I authorized this permission and I authorized this to happen in this case authorization Server is
If you search on google term authorization server this is what you will get 😧 !! and this why I am writing this blog to make things a little bit more clear.
A resource server is different from the authorization server, resource server is the API server which holds the data that the client wants to get access, in this case, Google contact API is the system which holds my contact.
Sometimes authorization server and resource server together kind of melded into the same system but many time they separate.
The whole point of OAuth flow is going over to the authorization server, coming back to the client is just to get something called authorization grant and the authorization grant is the thing that proves that I have clicked on Yes or I consent to this level of permission or I allow you to have to access of my Gmail contacts.
When the authorization server redirects back to the client application it kind of needs to know where to redirect back this is called callback or sometimes it is called redirect URI, it is just where should I end up at the end of this flow if the users click on Yes where they need to go next.
I mentioned above the authorization grant is the point of this whole flow well but at an even higher level, the thing the client really needs is something called access token.
An access token is going to be the way the key that the client uses to get into whatever the data user has granted access to or granted permission to on the resource server.
Pheww so much information, I think you are getting confused and overwhelmed till this time if you are reading till now and if you are ready to show little bit patience your all doubts and confusion will clear in a bit.
Let's see the diagram above which I have described earlier with all these jargons.
Now we will start the flow again with me the resource owner who is sitting on the client website Yelp and I clicked on the
connect with google button and then what happens is I will redirect to this authorization server which is accounts.google.com in this case but it could be the Facebook authorization server or authorization server hosted by Octa or it could be somebody else .
Right at the beginning of this flow when the client is redirecting over the authorization server its already passing some kind of the configuration information that the authorization server needs, so it's saying hey when you are done assuming everything is successful here is the URI where I want you to redirect back at the end. So we have to pass the redirect URI at the very beginning and we also have to provide some other information such as what type of authorization grant do we want.
There are actually few different types of an authorization grant, in this case, we will be going to use a most simple type of authorization grant called code grant or authorization code grant because we are requesting a code and it is called authorization code flow.
So now the authorization server prompts the login, consent to that permission all that good stuff then redirects back to the place specified at the beginning the redirect URI and redirect with something called authorization code because that's what we ask for at the beginning.
Now the client can't do a whole lot with that authorization code, in fact, there's only one thing that the client can do at all with authorization code which is to go back to the authorization server one more time and exchange this authorization code with
After getting the access token, now the client Yelp can do what they actually want us to do in the first place which is going to the resource server
contacts.google.com and ask for the user contacts.
Normally resource server will not allow to accessing the user's contacts but in this case, the client is attaching the access token in the request and this proves that user said that it is ok for the resource server to access contact information.
Now if the client tried to do something else say not retrieve my contacts but delete all my contacts, in this case, resource server says ok you have an access token but it doesn't mean that you can do just anything, the user has given you only read-only access to the contacts.
So access token grants access to the client to do specific things but how does the client specify what are the things it wants to do??
Let's add some more terminologies.
An authorization server is having a list of scopes based on the client needs, for example, read-only access to contact.
Too much, ok I got it, let's continue about scopes and consent in the next blog.