Andrea Chiarelli

Posted on Nov 19, 2023

Sessions, Tokens, and Rock'n'Roll

#cookies #jwt #token #session

Does dealing with tokens eliminate the need for cookie-based sessions? You will find plenty of questions like this on Stack Overflow, reddit, and other developer communities. Very often, developers confuse sessions, cookies, tokens, and other similar concepts that are somewhat related to authentication and authorization. Let’s try to get some clarity.

Disclaimer: This is not a complete guide on sessions, cookies, and tokens. This is an attempt to clarify the basic concepts on which sessions, cookies, and tokens are grounded and dispel the most common misunderstandings on the topic.

What Is a Session?

Let's start by clarifying what a session is. A session is a sequence of interactions between a user or a client application and a server. A session starts with the initial interaction and lasts until explicitly closed or after a given period of inactivity. A session can also last indefinitely if no specific termination rule is defined.

Usually, sessions have an associated state containing specific data, for example, the user's name, email, and other data related to the current user's activity.

From a conceptual point of view, this is what you need to know about a session. But you may be thinking about how to implement it, and that's probably where the problems start.

Before moving on, let us point out that the implementation of a session depends on the specific communication protocol. In this article, only HTTP-based sessions will be addressed.

Sessions and Cookies

On the Web, a session is a sequence of HTTP requests from a client and related responses from the server. However, by design, each HTTP request is independent of all others. That means that the HTTP protocol itself can't keep information between requests. In other words, it's a stateless protocol. How can a server recognize that a group of requests is coming from a specific client? How can it maintain the state between independent calls? Here's where cookies traditionally come in.

Here are the cookies!

I guess you know what a cookie is. Either as a user or as a developer, you have dealt with them directly or indirectly. Although, as a user, you may dislike them for their sometimes intrusive use, as a developer, you may have appreciated their usefulness.

Often, the names assigned to these artifacts seem to attribute thaumaturgic properties to them: authentication cookies, tracking cookies, session cookies. But beyond that, a cookie is just a small block of data created by the server and stored on the client, your web browser. Nothing more. The information stored in the cookie makes it special.

Session cookies

The typical approach to building HTTP-based sessions is for the server to send the browser a cookie with a unique ID: the session ID. That cookie, which magically becomes the session cookie, will be sent back to the server with any subsequent request. This causes the server to finally recognize HTTP requests coming from the same client and thus making up a session. In addition, the server can use the session ID to store and retrieve data related to the current user's activity: that's how the session state is maintained!

Session cookies rely on the native ability of web browsers to manage them without the developer's intervention. Of course, they should have the HttpOnly and Secure attribute set, and other requirements must be met to ensure security, but this is beyond the scope of the topics to be discussed here.

The main takeaway is that cookies allow tracking clients' requests so that they can build a session with a state stored on the server side. Usually, the session cookie only contains the session ID value.

Pros and cons of cookie-based sessions

Now that you understand how cookie-based sessions work, let's summarize the main advantages and disadvantages of this approach.

Pros

Cookies are tied to a domain, that is, the browser sends them back only to the server that created them. They can be configured to be shared with subdomains, but by default, they can't be shared with other domains.
They require little storage on the browser. As a result, they have a low impact on the amount of data transmitted in each individual request.
They are automatically managed by the browser.

Cons

Without proper precautions, cookies can be exploited to perform CSRF and XSS attacks and improperly access active user sessions.
You may run into CORS problems when building apps across multiple domains.
They have a scalability problem. If you replicate the server to spread its high workload, you need to centralize the session state management. In fact, the server that receives a request with a cookie might be different from the server that generated it. This increases the complexity of each interaction.

Sessions and Tokens

Now you know that requests sent by a browser to a server are bundled into sessions using cookies. So far, so good!

But the Web is not just for browsers. Mobile and desktop clients, also known as native clients, can make HTTP requests as well. Leaving out clients that use an embedded browser to render web pages, native clients have no support for cookies. So the problem arises again: how do you build a session from a sequence of stateless HTTP requests? This is a job for tokens. 😎

Meet the token

Before going ahead to see how tokens can help you build HTTP-based sessions, let's try to understand what a token is. Generally speaking, "token" is an extremely generic term. However, in the context of this article, a token is just a string, a sequence of generated characters. In particular, it's a unique string that identifies requests coming from the same client.

What are you saying? Do you have the feeling of a deja vu? Well, actually, the basic idea of building sessions here is the same as the cookie-based one. The token value is the session ID. The native client must send it to the server along with each request.

Unlike the cookie case, however, you don't have a standard specification that tells you how to store and manage tokens. You can send them to the server as a header, in the query string, or in the request's body. You don't even have a standard tool that takes charge of automatically managing tokens, as the browser does with cookies. As a developer of the native application, it's your responsibility to take care of the tokens according to what the server expects. So, you have to store the token sent by the server somewhere in your application, maybe just keeping it in a variable, but possibly you would save it to storage. You have to add the token to each request sent to the server. If your application interacts with different servers, you have to take care to send the right token to the right server. In other words, you have to replicate much of the work that browsers automatically do with cookies.

On the server side, everything remains pretty similar to the cookie-based session management. The server maintains the session state, and the session ID is the key to retrieving it. The only difference is that the session ID is not passed in a cookie but is the token's value.

JWTs

The type of token described so far is what is known as an opaque token, that is, a token whose value has no specific meaning for the client application. But in the magical world of tokens, there are some "talking" ones, such as JSON Web Tokens (JWT). These are tokens containing encoded data that the client application can decode and read. This feature opens new horizons in session management: the inversion of state maintenance! Why keep the session state on the server if it can be kept on the application client?

Think about this for a moment. Keeping the session state on the server comes at a cost in terms of both storage and performance. Also, if the client needs certain information about the current session, for example, the user's profile or other technical data, it has to make calls to the server. Then there is the system scalability problem: if you replicate the server to spread its high workload, you need to centralize the session state management. In fact, the server instance that receives a request with a token might be different from the server instance that generated that token. Remember that this detail also applies to session cookies. In short, server-side session management is a bit of a hassle for the server.

JWTs can solve these problems. They encode into the token the information (or at least part of it) that the server would store in the session state. The client application will have the user's data and other stuff at its disposal as soon as the session starts and no longer needs to request them from the server for the duration of the current session. This frees some server memory and lightens its workload a bit.

Also, the application client sends the token back to the server when it makes a request. The server instance receiving that request will get the session state from the token, and the scalability problem is solved as well. That's fantastic! 🎉

JWTs and browsers

The good news is that JWTs are not specific to native clients. Web applications can also use them. You can use JWTs in your Single Page Application to lighten the server workload and promote server scalability.

Of course, you will need to do some additional work on your SPA's code. You have to decode the JWT and verify its signature to make sure it comes from the right server. You have to store it somewhere to prevent losing it after a page refresh. You also have to add the JWT to each request sent to the server and take care of sending the right token to the right server, in case you talk with multiple servers.

Wait! Isn't part of this stuff automatically handled by the browser when you use cookies? What about storing your JWT in a cookie? That's a great idea! You still have the session information at your fingertips and leverage the standard cookie behavior to handle the interaction with the server. Awesome! 🙌

While storing a JWT into a cookie allows you to delegate part of the token management to the browser, you should also consider cookies' disadvantages, as mentioned earlier.

Pros and cons of token-based sessions

Let's summarize the main benefits and drawbacks of using tokens to track your sessions.

Pros

Tokens are platform-independent: they can be used by web applications as well as native apps.
JWT tokens promote server scalability, since they can relieve the server of the responsibility of maintaining the session state.
Tokens can be saved in different types of storage: they can be stored in memory, in cookies, in the browser's session storage, or in native secure storage.

Cons

Tokens may be exposed to greater security risk since there are no standard protection mechanisms like those applied by browsers to cookies, and their management is left to the developer.
JWT tokens' size can become considerable if a lot of state information is maintained. This can affect the transmission of data to the server.
The server can't invalidate a JWT sent to the client, so its data can become outdated until a new request is made or the JWT expires.

ID Tokens and Sessions

Do you know the law of the hammer? It says that when you have a hammer, everything looks like a nail. Maybe that's what happens to some developers when they discover JWTs. Let me explain briefly.

When does a session start? Usually, after the first request is sent to the server. In its response, the server sends a brand new session ID (through a cookie or as a token), and everything begins.

And what is one of the most common first requests sent to a server by an application? An authentication request.

If your application uses OpenID Connect (OIDC) for user authentication, it will receive a JWT as a confirmation of successful authentication and, optionally, some data about the user. Can you consider this JWT a session token? If you are an OIDC-experienced developer, you know that the answer is "no!". But once I was asked by a developer: "Why do you still need cookies if you get a JWT from the authentication server?"

Let's clarify that the JWT sent by the OIDC provider is a token the named ID token, that has a very specific task: to prove that the user has been successfully authenticated. It's not meant for session management. You should still use cookies or yet another token to manage your application's session.

To learn more about the nature of ID tokens and their correct use, check out this article.

When your app receives an ID token, it knows that the user has been successfully authenticated, and it can start a session — but with its own session ID.

Session & session

Consider the following picture, which describes a common scenario involving OIDC authentication:

In this scenario, you have two active sessions:

The session with the OpenID Connect provider. This session starts with user authentication and tracks all interactions with the authentication server.
The session with your own server. This session starts when your application receives the ID token and tracks all the interactions with your own server.

What is the relationship between the two sessions?

They are largely independent of each other. If the session with your own server expires or is closed, all the data related to this session should be destroyed, including your ID token. However, the end of your session normally has no effect on the session with the OIDC provider. It can continue to be alive. This can be convenient because it allows the user to log back in to your app without having to re-enter their credentials as long as the session with the OIDC provider is active.

What happens if the session with the OIDC provider expires? As said before, the two sessions are independent of each other so, in practice, nothing happens to your server's session, unless you have specific requirements. In this case, you have to look for a solution to keep the two sessions in sync.

Remember, your ID token has nothing to do with your session apart from telling you that you can start it.

Access Tokens and Sessions

In the OAuth 2 context, access tokens allow a client application to access a resource, such as an API, on behalf of the user.

To learn more about access tokens and the difference with ID tokens, check out this article.

In a client-server interaction, access tokens are credentials indicating that the client application is authorized to send requests to the server. Fundamentally, they have a different role from a session token. That is, their job is not to keep track of requests made by the same client and possibly enable session state management.

Analogies with session tokens

Within certain limits, however, the use of an access token can resemble that of a session token. It is sent to the server with each request and the server uses it to retrieve information about the permissions granted to the client for this set of interactions. This is true both when the access token is an opaque token and when it is a JWT. Note the analogy with session state retrieval described earlier.

Differences from session tokens

Unlike session tokens, the information represented by access tokens is critical to client and server security. One strategy to mitigate the risk of theft and unauthorized use of an access token is to reduce its validity to a limited time.

Of course, this contrasts with the classic concept of a session as the set of potentially infinite interactions between the client and the server. However, the access token expiration must necessarily result in the current session's termination: the current token is no longer valid, and the user must authenticate again to obtain another access token for the client application. This would mean starting a new session.

Fortunately, OAuth2 provides refresh tokens, that is, tokens designed to renew an expired access token without user intervention. They provide a compromise between security needs and user experience.

To learn more about how refresh tokens work, read this article.

In addition, access tokens and refresh tokens can be used even when the user has no current active session. This is the so-called offline access mode. Think of an application that sends tweets or emails on your behalf on a scheduled basis: once you authorize it, you don't need to have an active session on that app when the scheduled job is triggered.

These features highlight how an access token does not have the typical characteristics of a classic session token. Its value is not simply a mechanism for grouping calls to a server and retrieving information related to user activity. It serves a very different purpose.

What About Rock’n’Roll?

If you reached this section, you have learned a lot about sessions, cookies, and tokens.

At this point, you may be wondering: what does rock'n'roll have to do with this? Well, to tell you the truth, it only served to... set the tone for the title of the article.

But in fact, there is some relevance. The interactions between a client and a server, the going back and forth between the two actors somewhat reminds of a kind of dance. Incidentally, the interactions between the client and the OIDC provider for user authentication are sometimes called the authentication dance. It may not be quite rock'n'roll, but it is still dancing. 💃🕺🏻

Summary

In this journey, you explored the session concept in the HTTP context and how it can be implemented using cookies or tokens. Beyond the main pros and cons that each approach brings, you learned how having a token does not mean that it necessarily has anything to do with session management. The case of ID and access tokens is an example: each of them has its own specific role in authentication and authorization.