-- Photo by Omer Rana on Unsplash
When working on real production system recently I asked myself a question:
Should REST API be constrained by the current architecture choices?
To illustrate the problem I invented an example.
Let's imagine a system which is used for renting cars. The business is sort of Airbnb for "car rental". Small companies which rent cars can register their spot in the system to access wider range of potential customers. The system allows them to manage their cars but not any other rental spot cars.
Let's imagine we're a startup and build the system. We start small and use relational database to store all cars in a single table. Each car is identified by unique ID in this table.
We want to export a RESTful API for our system.
Among others, we would need APIs to browse all cars in a spot and get single car details.
The API for listing all cars in a spot could look like:
GET /spots/{spotId}/cars
It would return a list of cars from which we could get IDs of the cars.
The API for getting a car by ID could look like:
GET /spots/{spotId}/cars/{carId}
or
GET /cars/{carId}
Since we want to be aligned with good practices of API design, we've decided to go with the longer path, because the cars are resources which cannot exist alone and always belong to a given spot. The path /spots/{spotId}/cars
clearly explains the relationship.
However, the spotId
in the path is redundant.
Since we have all the cars in single table and we know the car ID, because we got it from the /spots/{spotId}/cars
endpoint, the only variable we really need is the carId
.
Of course, in our relational database we will have relation from car to a spot and we could add the spotId
in out query, but it's not crucial.
E.g. we could have a query like:
select c.* from cars c inner join spots s
on s.id = c.spot_id
where s.id = :spotId and c.id = :carId
but it would get the same result as:
select * from cars
where id = :carId
So, should we use /spots/{spotId}/cars/{carId}
or /cars/{carId}
as the endpoint path?
I've been thinking about it and both options have pros & cons. As mentioned before, the longer one sounds more appropriate from the semantics of the API perspective, but the shorter one is easier to use and implement in the current state of the backend architecture.
If we think about the evolution of our service, then we can imagine that we may want to split the cars table into separate per each spot. This may happen if the volume of data grows, or if we want to distribute database and set several instances in locations nearby to each spot (for better performance and scaling). Each car would then be unique but within the single DB instance (or instances if we consider a cluster of instances in specific location for given spot). Then we could only distinguish a car by a pair of spotId
and carId
and the longer API path would make more sense.
Finally, I answered to myself:
API is not still. When the architecture evolves so does API.
What currently makes sense and is simple (/cars/{id}
) may not be applicable anymore in future. In future, if I need to split car storage into separate table/database for each car rental spot, the new API may look like: /spots/{spotId}/cars/{carId}
. On the other hand this might never happen and as Donald Knuth used to say “Premature optimization is the root of all evil”.
What is your answer to the problem. If you have more thoughts please share with me and other readers in comments.
Top comments (25)
We can all picture what a “car” represents, but what does a “spot” mean in this context? A physical parking spot? If so, think about the real relationship between them as entities. I’d argue that a car can exist without a spot (what if it’s out being driven and another car gets put in the previous spot?) and that a spot can exist without a car (the car that used to be in that spot got crashed, now what?)
If each one can exist in your system without the other, then I think there’s no real hierarchy between them and having both a /car/{carid} and a /spot/{spotid} endpoint makes more sense.
Why don't use grapql instead.. You only need to design the query.. Not the endpoints..
Sure, but my post refers to REST API, of course you can use different technologies where you don't have this problem, but sometimes you're constrained to use REST.
Maybe I was not precise enough, please see my answer above.
I've been dealing with this lately with regards to shared user profiles. A user can be a member of any profile so long as they are invited, and can perform CRUD operations on entities owned by the profile.
Profiles have private items with all the standard CRUD ops, so I've gone with the "long approach:"
/profiles/:profile_id/items/:item_id
. In my case, I think this is semantically correct because a user should be operating in the context of a profile. If the given profile tries to update an item owned by another profile, I issue anunauthorized
response informing the user of the API that the given profile doesn't own the item.The short approach, on the other hand, has no sense of profile context. Though there are plenty of cases where the nested resources don't make sense, I think they are especially nice when a user of your API operates under a given context, such as a profile
I've been digging more and encountered this post from Google which also has a section mentioning the concern I described in my post as well as many other interesting ideas. You may find it worth reading too :)
Had same thoughts when started new product few weeks ago. Ended up with exposing sub object without the root eventhough the root can be extracted is like a postcode without an actual address to the mailman or the recipient. APIs are for people.
Either option is entirely logic and as a developer using your API I would understand both use cases. As long as it's well documented then both options are perfectly valid.
That said, spots and cars are seperate entities that could exist without each other. If you wanted to know where a car was at any given time then having spot id in the request to get that info seems strange.
I'd probably go for car/{carid} and spot/{spotId}
By saying in post that car is a part of a spot I mean that you cannot rent the same car in other spot or park it there as each spot is managed by completely independent company, same way you have flats for rent on Airbnb.
Aha! I see, sorry my misunderstanding.
Generally the concept of a resource should be fully decoupled from any middleware that generates the representation of that resource. A resource is a concept, and, while the representation is dependent on your architecture, the resource should remain conceptually independent and coupled only to the problem domain.
Additionally, it's recommended REST practice to try to make your URIs as still and unchanging as possible, as the concept of the resource should change very slowly if at all (see "Cool URIs Don't Change" w3.org/Provider/Style/URI). The representation of this resource can change as often as it want so long as it conceptually equivalent to the original resource.
Yes, this makes also a lot of sense. This reason would probably win if I am doing a publicaly available API.
I had the exact same thought because I had users, memberships, and organizations. Each membership connects a user with an organization, so each org can have multiple members and each user can be part of multiple orgs.
The basic RESTs are easy:
GET /users/:id/memberships
GET /organizations/:id/memberships
But it’s the same problem when thinking of a single membership:
You could GET /memberships/:id or have two endpoints, GET /users/:id/memberships/:mid and GET /organizations/:id/memberships/:mid. We chose the second, longer one and added the additional condition in the query because it felt semantically logical.
Discussing href structure has nothing to do with RESTful architecture. To be restful you need to be discussing the hypermedia controls between resources. URLs are irrelevant in RESTful.
It depends.
As far as I know there are 4 different maturity levels of REST according to Richardson. You're probably referring to 3rd level also known as HATEAOS. In practice I have never worked in project doing the API this way. Even big companies like FB or Google does not build their APIs this way because in most cases API clients don't need this kind of intelligence. So if your client needs this, then OK, otherwise it just adds too much work overhead and clutters the API.
Usually what I do is the 2nd level or maturity so I would not agree with your saying.
Moreover, please note that I am discussing href structure in context of service architecture (backend app design), not the REST architecture (multimedia headers, http methods, maturity).
It does not depend. Fielding gave one definition of rest despite what Richardson says. But EVEN if we take Richardson, not a single level mentions href structure so again..this has nothing to do with RESTful at all.
Google and fb build tons of RESTful services. Their entire html applications are RESTful. You are way off track here.
there is no such distinction between restful architecture and backend design as you say. RESTful goes from service to client, that’s a core tenant of its very definition. Try to think about your problem restfully and you’ll soon see why your questions are so confusing now. Your trying to model your problem domain with a broken language.
Ok, we have different opinions and I appreciate it. So what's the right API for my problem domain according to you and can you justify it?
That’s a fair question, lemme think about it today
Instead of focusing so much on the architecture constraints, I would concentrate on defining common use cases for my API. If a use case makes sense for my business, there is a big chance I will need to change your architecture, your db design or both.
Conceptually speaking, API resource ids should be decoupled from your legacy data models. Otherwise your API will eventually be limited and there would be not much room for API new versions.
In order to enforce that decoupling, when designing a new API, I always consider building an intermediate data model for my APIs that can bind resource ids with legacy apps ids.
Mostly I agree with what you wrote except decoupling API IDs from data model IDs. I've never seen this in practice. How would you achieve this? e.g. if I have a UUID in my DB record how would you translate it to API ID?
I like the shorter URL for a couple reasons: