K for moesif

Posted on May 6, 2019 • Originally published at moesif.com

REST API Design Best Practices for Parameters and Query String Usage

#api

Cover image by Edwin Young, on Flickr.

When we are designing APIs the goal is to give our users some amount of power over the service we provide. While HTTP verbs and resource URLs allow the basic interactions, often it is important to supply some additional functionality, otherwise, the system can become too cumbersome to work with.

One example of this is pagination: We can't send every article to a client in one response if we have millions in our database.

A way to get this done is parametrization.

What is Parametrization

Generally speaking, parametrization is some kind of configuration for a request.

In a programming language, we can request a return value from a function. If the function doesn't take any parameters, we can't directly affect this return value.

Same goes with APIs, especially stateless ones like REST APIs, as follows from this Roy Fielding quote:

All REST interactions are stateless. That is, each request contains all of the information necessary for a connector to understand the request, independent of any requests that may have preceded it.

There are many ways in HTTP to add parameters to our request. The query string; the body of POST, PUT and PATCH requests and the headers. Each has its own use-cases and rules.

So we have to ask some questions before deciding where to put the parameters and then we have to check how we put it best in the right place.

The simplest way to get all data in without too many constraints is to put everything in the body. I saw many APIs in my time that worked that way. Every endpoint uses POST and all parameters are in the body. Especially legacy APIs that grew over decades and accumulated more and more parameters have to do it like this, the sheer amount of data wouldn't fit in the query string.

While this is more often the case than not, I'd consider it an edge case in API design. If we ask the right questions up front, we can prevent such results early on.

What kind of parameter do we want to add?

The first question we should ask ourselves, is, what kind of parameter we want to add?

Maybe it's a parameter that is a header field already standardized in the HTTP specification.

There are many standardized fields and sometimes we reinvent the wheel by adding this information at another place. I'm not saying we can't do it differently, GraphQL, for example, did in many places what would seem crazy from a REST perspective and it still works, but sometimes it's just simpler to use what's already there

For example, there is an Accept header that allows us to define the format, or media type, the response should have. We can use this to tell the API that we need JSON or XML. We can also use this to version or API responses.

There is also a Cache-Control header we could use to prevent the API from sending us a cached response with no-cache instead of using a query string as cache buster (?cb=<RANDOM_STRING>)

Authorization could be seen as a parameter too, since it can lead to different responses from the server if one requests data authorized and unauthorized, depending on the detail of authorization of the API. HTTP defines an Authorization header for this purpose.

After we checked all the default header fields, the next step would be to evaluate if we should create a custom header field for our parameter, or put it into the query string of our URL.

When should we use the query string?

If we know the parameters we want to add don't belong in a default header field and aren't sensitive, we can then check if the query string is a good place for it.

Historically the use of the query string was, as the name implies, to query data. There was a <isindex> HTML element that could be used to send some keywords to a server and the server would respond with a list of pages that matched the keywords somehow.

Later the query string was repurposed for web-forms to send data to a server via a GET request.

So the main use-case that fits the usage of the query string is filtering and two special cases of filtering: searching and pagination. I won't go into detail here, because we already tackled them in this article.

But as the repurposing for web-forms shows, it can also be used for different type of parameters. A RESTful API would use a POST or PUT request with a body to send form data to a server, so that isn't a use-case here, but still, there can be other parameters.

One example would be a parameter for nested representations. By default, we return a plain representation of an article, and when a ?withComments query string is added to the endpoint; we return the comments of that article in-line, so only one request is needed.

If such a parameter should go into a custom header or the query string is mostly a question of developer experience.

The HTTP specification states that header fields are kinda like function parameters, so they are indeed thought of as the parameters we want to use, but adding a query string to an URL is quickly done and more obvious than creating a customer header for this.

These fields act as request modifiers, with semantics equivalent to the parameters on a programming language method invocation.

Parameters that stay the same on all endpoints are better suited for headers. For example authentication tokens get send on every request.

Parameters that are highly dynamic, especially when they are only valid for a few or one endpoint should go in the query string. For example filter parameters are different for every endpoint.

Bonus: Array and Map Parameters

One question that crops up rather often is, what to do about array parameters inside the query string?

For example, if we have multiple names we want to search.

One solution is the use of square brackets.

/authors?name[]=kay&name[]=xing

But the HTTP specification states:

A host identified by an Internet Protocol literal address, version 6[RFC3513] or later, is distinguished by enclosing the IP literal within square brackets ("[" and "]"). This is the only place where square bracket characters are allowed in the URI syntax.

Many implementations of HTTP servers and clients don't care about this fact, but it should be kept in mind.

Another solution that is offered is simply using one parameter name multiple times:

/authors?name=kay&name=xing

This is a valid solution but can lead to a decrease in developer experience. Often clients just use a map-like data-structure that goes through a simple string conversion before added to the URL, this can lead to overriding of following values. A more complex conversion is needed before the request can be sent.

Another way is to separate the values with ,-characters, which are allowed unencoded inside URLs.

/authors?name=kay,xing

For map-like data-structures, we can use the .-character, which is also allowed unencoded.

/articles?age.gt=21&age.lt=40

It is also possible to URL-encode the whole query string, so it can use whatever characters or format we want. It should be kept in mind that this can also decrease developer experience by quite a bit.

When shouldn't we use the query string?

The query string is part of our URL, and our URL can be read by everyone sitting between the clients and the API, so we shouldn't put sensitive data like passwords into the query string.

Also, the developer experience suffers greatly if we don't take URL design and length seriously. Sure, most HTTP clients will allow a five-figure length of characters in an URL, but debugging such kind of string is not very pleasant.

Since anything can be defined as a resource, sometimes it can make sense to use a POST endpoint for heavy parameter usage. This lets us send all the data in the body to the API.

Instead of sending a GET request to a resource that carries many parameters in the query string that could lead to a really long undebuggable URL, we could design it as the creation of a resource, like a search-resource for example. Depending on the things our API needs to do to satisfy our request, we could even use this to cache our computation results.

We would POST a new request to our /searches endpoint, that holds our search configuration/parameters in the body and gets back a search ID we can use later to GET the results of our search.

Conclusion

As with all best practices, our job as API designers or architects isn't to follow one approach as "the best solution" but to find out how our APIs are used.

The most frequent use-cases should be the simplest to accomplish and it should be really hard to do something wrong.

So it's always important to analyze our API usage patterns right from the start, the earlier we have data, the easier it is to implement changes if we messed up in our design. Moesif's analytics service can help with that.

Moesif is the most advanced API analytics service used by over 2000 organizations to measure usage patterns of their customers.

If we go one way, because it's simpler to grasp or easier to implement, we have to look at what we get out of it.

As nested resources can be used to make URLs more readable, they can also become too long and unreadable if we nest too much. Same goes with parameters. If we find ourselves creating one endpoint that has a huge query string, it could be better to extract another resource out of it and send the parameters inside the body.

Moesif is the most advanced API Analytics platform, supporting REST, GraphQL and more. Over 2000 organizations use Moesif to track what their most loyal customers do with their APIs. Learn More

Originally published at www.moesif.com

Top comments (4)

Connor Peet • May 14 '19 • Edited

If such a parameter should go into a custom header or the query string is mostly a question of developer experience.

I tend to favor query strings for a few reasons. Aside from the theological beliefs, a few of the reasons are concrete or footgun-avoidance :)

If your API is or can be called cross-domain, adding a custom header will require browsers to do a CORS preflight check where they might have been able to omit it before--making requests to your API slower!
Also, if you're cross-origin, you need to remember to add the header to your Access-Control-Allow-Headers as appropriate.
If you want your endpoint to be cached, you also need to remember to add the new header in the Vary headers in your response.
Query strings generally show up better by default in your logging and analytic tools.

Mert Yazıcıoğlu • May 16 '19

If your API is or can be called cross-domain, adding a custom header will require browsers to do a CORS preflight check where they might have been able to omit it before--making requests to your API slower!

Although that’s technically true, you’ll most likely trigger the CORS preflight check anyway due to the Content-Type as it’s quite unlikely you’ll be using one of the plain-text or form data types.

Conal Tuohy • May 14 '19

I don't really see the point of your criticism of the practice of having multiple URI parameters with the same name. You point out that some developers might add parameters to an overly simplistic data structure like a map, before encoding the map as URI parameters, and that they might lose data that way unless they used a more complex procedure to build the URI. That's certainly true, but it would be a silly mistake to make, and the alternative you suggest, of encoding multiple parameter values as a comma-separated string, is no less complex.

K • May 14 '19 • Edited

I just had a common JavaScript data structure in mind.

const filters = {
  limit: 25,
  afterDate: Date.now(),
  names: ["Kay", "Xing", "Derric"]
}

But you are probably right, doing filters.names.join(",") isn't much simpler than filters.names.map(n => "name=" + n).join("&").