Our API design journey was long and adventurous, but we are getting to the end of it. This is the final installment of the series and will include all the little things I had difficulty classifying to any previous chapters, or that I have forgotten about earlier as well as some closing thoughts. If you don’t know how to name a chapter or a presentation slide with a random list of tips – you can always slip away with “miscellaneous”.
Image by Grosnez
It took a bit longer than usual, I know. Last few months I had a big project called “let’s pass another cloud certificate” and I decided to focus on that. Before we continue, with our final piece of API design thoughts and ideas, let’s briefly recap what we did up to this point.
The Road So Far
Formally, it all started on Tuesday, 22 August 2017, 10:57:40 AM, according to Google Drive doc creation timestamp. I’ve created the first doc with notes, preparing to do a talk on web API design on the occasion of opening a new office of my company. I repeated this talk several times later and meanwhile decided to reforge it into a series of articles. Originally it was supposed to be three of them, but as you can see, I’m not the best at estimations. The first article was published in December 2017. Let’s quickly recap all parts of our journey:
Part One: Tech and UX – An answer to three important questions: Why, What and How and looked at API qualities from user experience perspective. API is a product, and a good product should be useful, usable, desirable, findable, accessible and credible.
Part Two: The Origins of REST – A dive into Roy Fielding’s Ph.D. thesis where the Representational State Transfer was born. Its architectural constraints are client-server, stateless, cache, uniform interface, layered system and code on demand.
Part Three: Core Concepts – Role of resources and relations between them in REST, naming conventions and functions. How a particular HTTP method works on a single resource and on collection.
Part Four: Collections – How to parametrize an HTTP request. Various ways of expressing operations on collections – sorting, searching, and pagination.
Part Five: Status and Error Handling – How to properly inform the user that something went awfully wrong. Meaning and usage of various HTTP codes and how many of them we need.
Part Six: Cache – They say there are two difficult things in computer science and we already covered naming… Cache policies, ETags and fine balance between speed, freshness, privacy, and sensitivity.
Part Seven: Security – Protecting API, authentication, tokens, JWT, OAuth, throttling and various tips on how not to get hacked.
Part Eight: HATEOAS – Hypermedia as the engine of application state. Pros and cons of using hyperlinks with API as well as a quick overview of existing formats.
Part Nine: Versioning – Dealing with API shapeshifting. Breaking changes, versioning schemas, URIs, parameters, headers, continuous versioning or none at all, all covered here.
Part Ten: Management – All the things around API that prevent us from descending into chaos. Lifecycle, productivity tools, documentation, support, traffic control, analytics, and monetization.
Image by Grosnez
Let’s get back to the main point of this episode. Here is a bunch of additional tips and ideas for your API that did not make to any of the previous chapters.
When creating or updating resources, it’s a good practice to return resulting resource as a confirmation. There might be several fields that are up to the server to fill, like ID or creation timestamp that might be useful to the client after receiving the response.
In the case of nested resources, it might sometimes require a number of calls to get all the required data. One solution to that is to define which nested fields should be fetched eagerly, for instance with an “embed” query param like this: …?embed=customer. For more elaborate structures and multiple nesting levels, you might want to consider using GraphQL.
Remember when we talked about the product being accessible to people with disabilities in the first chapter? If your disability is sitting behind a proxy that only passes through GET request, and you want to use another method, you might want to use a X-HTTP-Method-Override header. API might interpret the content of this header as the actual method and treat the request as such even though its simple GET. What if we can’t add the header for some reason? There is also an option to specify the desired HTTP method as query param like: …?method=put.
There is a standard header for indicating that the payload is compressed and how it is compressed: content-encoding. Possible values include: gzip – LZ77 encoding, originally used in Linux gzip; compress – LZW algorithm; deflate – zlib structure; br – Brotli algorithm and identity – no compression at all.
Please, please don’t invent your own time and date formats. Use ISO 8601 for that. UNIX timestamp is also not the best idea.
When creating IDs for your resources don’t just use the previous largest numeric id incremented by one. This might be a security vulnerability in your system as it allows to guess the resource id as well as performance bottleneck in distributed systems. For publicly available resources it allows a malicious client to get all resources by enumerating their IDs. So not only you give all your data on a plate, but also hand out an invitation to DDoS attack. To prevent that use UIDs that are long enough to be impossible to guess. If you have existing numeric IDs in your system, you can always map them to UIDs used in API.
Image by Grosnez
This is also called a correlation ID or support ID. In microservice systems, an original API request usually leads to a number of requests between microservices in order to work together to fulfill the original external one. When troubleshooting it’s essential to correlate those requests as part of a single flow and one way to do that is to add and propagate a custom header with original request ID. This might be later added to the final response and used to speed up support conversations with the customer if there are any issues.
Machine vs Browser
Some API vendors check User-Agent header and return HTML instead of JSON if the client is a browser, assuming that there is a human behind, so it’s better to return something human-readable. This is a bad idea, as content negotiation should be handled by Accept header. As a developer, you sometimes use the browser to check the response from an endpoint and you still want to get JSON there, as a non-developer you are probably not interested with API endpoints output.
When in trouble how to name a link in the response, you might want to refer to IANA Link Relations Registry to find a standard, suitable name, instead of inventing your own.
Health and Info
This is more geared towards internal APIs. Health endpoints are commonly used to indicate whether an application is capable to serve requests. Container orchestration might use single endpoint for that or might use one for healthiness (whether an application is fine or should be restarted) and readiness (whether the application is healthy, but not yet ready to serve requests, as it is initializing for instance). Aside from info for automatic actions, we might want to have an endpoint with information useful for developers, for instance a GIT commit from which the application was deployed, values of relevant config variables (as nowadays there are dozens of ways to override those and one might get confused at some point), system status or any other information that might help when troubleshooting.
External API Trap
Trap or opportunity, however, you might want to call it. Adding an external B2B API to a B2C oriented legacy system (or even B2B one) seems quite easy at first glance. We will just see what we got internally, put it behind some façade, add security, a few other things and voila! We have an external API. The reality, however, is often very different. Aside from all the aspects of API we discussed in this series, it is often the case that it’s not that easy to put together a bunch of internal APIs as each of them is a bit different at technical, and sometimes business level. People using various APIs internally for years had this figured out, but for the client we often have to clean up a lot of the mess, unify naming and add or restrict features of various APIs into one coherent product. Second thing is, we carelessly share a lot of things internally, that we might want to restrict for outside users due to security reasons. Third, and perhaps the most important – B2C systems might lack flows specific to B2B systems, like for instance invoice payment instead of credit cards payments. Adding external API, in that case, might mean a lot of changes much deeper than we initially assumed. Often it includes making a lot of small changes in many places, like adding a “partner-id” header or something like that. This can be a lot of work, but on the other hand might be an interesting opportunity to get the unique insights into the system from the beginning to the end, instead of sitting in one area forever.
We have reached the end of our web API design adventure. A series that was initially planned to be three or maybe four articles turned out to be eleven, which proves again that software developers are awful at estimating things. It could have been even longer, as with most of the topics we could continue to go deeper or branch, but you have to stop somewhere. I hope you enjoyed reading as much as I enjoyed writing and found some useful information that will help you design better APIs.
Image by Grosnez
We are three episodes short of a hundred now, so let’s continue. I’m not sure yet what’s coming next, probably it will be Something cloud or architecture related as I spent a lot of time in Google Cloud Platform recently. Or maybe something else entirely, we will see. Stay tuned for the next episode!