Two months ago, we started with motivations behind web APIs and looked at their design from UX point of view. The important conclusion was, that API and its ecosystem is to developers what GUI is to regular web applications users. A month ago, we looked from the scientific point of view at the properties of a modern web systems architectural style, REST, through the lenses of Roy Fielding’s famous Ph.D. dissertation.
Having those foundations, today we are going to get our hands dirty and talk about how to actually get the work done. Today we will talk about resources and representations, naming, relations, HTTP methods, collections, functions and sanity checks.
REST web API is built around exposing representations of resources being part of our system. The distinction is important: resource is some piece of data stored on our system or accessed ad hoc from somewhere else. It might be a record in a relational database, a document in NoSQL, a file on disk, or a stone tablet with hieroglyphs stored in an ancient tomb. Representation is how we send those data to whoever requested it, most likely as a JSON file, HTML, plain text or maybe piece of paper carried by homing-pigeon. Each resource is identified by unique URI which should end with a plural noun. That way we can easily represent both single object and list of objects. We will get into detail about that later.
We should avoid exposing implementation details in the URL (like including words “Apache” or “Spring” in it) in order to keep API surface clean and technology-agnostic (even if we are just talking names) and minimize the potential vector of attack that could take an advantage of the knowledge about system internals.
Camels vs Snakes vs Kebabs
My personal preference is for camelCase (perhaps I’m biased) and I will use it throughout this series. Aside from considering our audience preferences, however measurable it is, when choosing the style, the important takeaway here is consistency. Choose one and stick with it.
Between resources, there are relations. With it comes the design decision, how they should be represented in URLs. In general, if all members of the relation can exist independently, the relation should be a separate resource that can be created, deleted or modified, for example, groupMembership/123 would indicate a many-to-many relation between groups and people. For one to one relation (think of two people and marriage – in classical western meaning at least) it might be tempting to define it as simply hyperlinks in the respectful objects, however it still makes sense to have a separate resource to represent it as we can then manipulate the relation independently (like, you know, get a divorce…).
Another issue are relations where one object can’t exist independently. Think of a room inside a building. Well, in theory, we could argue, that the building can collapse, and the room can still exist as part of a historical address of a company, but you get the gist). In that case, we don’t usually need a separate representation for the resource as it doesn’t really make sense to manipulate it – we just express that one resource belongs to another via the URL of dependent one. In case of buildings and rooms it might look like this: /buildings/123/rooms/456 (notice plural forms).
Let’s examine what should happen when we use particular HTTP method on a collection of objects or a single object.
GET – on collection retrieves the entire collection, which is usually a bad idea, unless the collection is of small and constant size. In any other case, we should provide means of limiting, filtering, sorting and paging, which we will get to later. GET on single object retrieves it, no gimmicks here.
POST – on collection creates a new object in a scenario where we don’t know the object id and thus URL. On the single object, it’s a bit tricky. Some time ago, before PATCH was introduced and gained popularity, it was a way to partially update an object, and many sources still suggest it. Nowadays there is no reason to do this that way, so it should result in an error.
PUT – on collection appends a collection of objects to an existing collection. One might think that consistency dictates that it should replace the entire collection instead, but it doesn’t have much sense in practice. PUT on single object replaces that object if it already exists. If the object does not exist there are two options – it either creates a resource with a given id or results in error. The first option makes sense if we exercise control over the client who supplies the id. It’s generally not the best idea to let strangers decide on object ids, even if it’s only a key in URL and not in the database or other means of storage, as it might be error prone and impacts security and possibly performance.
DELETE – delete on a collection, well, destroys an entire collection which might often not be the best idea. We should carefully verify if the client is authorized or perhaps forbid that option entirely, especially if we are talking about publicly available external API. DELETE on a single object is simpler as it has less destructive potential.
PATCH – this method was added in 2010 in RFC 5789. It intends to update an object partially, so it’s not really suitable for collections. It’s tricky to use as it’s not just about sending a part of JSON object, but has a semantics based content type: application/json-patch+json is described in RFC 6902 while merging application/merge-patch+json in RFC 7396. Details are a bit out of the scope of this article, but I recommend reading this one if you want to know more, but not as much as to delve into RFCs.
Functions and the Dogma
What if we want to express a function in REST API, so use a verb instead of a noun? There are two ways to do it. First is: mask the function as a resource. Often a result of a function is a creation of some resources. For example, canceling a train ticket might sound like an action, which does not fit the REST idea very much, however, we can think of it a creating a cancellation object that has its own identity and is connected with a ticket that was to be canceled. This is especially useful if we would like to store some additional information, like perhaps a reason to cancel or a timestamp. The other way is… to bend the rules and just create the damn function. REST is not a religion but an architectural style. When architecting stuff, you can’t always fit everything perfectly and sanity check has to be performed in some cases. If we are dealing with exceptions and special cases, let’s make sure to document it properly so it’s clear what’s going on. This is a general rule – do stuff that makes sense I and follow rules but not blindly.
Huh, that was a long one and there is still so much to dive into. While writing this series I’m looking at slides for my presentation on the topic and I’m currently on 17th out of 42 total. To be specific, this article covered 5 slides, previous one covered about 2 and the first one somewhere near 4 excluding titles, introductions, agendas and yadda yadda yadda. If you want to actually hear me talking, you are welcome to attend Wrocław Java User Group meetup on February 20th. I will also be speaking at Boiling Frogs conference on March 24th, so why not grab a beer there. Meanwhile, in the next episode we will continue with the art of web API design and talk about handling collections and perhaps errors and statuses, we will see how it goes. Stay tuned!