Web API Design Part Five: Status and Error Handling

RSS

Web API Design Part Five: Status and Error Handling

22 Apr

Episode 90

In the previous installment of the web API design series we looked into ways of implementing various operations on object collections: filtering, searching, sorting and pagination. We also tackled ways to parametrize HTTP requests in order to employ mentioned operations.

After an article focused on requests, let’s work on our responses. We will talk about HTTP codes, which of them are actually worth using and how to use them. Before delving into specific codes, let’s talk about how we should inform our API consumer about problems.

There is a dragon in the server room

The dragon set the server on fire and we have troubles with processing your request, please try again late. Sometimes bad things happen. One of six qualities of good product design I wrote in Tech and UX is credibility and one aspect of credibility (besides not letting dragons into data centers) is the ability to deal with failure gracefully. If we are unable to process the request for whatever reason, either coming from us or from our client, we should try to provide as much feedback about the situation as possible. Aside from appropriate HTTP code, it’s good to include a JSON object that will contain few fields with details of the situation. This object should have the same structure for our entire API to simplify error handling on the client side.

The first piece of information would be some kind of error code – a member of an enumeration of all possible error situation that we can distinguish. It should be language agnostic and concise. One option here is to use arbitrary numbers or characters. The second option is to extend existing HTTP status codes, adding one or two digits to it, for example, error code 42201, an extension of HTTP status code 422 Unprocessable Entity might indicate a problem with arrival date not being after departure date in a journey. The third option is to use something a bit more descriptive, for example: “PAYMENT_FRAUD”. This way error mappings on the client side can be more direct and code might be more readable.

The second piece of information would be the message. Whichever way of representing error codes we choose, we should make sure that we include a brief explanation what happened, ideally both in response and in documentation and ideally both generated from a single source of truth, otherwise, there is a good chance they will start to drift apart. This might be localized for our user’s convenience. Apparently, not everyone around the world who writes code also speaks English.

The third piece of information is a link to more details. After all, we don’t have to squeeze everything we got on the problem in JSON and send it down the wire. We can point our users to an appropriate section of documentation or FAQ, where we can describe a problem in more details and offer possible solutions, ways to avoid it or deal it in the future.

We should not take details sharing too far. I’ve encountered some APIs that were returning stack traces in error response. This is acceptable for internal systems, but not for outside world as it might expose security vulnerabilities and allow bad guys to hurt us. Don’t spill your guts, unless you are doing open source.

We covered error details handling, now let’s now have a look at particular HTTP codes for various situations.

Information

1xx class of codes is not used very often, it’s good to know the first one though.

100 Continue – instead of sending a request with big body that might be rejected by the server, we can omit the body and include Expect: 100-continune Server can choose to respond with 100 Continue status to indicate that it’s ready to accept the full request with a body (and without Expect header anymore) or return some error code, without needlessly transferring the body.

Success

2xx class of codes represents successful request processing.

200 OK – standard success status. Some APIs use this exclusively even on errors
201 Created – used to indicate that result of a request is a resource creation
202 Accepted – used to confirm successful schedule of asynchronous processing of the request which does not yield immediate results to be returned
204 No Content – indicates successful removal of a resource
206 Partial Content – used in response to a request containing Range The response must include Content-Range header indicating which part of a payload is being transmitted

Allenjoy-backgrounds-for-photo-studio-Road-sign-dark-grape-staircase-cemetery-backdrop-newborn-original-design-fantasy.jpg

Redirection

3xx class of codes tells a client that additional actions must be taken to obtain results of the request. Responses should include a Location header with URL that should be followed.

300 Multiple Choices – there are more than one possible responses and client should choose one. As there is no standard way of choosing the response, this code is rarely used as opposed to 200, 400 or 500. In this status case, the Location header is included only if the server has a preferred choice.
301 Moved Permanently – means that URL is obsolete and a new one should be used instead.
302 Found – this status was commonly used incorrectly in the early days of web browsers, thus was later substituted with two other status codes: 303 See Other and 307 Temporary redirect and is now deprecated.
303 See other – means that requested resource is available on different URL and GET method should be used to obtain it.
307 Temporary Redirect – means that request should be repeated with the same method and body, but use provided URL. It’s different from 301 as provided URL is temporary and should not be taken for granted in the future request and its different from 303 as the original HTTP method should be used instead of GET.

Client Error

4xx class of codes means that client used the API incorrectly, either from technical or business point of view.

400 Bad Request – is a generic status indicating that server can’t make any sense of the request, and it should not be repeated without modification. It can be used in low-level errors, such as malformed JSON body in POST.
401 Unauthorized – means that server doesn’t know who the user is. According to standard security nomenclature, this should, in fact, be called “Unauthenticated” instead, but we are probably stuck with this one now.
403 Forbidden – server knows the user identity, but the user has no privileges to perform the desired request. Again, this status name should, in fact, be “Unauthorized”, bet well, it was already taken…
404 Not Found – probably the most famous HTTP status code. We don’t know whether the resource ever existed or if it will be available in the future.
405 Method not allowed – the resource is present, but HTTP method used is not available. This should be used when the method is globally not allowed, for example, PUT on an immutable resource. It should not be the case where the method is unavailable due to lack of privileges of a particular user.
409 Conflict – request contains data that is incoherent with the current state of the resource, for example when we want to delete files from an AWS S3 bucket that is not empty.
410 Gone – seems similar to 404 Nor Found, however, in that case, it’s an indication that there was, in fact, such resource but it was removed intentionally and the condition is permanent.
422 Unprocessable Entity – a bit similar to 400 Bad Request but on a higher level of abstraction, meaning, for example, that JSON body has all its parenthesis and fields in the right place, but arrival date is set before departure data, which violates our business rules (unless we provide time travel services). It’s also different from 409 Conflict, as it doesn’t refer to a current state of the resource and there might be no existing resource yet actually.

Server Error

5xx class of codes is used when our system screwed something and it’s not the client fault. Sometimes its difficult to clearly put the line between 4xx and 5xx, especially in case of business errors and third-party presence.

500 Internal Server Error – there was an unexpected event that prevented the server from doing its job. One can argue whether some situations qualify for server or client error – for example, we want to buy a book, but it turns out its sold out. Is it a client conflict or server error?
501 Not Implemented – HTTP method on the resource is not supported, but it is likely to change in the future, as opposed to 405 Method Not Allowed. It might be thought of a client error, but it’s a nod towards the client that “hey, it’s not working yet, but we know you are waiting for it so come and check later, ok?”.
502 Bad Gateway – we got an invalid response when acting as a gateway, so making a request to the third party in order to fulfill our client’s request. It mostly has value for internal APIs to indicate that’s the problem is essentially neither in client or us, but someone else screwed up (an enterprise pattern of shit deflector). In case of external API, our client doesn’t really care to whom we need to talk to fulfill his request. Fraud detection service we use gives us a malformed response? Well, it’s our problem now form the client perspective, so we should deal with that appropriately.
503 Service Unavailable – means there is a planned maintenance of our stuff, and the client should get back later. We should use this on our front outskirts when bringing down our backends if we are unable to do the zero-downtime deployment.
504 Gateway Timeout – means that server did not receive the response in the time when acting as a gateway. Its usage has the same reasoning behind as 502 Bad Gateway, depending on whether we are within our internal comfort zone or we are dealing with external users.

How many do I need?

That’s not an easy question. It depends on the type of users we have and our philosophy. There are 75 statuses in HTTP 1.1. It seems that most major APIs use between 5 and 20 of them. Facebook use only 200 OK and then specifies what happened in details, but I do not recommend that approach. There is no good reason not to use HTTP semantics to indicate what is happening in our API, as it is just easier to work with. Personally, I think that 10 to 15 is a reasonable amount.

With this length-record-setting article, we made it to the end of the core part of web API design series. In the next episodes, we will focus on more supportive aspects of APIs that are not that visible from the business point of view, but nonetheless important. Stay tuned!

Image sources:

Tamer Karatas

2 Comments

Posted by gvaireth on April 22, 2018 in API, Technology

Tags: API, HTTP, REST

2 responses to “Web API Design Part Five: Status and Error Handling”

Pingback: Web API Design Part Six: Cache | How To Train Your Java
Pingback: Web API Design Part Eleven: Miscellaneous | How To Train Your Java

How To Train Your Java