Web API Design Part Six: Cache

20 May

Episode 91

In the previous episode of this series, we talked about communicating response status through HTTP codes and error objects with several fields describing what the hell happened and how to deal with it. It was the last part of the core business aspects of web APIs, those that are the most visible to our clients and connected with a product domain. Today we will start a second big part of our journey, supporting aspects – things that are more generic, technical, a bit in the background and not always noticeable from a business perspective, but nonetheless important. First topic here will be cache.


As we might remember from an episode about the origins of REST, cacheability is one of six fundamentals of REST. The idea is that every response from the server must contain an information whether the response can be saved and reused, for how long and on what terms.


Why do we do this? For speed and resilience mainly. Having proper caching can save our servers and clients tons of work in terms of both bandwidth and computing time. It might also save our ass in case of system failure – content still can be served from cache. There are two basic type of cache – forward and reverse.

Forward cache is located on the client side, which might mean the originating computer itself, or the client organization network. If the response can be retrieved from client’s computer cache, we might get rid of any network activity altogether. If there is too much content to store on the client machine, a shared cache within an organization can be employed.

Reverse cache is located on the server side. If the response can be retrieved from it, we save our backend resources like application servers and databases and we can deliver response faster, especially when using a dense content delivery network, which might sit much closer geographically to the client than our origin servers.


In order to enable caching, we need to add Cache-Control header to our response. It can contain several directives (more than one at once). Let’s have a look at those controlling general caching policy.

Private – default value, the response can be stored only on client’s machine and is not intended for others to access.

Public – response can be stored on the intermediate machine, which usually means it contains data common for many clients, like company logo on the website.

No-Cache – it doesn’t mean we can’t benefit from caching, we just have to validate whether the response is still valid every time we access it which guarantees freshness.

No-store – this, in turn, means that nothing can be cached. It should be used for private and sensitive data, like medical or finance stuff that should not be persisted even on the client’s machine.

Only-if-cached – request should be sent only once and then the response can be served from cache without bothering origin server anymore. Request-only directive.

No-transform – some proxies might encode images or other media in order to reduce its size (and possibly quality). This directive prevents it.



When not using No-Cache or No-Store directives, clients and servers can control for how long the cached response is fine, and when it’s time to obtain a new one.

Max-Age – defines how long can the response be kept in the cache without contacting origin server, in seconds.

S-Max-Age – same as above, but relevant to shared caches and ignored by private ones.

Max-Stale – request specific directive indicating that the client is willing to accept a response cached beyond its expiry time. An optional number that follows defines how much in seconds.

Min-Fresh – the response has to have at least given number of seconds until expiry time to be taken from the cache.

Expires header – a separate header that says when the response is no longer valid. If both Max-Age and Expires are set, then Max-Age overrides Expires.


When we operate under No-Cache directive, we have to validate whether given resource changed since last time we accessed it. There are two mechanisms for that, namely content-based (ETag) and time-based (Last-Modified)

ETag header – also known as Entity Tag, included in server response is a unique fingerprint of the resource. The client can save it and use its value in If-None-Match header in the subsequent GET request. If the resource was not changed, the server should return a response with HTTP status code 304 Not Modified, which means that that client can use cache version. If resource changed, the new version will be returned from the server and new ETag will be generated. When modifying resource with PUT, we can use our saved ETag and use it with If-Match header. If the resource was unchanged, we are good to go. Otherwise, we will get an HTTP 412 Precondition Failed and we should obtain a current version of the resource in order to avoid conflict. ETags can be strong (binary) or weak (semantic). The latter means that although the resource is not the same byte to byte, it can be considered as such for client’s purposes.

Last-Modified header – is a time-based alternative to ETag. Instead of resource fingerprint, we rely on last modification date from the response and use If-Modified-since header in subsequent requests. For PUT we can analogously use If-Unmodified-Since. If conditions are not met, the server should response in 304 Not Modified or 412 Precondition Failed respectively. ETag takes priority over Last-Modified.

Must-revalidate – a Cache-Control directive that ensures that client will refresh an expired response.

Proxy-revalidate – same as above, but relevant to shared caches and ignored by private ones.



Having necessary tools at our disposal, let’s look at tradeoffs in various situations when the cache is at play.

Fast vs Fresh. The longer we allow to cache, the more resources we save. Determine whether it is important for given resource to be up to date or not and how often is it going to change in practice.

Public vs Private. If data is specific to one user, it should be private; if it’s common it can be public.

Sensitive vs Transparent. If data is confidential, it should not be kept even on client’s machine (or especially on client’s machine in many cases…) in order to reduce security risk.

Invalidation. What if resource changes while caching policy does not require it to be validated, but we made a mistake and are now in a hurry to force clients to use the new resource? We can invalidate reverse caches we control, but we don’t really have a way to force clients to do that on their side. Aside from waiting, we can change the resource URL, effectively creating the new resource, and include it in responses whenever we point to it.

Lots of stuff. If you feel confused, don’t worry. There are two difficult things in computer science – naming things, caches, and off-by-one errors. Caching strategy is worth to take care of when designing web API, as it might make a tremendous difference in performance. Performance in web systems translates directly to the money we can save on our production environment cost and the more traffic we handle the bigger the difference.

In the next episode, we will continue with other aspects of good web API design. If you prefer to listen instead of read – I will be speaking on the subject on Devoxx PL conference at the end of June, feel free to contact me if you want to grab a beer there.

Image sources:


1 Comment

Posted by on May 20, 2018 in API, Technology


Tags: , , , ,

One response to “Web API Design Part Six: Cache

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: