Episode 87
What is REST, or what does it mean to be RESTful? It seems to mean something a bit different for everyone, when looking at various API implementations and their documentation. When examining different materials on the art of web API design, one eventually stumbles upon one particular name. Roy Fielding, a computer scientist who was member of the team behind HTTP 1.1 and URI specifications. During this undertaking he created a set of principles around HTTP object model that culminated in his PhD dissertation “Architectural Styles and the Design of Network-based Software Architectures” published in 2000. Probably not too many people in software development industry read PhD dissertations, but I’ve decided to do that, and extract the essence in this article.
First part of the thesis introduces various preliminary notions relevant in consequent parts. We learn the definition of Software Architecture and its elements including components, connectors and data. Then we proceed to configurations, properties, styles patterns and views.
Architecture Properties and Styles
After dealing with basics, we can advance to more specific architectures. Fielding focuses on network-based architecture and introduces a number of properties that can be used to evaluate particular design, namely:
Performance – exchanging information between system components comes at a cost. Network performance consist of attributes like throughput, overhead, bandwidth and usable bandwidth. User-perceived performance is about latency and completion.
Scalability – ability to support multiple components and facilitate communication between them.
Simplicity – includes complexity, understandability and verifiability.
Modifiability – software system changes a lot, so there is a lot stuff in this section. Includes evaluability, extensibility, customizability, configurability and reusability.
Visibility – ability of a component to monitor and mediate communication between two other components.
Portability – assess whether a system can run in different environment.
Reliability – how easy is to bring our system down.
In the next chapter, a number of architectural styles are introduced and assessed using above criteria. Styles are grouped into categories: Data-flow, Replication Hierarchical, Mobile code and Peer-to-peer.
REST constraints
Showtime. An answer to the question what is REST is: it’s an architectural style, or a coordinated set of architectural constraints, that restricts role/features of architectural elements and the allowed relationships among those elements. It’s not a specification, a standard or technology, it’s a style. Let’s examine the constraints, drawing from architectural styles listed earlier.
Client-Server
Pretty simple and natural constraint. Client and server communicate over the network and are decoupled, thus can evolve independently. The original intent was the principle of separation of concerns within a system, as client was presented as user interface while server was dealing with data supply. Keep in mind however, that client doesn’t have to be a front-end component or browser, it can be just a different back-end application, that itself can act simultaneously as server to other clients.
Stateless
Each request contains all the information required to fulfill it, and no state is stored on the server side. It improves visibility, because we can tell what happened within the system based on single entry from monitoring. It’s easier to recover from failures, because we don’t rely on state that might go down with the particular server instance. Scalability is easier as the servers are independent. Server performance is improved in a sense that resources can be freed-up immediately upon request completion. However, network performance suffers as each request has to carry all necessary information.
Cache
In order to offset negative network performance impact from the stateless constraint, we add a requirement that server responses should be explicitly or implicitly marked as cacheable or non-cacheable. This allows to reuse messages and improve user experience by obtaining requested data faster and with lower bandwidth. This also improves server performance and allows it to process more requests. Cache control allows to improve performance without loosing any freshness of the data, however we can improve it even if we further if the freshness is not that important. Part of designing cache strategy is to determine whether the data changes often, if it’s private and how sensitive it is.
Uniform Interface
Interfaces between system components should be general and decoupled from the implementation. It simplifies the system, improves visibility and allows components to evolve separately. The cost is performance, as in many cases specialized interfaces and protocols would yield better results. REST is optimized for large-grained www requests. There are four sub-constraints here: Resource identification in requests – meaning that specific resource is part of address line and not hidden in body. Resource manipulation through representation – the resource and representation are conceptually separate, and holding the representation and the metadata is enough to operate on it. Self-descriptive messages – meaning that each message has enough information attached to it to be processed. HATEOAS – the message should contain links that allow to discover all related resources and actions.
Layered System
System may be composed of multiple transparent layers. Each layer only has knowledge about the immediate layer. It promotes separation of concerns and component independence, and allows to dynamically add or remove functionality, as well as enforce security or legal constraints. Layers can also be scaled independently which improves resources utilization. The more network layers however, the worse the performance. The problem can be partially, or fully offset with suitable caching strategy.
Code on Demand
An optional sixth constraint states that server can extend and customize client functionality by transferring an executable code. Original example was an applet or script. This however assumes the technology used by the client, so it has to be applied with care. The most common example would be Java Script as part of website, that allows to extend the functionality ad-hoc, without changing or redeploying any software.
The further part of the chapter lists architectural elements of REST. Examples of those might seem a bit off, but keep in mind that the paper was written almost two decades ago. Data elements include: resource, resource identifier, representation, resource metadata, representation metadata and control data. Connector elements include: client, server, cache, resolver and tunnel. Component elements include: origin server, gateway, proxy and user agent. The final chapter of the thesis describes lessons learned when designing HTTP and URI standards and application of REST principles there.
Conclusion
Oh my, that was a lot of theory. If you managed to reach this point, I will sneak one more topic around REST – The Richardson Maturity Model. It’s not another PhD thesis, don’t worry. It was originally part of a Qcon talk and later become quite well known. Leonard Richardson analyzed tons of famous APIs and came up with four levels to determine how close to being the proper REST the API is. The model is well described in Martin Fowler’s article, which I recommend to read.
After tackling the design concepts and analogies a month ago, and after today’s piece of computer science theory, we have now a solid background to proceed with our Web API Design series in more practical direction. This will be the objective of the further articles on How to Train Your Java. There is a lot ground to cover, so stay tuned!
Image sources:
4 responses to “Web API Design Part Two: The Origins of REST”