2013-07-11

HTTP/2 More that you need to know...

Ok, so the first two posts I made about HTTP/2 have been receiving excellent feedback. Lots of people seem to be quite surprised by the amount of new complexity that's being introduced. Sadly, I wish I could say I covered everything in those first two articles... I did not. Here's a few more bits of information an implementer would need to know...

Multiplexing, The Basics

When it comes to managing a request and response flow over a TCP connection, HTTP/1.1 is actually very simple. Generally speaking it's: Open a connection, Send a request, Wait for a response, then either send a New request or Close the Connection. Unless pipelining is being used, there's typically only one outstanding request at any given time, and responses must be sent in the same order requests are sent. Simple. Unfortunately, however, when it comes to efficiently managing TCP connections, HTTP/1.1 is extremely bad. The reasons are all very well documented so I won't go into it here.

Improving the efficiency of how the TCP connection is used is the key reason why HTTP/2 is now a binary protocol rather than a text protocol. Specifically, by dividing the flow of data between endpoints into frames, we enable the ability to multiplex request and response flows simultaneously over a single connection, allowing us to fully and properly utilize available bandwidth. This alone makes the additional complexity the binary framing layer requires worth the effort. If you're just now starting to look at HTTP/2, this fact might not be readily apparent, but the benefits here are very real.

Let's take a moment to understand how it works. There are a couple of new concepts you need to learn...

A "stream" is a virtual bidirectional communication channel within an HTTP/2 connection. There are 231-1 streams available within a single HTTP/2 connection. Whenever the client sends a new HTTP request, it consumes a single stream. Clients only ever use odd-numbered streams, which means a client can potentially send up to 1,073,741,824 individual HTTP requests over a single HTTP/2 connection (trust me, there are existing cases where that's not enough and the client has to create a new separate connection.. but that's not important here).

What is important is the fact that every frame sent over the connection includes a stream identifier, and any number of frames referencing any number of streams can be passed around at any given time over the connection. A client then, can open a new connection and send any number of simultaneous HTTP requests to the server, in any order it wants. The server would then respond to those requests in any order it wants... so you could end up receiving a response to your second request before receiving a response to your first. (which, as I will discuss later, could have rather interesting results when it comes to non-idempotent API requests).

Let's illustrate that for a second... suppose a client sends two requests to a server...


CLIENT         SERVER
------         ------
  |              |
  |------------->|
  |  request #1  |
  |------------->|
  |  request #2  |
  |              |
  |<-------------|
  |  response #2 |
  |<-------------|
  |  response #1 |
  |              |

When we're talking about loading an HTML page that has a large number of embedded links to JavaScript, Stylesheets and Images, this kind of parallel, multiplexed request and response flow is extremely beneficial when optimizing page load times and perceived performance. But here's an important question: What if request #1 is a GET for /image.jpg and request #2 is a DELETE for /image.jpg? Oooo, that could get quite interesting indeed, especially when it comes to caching. The net result is that REST API clients using HTTP/2 are going to have to be very careful when it comes to multiplexing sequences of non-idempotent / unsafe requests. As a general rule of thumb: Don't do it.

Server push is another area where multiplexing and non-idempotent requests become very.. um.. interesting. I haven't really covered server push too much yet so I won't delve into too much detail here, but consider the following scenario: The server is happily pushing Resource Foo to an intermediate cache sitting between it and the client. The client issues a DELETE on that resource after the server begins pushing it. The DELETE is received and processed by the server AFTER the server begins delivering the data to the intermediary. In order for everything to keep working correctly, the intermediary and the cache need to cancel the pushed resource, otherwise the cache could end up caching an old representation of an already deleted resource. In this case, the client might have no idea that the resource is being pushed so care needs to be taken by the server and intermediary to make sure things are handled correctly.

When starting down the path of enabling multiplexing and simultaneous request/response flows, one distinct problem became obvious: head of line blocking. That is, multiplexing is not going to happen if sending data on one stream is going to hold up the transmission of data for other streams. To address that issue, the notion of Stream Priority was developed.

Every stream has a priority. The priority is represented currently as a 31-bit value. The lower the number, the higher the priority. Frames for streams with higher priority ought to be fairly transmitted sooner and more often than frames for streams with lower priority. The client gets to determine the priority for http requests and responses based on its own needs and can dynamically change the priority in mid stream (e.g. imagine if a user switches away from a browser tab that is currently showing a video.. the transmission priority of that video can be dynamically lowered so it does not eat up your bandwidth... this is a good thing, trust me.)

Another challenge with multiplexing is that it becomes very easy for a sender to overwhelm the receiving endpoint with too much data. Thus, the idea of flow control within HTTP/2 was born. Flow control allows the recipient to place strict caps on the amount of data a peer can send it at any given time. And yes, it is completely independent of the flow control implemented at the TCP layer.

While all of this may be a bit overwhelming at first, the additional performance gains the design allows are quite impressive and well worth the added complexity, at least in my opinion. The other bits I've been complaining about (header compression, upgrade negotiation, etc) are all orthogonal to the framing layer, and, in my opinion are not critical to making HTTP/2 work. The framing layer and multiplexing are the one key killer feature of HTTP/2 that makes the new protocol worth the effort.

Let's talk about REST API's...

Ok, so one of the key ideas for HTTP/2 is that it ought to "preserve existing HTTP semantics" while providing this entirely new framing model. What does that mean really tho... It means that the fundamental notion of sending an HTTP Request and receiving an HTTP Response in return is unchanged. It also means that the fundamental abstract structure of HTTP requests and responses (Headers followed by Payload followed by Optional Trailers) remains intact. It also means that no changes are being made to the fundamental HTTP methods. A GET is still a GET, a POST is still a POST, etc. The differences lie mainly in how those semantics are expressed on the wire. However, as I describe above, multiplexing and server push make the overall picture far more complicated.

HTTP responses might be received and processed in any order! For instance, if you are sending multiple requests to a server via an intermediary, there is no requirement that the intermediary MUST forward those requests on the server in the same order it received them! So be very careful when sending a mix of idempotent and non-idempotent requests over a single connection! The results could be unpredictable, especially if you have no idea what the side effects of a particular non-idempotent operation are. I said this before, but it's worth saying again: take great care when multiplexing non-idempotent, unsafe requests!

One interest good thing about the new framing layer is that a mechanism is provided for indicating that a request has not been acted upon in any way. By sending an RST_STREAM frame using the REFUSED_STREAM error code (again, I won't go into the details yet), a server can indicate to the client that it did absolutely nothing with the stream... so even if the HTTP request uses a non-idempotent or unsafe method, it can still be safely repeated by the client without fear of any adverse side effects. This is a major improvement relative to HTTP/1.1.

That's all for now. Look for another post that delves even further into the features and issues...