2013-07-10

HTTP/2 Status Update

So work on HTTP/2 is progressing. The first "Implementer's Draft" has been published and work is underway to get initial implementations ready for interop testing at an upcoming face to face event in Germany in August.

For those of you who may not be familiar with HTTP/2 yet, you might be asking what's change... Well, the short answer is: Pretty much everything. Let's look at a comparison.

Oh, HTTP/1.1, how I love thee...

For all of HTTP/1.1's flaws, there was one thing it absolutely excelled at: It's flippin' easy as hell to implement. (note, I didn't say "implement correctly".. that's a different story). For instance, all you really need to do to send an HTTP request to a server is open a telnet connection to a server on port 80 and send a few lines of text:


  bash-3.2$ telnet www.google.com 80
  Trying 74.125.225.178...
  Connected to www.google.com.
  Escape character is '^]'.
  GET / HTTP/1.1
  Host: www.google.com
  User-Agent: my-user-agent
  X-Some-Header: first

That's it. Nothing more. What you type is sent to the server, the server responds in kind. So how have things changed in HTTP/2 so far?

Holy complication, Batman!

Ok, so the most important difference between HTTP/1.1 and HTTP/2 is the fact that we've switched from a text-based protocol to a binary-protocol. That means, no more simple telnet connections unless You're Just That Good.

Replacing the text-based design is a new "Framing Layer" that divides messages into sequences of typed binary frames. There are quite a few of these frame types: DATA, HEADERS, PUSH_PROMISE, RST_STREAM, GOAWAY, PING, WINDOWUPDATE, SETTINGS, and PRIORITY. Each serves a distinct purpose and even simple implementations will use most of these.

So how do I send a simple GET request like the HTTP/1.1 example above? Unfortunately it's no longer simple.

Step 1: Establish the Connection

The first step is to establish a TCP connection and send a SETTINGS frame. (The server will send it's own SETTINGS frame but you don't have to wait for that to arrive before you can begin sending your request.. so let's ignore that for now). The SETTINGS frame the client sends needs to be prefixed with a very specific sequence of octets, followed by the 8-bytes of the SETTINGS frame header, followed by the individual settings themselves. These settings give the server some parameters it needs to communicate back to the client.. We'll get into what those settings are later. For now, let's assume we're sending an empty SETTINGS frame...

This is the absolute minimum our client needs to send over the wire to initialize the connection:


  50 52 49 20 2a 20 48 54 54 50 
  2f 32 2e 30 0d 0a 0d 0a 53 4d 
  0d 0a 0d 0a 00 00 04 00 00 00 
  00 00

I guess that's simple enough, what's next?

Step 2: Prepare the Request!

In HTTP/2, a "Request" is encoded as one or more HEADERS frames followed by one or more DATA frames followed by zero or more trailing HEADERS frames. If the request does not have a payload (e.g. our GET request above), then it'll just end up being one or more HEADERS frames, depending on the number of headers we want to send (I'll address that bit of complexity later). Since our example is very simple, we'll end up only having to send a single HEADERS frame for our request.

This is what we send:


00 39 01 05 00 00 00 01 84 83 
42 11 77 77 77 2e 67 6f 6f 67 
6c 65 2e 63 6f 6d 3a 38 30 4c 
0d 6d 79 2d 75 73 65 72 2d 61 
67 65 6e 74 40 0d 78 2d 73 6f 
6d 65 2d 68 65 61 64 65 72 05 
66 69 72 73 74 

Clear as mud right? Let's break it down a bit.

  • The first two bytes (00 39) provide the size of the payload minus the size of the eight byte header. There are 57 bytes of payload data in our HEADERS frame.
  • The third octet identifies the frame type. 01 == HEADERS frame.
  • The fourth octet provides flags we need for processing the frame. 05 == the END_STREAM and END_HEADERS flags have been set. This just means that this one HEADERS frame is the only one we're sending to the server.
  • The next four octets identify the Stream ID. Every request needs a new Stream ID number. This is a 31-bit integer. We're using Stream #1.
  • The remaining octets (there should be 57 of them) provide the compressed header (name,value) pairs.

Wait... compressed header pairs? What does that mean, you ask? Well, this is what it means... Headers are no longer encoded simply as text. For this implementation draft of HTTP/2, a STATEFUL header compression algorithm has been adopted. Each endpoint maintains a compression state that persists throughout the life of the connection. This state is used to encode header (name,value) pairs as compactly as possible. Let's break it down...

  • The first octet in the payload (84) is what's called an "indexed reference". As part of the header compression mechanism, the compression state is pre-populated with a set of extremely common header (name,value) pairs. For the most part, these can be referenced within our HEADERS frame by simply pointing to the index position. Hex value 84 is 10000100 in binary. The most significant bit flags this as an Indexed Representation. Without going into detail, the remaining bits specify Index Position #4, which in our pre-populated compression state, references the (name,value) pair ":method = GET".
  • The next octet (83) is also an "indexed reference". We know because that most significant bit (0x80) is set. The referenced index is Position #3, which in our pre-populated table points to ":path = /". Now we're getting somewhere! Using just two bytes we've encoded two full headers! Woo hoo Compression FTW!
  • The next octet is 43. Uh oh, the most significant bit isn't set. What the hell is 43? Well, in binary, the hex number 43 is 01000011. Look at the three most significant bits. 010 means that it's a "Literal Header with Incremental Indexing". What the hell does that mean? Well, it means that we're passing a new (name,value) pair that does not currently exist in the persistent compression state but needs to be added to that state. However, the 5 remaining least-significant bits tell us that our header uses the same name as another (name,value) pair that is already in the compression state. Without going into the specifics, this is the ":host" header field. The next octet tells us the length of the header value (hex 11 = dec 17)... Looking at the next 17 octets then, decoding as a UTF-8 string, gives us "www.google.com:80". (Wait.. UTF-8? Yes, HTTP/2 allows for UTF-8 encoded header values! Woohoo! Finally!). So that gives us ":host = www.google.com:80".
  • Next, we have the following sequence of octets: 4c 0d 6d 79 2d 75 73 65 72 2d 61 67 65 6e 74. The first (4c) once again tells us that we have a "Literal Header with Incremental Indexing". It also tells us that we're reusing an existing header name ("user-agent" in this case). 0d tells us that the value is 13 UTF-8 bytes long. Decoding the remaining bytes gives us "my-user-agent".
  • Finally, we have the sequence: 40 0d 78 2d 73 6f 6d 65 2d 68 65 61 64 65 72 05 66 69 72 73 74. Once again, 40 indicates we have a Literal Header with Incremental Indexing. However, since none of the least significant bits are set, our header name is being passed as a literal rather than re-using an existing header name. This likely means we're dealing with a brand new header name that does not currently exist in the compression state. The next octet (0d) tells us that our header name is 13 UTF-8 octets long. Decoding those gives us "x-some-header". 05 tells us that our value is 5 octets long, giving us "first".
  • That's it! See, all of our request headers are in there!

So to recap, to send our simple HTTP GET request, we open the TCP connection and send the following complete sequence of bytes:


  50 52 49 20 2a 20 48 54 54 50 
  2f 32 2e 30 0d 0a 0d 0a 53 4d 
  0d 0a 0d 0a 00 00 04 00 00 00 
  00 00 00 39 01 05 00 00 00 01 
  84 83 42 11 77 77 77 2e 67 6f 
  6f 67 6c 65 2e 63 6f 6d 3a 38 
  30 4c 0d 6d 79 2d 75 73 65 72 
  2d 61 67 65 6e 74 40 0d 78 2d 
  73 6f 6d 65 2d 68 65 61 64 65 
  72 05 66 69 72 73 74 

After the server receives the HEADERS frame, it'll respond with it's own similarly encoded HEADERS frame, likely followed by a bunch of DATA frames that carry the actual payload data. For the sake of sanity, I won't get in to illustrating all that.

One of the downsides of this new binary framing approach is that it makes it exceedingly difficult to show just a simple example of an HTTP/2 message flow. With HTTP/1.1 (as in the example above) I can simply type in the exact message that is sent to the server, exactly as it would appear on the wire. For HTTP/2.0, I have to show it in Hex encoding because of the binary framing. To make it easier to communicate what is happening, I've devised a relatively simple convention that abstracts away the binary coding details:


CLIENT
=> HEADERS(1)
    + END_STREAM
    + END_HEADERS
    :method = GET
    :path = /
    :host = www.google.com:80
    user-agent = my-user-agent
    x-some-header = first
<= HEADERS
    - END_STREAM
    + END_HEADERS
<= DATA
    + END_STREAM
    {payload}          

This makes it a bit easier to explain what's happening with the message flow. Unfortunately it does not tell the whole story...

Multiplex! Flow Control! Priority!

HTTP/2 has a bunch of stuff built in that simply does not exist in HTTP/1.1. For one, the framing layer design is built around the notion of multiplexing multiple request and response flows (called "streams") simultaneously over a single connection. In other words, unlike HTP/1.1 which generally requires a sequential, synchronous request-response flow, HTTP/2 allows a fully asynchronous bidirectional flow where data from multiple independent requests and responses can be sent at the same time over a single connection. This allows us to use the connection more efficiently, but it opens up a world of other issues... issues that are not easily ignored.

Every stream in a HTTP/2 connection has a Priority. The higher the priority value, the more important the data on that stream should be treated by the two endpoints. In other words, if a server has two requests waiting for responses, and the priority of request #2 is higher than that of request #1, the data for request #2 should be sent first, but in a way that is fair to #1 and doesn't force #1 to block until #2 is done. Since we having the framing layer, this might mean sending 2 of Stream #2's DATA frames for every 1 of Stream #1's DATA frames. Clear as mud? I'll try to post more on the flow control stuff later.

In addition to Priority, every stream (and the connection itself) can be flow controlled. This means that the receiving endpoint can limit how much data the sending endpoint can send (either client or server). For instance, if you're sending a POST request, the server can incrementally limit the amount of payload data you can send at any given time. (e.g. "Give me no more than ten bytes... ok, now give me ten more..." etc.)

Together, Priority and Flow Control are intended to allow for more intelligent use of the TCP connection when using multiplexing. For Web Browser type clients, this is a very very good thing.

But wait! There's more!

As if a new binary framing model, multiplexing, stateful header compression, built-in flow control, request priority, and UTF-8 header support weren't enough, HTTP/2 also introduces server pushed resources. A pushed resource is essentially a pre-emptive response to an implied GET request the server expects the client to make. For instance, if you GET an HTML page that has embedded links to a bunch of Jpeg files, the server can imply that you're about to send a bunch of additional GET requests for those Jpeg images. Rather than waiting for those GETs, the server can just go ahead and start sending them. Typically, these pushed resources are used to populate a local cache, improving overall performance. The details here are still being worked out but the concept is actually fairly simple.

There's quite a bit more to it all, of course, but these are the basics. I'll post more later with additional details on various topics.