2013-08-19

Sigh..

Update: Isaac Schlueter asked me to clarify a few things from this post. I did so in a comment to his post. It's worth reading both.

Update: Git pull request submitted.

So a few folks have argued that using LINK for things like WebMention and PuSH is not a viable alternative because some HTTP server implementations don't support it. My first reaction to such arguments is generally, "Support for HTTP extension verbs is trivial! What's there to support? You look at the verb and report it to the application." It really should be that difficult. But, I decided to perform a quick test using Node.js, and sure enough, the Node.js server throws up on you if you use LINK.

Specifically, give this a try: Create a new simple Node.js server:


  http.createServer(function(request, response) {
    response.writeHead(200, {"Content-Type": "text/plain"});
    response.write("Hello World");
    response.end();
  }).listen(8888);

Start it up:


  bash-3.2$ node server.js

Then fire up telnet and give this a shot:


  bash-3.2$ telnet 127.0.0.1 8888
  Trying 127.0.0.1...
  Connected to localhost.
  Escape character is '^]'.
  LINK / HTTP/1.1
  Connection closed by foreign host.
  bash-3.2$ 

Yep, Node.js doesn't even return a proper 405 Method Not Allowed response... it just literally kills the connection and moves on.

At this point, I thought to myself, WTF! I thought Node.js was supposed to be cool, what the hell is going on here. So I switched over to Github to look at the source. This is the mess that I found starting at around Line 891 of the http_parser.c class:


parser->method = (enum http_method) 0;
        parser->index = 1;
        switch (ch) {
          case 'C': parser->method = HTTP_CONNECT; /* or COPY, CHECKOUT */ break;
          case 'D': parser->method = HTTP_DELETE; break;
          case 'G': parser->method = HTTP_GET; break;
          case 'H': parser->method = HTTP_HEAD; break;
          case 'L': parser->method = HTTP_LOCK; break;
          case 'M': parser->method = HTTP_MKCOL; /* or MOVE, MKACTIVITY, MERGE, M-SEARCH */ break;
          case 'N': parser->method = HTTP_NOTIFY; break;
          case 'O': parser->method = HTTP_OPTIONS; break;
          case 'P': parser->method = HTTP_POST;
            /* or PROPFIND|PROPPATCH|PUT|PATCH|PURGE */
            break;
          case 'R': parser->method = HTTP_REPORT; break;
          case 'S': parser->method = HTTP_SUBSCRIBE; /* or SEARCH */ break;
          case 'T': parser->method = HTTP_TRACE; break;
          case 'U': parser->method = HTTP_UNLOCK; /* or UNSUBSCRIBE */ break;
          default:
            SET_ERRNO(HPE_INVALID_METHOD);
            goto error;
        }
        parser->state = s_req_method;
        ...

First, of all.. WTF? Unfortunately, it only gets worse from there as you keep reading through that code. What ought to be painfully obvious is that this is certainly NOT the right way of implementing support for HTTP methods.

According to the HTTP specification, the ABNF grammar definition for an HTTP method is:


   tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." /
    "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA
   token = 1*tchar
   method = token

This means, that an HTTP method is any sequence of characters that include the letters A-Z, a-z, 0-9 or any of the symbols "!", "#", "$", "%", "&", "'", "*", "+", "-", ".", "^", "_", "`", "|", or "~". (Yes, an single "+" is a valid HTTP method name).

Let's take a look at how Node.js handles a few other "extension" HTTP methods. First, however, let's modify our server script just a bit so we can see what the Node.js server's view of the request method received is:


  http.createServer(function(request, response) {
    response.writeHead(200, {"Content-Type": "text/plain"});
    response.write("Hello World " + request.method);
    response.end();
  }).listen(8888);

Start it up:


  bash-3.2$ node server.js

Now do:


bash-3.2$ telnet 127.0.0.1 8888
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GEM / HTTP/1.1
Host: example.org

HTTP/1.1 200 OK
Content-Type: text/plain
Date: Mon, 19 Aug 2013 20:48:04 GMT
Connection: keep-alive
Transfer-Encoding: chunked

f
Hello World GET
0

PUN / HTTP/1.1
Host: example.org

HTTP/1.1 200 OK
Content-Type: text/plain
Date: Mon, 19 Aug 2013 20:49:18 GMT
Connection: keep-alive
Transfer-Encoding: chunked

f
Hello World PUT
0

POSH / HTTP/1.1
Connection closed by foreign host.
bash-3.2$ 

So, we sent "GEM" and the server interpreted it as "GET". We sent "PUN" at the server interpreted it as "PUT". We sent "POSH" and the server interpreted it as an error. And if we send "LINK", which is a legitimate, registered extension method that has been around since HTTP 1.0, Node.js dies. That's just silly.

Node.js is clearly doing things incorrectly. It's no wonder some Node.js developers don't like the idea of using Link, their tools are getting in the way!

Update: Node.js issue filed.