Wednesday, November 5, 2008

headersComplete

Hoo boy. I have really not had much time for this project lately. A few days ago, I did manage to do yet more prep work for what will eventually become session support. It may not seem like it at first, but bear with me.

extensions.org.icongarden.http needed to be able to report whether the script has finished writing the response headers. Since these headers should be in ASCII (specifically a variant called NET-ASCII, which ends lines with a CR LF sequence), my assumption is that people will be using the writeASCII function to write headers. Into the extension I embedded a little state machine which watches characters as they pass through writeASCII and writeUTF8 and detects the sequence CR LF CR LF, which signals the end of the headers. In fact, the machine has a state for each character in that sequence, so that the script doesn't have to get this exactly right; the extension will finish up if necessary.

And when might that happen? When the script calls writeUTF8, the function determines the machine's state and provides however many characters of the CR LF CR LF sequence which seem to have been neglected before writing any of the UTF-8 data. It does this under the assumption that if you're writing UTF-8 then you must not be writing headers any more, because headers must be ASCII. You can of course use writeASCII to write the body of your response, but if you do that then you're responsible for providing the CR LF CR LF yourself.

But the real purpose of the state machine is to enable a property called headersComplete, which is a (read-only) boolean indicating whether the state machine has detected that a script is done sending response headers. The forthcoming cookies extension will use this to verify that it can write response header lines (which will contain cookies). I expect that the cookies extension will throw an exception if headers are complete when the script tries to write cookies into the response.

This is all very low-level stuff and I expect that most scripts will use a higher-level layer written entirely in JavaScript which abstracts away these details.

Update: Ah, silly me. For some reason I got the impression that HTTP headers were in NET-ASCII, but that turns out not to be the case. At least one of the cookie response headers is specified as having UTF-8, and, really, why not? It's not as if UTF-8 will confuse anyone who consumes text as bytes, even if they think it's ASCII. I tried to take advantage of a detail which simply does not exist.