Web Application Interface
It is a problem almost every language used for web development has dealt with: the low level interface between the web server and the application. The earliest example of a solution is the venerable and battle-worn Common Gateway Interface (CGI), providing a language-agnostic interface using only standard input, standard output and environment variables.
Back when Perl was becoming the de facto web programming language, a major shortcoming of CGI became apparent: the process needed to be started anew for each request. When dealing with an interpretted language and application requiring database connection, this overhead became unbearable. FastCGI (and later SCGI) arose as a successor to CGI, but it seems that much of the programming world went in a different direction.
Each language began creating its own standard for interfacing with servers. mod_perl. mod_python. mod_php. mod_ruby. Within the same language, multiple interfaces arose. In some cases, we even had interfaces on top of interfaces. And all of this led to much duplicated effort: a Python application designed to work with FastCGI wouldn’t work with mod_python; mod_python only exists for certain webservers; and these programming language specific web server extensions need to be written for each programming language.
Haskell has its own history. We originally had the cgi package, which provided a monadic interface. The fastcgi package then provided the same interface. Meanwhile, it seemed that the majority of Haskell web development focused on the standalone server. The problem is that each server comes with its own interface, meaning that you need to target a specific backend. This means that it is impossible to share common features, like GZIP encoding, development servers, and testing frameworks.
WAI attempts to solve this, by providing a generic and efficient interface between web servers and applications. Any handler supporting the interface can serve any WAI application, while any application using the interface can run on any handler.
At the time of writing, there are various backends, including Warp, FastCGI, and development server. There are even more esoteric backends like wai-handler-webkit for creating desktop apps. wai-extra provides many common middleware components like GZIP, JSON-P and virtual hosting. wai-test makes it easy to write unit tests, and wai-handler-devel lets you develop your applications without worrying about stopping to compile. Yesod targets WAI, as well as other Haskell web frameworks such as Scotty and MFlow. It’s also used by some applications that skip the framework entirely, including Hoogle.
The Interface
The interface itself is very straight-forward: an application takes a request and returns a response. A response is an HTTP status, a list of headers and a response body. A request contains various information: the requested path, query string, request body, HTTP version, and so on.
In order to handle resource management in an exception-safe manner, we use
continuation passing style for returning the response, similar to how the
bracket
function works. This makes our definition of an application look
like:
type Application =
Request ->
(Response -> IO ResponseReceived) ->
IO ResponseReceived
The first argument is a Request
, which shouldn’t be too surprising. The
second argument is the continuation, or what we should do with a Response
.
Generally speaking, this will just be sending it to the client. We use the
special ResponseReceived
type to ensure that the application does in fact
call the continuation.
This may seem a little strange, but usage is pretty straight-forward, as we’ll demonstrate below.
Response Body
Haskell has a datatype known as a lazy bytestring. By utilizing laziness, you can create large values without exhausting memory. Using lazy I/O, you can do such tricks as having a value which represents the entire contents of a file, yet only occupies a small memory footprint. In theory, a lazy bytestring is the only representation necessary for a response body.
In practice, while lazy byte strings are wonderful for generating "pure" values, the lazy I/O necessary to read a file introduces some non-determinism into our programs. When serving thousands of small files a second, the limiting factor is not memory, but file handles. Using lazy I/O, file handles may not be freed immediately, leading to resource exhaustion. To deal with this, WAI provides its own streaming data interface.
The core of this streaming interface is the Builder
. A Builder
represents
an action to fill up a buffer with bytes of data. This is more efficient than
simply passing ByteString
s around, as it can avoid multiple copies of data.
In many cases, an application needs only to provide a single Builder
value.
And for that simple case, we have the ResponseBuilder
constructor.
However, there are times when an Application
will need to interleave IO
actions with yielding of data to the client. For that case, we have
ResponseStream
. With ResponseStream
, you provide a function. This
function in turn takes two actions: a "yield more data" action, and a "flush
the buffer" action. This allows you to yield data, perform IO
actions, and
flush, as many times as you need, and with any interleaving desired.
There is one further optimization: many operating systems provide a sendfile
system call, which sends a file directly to a socket, bypassing a lot of the
memory copying inherent in more general I/O system calls. For that case, we
have a ResponseFile
.
Finally, there are some cases where we need to break out of the HTTP mode
entirely. Two examples are WebSockets, where we need to upgrade a half-duplex
HTTP connection to a full-duplex connection, and HTTPS proxying, which requires
our proxy server to establish a connection, and then become a dumb data
transport agent. For these cases, we have the ResponseRaw
constructor. Note
that not all WAI handlers can in fact support ResponseRaw
, though the most
commonly used handler, Warp, does provide this support.
Request Body
Like response bodies, we could theoretically use a lazy ByteString for request
bodies, but in practice we want to avoid lazy I/O. Instead, the request body is
represented with a IO ByteString
action (ByteString
here being a strict
ByteString). Note that this does not return the entire request body, but
rather just the next chunk of data. Once you’ve consumed the entire request
body, further calls to this action will return an empty ByteString.
Note that, unlike response bodies, we have no need for using Builder
s on
the request side, since our purpose is purely for reading.
The request body could in theory contain any type of data, but the most common are URL encoded and multipart form data. The wai-extra package contains built-in support for parsing these in a memory-efficient manner.
Hello World
To demonstrate the simplicity of WAI, let’s look at a hello world example. In this example, we’re going to use the OverloadedStrings language extension to avoid explicitly packing string values into bytestrings.
{-# LANGUAGE OverloadedStrings #-}
import Network.Wai
import Network.HTTP.Types (status200)
import Network.Wai.Handler.Warp (run)
application _ respond = respond $
responseLBS status200 [("Content-Type", "text/plain")] "Hello World"
main = run 3000 application
Lines 2 through 4 perform our imports. Warp is provided by the warp package,
and is the premiere WAI backend. WAI is also built on top of the http-types
package, which provides a number of datatypes and convenience values, including
status200
.
First we define our application. Since we don’t care about the specific request
parameters, we ignore the first argument to the function, which contains the
request value. The second argument is our "send a response" function, which we
immediately use. The response value we send is built from a lazy ByteString
(thus responseLBS
), with status code 200 ("OK"), a text/plain content type,
and a body containing the words "Hello World". Pretty straight-forward.
Resource allocation
Let’s make this a little more interesting, and try to allocate a resource for
our response. We’ll create an MVar
in our main
function to track the number
of requests, and then hold that MVar
while sending each response.
{-# LANGUAGE OverloadedStrings #-}
import Blaze.ByteString.Builder (fromByteString)
import Blaze.ByteString.Builder.Char.Utf8 (fromShow)
import Control.Concurrent.MVar
import Data.Monoid ((<>))
import Network.HTTP.Types (status200)
import Network.Wai
import Network.Wai.Handler.Warp (run)
application countRef _ respond = do
modifyMVar countRef $ \count -> do
let count' = count + 1
msg = fromByteString "You are visitor number: " <>
fromShow count'
responseReceived <- respond $ responseBuilder
status200
[("Content-Type", "text/plain")]
msg
return (count', responseReceived)
main = do
visitorCount <- newMVar 0
run 3000 $ application visitorCount
This is where WAI’s continuation interface shines. We’re able to use the
standard modifyMVar
function to acquire the MVar
lock and send our
response. Note how we thread the responseReceived
value through, though we
never actually use the value for anything. It is merely witness to the fact
that we have, in fact, sent a response.
Notice also how we take advantage of Builder
s in constructing our msg
value. Instead of concatenating two ByteString
s together directly, we
monoidally append two different Builder
values. The advantage to this is that
the results will end up being copied directly into the final output buffer,
instead of first being copied into a temporary ByteString
buffer to only
later be copied into the final buffer.
Streaming response
Let’s give our streaming interface a test as well:
{-# LANGUAGE OverloadedStrings #-}
import Blaze.ByteString.Builder (fromByteString)
import Control.Concurrent (threadDelay)
import Network.HTTP.Types (status200)
import Network.Wai
import Network.Wai.Handler.Warp (run)
application _ respond = respond $ responseStream status200 [("Content-Type", "text/plain")]
$ \send flush -> do
send $ fromByteString "Starting the response...\n"
flush
threadDelay 1000000
send $ fromByteString "All done!\n"
main = run 3000 application
We use responseStream
, and our third argument is a function which takes our
"send a builder" and "flush the buffer" functions. Notice how we flush after
our first chunk of data, to make sure the client sees the data immediately.
However, there’s no need to flush at the end of a response. WAI requires that
the handler automatically flush at the end of a stream.
Middleware
In addition to allowing our applications to run on multiple backends without code changes, the WAI allows us another benefits: middleware. Middleware is essentially an application transformer, taking one application and returning another one.
Middleware components can be used to provide lots of services: cleaning up URLs, authentication, caching, JSON-P requests. But perhaps the most useful and most intuitive middleware is gzip compression. The middleware works very simply: it parses the request headers to determine if a client supports compression, and if so compresses the response body and adds the appropriate response header.
The great thing about middlewares is that they are unobtrusive. Let’s see how we would apply the gzip middleware to our hello world application.
{-# LANGUAGE OverloadedStrings #-}
import Network.Wai
import Network.Wai.Handler.Warp (run)
import Network.Wai.Middleware.Gzip (gzip, def)
import Network.HTTP.Types (status200)
application _ respond = respond $ responseLBS status200 [("Content-Type", "text/plain")]
"Hello World"
main = run 3000 $ gzip def application
We added an import line to actually have access to the middleware, and then
simply applied gzip to our application. You can also chain together multiple
middlewares: a line such as gzip False $ jsonp $ othermiddleware $
myapplication
is perfectly valid. One word of warning: the order the
middleware is applied can be important. For example, jsonp needs to work on
uncompressed data, so if you apply it after you apply gzip, you’ll have
trouble.