Continuing the pattern I started with conduit, I've written up a resourcet package overview on the School of Haskell. For convenience, here's the content of the tutorial. But for the best experience, I recommend reading the School of Haskell version, which includes active code.
Proper exception handling, especially in the presence of asynchronous exceptions, is a non-trivial task. But such proper handling is absolutely vital to any large scale application. Leaked file descriptors or database connections will simply not be an option when writing a popular web application, or a high concurrency data processing tool. So the question is, how do you deal with it?
The standard approach is the bracket pattern, which appears throughout much of
the standard libraries. withFile
uses the bracket pattern to safely wrap up
openFile
and closeFile
, guaranteeing that the file handle will be closed no
matter what. This approach works well, and I highly recommend using it.
However, there's another approach available: the resourcet package. If the bracket pattern is so good, why do we need another one? The goal of this post is to answer that question.
What is ResourceT
ResourceT is a monad transformer which creates a region of code where you can safely allocate resources. Let's write a simple example program: we'll ask the user for some input and pretend like it's a scarce resource that must be released. We'll then do something dangerous (potentially introducing a divide-by-zero error). We then want to immediately release our scarce resource and perform some long-running computation.
import Control.Monad.Trans.Resource
import Control.Monad.IO.Class
main = runResourceT $ do
(releaseKey, resource) <- allocate
(do
putStrLn "Enter some number"
readLn)
(\i -> putStrLn $ "Freeing scarce resource: " ++ show i)
doSomethingDangerous resource
liftIO $ putStrLn $ "Going to release resource immediately: " ++ show resource
release releaseKey
somethingElse
doSomethingDangerous i =
liftIO $ putStrLn $ "5 divided by " ++ show i ++ " is " ++ show (5 `div` i)
somethingElse = liftIO $ putStrLn
"This could take a long time, don't delay releasing the resource!"
Try entering a valid value, such as 3, and then enter 0. Notice that in both cases the "Freeing scarce resource" message it printed. And by using release
before somethingElse
, we guarantee that the resource is freed before running the potentially long process.
In this specific case, we could easily represent our code in terms of bracket with a little refactoring.
import Control.Exception (bracket)
main = do
bracket
(do
putStrLn "Enter some number"
readLn)
(\i -> putStrLn $ "Freeing scarce resource: " ++ show i)
doSomethingDangerous
somethingElse
doSomethingDangerous i =
putStrLn $ "5 divided by " ++ show i ++ " is " ++ show (5 `div` i)
somethingElse = putStrLn
"This could take a long time, don't delay releasing the resource!"
In fact, the bracket
version is cleaner than the resourcet version. If so, why bother with resourcet at all? Let's build up to the more complicated cases.
bracket in terms of ResourceT
The first thing to demonstrate is that ResourceT
is strictly more powerful than bracket
, in the sense that:
bracket
can be implemented in terms ofResourceT
.ResourceT
cannot be implemented in terms ofbracket
.
The first one is pretty easy to demonstrate:
import Control.Monad.Trans.Resource
import Control.Monad.Trans.Class
bracket alloc free inside = runResourceT $ do
(_releaseKey, resource) <- allocate alloc free
lift $ inside resource
main = bracket
(putStrLn "Allocating" >> return 5)
(\i -> putStrLn $ "Freeing: " ++ show i)
(\i -> putStrLn $ "Using: " ++ show i)
Now let's analyze why the second statement is true.
What ResourceT adds
The bracket
pattern is designed with nested resource allocations. For example, consider the following program which copies data from one file to another. We'll open up the source file using withFile
, and then nest within it another withFile
to open the destination file, and finally do the copying with both file handles.
{-# START_FILE main.hs #-}
import System.IO
import qualified Data.ByteString as S
main = do
withFile "input.txt" ReadMode $ \input ->
withFile "output.txt" WriteMode $ \output -> do
bs <- S.hGetContents input
S.hPutStr output bs
S.readFile "output.txt" >>= S.putStr
{-# START_FILE input.txt #-}
This is the input file.
But now, let's tweak this a bit. Instead of reading from a single file, we want to read from two files and concatenate them. We could just have three nested withFile
calls, but that would be inefficient: we'd have two Handle
s open for reading at once, even though we'll only ever need one. We could restructure our program a bit instead: put the withFile
for the output file on the outside, and then have two calls to withFile
for the input files on the inside.
But consider a more complicated example. Instead of just a single destination file, let's say we want to break up our input stream into chunks of, say, 50 bytes each, and write each chunk to successive output files. We now need to interleave allocations and freeings of both the source and destination files, and we cannot statically know exactly how the interleaving will look, since we don't know the size of the files at compile time.
This is the kind of situation that resourcet
solves well (we'll demonstrate in the next section). As an extension of this, we can write library functions which allow user code to request arbitrary resource allocations, and we can guarantee that they will be cleaned up. A prime example of this is in WAI (Web Application Interface). The user application may wish to allocate some scarce resources (such as database statements) and use them in the generation of the response body. Using ResourceT
, the web server can guarantee that these resources will be cleaned up.
Interleaving with conduit
Let's demonstrate the interleaving example described above. To simplify the code, we'll use the conduit package for the actual chunking implementation. Notice when you run the program that there are never more than two file handles open at the same time.
import Control.Monad.IO.Class (liftIO)
import Data.Conduit (addCleanup, runResourceT, ($$), (=$))
import Data.Conduit.Binary (isolate, sinkFile, sourceFile)
import Data.Conduit.List (peek)
import Data.Conduit.Zlib (gzip)
import System.Directory (createDirectoryIfMissing)
-- show
-- All of the files we'll read from
infiles = map (\i -> "input/" ++ show i ++ ".bin") [1..10]
-- Generate a filename to write to
outfile i = "output/" ++ show i ++ ".gz"
-- Monad instance of Source allows us to simply mapM_ to create a single Source
-- for reading all of the files sequentially.
source = mapM_ sourceFileTrace infiles
-- The Sink is a bit more complicated: we keep reading 30kb chunks of data into
-- new files. We then use peek to check if there is any data left in the
-- stream. If there is, we continue the process.
sink =
loop 1
where
loop i = do
isolate (30 * 1024) =$ sinkFileTrace (outfile i)
mx <- peek
case mx of
Nothing -> return ()
Just _ -> loop (i + 1)
-- Putting it all together is trivial. ResourceT guarantees we have exception
-- safety.
transform = runResourceT $ source $$ gzip =$ sink
-- /show
-- Just some setup for running our test.
main = do
createDirectoryIfMissing True "input"
createDirectoryIfMissing True "output"
mapM_ fillRandom infiles
transform
fillRandom fp = runResourceT
$ sourceFile "/dev/urandom"
$$ isolate (50 * 1024)
=$ sinkFile fp
-- Modified sourceFile and sinkFile that print when they are opening and
-- closing file handles, to demonstrate interleaved allocation.
sourceFileTrace fp = do
liftIO $ putStrLn $ "Opening: " ++ fp
addCleanup (const $ liftIO $ putStrLn $ "Closing: " ++ fp) (sourceFile fp)
sinkFileTrace fp = do
liftIO $ putStrLn $ "Opening: " ++ fp
addCleanup (const $ liftIO $ putStrLn $ "Closing: " ++ fp) (sinkFile fp)
resourcet is not conduit
resourcet was originally created in the process of writing the conduit package. As a result, many people have the impression that these two concepts are intrinsically linked. In fact, this is not true: each can be used separately from the other. The canonical demonstration of resourcet combined with conduit is the file copy function:
import Data.Conduit
import Data.Conduit.Binary
fileCopy src dst = runResourceT $ sourceFile src $$ sinkFile dst
main = do
writeFile "input.txt" "Hello"
fileCopy "input.txt" "output.txt"
readFile "output.txt" >>= putStrLn
However, since this function does not actually use any of ResourceT's added functionality, it can easily be implemented with the bracket pattern instead:
import Data.Conduit
import Data.Conduit.Binary
import System.IO
fileCopy src dst = withFile src ReadMode $ \srcH ->
withFile dst WriteMode $ \dstH ->
sourceHandle srcH $$ sinkHandle dstH
main = do
writeFile "input.txt" "Hello"
fileCopy "input.txt" "output.txt"
readFile "output.txt" >>= putStrLn
Likewise, resourcet can be freely used for more flexible resource management without touching conduit. In other words, these two libraries are completely orthogonal and, while they complement each other nicely, can certainly be used separately.
Conclusion
ResourceT provides you with a flexible means of allocating resources in an exception safe manner. Its main advantage over the simpler bracket pattern is that it allows interleaving of allocations, allowing for more complicated programs to be created efficiently. If your needs are simple, stick with bracket. If you have need of something more complex, resourcet may be your answer.