The Resource monad transformer
The Resource transformer (ResourceT
) plays a vital role in proper resource
management in the conduit project. It is included within the conduit
package
itself. We'll explaining ResourceT
as its own entity. While some of the design
decisions clearly are biased towards conduits, ResourceT
should remain a usable
tool in its own right.
Goals
What's wrong with the following code?
import System.IO main = do output <- openFile "output.txt" WriteMode input <- openFile "input.txt" ReadMode hGetContents input >>= hPutStr output hClose input hClose output
If the file input.txt
does not exist, then an exception will be thrown
when trying to open it. As a result, hClose output
will never be called, and
we'll have leaked a scarce resource (a file descriptor). In our tiny program, this isn't a big
deal, but clearly we can't afford such waste in a long running, highly active server process.
Fortunately, solving the problem is easy:
import System.IO main = withFile "output.txt" WriteMode $ \output -> withFile "input.txt" ReadMode $ \input -> hGetContents input >>= hPutStr output
withFile
makes sure that the Handle
is always closed, even in
the presence of exceptions. It also handles asynchronous exceptions. Overall, it's a great
approach to use... when you can use it. While often withFile
is easy to use,
sometimes it can require restructuring our programs. And this restructuring can range from mildly
tedious to wildly inefficient.
Let's take enumerators for example. If you look in the documentation, there is an
enumFile
function (for reading contents from a file), but no
iterFile
(for writing contents to a file). That's because the flow of control
in an iteratee doesn't allow proper allocation of the Handle. Instead, in order to write to a
file, you need to allocate the Handle before entering the Iteratee, e.g.:
import System.IO import Data.Enumerator import Data.Enumerator.Binary main = withFile "output.txt" WriteMode $ \output -> run_ $ enumFile "input.txt" $$ iterHandle output
This code works fine, but imagine that, instead of simply piping data directly to the file, there was a huge amount of computation that occurred before we need to use the output handle. We will have allocated a file descriptor long before we needed it, and thereby locked up a scarce resource in our application. Besides this, there are times when we can't allocate the file before hand, such as when we won't know which file to open until we've read from the input file.
One of the stated goals of conduits is to solve this problem, and it does so via
ResourceT
. As a result, the above program can be written in conduit as:
{-# LANGUAGE OverloadedStrings #-} import Data.Conduit import Data.Conduit.Binary main = runResourceT $ sourceFile "input.txt" $$ sinkFile "output.txt"
How it Works
There are essentially three base functions on ResourceT
, and then a bunch of
conveniences thrown on top. The first function is:
register :: IO () -> ResourceT IO ReleaseKey
This function registers a piece of code that it asserts must be run. It gives back a
ReleaseKey
, which is used by the next function:
release :: ReleaseKey -> ResourceT IO ()
Calling release
on a ReleaseKey
immediately performs the
action you previously registered. You may call release
on the same
ReleaseKey
as many times as you like; the first time it is called, it
unregisters the action. This means you can safely register an action like a memory
free, and have no concerns that it will be called twice.
Eventually, we'll want to exit our special ResourceT
. To do so, we use:
runResourceT :: ResourceT IO a -> IO a
This seemingly innocuous function is where all the magic happens. It runs through all of the
registered cleanup actions and performs them. It is fully exception safe, meaning the cleanups
will be performed in the presence of both synchronous and asynchronous exceptions. And as
mentioned before, calling release
will unregister an action, so there is no
concern of double-freeing.
Finally, as a convenience, we provide one more function for the common case of allocating a resource and registering a release action:
with :: IO a -- ^ allocate -> (a -> IO ()) -- ^ free resource -> ResourceT IO (ReleaseKey, a)
So, to rework our first buggy example to use ResourceT
, we would write:
import System.IO import Control.Monad.Trans.Resource import Control.Monad.Trans.Class (lift) main = runResourceT $ do (releaseO, output) <- with (openFile "output.txt" WriteMode) hClose (releaseI, input) <- with (openFile "input.txt" ReadMode) hClose lift $ hGetContents input >>= hPutStr output release releaseI release releaseO
Now there is no concern of any exceptions preventing the releasing of resources. We could skip
the release
calls if we want to, and in an example this small, it would not make
any difference. But for larger applications, where we want processing to continue, this ensures
that the Handle
s are freed as early as possible, keeping our scarce resource
usage to a minimum.
Some Type Magic
As alluded to, there's a bit more to ResourceT
than simply running in
IO
. Let's cover some of the things we need from this underlying
Monad
.
- Mutable references to keep track of the registered release actions. You might think we could
just use a
StateT
transformer, but then our state wouldn't survive exceptions. - We only want to register actions in the base monad. For example, if we have a
ResourceT (WriterT [Int] IO)
stack, we only want to registerIO
actions. This makes it easy to lift our stacks around (i.e., add an extra transformer to the middle of an existing stack), and avoids confusing issues about the threading of other monadic side-effects. - Some way to guarantee an action is performed, even in the presence of exceptions. This boils
down to needing a
bracket
-like function.
For the first point, we define a new typeclass to represent monads that have mutable references:
class Monad m => HasRef m where type Ref m :: * -> * newRef' :: a -> m (Ref m a) readRef' :: Ref m a -> m a writeRef' :: Ref m a -> a -> m () modifyRef' :: Ref m a -> (a -> (a, b)) -> m b mask :: ((forall a. m a -> m a) -> m b) -> m b mask_ :: m a -> m a try :: m a -> m (Either SomeException a)
We have an associated type to signify what the reference type should be. (For fans of fundeps,
you'll see in the next section that this has to be an associated type.) Then we provide a
number of basic reference operations. Finally, there are some functions to help with exceptions,
which are needed to safely implement the functions described in the last section. The instance
for IO
is very straight-forward:
instance HasRef IO where type Ref IO = I.IORef newRef' = I.newIORef modifyRef' = I.atomicModifyIORef readRef' = I.readIORef writeRef' = I.writeIORef mask = E.mask mask_ = E.mask_ try = E.try
However, we have a problem when it comes to implementing the ST
instance:
there is no way to deal with exceptions in the ST
monad. As a result,
mask
, mask_
and try
are given default
implementations that do no exception checking. This gives rise to the first word of warning:
operations in the ST monad are not exception safe. You should not be allocating scarce
resources in ST when using ResourceT
. You might be wondering why bother with
ResourceT
at all then for ST
. The answer is that there is a
lot you can do with conduits without allocating scarce resources, and ST
is a
great way to do this in a pure way. But more on this later.
Now onto point 2: we need some way to deal with this base monad concept. Again, we use an associated type (again explained in the next section). Our solution looks something like:
class (HasRef (Base m), Monad m) => Resource m where type Base m :: * -> * resourceLiftBase :: Base m a -> m a
But we forgot about point 3: some bracket
-like function. So we need one more
method in this typeclass:
resourceBracket_ :: Base m a -> Base m b -> m c -> m c
The reason the first two arguments to resourceBracket_
(allocation and
cleanup) live in Base m
instead of m
is that, in
ResourceT
, all allocation and cleanup lives in the base monad.
So on top of our HasRef
instance for IO
, we now need a
Resource
instance as well. This is similarly straight-forward:
instance Resource IO where type Base IO = IO resourceLiftBase = id resourceBracket_ = E.bracket_
We have similar ST
instances, with resourceBracket_
having no
exception safety. The final step is dealing with monad transformers. We don't need to provide a
HasRef
instance, but we do need a Resource
instance. The
tricky part is providing a valid implementation of resourceBracket_
. For this,
we use some functions from monad-control:
instance (MonadTransControl t, Resource m, Monad (t m)) => Resource (t m) where type Base (t m) = Base m resourceLiftBase = lift . resourceLiftBase resourceBracket_ a b c = control' $ \run -> resourceBracket_ a b (run c) where control' f = liftWith f >>= restoreT . return
For any transformer, its base is the base of its inner monad. Similarly, we lift to the base by
lifting to the inner monad and then lifting to the base from there. The tricky part is the
implemetnation of resourceBracket_
. I will not go into a detailed explanation,
as I would simply make a fool of myself.
Definition of ResourceT
We now have enough information to understand the definition of ResourceT
:
newtype ReleaseKey = ReleaseKey Int type RefCount = Int type NextKey = Int data ReleaseMap base = ReleaseMap !NextKey !RefCount !(IntMap (base ())) newtype ResourceT m a = ResourceT (Ref (Base m) (ReleaseMap (Base m)) -> m a)
We see that ReleaseKey
is simply an Int
. If you skip a few
lines down, this will make sense, since we're using an IntMap
to keep track of
the registered actions. We also define two type synonyms: RefCount
and
NextKey
. NextKey
keeps track of the most recently assigned
value for a key, and is incremented each time register
is called. We'll touch on
RefCount
later.
The ReleaseMap
is three pieces of information: the next key and the reference
count, and then the map of all registered actions. Notice that ReleaseMap
takes
a type parameter base
, which states which monad release actions must live
in.
Finally, a ResourceT
is essentially a ReaderT
that keeps a
mutable reference to a ReleaseMap
. The reference type is determined by the base
of the monad in question, as is the cleanup monad. This is why we need to use associated
types.
The majority of the rest of the code in the Control.Monad.Trans.Resource
module is just providing instances for the ResourceT
type.
Other Typeclasses
There are three other typeclasses provided by the module:
- ResourceUnsafeIO
- Any monad which can lift
IO
actions into it, but that this may be considered unsafe. The prime candidate here isST
. Care should be taken to only lift actions which do not acquire scarce resources and which don't "fire the missiles." In other words, all the normal warnings ofunsafeIOToST
apply. - ResourceThrow
- For actions that can throw exceptions. This automatically applies to all
IO
-based monads. ForST
-based monads, you can use the suppliedExceptionT
transformer to provide exception-throwing capabilities. Some functions in conduit, for example, will require this (e.g., text decoding). - ResourceIO
- A convenience class tying together a bunch of other classes, included the two mentioned above. This is purely for convenience; you could achieve the same effect without this type class, you'd just have to do a lot more typing.
Forking
It would seem that forking a thread would be inherently unsafe with ResourceT
,
since the parent thread may call runResourceT
while the child thread is still
accessing some of the allocated resources. This is indeed true, if you use the normal
forkIO
function.
In order to solve this, ResourceT
includes reference counting. When you fork a
new thread via resourceForkIO
, the RefCount
value of the
ReleaseMap
is incremented. Every time runResourceT
is called,
the value is decremented. Only when the value hits 0 are all the release actions called.
Convenience Exports
In addition to what's been listed so far, there are a few extra functions exported (mostly) for convenience.
newRef
,writeRef
, andreadRef
wrap up theHasRef
versions of the functions and allow them to run in anyResourceT
.withIO
is essentially a type-restricted version ofwith
, but working around some of the nastiness with types you would otherwise run into. In general: you'll want to usewithIO
when writingIO
code.transResourceT
let's you modify which monad your ResourceT is running in, assuming it keeps the same base.transResourceT :: (Base m ~ Base n) => (m a -> n a) -> ResourceT m a -> ResourceT n a transResourceT f (ResourceT mx) = ResourceT (\r -> f (mx r))