Kazu and I are happy to announce the first release of auto-update, a library to run update actions on a given schedule. To make it more concrete, let's start with a motivating example.
Suppose you're writing a web service which will return the current time. This is simple enough with WAI and Warp, e.g.:
{-# LANGUAGE OverloadedStrings #-}
import Data.ByteString.Lazy.Char8 (pack)
import Data.Time (formatTime, getCurrentTime)
import Network.HTTP.Types (status200)
import Network.Wai (responseLBS)
import Network.Wai.Handler.Warp (run)
import System.Locale (defaultTimeLocale)
main :: IO ()
main =
run 3000 app
where
app _ respond = do
now <- getCurrentTime
respond $ responseLBS status200 [("Content-Type", "text/plain")]
$ pack $ formatTime defaultTimeLocale "%c" now
This is all well and good, but it's a bit inefficient. Imagine if you have a
thousand requests per second (some people really like do know what time it
is). We will end up recalculating the string representation of the time a 999
extra times than is necessary! To work around this, we have a simple solution:
spawn a worker thread to calculate the time once per second. (Note: it will
actually calculate it slightly less than once per second due to the way
threadDelay
works; we're assuming we have a little bit of latitude in
returning a value thats a few milliseconds off.)
{-# LANGUAGE OverloadedStrings #-}
import Control.Concurrent (forkIO, threadDelay)
import Control.Monad (forever)
import Data.ByteString.Lazy.Char8 (ByteString, pack)
import Data.IORef (newIORef, readIORef, writeIORef)
import Data.Time (formatTime, getCurrentTime)
import Network.HTTP.Types (status200)
import Network.Wai (responseLBS)
import Network.Wai.Handler.Warp (run)
import System.Locale (defaultTimeLocale)
getCurrentTimeString :: IO ByteString
getCurrentTimeString = do
now <- getCurrentTime
return $ pack $ formatTime defaultTimeLocale "%c" now
main :: IO ()
main = do
timeRef <- getCurrentTimeString >>= newIORef
_ <- forkIO $ forever $ do
threadDelay 1000000
getCurrentTimeString >>= writeIORef timeRef
run 3000 (app timeRef)
where
app timeRef _ respond = do
time <- readIORef timeRef
respond $ responseLBS status200 [("Content-Type", "text/plain")] time
Now we will calculate the current time once per second, which is far more efficient... right? Well, it depends on server load. Previously, we talked about a server getting a thousand requests per second. Let's instead reverse it: a server that gets one request every thousand seconds. In that case, our optimization turns into a pessimization.
This problem doesn't just affect getting the current time. Another example is flushing logs. A hot web server could be crippled by flushing logs to disk on every request, whereas flushing once a second on a less popular server simply keeps the process running for no reason. One option is to put the power in the hands of users of a library to decide how often to flush. But often times, we won't know until runtime how frequently a service will be requested. Or even more complicated: traffic will come in spikes, with both busy and idle times.
(Note that I've only given examples of running web servers, though I'm certain there are plenty of other examples out there to draw from.)
This is the problem that auto-update comes to solve. With auto-update, you declare an update function, a frequency with which it should run, and a threshold at which it should "daemonize". The first few times you request a value, it's calculated in the main thread. Once you cross the daemonize threshold, a dedicated worker thread is spawned to recalculate the value. If the value is not requested during an update period, the worker thread is shut down, and we go back to the beginning.
Let's see how our running example works out with this:
{-# LANGUAGE OverloadedStrings #-}
import Control.AutoUpdate (defaultUpdateSettings,
mkAutoUpdate, updateAction)
import Data.ByteString.Lazy.Char8 (ByteString, pack)
import Data.Time (formatTime, getCurrentTime)
import Network.HTTP.Types (status200)
import Network.Wai (responseLBS)
import Network.Wai.Handler.Warp (run)
import System.Locale (defaultTimeLocale)
getCurrentTimeString :: IO ByteString
getCurrentTimeString = do
now <- getCurrentTime
return $ pack $ formatTime defaultTimeLocale "%c" now
main :: IO ()
main = do
getTime <- mkAutoUpdate defaultUpdateSettings
{ updateAction = getCurrentTimeString
}
run 3000 (app getTime)
where
app getTime _ respond = do
time <- getTime
respond $ responseLBS status200 [("Content-Type", "text/plain")] time
If you want to see the impact of this change, add a putStrLn
call to
getCurrentTimeString
and make a bunch of requests to the service. You should
see just one request per second, once you get past that initial threshold
period (default of 3).
Kazu and I have started using this library in a few places:
- fast-logger no longer requires explicit flushing; it's handled for you automatically.
- wai-logger and wai-extra's request logger, by extension, inherit this functionality.
- Warp no longer has a dedicated thread for getting the current time.
- The Yesod scaffolding was able to get rid of an annoying bit of commentary.
Hopefully others will enjoy and use this library as well.
Control.Reaper
The second module in auto-update is Control.Reaper
. This provides something
similar, but slightly different, from Control.AutoUpdate
. The goal is to
spawn reaper/cleanup threads on demand. These threads can handle such things
as:
- Recycling resources in a resource pool.
- Closing out unused connections in a connection pool.
- Terminating threads that have overstayed a timeout.
This module is currently being used in Warp for slowloris timeouts and file descriptor cache management, though I will likely use it in http-client in the near future as well for its connection manager management.