Persistent
Forms deal with the boundary between the user and the application. Another boundary we need to deal with is between the application and the storage layer. Whether it be a SQL database, a YAML file, or a binary blob, odds are your storage layer does not natively understand your application’s data types, and you’ll need to perform some marshaling. Persistent is Yesod’s answer to data storage- a type-safe, universal data store interface for Haskell.
Haskell has many different database bindings available. However, most of these have little knowledge of a schema and therefore do not provide useful static guarantees. They also force database-dependent APIs and data types on the programmer.
Some Haskellers have attempted a more revolutionary route: creating Haskell specific data stores that allow one to easily store any strongly typed Haskell data. These options are great for certain use cases, but they constrain one to the storage techniques provided by the library and do not interface well with other languages.
In contrast, Persistent allows us to choose among existing databases that are highly tuned for different data storage use cases, interoperate with other programming languages, and to use a safe and productive query interface, while still keeping the type safety of Haskell datatypes.
Persistent follows the guiding principles of type safety and concise, declarative syntax. Some other nice features are:
-
Database-agnostic. There is first class support for PostgreSQL, SQLite, MySQL and MongoDB, with experimental Redis support.
-
Convenient data modeling. Persistent lets you model relationships and use them in type-safe ways. The default type-safe persistent API does not support joins, allowing support for a wider number of storage layers. Joins and other SQL specific functionality can be achieved through using a raw SQL layer (with very little type safety). An additional library, Esqueleto, builds on top of the Persistent data model, adding type-safe joins and SQL functionality.
-
Automatically perform database migrations
Persistent works well with Yesod, but it is quite usable on its own as a standalone library. Most of this chapter will address Persistent on its own.
Synopsis
The required dependencies for the below are: persistent, persistent-sqlite and persistent-template.
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Control.Monad.IO.Class (liftIO)
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
name String
age Int Maybe
deriving Show
BlogPost
title String
authorId PersonId
deriving Show
|]
main :: IO ()
main = runSqlite ":memory:" $ do
runMigration migrateAll
johnId <- insert $ Person "John Doe" $ Just 35
janeId <- insert $ Person "Jane Doe" Nothing
insert $ BlogPost "My fr1st p0st" johnId
insert $ BlogPost "One more for good measure" johnId
oneJohnPost <- selectList [BlogPostAuthorId ==. johnId] [LimitTo 1]
liftIO $ print (oneJohnPost :: [Entity BlogPost])
john <- get johnId
liftIO $ print (john :: Maybe Person)
delete janeId
deleteWhere [BlogPostAuthorId ==. johnId]
Solving the boundary issue
Suppose you are storing information on people in a SQL database. Your table might look something like:
CREATE TABLE person(id SERIAL PRIMARY KEY, name VARCHAR NOT NULL, age INTEGER)
And if you are using a database like PostgreSQL, you can be guaranteed that the database will never store some arbitrary text in your age field. (The same cannot be said of SQLite, but let’s forget about that for now.) To mirror this database table, you would likely create a Haskell datatype that looks something like:
data Person = Person
{ personName :: Text
, personAge :: Int
}
It looks like everything is type safe: the database schema matches our Haskell datatypes, the database ensures that invalid data can never make it into our data store, and everything is generally awesome. Well, until:
-
You want to pull data from the database, and the database layer gives you the data in an untyped format.
-
You want to find everyone older than 32, and you accidentally write "thirtytwo" in your SQL statement. Guess what: that will compile just fine, and you won’t find out you have a problem until runtime.
-
You decide you want to find the first 10 people alphabetically. No problem… until you make a typo in your SQL. Once again, you don’t find out until runtime.
In dynamic languages, the answer to these issues is unit testing. For everything that can go wrong, make sure you write a test case. But as I am sure you are aware by now, that doesn’t jive well with the Yesod approach to things. We like to take advantage of Haskell’s strong typing to save us wherever possible, and data storage is no exception.
So the question remains: how can we use Haskell’s type system to save the day?
Types
Like routing, there is nothing intrinsically difficult about type-safe data access. It just requires a lot of monotonous, error prone, boiler plate code. As usual, this means we can use the type system to keep us honest. And to avoid some of the drudgery, we’ll use a sprinkling of Template Haskell.
PersistValue
is the basic building block of Persistent. It is a sum type that
can represent data that gets sent to and from a database. Its definition is:
data PersistValue
= PersistText Text
| PersistByteString ByteString
| PersistInt64 Int64
| PersistDouble Double
| PersistRational Rational
| PersistBool Bool
| PersistDay Day
| PersistTimeOfDay TimeOfDay
| PersistUTCTime UTCTime
| PersistNull
| PersistList [PersistValue]
| PersistMap [(Text, PersistValue)]
| PersistObjectId ByteString
-- ^ Intended especially for MongoDB backend
| PersistDbSpecific ByteString
-- ^ Using 'PersistDbSpecific' allows you to use types
-- specific to a particular backend
Each Persistent backend needs to know how to translate the relevant values into
something the database can understand. However, it would be awkward to have to
express all of our data simply in terms of these basic types. The next layer is
the PersistField
typeclass, which defines how an arbitrary Haskell datatype
can be marshaled to and from a PersistValue
. A PersistField
correlates to a
column in a SQL database. In our person example above, name and age would be
our PersistField
s.
To tie up the user side of the code, our last typeclass is PersistEntity
. An
instance of PersistEntity
correlates with a table in a SQL database. This
typeclass defines a number of functions and some associated types. To review,
we have the following correspondence between Persistent and SQL:
SQL | Persistent |
Datatypes (VARCHAR, INTEGER, etc) |
PersistValue |
Column |
PersistField |
Table |
PersistEntity |
Code Generation
In order to ensure that the PersistEntity instances match up properly with your Haskell datatypes, Persistent takes responsibility for both. This is also good from a DRY (Don’t Repeat Yourself) perspective: you only need to define your entities once. Let’s see a quick example:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Database.Persist
import Database.Persist.TH
import Database.Persist.Sqlite
import Control.Monad.IO.Class (liftIO)
mkPersist sqlSettings [persistLowerCase|
Person
name String
age Int
deriving Show
|]
We use a combination of Template Haskell and Quasi-Quotation (like when
defining routes): persistLowerCase
is a quasi-quoter which converts a
whitespace-sensitive syntax into a list of entity definitions. "Lower case"
refers to the format of the generated table names. In this scheme, an
entity like SomeTable
would become the SQL table some_table
. You can also
declare your entities in a separate file using persistFileWith
. mkPersist
takes that list of entities and declares:
-
One Haskell datatype for each entity.
-
A
PersistEntity
instance for each datatype defined.
The example above generates code that looks like the following:
{-# LANGUAGE TypeFamilies, GeneralizedNewtypeDeriving, OverloadedStrings, GADTs #-}
import Database.Persist
import Database.Persist.Sqlite
import Control.Monad.IO.Class (liftIO)
import Control.Applicative
data Person = Person
{ personName :: !String
, personAge :: !Int
}
deriving Show
type PersonId = Key Person
instance PersistEntity Person where
newtype Key Person = PersonKey (BackendKey SqlBackend)
deriving (PersistField, Show, Eq, Read, Ord)
-- A Generalized Algebraic Datatype (GADT).
-- This gives us a type-safe approach to matching fields with
-- their datatypes.
data EntityField Person typ where
PersonId :: EntityField Person PersonId
PersonName :: EntityField Person String
PersonAge :: EntityField Person Int
data Unique Person
type PersistEntityBackend Person = SqlBackend
toPersistFields (Person name age) =
[ SomePersistField name
, SomePersistField age
]
fromPersistValues [nameValue, ageValue] = Person
<$> fromPersistValue nameValue
<*> fromPersistValue ageValue
fromPersistValues _ = Left "Invalid fromPersistValues input"
-- Information on each field, used internally to generate SQL statements
persistFieldDef PersonId = FieldDef
(HaskellName "Id")
(DBName "id")
(FTTypeCon Nothing "PersonId")
SqlInt64
[]
True
NoReference
persistFieldDef PersonName = FieldDef
(HaskellName "name")
(DBName "name")
(FTTypeCon Nothing "String")
SqlString
[]
True
NoReference
persistFieldDef PersonAge = FieldDef
(HaskellName "age")
(DBName "age")
(FTTypeCon Nothing "Int")
SqlInt64
[]
True
NoReference
As you might expect, our Person
datatype closely matches the definition we
gave in the original Template Haskell version. We also have a Generalized
Algebraic Datatype (GADT) which gives a separate constructor for each field.
This GADT encodes both the type of the entity and the type of the field. We use
its constructors throughout Persistent, such as to ensure that when we apply a
filter, the types of the filtering value match the field. There’s another
associated newtype for the database primary key of this entity.
We can use the generated Person
type like any other Haskell type, and then
pass it off to other Persistent functions.
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Control.Monad.IO.Class (liftIO)
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
name String
age Int Maybe
deriving Show
|]
main :: IO ()
main = runSqlite ":memory:" $ do
michaelId <- insert $ Person "Michael" $ Just 26
michael <- get michaelId
liftIO $ print michael
We start off with some standard database connection code. In this case, we used the single-connection functions. Persistent also comes built in with connection pool functions, which we will generally want to use in production.
In this example, we have seen two functions: insert
creates a new record in
the database and returns its ID. Like everything else in Persistent, IDs are
type safe. We’ll get into more details of how these IDs work later. So when you
call insert $ Person "Michael" 26
, it gives you a value back of type
PersonId
.
The next function we see is get
, which attempts to load a value from the
database using an Id
. In Persistent, you never need to worry that you are
using the key from the wrong table: trying to load up a different entity (like
House
) using a PersonId
will never compile.
PersistStore
One last detail is left unexplained from the previous example: what exactly
does runSqlite
do, and what is that monad that our database actions are
running in?
All database actions require a parameter which is an instance of
PersistStore
. As its name implies, every data store (PostgreSQL, SQLite,
MongoDB) has an instance of PersistStore
. This is where all the translations
from PersistValue
to database-specific values occur, where SQL query
generation happens, and so on.
runSqlite
creates a single connection to a database using its supplied
connection string. For our test cases, we will use :memory:
, which uses an
in-memory database. All of the SQL backends share the same instance of
PersistStore
: SqlBackend
. runSqlite
then provides the SqlBackend
value
as an environment parameter to the action via runReaderT
.
One important thing to note is that everything which occurs inside a single
call to runSqlite
runs in a single transaction. This has two important
implications:
-
For many databases, committing a transaction can be a costly activity. By putting multiple steps into a single transaction, you can speed up code dramatically.
-
If an exception is thrown anywhere inside a single call to
runSqlite
, all actions will be rolled back (assuming your backend has rollback support).
Migrations
I’m sorry to tell you, but so far I have lied to you a bit: the example from the previous section does not actually work. If you try to run it, you will get an error message about a missing table.
For SQL databases, one of the major pains can be managing schema changes. Instead of leaving this to the user, Persistent steps in to help, but you have to ask it to help. Let’s see what this looks like:
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Database.Persist
import Database.Persist.TH
import Database.Persist.Sqlite
import Control.Monad.IO.Class (liftIO)
share [mkPersist sqlSettings, mkSave "entityDefs"] [persistLowerCase|
Person
name String
age Int
deriving Show
|]
main :: IO ()
main = runSqlite ":memory:" $ do
-- this line added: that's it!
runMigration $ migrate entityDefs $ entityDef (Nothing :: Maybe Person)
michaelId <- insert $ Person "Michael" 26
michael <- get michaelId
liftIO $ print michael
With this one little code change, Persistent will automatically create your
Person
table for you. This split between runMigration
and migrate
allows
you to migrate multiple tables simultaneously.
This works when dealing with just a few entities, but can quickly get tiresome
once we are dealing with a dozen entities. Instead of repeating yourself,
Persistent provides a helper function, mkMigrate
:
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
name String
age Int
deriving Show
Car
color String
make String
model String
deriving Show
|]
main :: IO ()
main = runSqlite ":memory:" $ do runMigration migrateAll
mkMigrate
is a Template Haskell function which creates a new function that
will automatically call migrate
on all entities defined in the persist
block. The share
function is just a little helper that passes the information
from the persist block to each Template Haskell function and concatenates the
results.
Persistent has very conservative rules about what it will do during a migration. It starts by loading up table information from the database, complete with all defined SQL datatypes. It then compares that against the entity definition given in the code. For the following cases, it will automatically alter the schema:
-
The datatype of a field changed. However, the database may object to this modification if the data cannot be translated.
-
A field was added. However, if the field is not null, no default value is supplied (we’ll discuss defaults later) and there is already data in the database, the database will not allow this to happen.
-
A field is converted from not null to null. In the opposite case, Persistent will attempt the conversion, contingent upon the database’s approval.
-
A brand new entity is added.
However, there are some cases that Persistent will not handle:
-
Field or entity renames: Persistent has no way of knowing that "name" has now been renamed to "fullName": all it sees is an old field called name and a new field called fullName.
-
Field removals: since this can result in data loss, Persistent by default will refuse to perform the action (you can force the issue by using
runMigrationUnsafe
instead ofrunMigration
, though it is not recommended).
runMigration
will print out the migrations it is running on stderr
(you can
bypass this by using runMigrationSilent
). Whenever possible, it uses ALTER
TABLE
calls. However, in SQLite, ALTER TABLE
has very limited abilities, and
therefore Persistent must resort to copying the data from one table to another.
Finally, if instead of performing a migration, you want Persistent to give
you hints about what migrations are necessary, use the printMigration
function. This function will print out the migrations which runMigration
would perform for you. This may be useful for performing migrations that
Persistent is not capable of, for adding arbitrary SQL to a migration, or just
to log what migrations occurred.
Uniqueness
In addition to declaring fields within an entity, you can also declare uniqueness constraints. A typical example would be requiring that a username be unique.
User
username Text
UniqueUsername username
While each field name must begin with a lowercase letter, the uniqueness constraints must begin with an uppercase letter, since it will be represented in Haskell as a data constructor.
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Data.Time
import Control.Monad.IO.Class (liftIO)
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
firstName String
lastName String
age Int
PersonName firstName lastName
deriving Show
|]
main :: IO ()
main = runSqlite ":memory:" $ do
runMigration migrateAll
insert $ Person "Michael" "Snoyman" 26
michael <- getBy $ PersonName "Michael" "Snoyman"
liftIO $ print michael
To declare a unique combination of fields, we add an extra line to our declaration. Persistent knows that it is defining a unique constructor, since the line begins with a capital letter. Each following word must be a field in this entity.
The main restriction on uniqueness is that it can only be applied to non-null
fields. The reason for this is that the SQL standard is ambiguous on how
uniqueness should be applied to NULL
(e.g., is NULL=NULL
true or false?).
Besides that ambiguity, most SQL engines in fact implement rules which would be
contrary to what the Haskell datatypes anticipate (e.g., PostgreSQL says that
NULL=NULL
is false, whereas Haskell says Nothing == Nothing
is True
).
In addition to providing nice guarantees at the database level about
consistency of your data, uniqueness constraints can also be used to perform
some specific queries within your Haskell code, like the getBy
demonstrated
above. This happens via the Unique
associated type. In the example above, we
end up with a new constructor:
PersonName :: String -> String -> Unique Person
Queries
Depending on what your goal is, there are different approaches to querying the database. Some commands query based on a numeric ID, while others will filter. Queries also differ in the number of results they return: some lookups should return no more than one result (if the lookup key is unique) while others can return many results.
Persistent therefore provides a few different query functions. As usual, we try
to encode as many invariants in the types as possible. For example, a query
that can return only 0 or 1 results will use a Maybe
wrapper, whereas a query
returning many results will return a list.
Fetching by ID
The simplest query you can perform in Persistent is getting based on an ID.
Since this value may or may not exist, its return type is wrapped in a Maybe
.
personId <- insert $ Person "Michael" "Snoyman" 26
maybePerson <- get personId
case maybePerson of
Nothing -> liftIO $ putStrLn "Just kidding, not really there"
Just person -> liftIO $ print person
This can be very useful for sites that provide URLs like /person/5. However,
in such a case, we don’t usually care about the Maybe
wrapper, and just want
the value, returning a 404 message if it is not found. Fortunately, the
get404
(provided by the yesod-persistent package) function helps us out here.
We’ll go into more details when we see integration with Yesod.
Fetching by unique constraint
getBy
is almost identical to get
, except:
-
it takes a uniqueness constraint; that is, instead of an ID it takes a
Unique
value. -
it returns an
Entity
instead of a value. AnEntity
is a combination of database ID and value.
personId <- insert $ Person "Michael" "Snoyman" 26
maybePerson <- getBy $ PersonName "Michael" "Snoyman"
case maybePerson of
Nothing -> liftIO $ putStrLn "Just kidding, not really there"
Just (Entity personId person) -> liftIO $ print person
Like get404
, there is also a getBy404
function.
Select functions
Most likely, you’re going to want more powerful queries. You’ll want to find everyone over a certain age; all cars available in blue; all users without a registered email address. For this, you need one of the select functions.
All the select functions use a similar interface, with slightly different outputs:
Function | Returns |
selectSource |
A NOTE: A |
selectList |
A list containing all the IDs and values from the database. All records will be loaded into memory. |
selectFirst |
Takes just the first ID and value from the database, if available |
selectKeys |
Returns only the keys, without the values, as a |
selectList
is the most commonly used, so we will cover it specifically. Understanding the others should be trivial after that.
selectList
takes two arguments: a list of Filter
s, and a list of
SelectOpt
s. The former is what limits your results based on
characteristics; it allows for equals, less than, is member of, and such.
SelectOpt
s provides for three different features: sorting, limiting output
to a certain number of rows, and offsetting results by a certain number of
rows.
Let’s jump straight into an example of filtering, and then analyze it.
people <- selectList [PersonAge >. 25, PersonAge <=. 30] []
liftIO $ print people
As simple as that example is, we really need to cover three points:
-
PersonAge
is a constructor for an associated phantom type. That might sound scary, but what’s important is that it uniquely identifies the "age" column of the "person" table, and that it knows that the age field is anInt
. (That’s the phantom part.) -
We have a bunch of Persistent filtering operators. They’re all pretty straight-forward: just tack a period to the end of what you’d expect. There are three gotchas here, I’ll explain below.
-
The list of filters is
AND
ed together, so that our constraint means "age is greater than 25 AND age is less than or equal to 30". We’ll describe ORing later.
The one operator that’s surprisingly named is "not equals." We use !=.
, since
/=.
is used for updates (for "divide-and-set", described later). Don’t worry:
if you use the wrong one, the compiler will catch you. The other two surprising
operators are the "is member" and "is not member". They are, respectively,
<-.
and /<-.
(both end with a period).
And regarding OR
s, we use the ||.
operator. For example:
people <- selectList
( [PersonAge >. 25, PersonAge <=. 30]
||. [PersonFirstName /<-. ["Adam", "Bonny"]]
||. ([PersonAge ==. 50] ||. [PersonAge ==. 60])
)
[]
liftIO $ print people
This (completely nonsensical) example means: find people who are 26-30, inclusive, OR whose names are neither Adam or Bonny, OR whose age is either 50 or 60.
SelectOpt
All of our selectList
calls have included an empty list as the second
parameter. That specifies no options, meaning: sort however the database wants,
return all results, and don’t skip any results. A SelectOpt
has four
constructors that can be used to change all that.
- Asc
-
Sort by the given column in ascending order. This uses the same phantom type as filtering, such as
PersonAge
. - Desc
-
Same as
Asc
, in descending order. - LimitTo
-
Takes an
Int
argument. Only return up to the specified number of results. - OffsetBy
-
Takes an
Int
argument. Skip the specified number of results.
The following code defines a function that will break down results into pages. It returns all people aged 18 and over, and then sorts them by age (oldest person first). For people with the same age, they are sorted alphabetically by last name, then first name.
resultsForPage pageNumber = do
let resultsPerPage = 10
selectList
[ PersonAge >=. 18
]
[ Desc PersonAge
, Asc PersonLastName
, Asc PersonFirstName
, LimitTo resultsPerPage
, OffsetBy $ (pageNumber - 1) * resultsPerPage
]
Manipulation
Querying is only half the battle. We also need to be able to add data to and modify existing data in the database.
Insert
It’s all well and good to be able to play with data in the database, but how
does it get there in the first place? The answer is the insert
function. You
just give it a value, and it gives back an ID.
At this point, it makes sense to explain a bit of the philosophy behind Persistent. In many other ORM solutions, the datatypes used to hold data are opaque: you need to go through their defined interfaces to get at and modify the data. That’s not the case with Persistent: we’re using plain old Algebraic Data Types for the whole thing. This means you still get all the great benefits of pattern matching, currying and everything else you’re used to.
However, there are a few things we can’t do. For one, there’s no way to automatically update values in the database every time the record is updated in Haskell. Of course, with Haskell’s normal stance of purity and immutability, this wouldn’t make much sense anyway, so I don’t shed any tears over it.
However, there is one issue that newcomers are often bothered by: why are IDs and values completely separate? It seems like it would be very logical to embed the ID inside the value. In other words, instead of having:
data Person = Person { name :: String }
have
data Person = Person { personId :: PersonId, name :: String }
Well, there’s one problem with this right off the bat: how do we do an insert
? If a Person needs to have an ID, and we get the ID by inserting, and an insert needs a Person, we have an impossible loop. We could solve this with undefined
, but that’s just asking for trouble.
OK, you say, let’s try something a bit safer:
data Person = Person { personId :: Maybe PersonId, name :: String }
I definitely prefer insert $ Person Nothing "Michael"
to insert $ Person
undefined "Michael"
. And now our types will be much simpler, right? For
example, selectList
could return a simple [Person]
instead of that ugly
[Entity SqlPersist Person]
.
The problem is that the "ugliness" is incredibly useful. Having Entity Person
makes it obvious, at the type level, that we’re dealing with a value that
exists in the database. Let’s say we want to create a link to another page that
requires the PersonId
(not an uncommon occurrence as we’ll discuss later).
The Entity Person
form gives us unambiguous access to that information;
embedding PersonId
within Person
with a Maybe
wrapper means an extra
runtime check for Just
, instead of a more error-proof compile time check.
Finally, there’s a semantic mismatch with embedding the ID within the value.
The Person
is the value. Two people are identical (in the context of
Haskell) if all their fields are the same. By embedding the ID in the value,
we’re no longer talking about a person, but about a row in the database.
Equality is no longer really equality, it’s identity: is this the same
person, as opposed to an equivalent person.
In other words, there are some annoyances with having the ID separated out, but overall, it’s the right approach, which in the grand scheme of things leads to better, less buggy code.
Update
Now, in the context of that discussion, let’s think about updating. The simplest way to update is:
let michael = Person "Michael" 26
michaelAfterBirthday = michael { personAge = 27 }
But that’s not actually updating anything, it’s just creating a new Person
value based on the old one. When we say update, we’re not talking about
modifications to the values in Haskell. (We better not be of course, since
data in Haskell is immutable.)
Instead, we’re looking at ways of modifying rows in a table. And the simplest
way to do that is with the update
function.
personId <- insert $ Person "Michael" "Snoyman" 26
update personId [PersonAge =. 27]
update
takes two arguments: an ID, and a list of Update
s. The simplest
update is assignment, but it’s not always the best. What if you want to
increase someone’s age by 1, but you don’t have their current age? Persistent
has you covered:
haveBirthday personId = update personId [PersonAge +=. 1]
And as you might expect, we have all the basic mathematical operators:
+=.
, -=.
, *=.
, and /=.
(full stop). These can be convenient for
updating a single record, but they are also essential for proper ACID
guarantees. Imagine the alternative: pull out a Person
, increment the age,
and update the new value. If you have two threads/processes working on this
database at the same time, you’re in for a world of hurt (hint: race
conditions).
Sometimes you’ll want to update many rows at once (give all your employees a
5% pay increase, for example). updateWhere
takes two parameters: a list of
filters, and a list of updates to apply.
updateWhere [PersonFirstName ==. "Michael"] [PersonAge *=. 2] -- it's been a long day
Occasionally, you’ll just want to completely replace the value in a database
with a different value. For that, you use (surprise) the replace
function.
personId <- insert $ Person "Michael" "Snoyman" 26
replace personId $ Person "John" "Doe" 20
Delete
As much as it pains us, sometimes we must part with our data. To do so, we have three functions:
- delete
-
Delete based on an ID
- deleteBy
-
Delete based on a unique constraint
- deleteWhere
-
Delete based on a set of filters
personId <- insert $ Person "Michael" "Snoyman" 26
delete personId
deleteBy $ PersonName "Michael" "Snoyman"
deleteWhere [PersonFirstName ==. "Michael"]
We can even use deleteWhere
to wipe out all the records in a table, we just
need to give some hints to GHC as to what table we’re interested in:
deleteWhere ([] :: [Filter Person])
Attributes
So far, we have seen a basic syntax for our persistLowerCase
blocks: a line
for the name of our entities, and then an indented line for each field with two
words: the name of the field and the datatype of the field. Persistent handles
more than this: you can assign an arbitrary list of attributes after the first
two words on a line.
Suppose we want to have a Person
entity with an (optional) age and the
timestamp of when he/she was added to the system. For entities already in the
database, we want to just use the current date-time for that timestamp.
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Data.Time
import Control.Monad.IO.Class
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
name String
age Int Maybe
created UTCTime default=CURRENT_TIME
deriving Show
|]
main :: IO ()
main = runSqlite ":memory:" $ do
time <- liftIO getCurrentTime
runMigration migrateAll
insert $ Person "Michael" (Just 26) time
insert $ Person "Greg" Nothing time
return ()
Maybe
is a built in, single word attribute. It makes the field optional. In
Haskell, this means it is wrapped in a Maybe
. In SQL, it makes the column
nullable.
The default
attribute is backend specific, and uses whatever syntax is
understood by the database. In this case, it uses the database’s built-in
CURRENT_TIME
function. Suppose that we now want to add a field for a person’s
favorite programming language:
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Data.Time
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
name String
age Int Maybe
created UTCTime default=CURRENT_TIME
language String default='Haskell'
deriving Show
|]
main :: IO ()
main = runSqlite ":memory:" $ do
runMigration migrateAll
We need to surround the string with single quotes so that the database can properly interpret it. Finally, Persistent can use double quotes for containing white space, so if we want to set someone’s default home country to be El Salvador:
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Data.Time
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
name String
age Int Maybe
created UTCTime default=CURRENT_TIME
language String default='Haskell'
country String "default='El Salvador'"
deriving Show
|]
main :: IO ()
main = runSqlite ":memory:" $ do
runMigration migrateAll
One last trick you can do with attributes is to specify the names to be used for the SQL tables and columns. This can be convenient when interacting with existing databases.
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person sql=the-person-table id=numeric_id
firstName String sql=first_name
lastName String sql=fldLastName
age Int "sql=The Age of the Person"
PersonName firstName lastName
deriving Show
|]
There are a number of other features to the entity definition syntax. An up-to-date list is maintained on the Yesod wiki.
Relations
Persistent allows references between your data types in a manner that is consistent with supporting non-SQL databases. We do this by embedding an ID in the related entity. So if a person has many cars:
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Control.Monad.IO.Class (liftIO)
import Data.Time
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
name String
deriving Show
Car
ownerId PersonId
name String
deriving Show
|]
main :: IO ()
main = runSqlite ":memory:" $ do
runMigration migrateAll
bruce <- insert $ Person "Bruce Wayne"
insert $ Car bruce "Bat Mobile"
insert $ Car bruce "Porsche"
-- this could go on a while
cars <- selectList [CarOwnerId ==. bruce] []
liftIO $ print cars
Using this technique, you can define one-to-many relationships. To define many-to-many relationships, we need a join entity, which has a one-to-many relationship with each of the original tables. It is also a good idea to use uniqueness constraints on these. For example, to model a situation where we want to track which people have shopped in which stores:
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Data.Time
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
name String
Store
name String
PersonStore
personId PersonId
storeId StoreId
UniquePersonStore personId storeId
|]
main :: IO ()
main = runSqlite ":memory:" $ do
runMigration migrateAll
bruce <- insert $ Person "Bruce Wayne"
michael <- insert $ Person "Michael"
target <- insert $ Store "Target"
gucci <- insert $ Store "Gucci"
sevenEleven <- insert $ Store "7-11"
insert $ PersonStore bruce gucci
insert $ PersonStore bruce sevenEleven
insert $ PersonStore michael target
insert $ PersonStore michael sevenEleven
return ()
Closer look at types
So far, we’ve spoken about Person
and PersonId
without really explaining
what they are. In the simplest sense, for a SQL-only system, the PersonId
could just be type PersonId = Int64
. However, that means there is nothing
binding a PersonId
at the type level to the Person
entity. As a result, you
could accidentally use a PersonId
and get a Car
. In order to model this
relationship, we could use phantom types. So, our next naive step would be:
newtype Key entity = Key Int64
type PersonId = Key Person
And that works out really well, until you get to a backend that doesn’t use
Int64 for its IDs. And that’s not just a theoretical question; MongoDB uses
ByteString
s instead. So what we need is a key value that can contain an
Int
and a ByteString
. Seems like a great time for a sum type:
data Key entity = KeyInt Int64 | KeyByteString ByteString
But that’s just asking for trouble. Next we’ll have a backend that uses
timestamps, so we’ll need to add another constructor to Key
. This could go on
for a while. Fortunately, we already have a sum type intended for representing
arbitrary data: PersistValue
:
newtype Key entity = Key PersistValue
And this is (more or less) what Persistent did until version 2.0. However, this
has a different problem: it throws away data. For example, when dealing with a
SQL database, we know that the key type will be an Int64
(assuming defaults
are being used). However, you can’t assert that at the type level with this
construction. So instead, starting with Persistent 2.0, we now use an
associated datatype inside the PersistEntity
class:
class PersistEntity record where
data Key record
...
When you’re working with a SQL backend, and aren’t using a custom key type,
this becomes a newtype wrapper around an Int64
, and the
toSqlKey
/fromSqlKey
functions can perform that type-safe conversion for
you. With MongoDB, on the other hand, it’s a wrapper around a ByteString
.
More complicated, more generic
By default, Persistent will hard-code your datatypes to work with a specific
database backend. When using sqlSettings
, this is the SqlBackend
type. But
if you want to write Persistent code that can be used on multiple backends, you
can enable more generic types by replacing sqlSettings
with sqlSettings {
mpsGeneric = True }
.
To understand why this is necessary, consider relations. Let’s say we want to represent blogs and blog posts. We would use the entity definition:
Blog title Text Post title Text blogId BlogId
We know that BlogId
is just a type synonym for Key Blog
, but how will Key
Blog
be defined? We can’t use an Int64
, since that won’t work for MongoDB.
And we can’t use ByteString
, since that won’t work for SQL databases.
To allow for this, once mpsGeneric
is set to True
, out resulting datatypes have a type parameter to indicate the database backend they use, so that keys can be properly encoded. This looks like:
data BlogGeneric backend = Blog { blogTitle :: Text }
data PostGeneric backend = Post
{ postTitle :: Text
, postBlogId :: Key (BlogGeneric backend)
}
Notice that we still keep the short names for the constructors and the records. Finally, to give a simple interface for normal code, we define some type synonyms:
type Blog = BlogGeneric SqlBackend
type BlogId = Key Blog
type Post = PostGeneric SqlBackend
type PostId = Key Post
And no, SqlBackend
isn’t hard-coded into Persistent anywhere. That
sqlSettings
parameter you’ve been passing to mkPersist
is what tells us to
use SqlBackend
. Mongo code will use mongoSettings
instead.
This might be quite complicated under the surface, but user code hardly ever
touches this. Look back through this whole chapter: not once did we need to
deal with the Key
or Generic
stuff directly. The most common place for it
to pop up is in compiler error messages. So it’s important to be aware that
this exists, but it shouldn’t affect you on a day-to-day basis.
Custom Fields
Occasionally, you will want to define a custom field to be used in your datastore. The most common case is an enumeration, such as employment status. For this, Persistent provides a helper Template Haskell function:
-- @Employment.hs
{-# LANGUAGE TemplateHaskell #-}
module Employment where
import Database.Persist.TH
data Employment = Employed | Unemployed | Retired
deriving (Show, Read, Eq)
derivePersistField "Employment"
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Database.Persist.Sqlite
import Database.Persist.TH
import Employment
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
name String
employment Employment
|]
main :: IO ()
main = runSqlite ":memory:" $ do
runMigration migrateAll
insert $ Person "Bruce Wayne" Retired
insert $ Person "Peter Parker" Unemployed
insert $ Person "Michael" Employed
return ()
derivePersistField
stores the data in the database using a string field, and
performs marshaling using the Show
and Read
instances of the datatype. This
may not be as efficient as storing via an integer, but it is much more future
proof: even if you add extra constructors in the future, your data will still
be valid.
Persistent: Raw SQL
The Persistent package provides a type safe interface to data stores. It tries to be backend-agnostic, such as not relying on relational features of SQL. My experience has been you can easily perform 95% of what you need to do with the high-level interface. (In fact, most of my web apps use the high level interface exclusively.)
But occassionally you’ll want to use a feature that’s specific to a backend. One feature I’ve used in the past is full text search. In this case, we’ll use the SQL "LIKE" operator, which is not modeled in Persistent. We’ll get all people with the last name "Snoyman" and print the records out.
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Database.Persist.TH
import Data.Text (Text)
import Database.Persist.Sqlite
import Control.Monad.IO.Class (liftIO)
import Data.Conduit
import qualified Data.Conduit.List as CL
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
name Text
|]
main :: IO ()
main = runSqlite ":memory:" $ do
runMigration migrateAll
insert $ Person "Michael Snoyman"
insert $ Person "Miriam Snoyman"
insert $ Person "Eliezer Snoyman"
insert $ Person "Gavriella Snoyman"
insert $ Person "Greg Weber"
insert $ Person "Rick Richardson"
-- Persistent does not provide the LIKE keyword, but we'd like to get the
-- whole Snoyman family...
let sql = "SELECT name FROM Person WHERE name LIKE '%Snoyman'"
rawQuery sql [] $$ CL.mapM_ (liftIO . print)
There is also higher-level support that allows for automated data marshaling. Please see the Haddock API docs for more details.
Integration with Yesod
So you’ve been convinced of the power of Persistent. How do you integrate it with your Yesod application? If you use the scaffolding, most of the work is done for you already. But as we normally do, we’ll build up everything manually here to point out how it works under the surface.
The yesod-persistent package provides the meeting point between Persistent and
Yesod. It provides the YesodPersist
typeclass, which standardizes access to
the database via the runDB
method. Let’s see this in action.
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE ViewPatterns #-}
import Yesod
import Database.Persist.Sqlite
import Control.Monad.Trans.Resource (runResourceT)
import Control.Monad.Logger (runStderrLoggingT)
-- Define our entities as usual
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
firstName String
lastName String
age Int
deriving Show
|]
-- We keep our connection pool in the foundation. At program initialization, we
-- create our initial pool, and each time we need to perform an action we check
-- out a single connection from the pool.
data PersistTest = PersistTest ConnectionPool
-- We'll create a single route, to access a person. It's a very common
-- occurrence to use an Id type in routes.
mkYesod "PersistTest" [parseRoutes|
/ HomeR GET
/person/#PersonId PersonR GET
|]
-- Nothing special here
instance Yesod PersistTest
-- Now we need to define a YesodPersist instance, which will keep track of
-- which backend we're using and how to run an action.
instance YesodPersist PersistTest where
type YesodPersistBackend PersistTest = SqlBackend
runDB action = do
PersistTest pool <- getYesod
runSqlPool action pool
-- List all people in the database
getHomeR :: Handler Html
getHomeR = do
people <- runDB $ selectList [] [Asc PersonAge]
defaultLayout
[whamlet|
<ul>
$forall Entity personid person <- people
<li>
<a href=@{PersonR personid}>#{personFirstName person}
|]
-- We'll just return the show value of a person, or a 404 if the Person doesn't
-- exist.
getPersonR :: PersonId -> Handler String
getPersonR personId = do
person <- runDB $ get404 personId
return $ show person
openConnectionCount :: Int
openConnectionCount = 10
main :: IO ()
main = runStderrLoggingT $ withSqlitePool "test.db3" openConnectionCount $ \pool -> liftIO $ do
runResourceT $ flip runSqlPool pool $ do
runMigration migrateAll
insert $ Person "Michael" "Snoyman" 26
warp 3000 $ PersistTest pool
There are two important pieces here for general use. runDB
is used to run a
DB action from within a Handler
. Within the runDB
, you can use any of the
functions we’ve spoken about so far, such as insert
and selectList
.
The other new feature is get404
. It works just like get
, but instead of
returning a Nothing
when a result can’t be found, it returns a 404 message
page. The getPersonR
function is a very common approach used in real-world
Yesod applications: get404
a value and then return a response based on it.
More complex SQL
Persistent strives to be backend-agnostic. The advantage of this approach is code which easily moves from different backend types. The downside is that you lose out on some backend-specific features. Probably the biggest casualty is SQL join support.
Fortunately, thanks to Felipe Lessa, you can have your cake and eat it too. The Esqueleto library provides support for writing type safe SQL queries, using the existing Persistent infrastructure. The Haddocks for that package provide a good introduction to its usage. And since it uses many Persistent concepts, most of your existing Persistent knowledge should transfer over easily.
For a simple example of using Esqueleto, please see the SQL Joins chapter.
Something besides SQLite
To keep the examples in this chapter simple, we’ve used the SQLite backend. Just to round things out, here’s our original synopsis rewritten to work with PostgreSQL:
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
import Control.Monad.IO.Class (liftIO)
import Control.Monad.Logger (runStderrLoggingT)
import Database.Persist
import Database.Persist.Postgresql
import Database.Persist.TH
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
name String
age Int Maybe
deriving Show
BlogPost
title String
authorId PersonId
deriving Show
|]
connStr = "host=localhost dbname=test user=test password=test port=5432"
main :: IO ()
main = runStderrLoggingT $ withPostgresqlPool connStr 10 $ \pool -> liftIO $ do
flip runSqlPersistMPool pool $ do
runMigration migrateAll
johnId <- insert $ Person "John Doe" $ Just 35
janeId <- insert $ Person "Jane Doe" Nothing
insert $ BlogPost "My fr1st p0st" johnId
insert $ BlogPost "One more for good measure" johnId
oneJohnPost <- selectList [BlogPostAuthorId ==. johnId] [LimitTo 1]
liftIO $ print (oneJohnPost :: [Entity BlogPost])
john <- get johnId
liftIO $ print (john :: Maybe Person)
delete janeId
deleteWhere [BlogPostAuthorId ==. johnId]
Summary
Persistent brings the type safety of Haskell to your data access layer. Instead of writing error-prone, untyped data access, or manually writing boilerplate marshal code, you can rely on Persistent to automate the process for you.
The goal is to provide everything you need, most of the time. For the times when you need something a bit more powerful, Persistent gives you direct access to the underlying data store, so you can write whatever 5-way joins you want.
Persistent integrates directly into the general Yesod workflow. Not only do
helper packages like yesod-persistent
provide a nice layer, but packages like
yesod-form
and yesod-auth
also leverage Persistent’s features as well.
For more information on the syntax of entity declarations, database connection, etc. Checkout https://github.com/yesodweb/persistent/wiki