Note:
This is part 2 in a 2-part series. Click here to go to part 1. The complete code associated with parts 1 and 2 can be seen here.
In the previous part of this series we saw some examples of usage of Frames, along with some explanation of how it works under the hood. In this part we take a look at the Postgres access code using beam, and then we finally tie things together by actually converting a ‘beam record’ (place holder for “a regular haskell record used by beam
to describe tables”) to a vinyl
record, and from a list of vinyl
s to an actual dataframe.
A Short Introduction to Beam
Haskell has many libraries for Postgres/other SQL database access/usage. To get some idea about how beam
compares with library $x, I would refer you to beam’s FAQ page. Now onward to a mini-tutorial of how to write some data to beam
and read it back.
Consider the following:
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ImpredicativeTypes #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE PartialTypeSignatures #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeFamilies #-}
module PostgresAccess where
import Data.Text (Text)
import Database.Beam as B
import Database.Beam.Backend.SQL.BeamExtensions
import Database.Beam.Postgres
import Database.Beam.Postgres.Syntax
import Database.PostgreSQL.Simple
import Lens.Micro
-- A
data UserT f = User
{ _userEmail :: Columnar f Text
, _userFirstName :: Columnar f Text
, _userLastName :: Columnar f Text
, _userIsMember :: Columnar f Bool
, _userDaysInQueue :: Columnar f Int
} deriving (Generic)
-- B
type User = UserT Identity
type UserId = PrimaryKey UserT Identity
deriving instance Show User
instance Beamable UserT
instance Beamable (PrimaryKey UserT)
instance Table UserT where
data PrimaryKey UserT f = UserId (Columnar f Text) deriving Generic
primaryKey = UserId . _userEmail
-- C
data ShoppingCartDb f = ShoppingCartDb
{ _shoppingCartUsers :: f (TableEntity UserT) }
deriving Generic
instance Database be ShoppingCartDb
-- D
ShoppingCartDb (TableLens shoppingCartUsers) = dbLenses
User (LensFor userEmail) (LensFor userFirstName)
(LensFor userLastName) (LensFor userIsMember)
(LensFor userDaysInQueue) = tableLenses
-- E
shoppingCartDb :: DatabaseSettings be ShoppingCartDb
shoppingCartDb = defaultDbSettings
-- F
allUsers :: Q PgSelectSyntax ShoppingCartDb s (UserT (QExpr PgExpressionSyntax s))
allUsers = all_ (shoppingCartDb ^. shoppingCartUsers)
insertUsers :: Connection -> IO [User]
insertUsers conn =
runBeamPostgresDebug putStrLn conn $ runInsertReturningList (shoppingCartDb ^. shoppingCartUsers) $
insertValues users
users = [ User "james@example.com" "James" "Smith" True 1 {- james -}
, User "betty@example.com" "Betty" "Jones" False 42 {- betty -}
, User "james@pallo.com" "James" "Pallo" True 1 {- james -}
, User "betty@sims.com" "Betty" "Sims" False 42 {- betty -}
, User "james@oreily.com" "James" "O'Reily" True 1 {- james -}
, User "sam@sophitz.com" "Sam" "Sophitz" False 42 {- sam -}
, User "sam@jely.com" "Sam" "Jely" True 1 {- sam -}
, User "sam@example.com" "Sam" "Taylor" False 42 {- sam -}
]
selectAllUsers :: Connection -> IO [User]
selectAllUsers conn =
runBeamPostgresDebug putStrLn conn $ do
users <- runSelectReturningList $ select allUsers
(liftIO . return) users
Here’s how I would unpack all that is happening in the above:
A. The UserT
record declaration (we’ll call it a ‘beam record’ from now on) is what maps onto the underlying database table. Fields in the UserT
beam record are of the type Columnar f a
; here Columnar
is a type family such that when f is the Identity functor, then Columnar f a = a
.
B. Next we declare some type synonyms for the table and its primary key. We also do ‘StandAloneDeriving’ of several typeclass instances needed by beam
(hence we also use the language extension of the same name). The implementation of these instances is filled in by the compiler at compile time (we need the ‘DeriveGeneric’ extension for this). We also state that _userEmail
is our primary key.
C. Here we have the data declaration corresponding to the database (ShoppingCartDb
) and the table/tables inside the database (_shoppingCartUsers
).
D. These are the declarations necessary for using lenses for accessing the table(s) in the database and fields in a particular table (the UserT
table in this case).
E. defaultSettings
implies, amongst other things, that our naming at the DB level and at the program-level is inline with the default naming that beam
expects/handles automatically. In case there is any difference between the expectation and the implementation, we can specify that here.
F. The three declarations here, are used for inserting some mock data and reading back that data from the database.
This is all the essential code we need to access data written to Postgres. In case you’d like to learn more about beam
, I would refer you to the excellent official tutorial.
Tying things together
Now let’s say you’d like to access the data written to Postgres above from a dataframe, in order interpret it (say in the context of some other CSV(s) also read into a dataframe(s)). In more concrete terms, you have code now that returns a list of User
beam-records, and you want to convert this list to a dataframe. A quick glance at the documentation reveals that Frames.InCore
has the following function:
toFrame :: (Foldable f, RecVec rs) => f (Record rs) -> Frame (Record rs)
This function can take in a foldable (e.g. a list) of Record
s. Record
is a type synonym for Rec Identity
taken from vinyl
. Therefore if we can figure out a way to convert our beam-record to a vinyl
record with the column names promoted to the type level (the rs
), then our job is mostly done.
Consider the following code:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE TypeSynonymInstances #-}
module PostgresFrame where
import Data.Text (Text, unpack)
import Data.Vinyl
import qualified Data.Vinyl.Functor as VF
import Frames.Col
import Frames.InCore
import Frames.Rec
import qualified Lens.Micro as LM
import PostgresAccess (User, userDaysInQueue, userEmail,
userFirstName, userIsMember, userLastName)
-- A
type SEmail = "email" :-> String
type SFirstName = "first_name" :-> String
type SLastName = "last_name" :-> String
type SIsMember = "is_member" :-> Bool
type SDaysInQueue = "days_in_queue" :-> Int
-- B
class FromBeam t rs | t -> rs where
createRecId :: t -> Rec VF.Identity rs
-- C
instance FromBeam User '[SEmail, SFirstName, SLastName, SIsMember, SDaysInQueue] where
createRecId u =
VF.Identity (Col $ unpack $ u LM.^. userEmail) :&
VF.Identity (Col $ unpack $ u LM.^. userFirstName) :&
VF.Identity (Col $ unpack $ u LM.^. userLastName) :&
VF.Identity (Col $ u LM.^. userIsMember) :&
VF.Identity (Col $ u LM.^. userDaysInQueue) :&
RNil
Here’s what the above does:
A. Here we declare the column types using the :->
constructor imported from Frames.Col
. TypeOperators
extension helps here by allowing us to use the :->
as infix notation at the type-level. Also since DataKinds
extension is in effect, we are able to promote the column names from the term level (i.e. String
) to the type-level.
B. Here we declare the FromBeam
class that has two parameters (t
and rs
), thus requiring the MultiParamTypeClasses
extension. The only function under this typeclass is createRecId that takes something of type t
(in our case the beam-record t
) and returns a vinyl
record. In fact given a t
(say User
), we are able to uniquely write the typeclass instance for it’s underlying set of column types (rs
). We formalize this fact by making use of the FunctionalDependencies
extension and add the | t -> rs
annotation to the class declaration. (We could probably further constrain the type variable t
to have certain properties, but I haven’t gotten around to thinking about that yet.)
C. This is where we implement an instance of the FromBeam
typeclass for the User
beam-record. Here we are building a heterogenous list (i.e. a list with values of different types) of column values to get a vinyl
record representing a row in our dataframe. Specifically, :&
is like list cons
(i.e. :
) and RNil
is like the empty list ([]
). We instantiate every column with the Col
data constructor (it is the data constructor for the :->
type constructor) that returns column value of the corresponding type. The column types are inferred from using SEmail, SFirstName, SLastName, SIsMember, SDaysInQueue
, in order.
Next we create a database in psql
named shoppingcart1
. And create a table in this DB named cart_users
using the SQL from the sql/cart_users.sql
file.
Once you’ve done the above, all that remains is to build the project, and fire up ghci
:
$ stack build
-- output elided
$ stack ghci
-- some output elided
Ok, five modules loaded.
Collecting type info for 5 module(s) ...
-- some output elided
ghci>:set -XOverloadedStrings
ghci>import Database.PostgreSQL.Simple
ghci>conn <- connectPostgreSQL "host=localhost dbname=shoppingcart1"
ghci>insertUsers conn
INSERT INTO "cart_users"("email", "first_name", "last_name", "is_member", "days_in_queue") VALUES ('james@example.com', 'James', 'Smith', true, 1), ('betty@example.com', 'Betty', 'Jones', false, 42), ('james@pallo.com', 'James', 'Pallo', true, 1), ('betty@sims.com', 'Betty', 'Sims', false, 42), ('james@oreily.com', 'James', 'O''Reily', true, 1), ('sam@sophitz.com', 'Sam', 'Sophitz', false, 42), ('sam@jely.com', 'Sam', 'Jely', true, 1), ('sam@example.com', 'Sam', 'Taylor', false, 42) RETURNING "email", "first_name", "last_name", "is_member", "days_in_queue"
[User {...},...] -- some output elided
ghci>us <- selectAllUsers conn
SELECT "t0"."email" AS "res0", "t0"."first_name" AS "res1", "t0"."last_name" AS "res2", "t0"."is_member" AS "res3", "t0"."days_in_queue" AS "res4" FROM "cart_users" AS "t0"
ghci>let recId = map createRecId us
ghci>import Frames
ghci>let userFrame = toFrame recId
ghci>:t userFrame
userFrame
:: Frame
(Record '[SEmail, SFirstName, SLastName, SIsMember, SDaysInQueue])
ghci>mapM_ print userFrame
{email :-> "james@example.com", first_name :-> "James", last_name :-> "Smith", is_member :-> True, days_in_queue :-> 1}
{email :-> "betty@example.com", first_name :-> "Betty", last_name :-> "Jones", is_member :-> False, days_in_queue :-> 42}
{email :-> "james@pallo.com", first_name :-> "James", last_name :-> "Pallo", is_member :-> True, days_in_queue :-> 1}
{email :-> "betty@sims.com", first_name :-> "Betty", last_name :-> "Sims", is_member :-> False, days_in_queue :-> 42}
{email :-> "james@oreily.com", first_name :-> "James", last_name :-> "O'Reily", is_member :-> True, days_in_queue :-> 1}
{email :-> "sam@sophitz.com", first_name :-> "Sam", last_name :-> "Sophitz", is_member :-> False, days_in_queue :-> 42}
{email :-> "sam@jely.com", first_name :-> "Sam", last_name :-> "Jely", is_member :-> True, days_in_queue :-> 1}
And with that we are now reading data read from Postgres into a dataframe.
What Comes Next?
Over the next few weeks the idea is to generalize the code that reads a beam-record to vinyl
to account for other cases (such as joins, embedded records etc.). This would be followed by figuring out a way to autogenerate the FromBeam
instance for arbitrary beam-records, along with associated type synonyms for column types. This would take out of the picture all sorts of boiler plate (some of which we saw) that the user would have to come up with otherwise.