How AI, OpenAI, and ChatGPT impact business and software.
Template Haskell
1. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Template Haskell
bit.ly/green-th
talk Artyom Kazak (@neongreen)
slides Dmitry Kovanikov (@chshersh)
Hi! I’m Artyom, and this is going to be an hour-long talk about Template
Haskell, which is basically a way to make GHC write boring code for you, which
might sound like a dream come true, but the catch is that first you would have
to write a bunch of awful incomprehensible read-only code that all your
colleagues will hate you for – and you’ll spend about six times as much time
doing that as it would have taken you to write the code you wanted to get.
This is the perfect time to leave this talk and go to the other wing of our
fabulous office, where you can eat free cookies and play kalimba. Those
compelled by social awkwardness to stay will with absolute certainty regret it.
Some basic decency requires me to mention that the slides were actually largely
written by my former colleague Dmitry, who is teaching a Haskell course at the
ITMO university in Saint Petersburg, Russia. You will only have to hear one
talk, but poor Russian students have to hear sixteen and do homework after
each one. If you ever wanted to hire competent Haskellers for cheap, now you
know the place.
3. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Copypaste
instance MonadBaseControl IO IO where
type StM IO a = a
liftBaseWith f = f id
restoreM = return
{-# INLINABLE liftBaseWith #-}
{-# INLINABLE restoreM #-}
instance MonadBaseControl (Either e) (Either e) where
type StM (Either e) a = a
liftBaseWith f = f id
restoreM = return
{-# INLINABLE liftBaseWith #-}
{-# INLINABLE restoreM #-}
instance MonadBaseControl STM STM where
type StM STM a = a
liftBaseWith f = f id
restoreM = return
{-# INLINABLE liftBaseWith #-}
{-# INLINABLE restoreM #-}
The C preprocessor is a magic copypaste removal tool. Let’s say you have this
instance of MonadBaseControl. If you don’t know what MonadBaseControl is,
don’t worry, it doesn’t matter. Although, of course, you should still be
ashamed. The instance for Maybe happens to be similar. Same for lists. Same
for STM. Same for Either. You spend a day writing such instances and
recompiling after each new one, because you are a really careful programmer.
While things are compiling, you are reading Hackernews. Life is good.
4. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-XCPP
Raw Haskell
instance MonadBaseControl IO IO where
type StM IO a = a
liftBaseWith f = f id
restoreM = return
{-# INLINABLE liftBaseWith #-}
{-# INLINABLE restoreM #-}
With a #define
{-# LANGUAGE CPP #-}
#define BASE(M)
instance MonadBaseControl (M) (M) where {
type StM (M) a = a;
liftBaseWith f = f id;
restoreM = return;
{-# INLINABLE liftBaseWith #-};
{-# INLINABLE restoreM #-}}
Unfortunately, cursed with the knowledge of the C preprocessor, you’ll be able
to write them all in one minute. Here’s how.
We take the Haskell code that we want to duplicate. We enable CPP and
create a #define parametrized by an M. This ”IO” is the only thing that
changes between the instances, so it becomes an ”M”. We also add semicolons
and curly braces because the C preprocessor hates us and will strip away all line
breaks. If this is the first time you see Haskell written with curly braces, you’re
a lucky person. GHC developers see it every day because Simon Peyton-Jones
really likes this style for whatever reason.
5. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Instance generation
BASE(IO)
BASE(Maybe)
BASE(Either e)
BASE([])
BASE((->) r)
BASE(Identity)
BASE(STM)
#if MIN_VERSION_base(4,4,0)
BASE(Strict.ST s)
BASE( ST s)
#endif
#undef BASE
Now we can just give our #define different Ms and it will generate different
instances. This is enough to cover maybe a third of usecases of Template
Haskell. There are some pitfalls – for instance, the default C preprocessor on
macOS and on Linux has different ideas about whether single quotes can occur
in function names, or whether patterns inside quotes should be replaced or not.
I have documented everything I know about CPP at this link, so if you actually
end up using it, please take a look.
6. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Template Haskell
And now we can actually move to Template Haskell. The
aforementioned C preprocessor operates on text; it doesn’t know
anything about Haskell syntax, and in fact it is actively malicious
to our poor Haskell sources sometimes, because it believes them to
be written in C. Template Haskell works in a different way – it
gives us a bunch of giant ADTs that we can use to construct any
piece of Haskell that we need, and then it will run our code and
compile the generated ADT without ever turning it to text. Please
raise a hand if you have never heard about a concept called
”abstract syntax tree”. [An optional explanation.]
7. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
A simple example
fst (x,_) = x
fst3 (x,_,_) = x
fst4 (x,_,_,_) = x
print $ fst3 ("hello world", 1, 2)
print $ fst4 ("hello world", 1, 2, 3)
{-# LANGUAGE TemplateHaskell #-}
print $ $(fstN 3) ("hello world", 1, 2)
print $ $(fstN 4) ("hello world", 1, 2, 3)
import Language.Haskell.TH
fstN :: Int -> Q Exp
fstN n = do
x <- newName "x"
pure $ LamE [TupP $ VarP x : replicate (n - 1) WildP]
(VarE x)
Our first example will be very simple. We just want to do something that all
other languages can do already: get the first element of an arbitrary-length
tuple. We could write different functions for each tuple length, but it gets old
really fast. We’d rather like the compiler to generate this repetitive code for us.
Spoiler: it’s not going to look pretty, but it will work. And I might even show
you how the code generator will look, just in case you’re wondering. I’ll explain
it a couple of slides later.
8. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
A simple example (part 1)
GHC can parse code for us:
ghci> import Language.Haskell.TH
ghci> :set -XTemplateHaskell
ghci> runQ [| (x, _, _) -> x |]
LamE [TupP [VarP x_1,WildP,WildP]] (VarE x_1)
First, you should know that GHC provides a wonderful way to cheat at code
generation: for any piece of code, it can tell you how the syntax tree for that
code looks like. You need to put it into those funny brackets called ”Oxford
brackets” and do runQ. Let’s look at the pieces we have here. VarP and VarE
are constructors of the syntax tree; P stands for Pattern and E stands for
Expression. Those constructors take Names, which are basically strings that
also contain optional information about which module something comes from.
This is needed so that GHC could distinguish between identifiers that are
bound locally and identifiers that come from elsewhere.You can create a new
name with ’mkName’.If we just copy what GHC gave us, we can write our first
generator. Yeah, it will always return the same code, but it’s a start.Before
going further, let’s talk about the rest of the pieces. A lambda takes two
arguments: the left side and the right side. The left side binds some patterns;
the right side is an expression that may refer to the bound variables. Patterns
can be compound; in this example we have a tuple pattern with three elements,
the first one binds something to our ”x”, the rest are wildcards. And on the
right side we just return the bound ”x”.
9. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
A simple example (part 2)
ghci> import Language.Haskell.TH
ghci> :set -XTemplateHaskell
ghci> runQ [| (x, _, _) -> x |]
LamE [TupP [VarP x_1,WildP,WildP]] (VarE x_1)
VarP :: Name -> Pat
VarE :: Name -> Exp
mkName :: String -> Name
fst3 :: Q Exp
fst3 = do
let x = mkName "x"
pure $ LamE [TupP [VarP x, WildP, WildP]] (VarE x)
-- (x, _, _) -> x
10. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Haskell AST
data Exp
= VarE Name -- x
| ConE Name -- Just
| LitE Lit -- 5, 'c', "string"
| AppE Exp Exp -- f x
| AppTypeE Exp Type -- f @Int
| InfixE (Maybe Exp) Exp (Maybe Exp) -- x+y, (x+), (+x)
| LamE [Pat] Exp -- a b c -> ...
| TupE [Exp] -- (a, b, c)
| CondE Exp Exp Exp -- if p then A else B
| ...
data Pat = ...
data Type = ...
data Dec = ...
Here we have a piece of the syntax tree of expressions. Variables and generally
all things that are referred to by name; constructors like ’Just’ are special and
have their own place in the syntax tree; function application; type application;
operator application and operator sections; lambdas; tuples; and ifs.
There are four major syntax types, the other three being patterns, types, and
declarations. At this point you might start feeling that you have become
comfortable with Template Haskell; you are telling yourself that you will just
spend ten minutes looking at Template Haskell docs and you’ll become able to
conjure code out of thin air. However, actually looking at the docs even once
should be enough to disabuse ourselves of this laughable notion.
11. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
A simple example
fst3 :: Q Exp
fst3 = do
let x = mkName "x"
pure $ LamE [TupP [VarP x, WildP, WildP]] (VarE x)
varP :: Name -> Q Pat
varE :: Name -> Q Exp
newName :: String -> Q Name
Those of you familiar with Lisp or Scheme could’ve noticed a tiny
weirdness that everyone else probably missed. What exactly are the
rules for resolving variables? If there was already an ”x” in scope,
will the ”x” on the right side of the lambda refer to the bound one,
or to the one that came from the scope? The answer is ”it will do
the right thing but you shouldn’t rely on that because in some
non-obvious cases it won’t”. We should really just generate a
unique name so that we would be able to sleep at night and not
have nightmares about having mixed up stuff. And yes, we can
generate unique names! The Q monad provides ’newName’ just for
that purpose. And we are going to embrace the Q monad by doing
another change – we’ll use lifted constructors that operate in Q.
This is useful because now, whenever we want to generate unique
names in some subgenerator, or want to use some other thing the
Q monad provides, we won’t have to restructure our code.
12. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
A simple example
fst3 is a macro. Use $(fst3) to expand it. $(...) is called a splice.
ghci> :set -XTemplateHaskell
ghci> :t $(fst3)
$(fst3) :: (t2, t1, t) -> t2
ghci> $(fst3) ("hello", 10, True)
"hello"
[this is the point where I didn’t have enough time to write
extensive speaker notes]
Another name for ”code generator” is ”a macro”. We can ask GHC
to execute that macro by using the dollar.
13. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
A simple example
fst4 :: Q Exp
fst4 = do
x <- newName "x"
lamE [tupP [varP x, wildP, wildP, wildP]] (varE x)
fst5 :: Q Exp
fst5 = do
x <- newName "x"
lamE [tupP [varP x, wildP, wildP, wildP, wildP]] (varE x)
fstN :: Int -- ^ Tuple length
-> Q Exp
fstN n = do
x <- newName "x"
lamE [tupP $ varP x : replicate (n - 1) wildP] (varE x)
Let us generalize the ’fst3’ macro now.
14. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Stage restriction
fst3 :: Q Exp
fst3 = do
x <- newName "x"
lamE [tupP [varP x, wildP, wildP]] (varE x)
main = print ($fst3 (1,2,3))
GHC stage restriction: 'fst3'
is used in a top-level splice or annotation ,
and must be imported, not defined locally
eval :: Exp -> Int
eval expr = $(pure expr)
This is a point where everyone trips. Let’s say you define a macro and want to
use it, like a naive person you are. You will get this error. You must define
macros in a different place from where you use them. Why? Because otherwise
you get code compilation at runtime.
15. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Declaration order matters :(
import Control.Lens
data Point = Point { _x, _y :: Double }
isNull :: Point -> Bool
isNull p = p ^. x == 0 -- (8)
&& p ^. y == 0 -- (9)
makeLenses ''Point -- (11)
8:17: error: …•
Variable not in scope: x :: Getting Integer Point Integer•‘’
x (splice on line 11) is not in scope before line 11
9:17: error: …•
Variable not in scope: y :: Getting Integer Point Integer•‘’
y (splice on line 11) is not in scope before line 11