http://www.cis.upenn.edu/~cis194/spring13/hw/05-type-classes.pdf

By Elze Hamilton, @elze , elze.hamilton@gmail.com

What follows is the code for the **Parser** and **StackVM** modules. It was written by the authors of CIS194 Homework 5, and provided with the homework. **Parser** parses strings into arithmetic expressions, and **StackVM** runs a "virtual machine". I'm including it here because this presentation is generated from a Jupyter IHaskell notebook, and this code is necessary for the code in the notebook to run. Feel free to ignore the next three slides for the purposes of this presentation. The presentation starts at Slide 5.

The code for Parser.hs could be found at Parser.hs

The code for StackVM.hs could be found at StackVM.hs

In [1]:

```
:extension TypeSynonymInstances
:extension FlexibleInstances
```

In [2]:

```
-- Applicative parser for infix arithmetic expressions without any
-- dependency on hackage. Builds an explicit representation of the
-- syntax tree to fold over using client-supplied semantics.
module Parser (parseExp) where
import Control.Applicative hiding (Const)
import Control.Arrow
import Data.Char
import Data.Monoid
import Data.List (foldl')
-- Building block of a computation with some state of type @s@
-- threaded through it, possibly resulting in a value of type @r@
-- along with some updated state.
newtype State s r = State (s -> Maybe (r, s))
-- Expressions
data Expr = Const Integer
| Add Expr Expr
| Mul Expr Expr
deriving Show
instance Functor (State s) where
fmap f (State g) = State $ fmap (first f) . g
instance Applicative (State s) where
pure x = State $ \s -> Just (x, s)
State f <*> State g = State $ \s ->
case f s of
Nothing -> Nothing
Just (r, s') -> fmap (first r) . g $ s'
instance Alternative (State s) where
empty = State $ const Nothing
State f <|> State g = State $ \s -> maybe (g s) Just (f s)
-- A parser threads some 'String' state through a computation that
-- produces some value of type @a@.
type Parser a = State String a
-- Parse one numerical digit.
digit :: Parser Integer
digit = State $ parseDigit
where parseDigit [] = Nothing
parseDigit s@(c:cs)
| isDigit c = Just (fromIntegral $ digitToInt c, cs)
| otherwise = Nothing
-- Parse an integer. The integer may be prefixed with a negative sign.
num :: Parser Integer
num = maybe id (const negate) <$> optional (char '-') <*> (toInteger <$> some digit)
where toInteger = foldl' ((+) . (* 10)) 0
-- Parse a single white space character.
space :: Parser ()
space = State $ parseSpace
where parseSpace [] = Nothing
parseSpace s@(c:cs)
| isSpace c = Just ((), cs)
| otherwise = Nothing
-- Consume zero or more white space characters.
eatSpace :: Parser ()
eatSpace = const () <$> many space
-- Parse a specific character.
char :: Char -> Parser Char
char c = State parseChar
where parseChar [] = Nothing
parseChar (x:xs) | x == c = Just (c, xs)
| otherwise = Nothing
-- Parse one of our two supported operator symbols.
op :: Parser (Expr -> Expr -> Expr)
op = const Add <$> (char '+') <|> const Mul <$> (char '*')
-- Succeed only if the end of the input has been reached.
eof :: Parser ()
eof = State parseEof
where parseEof [] = Just ((),[])
parseEof _ = Nothing
-- Parse an infix arithmetic expression consisting of integers, plus
-- signs, multiplication signs, and parentheses.
parseExpr :: Parser Expr
parseExpr = eatSpace *>
((buildOp <$> nonOp <*> (eatSpace *> op) <*> parseExpr) <|> nonOp)
where buildOp x op y = x `op` y
nonOp = char '(' *> parseExpr <* char ')' <|> Const <$> num
-- Run a parser over a 'String' returning the parsed value and the
-- remaining 'String' data.
execParser :: Parser a -> String -> Maybe (a, String)
execParser (State f) = f
-- Run a parser over a 'String' returning the parsed value.
evalParser :: Parser a -> String -> Maybe a
evalParser = (fmap fst .) . execParser
-- Parse an arithmetic expression using the supplied semantics for
-- integral constants, addition, and multiplication.
parseExp :: (Integer -> a) -> (a -> a -> a) -> (a -> a -> a) -> String -> Maybe a
parseExp con add mul = (convert <$>) . evalParser (parseExpr <* eof)
where convert (Const x) = con x
convert (Add x y) = add (convert x) (convert y)
convert (Mul x y) = mul (convert x) (convert y)
```

Use <|>

Found:

maybe (g s) Just (f s)

Why Not:

f s <|> g s

Redundant $

Found:

State $ parseDigit

Why Not:

State parseDigit

Redundant $

Found:

State $ parseSpace

Why Not:

State parseSpace

Use void

Found:

const () <$> many space

Why Not:

Control.Monad.void (many space)

Redundant bracket

Found:

const Add <$> (char '+')

Why Not:

const Add <$> char '+'

Redundant bracket

Found:

const Mul <$> (char '*')

Why Not:

const Mul <$> char '*'

In [3]:

```
module StackVM (StackVal(..), StackExp(..), Stack, Program, stackVM) where
-- Values that may appear in the stack. Such a value will also be
-- returned by the stackVM program execution function.
data StackVal = IVal Integer | BVal Bool | Void deriving Show
-- The various expressions our VM understands.
data StackExp = PushI Integer
| PushB Bool
| Add
| Mul
| And
| Or
deriving Show
type Stack = [StackVal]
type Program = [StackExp]
-- Execute the given program. Returns either an error message or the
-- value on top of the stack after execution.
stackVM :: Program -> Either String StackVal
stackVM = execute []
errType :: String -> Either String a
errType op = Left $ "Encountered '" ++ op ++ "' opcode with ill-typed stack."
errUnderflow :: String -> Either String a
errUnderflow op = Left $ "Stack underflow with '" ++ op ++ "' opcode."
-- Execute a program against a given stack.
execute :: Stack -> Program -> Either String StackVal
execute [] [] = Right Void
execute (s:_) [] = Right s
execute s (PushI x : xs) = execute (IVal x : s) xs
execute s (PushB x : xs) = execute (BVal x : s) xs
execute (IVal s1 : IVal s2 : ss) (Add : xs) = execute (s':ss) xs
where s' = IVal (s1 + s2)
execute (_:_:_) (Add:_) = errType "Add"
execute _ (Add:_) = errUnderflow "Add"
execute (IVal s1:IVal s2:ss) (Mul : xs) = execute (s':ss) xs
where s' = IVal (s1 * s2)
execute (_:_:_) (Mul:_) = errType "Mul"
execute _ (Mul:_) = errUnderflow "Mul"
execute (BVal s1:BVal s2:ss) (And : xs) = execute (s':ss) xs
where s' = BVal (s1 && s2)
execute (_:_:_) (And:_) = errType "And"
execute _ (And:_) = errUnderflow "And"
execute (BVal s1 : BVal s2 : ss) (Or : xs) = execute (s':ss) xs
where s' = BVal (s1 || s2)
execute (_:_:_) (Or:_) = errType "Or"
execute _ (Or:_) = errUnderflow "Or"
test = stackVM [PushI 3, PushI 5, Add]
```

Let's say we want to write a calculator in Haskell. All it will do is add and multiply integers.

Maybe we started by modeling the domain with the following data type of arithmetic expressions:

In [4]:

```
data ExprT = Const Integer
| Add ExprT ExprT
| Mul ExprT ExprT
deriving (Show, Eq)
```

This type, ExprT, is capable of representing expressions involving integer constants, addition, and multiplication.

On the right hand side there are three data constructors:

**Const**, which simply wraps an Integer in an ExprT,**Add**, which constructs an ExprT from two other ExprT's, and represents addition**Mul**, which also constructs an ExprT from two other ExprT's, and represents multiplication

For example, the expression (2 + 3) × 4 would be represented by the value

Mul (Add (Const 2) (Const 3)) (Const 4)

But now let's say we want to also make this expression work on Booleans.

Let's say we thought of a way to interpret Integers as Booleans, and we want to construct expressions from them using the same methods.

For example:

- if an Integer is > 0, it is interpreted as True, otherwise False
- An
**Add**for two Booleans is an*or* - A
**Mul**for two Booleans is an*and*

What we want is the expression **Mul (Add (Const (-2)) (Const 3)) (Const 4)** to be interpreted differently, depending on what type the argument represents. If 2, 3, and 4 represent Integers, we want the expression to evaluate to 4 (because (-2 + 3) x 4 = 4). If 2, 3, and 4 represent Booleans, we want this expression to evaluate to True (because (False || True) && True == True).

So, we want methods **Const**, **Add** and **Mul** to have different meanings depending on what the argument type is.

This is where typeclasses help.

A typeclass defines a collection of methods that a type should implement if it wants to have an instance of the typeclass.

In our example, we can write a typeclass for our methods like this:

In [5]:

```
class Expr a where
const :: Integer -> a
add :: a -> a -> a
mul :: a -> a -> a
```

This typeclass, Expr, has methods **const**, **add**, and **mul**.

**const** takes an Integer and converts it to a value of type **a**.

**add** takes two values of type **a** and returns an **a**.

**mul** takes two values of type **a** and returns an **a**.

(The Haskell syntax a -> a -> a can be read as "(a -> a) -> a", or in other words, the argument is a function that takes a value of type **a** and returns a value of type **a**, and the result is a value of type **a**. But since all functions in Haskell are one-argument functions, a two-argument function is a composition of two one-argument functions, a. k. a. curried functions.)

Notice that this syntax only lists method signatures, i.e. the types of the arguments and the result. It says nothing about what the methods should do. That's the job of the types that define the instances of those typeclasses.

To create expressions out of integers, let's write an instance of typeclass Expr for Integer:

In [6]:

```
instance Expr Integer where
const x = x
add x y = x + y
mul x y = x * y
```

In [7]:

```
add (const 2) (const 9)
```

11

To construct boolean expressions out of integers, let's write an instance of typeclass Expr for Bool.

Let's say that a Bool const from an Integer is true if the integer is positive, and false otherwise.

To add two booleans, we OR them, and to multiply them, we AND them.

In [8]:

```
instance Expr Bool where
const x = x > 0
add x y = x || y
mul x y = x && y
```

In [9]:

```
const 5 :: Bool
```

True

In [10]:

```
const (-2) :: Bool
```

False

As we see, we need to specify the type at the end if the expression (as in :: Bool) so the const method would be called as a Bool.

We can define Expr typeclass for even more types. Here is a type MinMax, which is the same as Integer, except that addition for it means taking the maximum of two integers, and multiplication means taking the minimum of the two integers.

In [11]:

```
newtype MinMax = MinMax Integer deriving (Eq, Ord, Show)
instance Expr MinMax where
const x = MinMax x
add x y = max x y
mul x y = min x y
```

In [12]:

```
add (const 5) (const (-2)) :: MinMax
```

MinMax 5

In [13]:

```
mul (const 5) (const (-2)) :: MinMax
```

MinMax (-2)

So, these methods "const", "add" and "mul", implemented differently for each type, can give us a different result (appropriate for that type) from the same expression.

The homework provided us with a Parser module that has a function *parseExp*. This function will parse a string that contains an expression consisting of usual numbers as well as addition and multiplication signs. It will convert it to an expression consisting of **const**, **add** and **mul** methods. Given those methods, it will perform the appropriate operations and give us a result.

E.g. it can take a string "(3 * -4) + 5" and give us a result "7", because that's the result of these operations.

But for that, parseExp needs us to give methods **const**, **add** and **mul** so it would know how to add and multiply those values.

Thus, we call *parseExp* like this:

In [14]:

```
parseExp const add mul "(3 * -4) + 5"
```

Just (-7)

Why "Just"? If we look at the *parseExp* type description

In [15]:

```
:t parseExp
```

parseExp :: forall a. (Integer -> a) -> (a -> a -> a) -> (a -> a -> a) -> String -> Maybe a

we'll see it returns **Maybe a**. The definition of ** Maybe a ** is

data Maybe a = Just a | Nothing

Which means that if there is a value, you'll get a ** Just **, otherwise you'll get a ** Nothing **.

For example, if you give *parseExp* an invalid string (multiplication sign out of place)

In [16]:

```
parseExp const add mul "(3 -4 *) + 5"
```

Nothing

Just like we got an Integer from *parseExp*, we can get a Bool or any other type for which we defined an instance of the Expr typeclass, because we have provided the methods **const**, **add** and **mul**.

In [17]:

```
parseExp const add mul "(3 * -4) + 5" :: Maybe Bool
```

Just True

(because **const 3 && const (-4) == True && False == False** and **False || const 5 == False || True == True**)

In [18]:

```
parseExp const add mul "(3 * -4) + 5" :: Maybe MinMax
```

Just (MinMax 5)

What does this give us?

If we want to implement a function that takes a string of arithmetic expressions and produces the answer as an Integer, a Boolean, or a MinMax, we don't need to write 3 different functions.

We only need a function like *parseExp* that will replace * and + with appropriate methods. And we supply the meaning of those methods via typeclass instances.

You can implement addition and multiplication for even more complex types this way.

A more complex type for which we can write an instance of the **Expr** typeclass (i.e. implement addition and multiplication), is **Program**.

A **Program** consists of a list of **StackExp**'s, or stack expressions.

A **StackExp** represents an instruction for the virtual machine.

It is defined in StackVM.h, which was written by the homework authors and provided to us.

data StackExp = PushI Integer

| PushB Bool

| Add

| Mul

| And

| Or

deriving Show

type Stack = [StackVal] type Program = [StackExp]

(**StackVal** is a type representing a "stack value", but we don't need to know much about it, because it's for the StackVM internal usage.)

Program is a synonym for [StackExp], or a list of stack expressions.

An example of a Program would be

[PushI 3, PushI 4, Add]

It pushes 3 on the stack, pushes 4 on the stack, and adds them.

A longer example would be

[PushI 1, PushI 2, PushI 3, PushI 4, Add, Mul, Add]

It does the following:

- Pushes 3 on the stack,
- Pushes 4 on the stack,
- Adds them,
- Pushes 2 on the stack,
- Multiplies the result of the previous addition (7) by 2,
- Pushes 1 on the stack,
- Adds the result of the previous multiplication (14) to 1.

What does it mean to add two **Program**s?

You concatenate their lists and append an **Add** at the end.

What does it mean to multiply two **Program**s?

You concatenate their lists and append a **Mul** at the end.

What is a **const** for a **Program**?

(Let's consider only the simplest Programs that consist of PushI, Add and Mul StackExps, omitting the boolean StackExps for the sake of simplicity.)

A **const** for a **Program** is a list consisting of just **PushI x**

In [34]:

```
instance Expr Program where
const x = [PushI x]
add x y = x ++ y ++ [Add]
mul x y = x ++ y ++ [Mul]
```

In [35]:

```
add [PushI 3, PushI 4, Add] [PushI 1, PushI 2, PushI 3, PushI 4, Add, Mul, Add]
```

[PushI 3,PushI 4,Add,PushI 1,PushI 2,PushI 3,PushI 4,Add,Mul,Add,Add]

We can execute Programs with **stackVM** (provided by the homework authors) to make sure our typeclass methods do the right thing.

(**Right** means the result is an **Either** type. An **Either** type wraps the result in either the **Left** or a **Right**, and it is often used to wrap error messages in a **Left** and good values in a **Right**. It is a way for **stackVM** to avoid throwing exceptions and return a nice error message instead.

**IVal** is a **StackVal** type, and it is used internally by **stackVM**. We only see it in the result.)

In [36]:

```
stackVM [PushI 3, PushI 4, Add]
```

Right (IVal 7)

In [37]:

```
stackVM [PushI 1, PushI 2, PushI 3, PushI 4, Add, Mul, Add]
```

Right (IVal 15)

In [38]:

```
stackVM (add [PushI 3, PushI 4, Add] [PushI 1, PushI 2, PushI 3, PushI 4, Add, Mul, Add])
```

Right (IVal 22)

In [44]:

```
stackVM [PushI 5]
```

Right (IVal 5)

Finally, we can write a function **compile**, which will take an arithmetic expression as a string, and return a **Maybe Program**. (**Maybe** is necessary because the expression may be invalid, in which case it will return a **Nothing**.)

In [39]:

```
compile :: String -> Maybe Program
compile s = parseExp const add mul s :: Maybe Program
```

In [40]:

```
compile "(3 * -4) + 5"
```

Just [PushI 3,PushI (-4),Mul,PushI 5,Add]

To sum up, it's pretty neat that an arithmetic expression like "(3 * -4) + 5" can be interpreted as an integer, a boolean, or a stack of instructions for a virtual machine, depending on three simple methods that we implemented. That's the power of typeclasses.

We can run the **Program** generated by **compile** like this.

In [41]:

```
fmap stackVM (compile "(3 * -4) + 5")
```

which is equivalent to this:

In [42]:

```
stackVM <$> compile "(3 * -4) + 5"
```

Just (Right (IVal (-7)))

**fmap** or **<$>** is a method provided by Functor, and **Maybe** is a Functor, so we can't simply call **stackVM** on the result of **compile** -- we need to **fmap** it.