## Arithmetic expressions and a simple virtual machine¶

#### This course is online and is a good tool for teaching yourself Haskell.¶

http://www.cis.upenn.edu/~cis194/spring13/hw/05-type-classes.pdf

By Elze Hamilton, @elze , elze.hamilton@gmail.com

What follows is the code for the Parser and StackVM modules. It was written by the authors of CIS194 Homework 5, and provided with the homework. Parser parses strings into arithmetic expressions, and StackVM runs a "virtual machine". I'm including it here because this presentation is generated from a Jupyter IHaskell notebook, and this code is necessary for the code in the notebook to run. Feel free to ignore the next three slides for the purposes of this presentation. The presentation starts at Slide 5.

The code for Parser.hs could be found at Parser.hs

The code for StackVM.hs could be found at StackVM.hs

In [1]:
:extension TypeSynonymInstances
:extension FlexibleInstances

In [2]:
-- Applicative parser for infix arithmetic expressions without any
-- dependency on hackage. Builds an explicit representation of the
-- syntax tree to fold over using client-supplied semantics.
module Parser (parseExp) where
import Control.Applicative hiding (Const)
import Control.Arrow
import Data.Char
import Data.Monoid
import Data.List (foldl')

-- Building block of a computation with some state of type @s@
-- threaded through it, possibly resulting in a value of type @r@
-- along with some updated state.
newtype State s r = State (s -> Maybe (r, s))

-- Expressions
data Expr = Const Integer
| Mul Expr Expr
deriving Show

instance Functor (State s) where
fmap f (State g) = State $fmap (first f) . g instance Applicative (State s) where pure x = State$ \s -> Just (x, s)
State f <*> State g = State $\s -> case f s of Nothing -> Nothing Just (r, s') -> fmap (first r) . g$ s'

instance Alternative (State s) where
empty = State $const Nothing State f <|> State g = State$ \s -> maybe (g s) Just (f s)

-- A parser threads some 'String' state through a computation that
-- produces some value of type @a@.
type Parser a = State String a

-- Parse one numerical digit.
digit :: Parser Integer
digit = State $parseDigit where parseDigit [] = Nothing parseDigit s@(c:cs) | isDigit c = Just (fromIntegral$ digitToInt c, cs)
| otherwise = Nothing

-- Parse an integer. The integer may be prefixed with a negative sign.
num :: Parser Integer
num = maybe id (const negate) <$> optional (char '-') <*> (toInteger <$> some digit)
where toInteger = foldl' ((+) . (* 10)) 0

-- Parse a single white space character.
space :: Parser ()
space = State $parseSpace where parseSpace [] = Nothing parseSpace s@(c:cs) | isSpace c = Just ((), cs) | otherwise = Nothing -- Consume zero or more white space characters. eatSpace :: Parser () eatSpace = const () <$> many space

-- Parse a specific character.
char :: Char -> Parser Char
char c = State parseChar
where parseChar [] = Nothing
parseChar (x:xs) | x == c = Just (c, xs)
| otherwise = Nothing

-- Parse one of our two supported operator symbols.
op :: Parser (Expr -> Expr -> Expr)
op = const Add <$> (char '+') <|> const Mul <$> (char '*')

-- Succeed only if the end of the input has been reached.
eof :: Parser ()
eof = State parseEof
where parseEof [] = Just ((),[])
parseEof _  = Nothing

-- Parse an infix arithmetic expression consisting of integers, plus
-- signs, multiplication signs, and parentheses.
parseExpr :: Parser Expr
parseExpr = eatSpace *>
((buildOp <$> nonOp <*> (eatSpace *> op) <*> parseExpr) <|> nonOp) where buildOp x op y = x op y nonOp = char '(' *> parseExpr <* char ')' <|> Const <$> num

-- Run a parser over a 'String' returning the parsed value and the
-- remaining 'String' data.
execParser :: Parser a -> String -> Maybe (a, String)
execParser (State f) = f

-- Run a parser over a 'String' returning the parsed value.
evalParser :: Parser a -> String -> Maybe a
evalParser = (fmap fst .) . execParser

-- Parse an arithmetic expression using the supplied semantics for
-- integral constants, addition, and multiplication.
parseExp :: (Integer -> a) -> (a -> a -> a) -> (a -> a -> a) -> String -> Maybe a
parseExp con add mul = (convert <$>) . evalParser (parseExpr <* eof) where convert (Const x) = con x convert (Add x y) = add (convert x) (convert y) convert (Mul x y) = mul (convert x) (convert y)  Use <|> Found: maybe (g s) Just (f s) Why Not: f s <|> g s Redundant$
Found:
State $parseDigit Why Not: State parseDigit Redundant$
Found:
State $parseSpace Why Not: State parseSpace Use void Found: const () <$> many space
Why Not:
Redundant bracket
Found:
const Add <$> (char '+') Why Not: const Add <$> char '+'
Redundant bracket
Found:
const Mul <$> (char '*') Why Not: const Mul <$> char '*'
In [3]:
module StackVM (StackVal(..), StackExp(..), Stack, Program, stackVM) where

-- Values that may appear in the stack. Such a value will also be
-- returned by the stackVM program execution function.
data StackVal = IVal Integer | BVal Bool | Void deriving Show

-- The various expressions our VM understands.
data StackExp = PushI Integer
| PushB Bool
| Mul
| And
| Or
deriving Show

type Stack   = [StackVal]
type Program = [StackExp]

-- Execute the given program. Returns either an error message or the
-- value on top of the stack after execution.
stackVM :: Program -> Either String StackVal
stackVM = execute []

errType :: String -> Either String a
errType op = Left $"Encountered '" ++ op ++ "' opcode with ill-typed stack." errUnderflow :: String -> Either String a errUnderflow op = Left$ "Stack underflow with '" ++ op ++ "' opcode."

-- Execute a program against a given stack.
execute :: Stack -> Program -> Either String StackVal
execute [] []                               = Right Void
execute (s:_) []                            = Right s

execute s (PushI x : xs)                    = execute (IVal x : s) xs
execute s (PushB x : xs)                    = execute (BVal x : s) xs

execute (IVal s1 : IVal s2 : ss) (Add : xs) = execute (s':ss) xs
where s' = IVal (s1 + s2)

execute (IVal s1:IVal s2:ss) (Mul : xs)     = execute (s':ss) xs
where s' = IVal (s1 * s2)
execute (_:_:_) (Mul:_)                     = errType "Mul"
execute _ (Mul:_)                           = errUnderflow "Mul"

execute (BVal s1:BVal s2:ss) (And : xs)     = execute (s':ss) xs
where s' = BVal (s1 && s2)
execute (_:_:_) (And:_)                     = errType "And"
execute _ (And:_)                           = errUnderflow "And"

execute (BVal s1 : BVal s2 : ss) (Or : xs)  = execute (s':ss) xs
where s' = BVal (s1 || s2)
execute (_:_:_) (Or:_)                      = errType "Or"
execute _ (Or:_)                            = errUnderflow "Or"

test = stackVM [PushI 3, PushI 5, Add]


Let's say we want to write a calculator in Haskell. All it will do is add and multiply integers.

Maybe we started by modeling the domain with the following data type of arithmetic expressions:

In [4]:
data ExprT  = Const Integer
| Mul ExprT ExprT
deriving (Show, Eq)


This type, ExprT, is capable of representing expressions involving integer constants, addition, and multiplication.

On the right hand side there are three data constructors:

• Const, which simply wraps an Integer in an ExprT,
• Add, which constructs an ExprT from two other ExprT's, and represents addition
• Mul, which also constructs an ExprT from two other ExprT's, and represents multiplication

For example, the expression (2 + 3) × 4 would be represented by the value

Mul (Add (Const 2) (Const 3)) (Const 4)

But now let's say we want to also make this expression work on Booleans.

Let's say we thought of a way to interpret Integers as Booleans, and we want to construct expressions from them using the same methods.

For example:

• if an Integer is > 0, it is interpreted as True, otherwise False
• An Add for two Booleans is an or
• A Mul for two Booleans is an and

What we want is the expression Mul (Add (Const (-2)) (Const 3)) (Const 4) to be interpreted differently, depending on what type the argument represents. If 2, 3, and 4 represent Integers, we want the expression to evaluate to 4 (because (-2 + 3) x 4 = 4). If 2, 3, and 4 represent Booleans, we want this expression to evaluate to True (because (False || True) && True == True).

So, we want methods Const, Add and Mul to have different meanings depending on what the argument type is.

This is where typeclasses help.

A typeclass defines a collection of methods that a type should implement if it wants to have an instance of the typeclass.

In our example, we can write a typeclass for our methods like this:

In [5]:
class Expr a where
const :: Integer -> a
add :: a -> a -> a
mul :: a -> a -> a


This typeclass, Expr, has methods const, add, and mul.

const takes an Integer and converts it to a value of type a.

add takes two values of type a and returns an a.

mul takes two values of type a and returns an a.

(The Haskell syntax a -> a -> a can be read as "(a -> a) -> a", or in other words, the argument is a function that takes a value of type a and returns a value of type a, and the result is a value of type a. But since all functions in Haskell are one-argument functions, a two-argument function is a composition of two one-argument functions, a. k. a. curried functions.)

Notice that this syntax only lists method signatures, i.e. the types of the arguments and the result. It says nothing about what the methods should do. That's the job of the types that define the instances of those typeclasses.

To create expressions out of integers, let's write an instance of typeclass Expr for Integer:

In [6]:
instance Expr Integer where
const x = x
add x y = x + y
mul x y = x * y

In [7]:
add (const 2) (const 9)

11

To construct boolean expressions out of integers, let's write an instance of typeclass Expr for Bool.

Let's say that a Bool const from an Integer is true if the integer is positive, and false otherwise.

To add two booleans, we OR them, and to multiply them, we AND them.

In [8]:
instance Expr Bool where
const x = x > 0
add x y = x || y
mul x y = x && y

In [9]:
const 5 :: Bool

True
In [10]:
const (-2) :: Bool

False

As we see, we need to specify the type at the end if the expression (as in :: Bool) so the const method would be called as a Bool.

We can define Expr typeclass for even more types. Here is a type MinMax, which is the same as Integer, except that addition for it means taking the maximum of two integers, and multiplication means taking the minimum of the two integers.

In [11]:
newtype MinMax = MinMax Integer deriving (Eq, Ord, Show)

instance Expr MinMax where
const x = MinMax x
add x y = max x y
mul x y = min x y

In [12]:
add (const 5) (const (-2)) :: MinMax

MinMax 5
In [13]:
mul (const 5) (const (-2)) :: MinMax

MinMax (-2)

So, these methods "const", "add" and "mul", implemented differently for each type, can give us a different result (appropriate for that type) from the same expression.

The homework provided us with a Parser module that has a function parseExp. This function will parse a string that contains an expression consisting of usual numbers as well as addition and multiplication signs. It will convert it to an expression consisting of const, add and mul methods. Given those methods, it will perform the appropriate operations and give us a result.

E.g. it can take a string "(3 * -4) + 5" and give us a result "7", because that's the result of these operations.

But for that, parseExp needs us to give methods const, add and mul so it would know how to add and multiply those values.

Thus, we call parseExp like this:

In [14]:
parseExp const add mul "(3 * -4) + 5"

Just (-7)

Why "Just"? If we look at the parseExp type description

In [15]:
:t parseExp

parseExp :: forall a. (Integer -> a) -> (a -> a -> a) -> (a -> a -> a) -> String -> Maybe a

we'll see it returns Maybe a. The definition of Maybe a is

data Maybe a = Just a | Nothing

Which means that if there is a value, you'll get a Just , otherwise you'll get a Nothing .

For example, if you give parseExp an invalid string (multiplication sign out of place)

In [16]:
parseExp const add mul "(3 -4 *) + 5"

Nothing

Just like we got an Integer from parseExp, we can get a Bool or any other type for which we defined an instance of the Expr typeclass, because we have provided the methods const, add and mul.

In [17]:
parseExp const add mul "(3 * -4) + 5" :: Maybe Bool

Just True

(because const 3 && const (-4) == True && False == False and False || const 5 == False || True == True)

In [18]:
parseExp const add mul "(3 * -4) + 5" :: Maybe MinMax

Just (MinMax 5)

What does this give us?

If we want to implement a function that takes a string of arithmetic expressions and produces the answer as an Integer, a Boolean, or a MinMax, we don't need to write 3 different functions.

We only need a function like parseExp that will replace * and + with appropriate methods. And we supply the meaning of those methods via typeclass instances.

You can implement addition and multiplication for even more complex types this way.

A more complex type for which we can write an instance of the Expr typeclass (i.e. implement addition and multiplication), is Program.

A Program consists of a list of StackExp's, or stack expressions.

A StackExp represents an instruction for the virtual machine.

It is defined in StackVM.h, which was written by the homework authors and provided to us.

data StackExp = PushI Integer
| PushB Bool
| Mul
| And
| Or
deriving Show

type Stack = [StackVal] type Program = [StackExp]

(StackVal is a type representing a "stack value", but we don't need to know much about it, because it's for the StackVM internal usage.)

Program is a synonym for [StackExp], or a list of stack expressions.

An example of a Program would be

It pushes 3 on the stack, pushes 4 on the stack, and adds them.

A longer example would be

It does the following:

• Pushes 3 on the stack,
• Pushes 4 on the stack,
• Pushes 2 on the stack,
• Multiplies the result of the previous addition (7) by 2,
• Pushes 1 on the stack,
• Adds the result of the previous multiplication (14) to 1.

What does it mean to add two Programs?

You concatenate their lists and append an Add at the end.

What does it mean to multiply two Programs?

You concatenate their lists and append a Mul at the end.

What is a const for a Program?

(Let's consider only the simplest Programs that consist of PushI, Add and Mul StackExps, omitting the boolean StackExps for the sake of simplicity.)

A const for a Program is a list consisting of just PushI x

In [34]:
instance Expr Program where
const x = [PushI x]
mul x y = x ++ y ++ [Mul]

In [35]:
add [PushI 3, PushI 4, Add] [PushI 1, PushI 2, PushI 3, PushI 4, Add, Mul, Add]

[PushI 3,PushI 4,Add,PushI 1,PushI 2,PushI 3,PushI 4,Add,Mul,Add,Add]

We can execute Programs with stackVM (provided by the homework authors) to make sure our typeclass methods do the right thing.

(Right means the result is an Either type. An Either type wraps the result in either the Left or a Right, and it is often used to wrap error messages in a Left and good values in a Right. It is a way for stackVM to avoid throwing exceptions and return a nice error message instead.

IVal is a StackVal type, and it is used internally by stackVM. We only see it in the result.)

In [36]:
stackVM [PushI 3, PushI 4, Add]

Right (IVal 7)
In [37]:
stackVM [PushI 1, PushI 2, PushI 3, PushI 4, Add, Mul, Add]

Right (IVal 15)
In [38]:
stackVM (add [PushI 3, PushI 4, Add] [PushI 1, PushI 2, PushI 3, PushI 4, Add, Mul, Add])

Right (IVal 22)
In [44]:
stackVM [PushI 5]

Right (IVal 5)

Finally, we can write a function compile, which will take an arithmetic expression as a string, and return a Maybe Program. (Maybe is necessary because the expression may be invalid, in which case it will return a Nothing.)

In [39]:
compile :: String -> Maybe Program
compile s = parseExp const add mul s :: Maybe Program

In [40]:
compile "(3 * -4) + 5"

Just [PushI 3,PushI (-4),Mul,PushI 5,Add]

To sum up, it's pretty neat that an arithmetic expression like "(3 * -4) + 5" can be interpreted as an integer, a boolean, or a stack of instructions for a virtual machine, depending on three simple methods that we implemented. That's the power of typeclasses.

We can run the Program generated by compile like this.

In [41]:
fmap stackVM (compile "(3 * -4) + 5")


which is equivalent to this:

In [42]:
stackVM <$> compile "(3 * -4) + 5"  Just (Right (IVal (-7))) fmap or <$> is a method provided by Functor, and Maybe is a Functor, so we can't simply call stackVM on the result of compile -- we need to fmap it.