By Elze Hamilton, @elze , firstname.lastname@example.org
What follows is the code for the Parser and StackVM modules. It was written by the authors of CIS194 Homework 5, and provided with the homework. Parser parses strings into arithmetic expressions, and StackVM runs a "virtual machine". I'm including it here because this presentation is generated from a Jupyter IHaskell notebook, and this code is necessary for the code in the notebook to run. Feel free to ignore the next three slides for the purposes of this presentation. The presentation starts at Slide 5.
The code for Parser.hs could be found at Parser.hs
The code for StackVM.hs could be found at StackVM.hs
:extension TypeSynonymInstances :extension FlexibleInstances
-- Applicative parser for infix arithmetic expressions without any -- dependency on hackage. Builds an explicit representation of the -- syntax tree to fold over using client-supplied semantics. module Parser (parseExp) where import Control.Applicative hiding (Const) import Control.Arrow import Data.Char import Data.Monoid import Data.List (foldl') -- Building block of a computation with some state of type @s@ -- threaded through it, possibly resulting in a value of type @r@ -- along with some updated state. newtype State s r = State (s -> Maybe (r, s)) -- Expressions data Expr = Const Integer | Add Expr Expr | Mul Expr Expr deriving Show instance Functor (State s) where fmap f (State g) = State $ fmap (first f) . g instance Applicative (State s) where pure x = State $ \s -> Just (x, s) State f <*> State g = State $ \s -> case f s of Nothing -> Nothing Just (r, s') -> fmap (first r) . g $ s' instance Alternative (State s) where empty = State $ const Nothing State f <|> State g = State $ \s -> maybe (g s) Just (f s) -- A parser threads some 'String' state through a computation that -- produces some value of type @a@. type Parser a = State String a -- Parse one numerical digit. digit :: Parser Integer digit = State $ parseDigit where parseDigit  = Nothing parseDigit s@(c:cs) | isDigit c = Just (fromIntegral $ digitToInt c, cs) | otherwise = Nothing -- Parse an integer. The integer may be prefixed with a negative sign. num :: Parser Integer num = maybe id (const negate) <$> optional (char '-') <*> (toInteger <$> some digit) where toInteger = foldl' ((+) . (* 10)) 0 -- Parse a single white space character. space :: Parser () space = State $ parseSpace where parseSpace  = Nothing parseSpace s@(c:cs) | isSpace c = Just ((), cs) | otherwise = Nothing -- Consume zero or more white space characters. eatSpace :: Parser () eatSpace = const () <$> many space -- Parse a specific character. char :: Char -> Parser Char char c = State parseChar where parseChar  = Nothing parseChar (x:xs) | x == c = Just (c, xs) | otherwise = Nothing -- Parse one of our two supported operator symbols. op :: Parser (Expr -> Expr -> Expr) op = const Add <$> (char '+') <|> const Mul <$> (char '*') -- Succeed only if the end of the input has been reached. eof :: Parser () eof = State parseEof where parseEof  = Just ((),) parseEof _ = Nothing -- Parse an infix arithmetic expression consisting of integers, plus -- signs, multiplication signs, and parentheses. parseExpr :: Parser Expr parseExpr = eatSpace *> ((buildOp <$> nonOp <*> (eatSpace *> op) <*> parseExpr) <|> nonOp) where buildOp x op y = x `op` y nonOp = char '(' *> parseExpr <* char ')' <|> Const <$> num -- Run a parser over a 'String' returning the parsed value and the -- remaining 'String' data. execParser :: Parser a -> String -> Maybe (a, String) execParser (State f) = f -- Run a parser over a 'String' returning the parsed value. evalParser :: Parser a -> String -> Maybe a evalParser = (fmap fst .) . execParser -- Parse an arithmetic expression using the supplied semantics for -- integral constants, addition, and multiplication. parseExp :: (Integer -> a) -> (a -> a -> a) -> (a -> a -> a) -> String -> Maybe a parseExp con add mul = (convert <$>) . evalParser (parseExpr <* eof) where convert (Const x) = con x convert (Add x y) = add (convert x) (convert y) convert (Mul x y) = mul (convert x) (convert y)
module StackVM (StackVal(..), StackExp(..), Stack, Program, stackVM) where -- Values that may appear in the stack. Such a value will also be -- returned by the stackVM program execution function. data StackVal = IVal Integer | BVal Bool | Void deriving Show -- The various expressions our VM understands. data StackExp = PushI Integer | PushB Bool | Add | Mul | And | Or deriving Show type Stack = [StackVal] type Program = [StackExp] -- Execute the given program. Returns either an error message or the -- value on top of the stack after execution. stackVM :: Program -> Either String StackVal stackVM = execute  errType :: String -> Either String a errType op = Left $ "Encountered '" ++ op ++ "' opcode with ill-typed stack." errUnderflow :: String -> Either String a errUnderflow op = Left $ "Stack underflow with '" ++ op ++ "' opcode." -- Execute a program against a given stack. execute :: Stack -> Program -> Either String StackVal execute   = Right Void execute (s:_)  = Right s execute s (PushI x : xs) = execute (IVal x : s) xs execute s (PushB x : xs) = execute (BVal x : s) xs execute (IVal s1 : IVal s2 : ss) (Add : xs) = execute (s':ss) xs where s' = IVal (s1 + s2) execute (_:_:_) (Add:_) = errType "Add" execute _ (Add:_) = errUnderflow "Add" execute (IVal s1:IVal s2:ss) (Mul : xs) = execute (s':ss) xs where s' = IVal (s1 * s2) execute (_:_:_) (Mul:_) = errType "Mul" execute _ (Mul:_) = errUnderflow "Mul" execute (BVal s1:BVal s2:ss) (And : xs) = execute (s':ss) xs where s' = BVal (s1 && s2) execute (_:_:_) (And:_) = errType "And" execute _ (And:_) = errUnderflow "And" execute (BVal s1 : BVal s2 : ss) (Or : xs) = execute (s':ss) xs where s' = BVal (s1 || s2) execute (_:_:_) (Or:_) = errType "Or" execute _ (Or:_) = errUnderflow "Or" test = stackVM [PushI 3, PushI 5, Add]
Let's say we want to write a calculator in Haskell. All it will do is add and multiply integers.
Maybe we started by modeling the domain with the following data type of arithmetic expressions:
data ExprT = Const Integer | Add ExprT ExprT | Mul ExprT ExprT deriving (Show, Eq)
This type, ExprT, is capable of representing expressions involving integer constants, addition, and multiplication.
On the right hand side there are three data constructors:
For example, the expression (2 + 3) × 4 would be represented by the value
Mul (Add (Const 2) (Const 3)) (Const 4)
But now let's say we want to also make this expression work on Booleans.
Let's say we thought of a way to interpret Integers as Booleans, and we want to construct expressions from them using the same methods.
What we want is the expression Mul (Add (Const (-2)) (Const 3)) (Const 4) to be interpreted differently, depending on what type the argument represents. If 2, 3, and 4 represent Integers, we want the expression to evaluate to 4 (because (-2 + 3) x 4 = 4). If 2, 3, and 4 represent Booleans, we want this expression to evaluate to True (because (False || True) && True == True).
So, we want methods Const, Add and Mul to have different meanings depending on what the argument type is.
This is where typeclasses help.
A typeclass defines a collection of methods that a type should implement if it wants to have an instance of the typeclass.
In our example, we can write a typeclass for our methods like this:
class Expr a where const :: Integer -> a add :: a -> a -> a mul :: a -> a -> a
This typeclass, Expr, has methods const, add, and mul.
const takes an Integer and converts it to a value of type a.
add takes two values of type a and returns an a.
mul takes two values of type a and returns an a.
(The Haskell syntax a -> a -> a can be read as "(a -> a) -> a", or in other words, the argument is a function that takes a value of type a and returns a value of type a, and the result is a value of type a. But since all functions in Haskell are one-argument functions, a two-argument function is a composition of two one-argument functions, a. k. a. curried functions.)
Notice that this syntax only lists method signatures, i.e. the types of the arguments and the result. It says nothing about what the methods should do. That's the job of the types that define the instances of those typeclasses.
To create expressions out of integers, let's write an instance of typeclass Expr for Integer:
instance Expr Integer where const x = x add x y = x + y mul x y = x * y
add (const 2) (const 9)
To construct boolean expressions out of integers, let's write an instance of typeclass Expr for Bool.
Let's say that a Bool const from an Integer is true if the integer is positive, and false otherwise.
To add two booleans, we OR them, and to multiply them, we AND them.
instance Expr Bool where const x = x > 0 add x y = x || y mul x y = x && y
const 5 :: Bool
const (-2) :: Bool
As we see, we need to specify the type at the end if the expression (as in :: Bool) so the const method would be called as a Bool.
We can define Expr typeclass for even more types. Here is a type MinMax, which is the same as Integer, except that addition for it means taking the maximum of two integers, and multiplication means taking the minimum of the two integers.
newtype MinMax = MinMax Integer deriving (Eq, Ord, Show) instance Expr MinMax where const x = MinMax x add x y = max x y mul x y = min x y
add (const 5) (const (-2)) :: MinMax
mul (const 5) (const (-2)) :: MinMax
So, these methods "const", "add" and "mul", implemented differently for each type, can give us a different result (appropriate for that type) from the same expression.
The homework provided us with a Parser module that has a function parseExp. This function will parse a string that contains an expression consisting of usual numbers as well as addition and multiplication signs. It will convert it to an expression consisting of const, add and mul methods. Given those methods, it will perform the appropriate operations and give us a result.
E.g. it can take a string "(3 * -4) + 5" and give us a result "7", because that's the result of these operations.
But for that, parseExp needs us to give methods const, add and mul so it would know how to add and multiply those values.
Thus, we call parseExp like this:
parseExp const add mul "(3 * -4) + 5"
Why "Just"? If we look at the parseExp type description
we'll see it returns Maybe a. The definition of Maybe a is
data Maybe a = Just a | Nothing
Which means that if there is a value, you'll get a Just , otherwise you'll get a Nothing .
For example, if you give parseExp an invalid string (multiplication sign out of place)
parseExp const add mul "(3 -4 *) + 5"
Just like we got an Integer from parseExp, we can get a Bool or any other type for which we defined an instance of the Expr typeclass, because we have provided the methods const, add and mul.
parseExp const add mul "(3 * -4) + 5" :: Maybe Bool
(because const 3 && const (-4) == True && False == False and False || const 5 == False || True == True)
parseExp const add mul "(3 * -4) + 5" :: Maybe MinMax
Just (MinMax 5)
What does this give us?
If we want to implement a function that takes a string of arithmetic expressions and produces the answer as an Integer, a Boolean, or a MinMax, we don't need to write 3 different functions.
We only need a function like parseExp that will replace * and + with appropriate methods. And we supply the meaning of those methods via typeclass instances.
You can implement addition and multiplication for even more complex types this way.
A more complex type for which we can write an instance of the Expr typeclass (i.e. implement addition and multiplication), is Program.
A Program consists of a list of StackExp's, or stack expressions.
A StackExp represents an instruction for the virtual machine.
It is defined in StackVM.h, which was written by the homework authors and provided to us.
data StackExp = PushI Integer
| PushB Bool
type Stack = [StackVal] type Program = [StackExp]
(StackVal is a type representing a "stack value", but we don't need to know much about it, because it's for the StackVM internal usage.)
Program is a synonym for [StackExp], or a list of stack expressions.
An example of a Program would be
[PushI 3, PushI 4, Add]
It pushes 3 on the stack, pushes 4 on the stack, and adds them.
A longer example would be
[PushI 1, PushI 2, PushI 3, PushI 4, Add, Mul, Add]
It does the following:
What does it mean to add two Programs?
You concatenate their lists and append an Add at the end.
What does it mean to multiply two Programs?
You concatenate their lists and append a Mul at the end.
What is a const for a Program?
(Let's consider only the simplest Programs that consist of PushI, Add and Mul StackExps, omitting the boolean StackExps for the sake of simplicity.)
A const for a Program is a list consisting of just PushI x
instance Expr Program where const x = [PushI x] add x y = x ++ y ++ [Add] mul x y = x ++ y ++ [Mul]
add [PushI 3, PushI 4, Add] [PushI 1, PushI 2, PushI 3, PushI 4, Add, Mul, Add]
[PushI 3,PushI 4,Add,PushI 1,PushI 2,PushI 3,PushI 4,Add,Mul,Add,Add]
We can execute Programs with stackVM (provided by the homework authors) to make sure our typeclass methods do the right thing.
(Right means the result is an Either type. An Either type wraps the result in either the Left or a Right, and it is often used to wrap error messages in a Left and good values in a Right. It is a way for stackVM to avoid throwing exceptions and return a nice error message instead.
IVal is a StackVal type, and it is used internally by stackVM. We only see it in the result.)
stackVM [PushI 3, PushI 4, Add]
Right (IVal 7)
stackVM [PushI 1, PushI 2, PushI 3, PushI 4, Add, Mul, Add]
Right (IVal 15)
stackVM (add [PushI 3, PushI 4, Add] [PushI 1, PushI 2, PushI 3, PushI 4, Add, Mul, Add])
Right (IVal 22)
stackVM [PushI 5]
Right (IVal 5)
Finally, we can write a function compile, which will take an arithmetic expression as a string, and return a Maybe Program. (Maybe is necessary because the expression may be invalid, in which case it will return a Nothing.)
compile :: String -> Maybe Program compile s = parseExp const add mul s :: Maybe Program
compile "(3 * -4) + 5"
Just [PushI 3,PushI (-4),Mul,PushI 5,Add]
To sum up, it's pretty neat that an arithmetic expression like "(3 * -4) + 5" can be interpreted as an integer, a boolean, or a stack of instructions for a virtual machine, depending on three simple methods that we implemented. That's the power of typeclasses.
We can run the Program generated by compile like this.
fmap stackVM (compile "(3 * -4) + 5")
which is equivalent to this:
stackVM <$> compile "(3 * -4) + 5"
Just (Right (IVal (-7)))
fmap or <$> is a method provided by Functor, and Maybe is a Functor, so we can't simply call stackVM on the result of compile -- we need to fmap it.