# [Aaron McDaid](https://aaronmcdaid.github.io/)
Statistics and C++, and other fun projects
`aaron.mcdaid@gmail.com`
[`@aaronmcdaid`](https://twitter.com/aaronmcdaid)

#
***by [Aaron](https://aaronmcdaid.github.io/), 29th October 2017***
[(code for this document)](index.Rmd)

#
(A demo program of this, with all the code, is available as [countdown.hs](https://raw.githubusercontent.com/aaronmcdaid/aaronmcdaid.github.io/master/blog.posts/countdown.in.haskell/countdown.hs) .)
I loved the British game show [*Countdown*](https://en.wikipedia.org/wiki/Countdown_(game_show)) as a child,
which is based on [*Des chiffres et des lettres*](https://en.wikipedia.org/wiki/Des_chiffres_et_des_lettres).
My first attempt at a non-trivial C program was, if I remember correctly, and attempt to solve the "Numbers Round" on Countdown.
Given a target number, and six input numbers, the goal is to combine the inputs using only addition, subtraction, multiplication and division and reach the target value.
For example, given target $562$ and inputs $9,8,2,10,4,4$
$$
562 = 2+10*(4*4-9)*8
$$
Let's assume, for now, that each input must be used once, and only once, in the expression. In reality, it is not necessary to use them all and I will describe that near the end of this post.
I'll call this pair, the target and the inputs, a *game*. A given game may have many solutions, or even zero solutions.
In this post, I'll describe a program written in [Haskell](https://en.wikipedia.org/wiki/Haskell_(programming_language)). I don't claim to be an expert in Haskell; this is just a bit of fun. I've recently become much more interested in it via the [Haskell Dublin Meetup](https://www.meetup.com/Haskell-Dublin-Meetup/).
Any feedback appreciated! `aaron.mcdaid@gmail.com` or Twitter: [`@aaronmcdaid`](https://twitter.com/aaronmcdaid)
## Expressions
At first, let's consider using just two inputs, instead of the six that are allowed in the real game show. That gives us four expressions, where $a$ and $b$ will take the place of the inputs
$$ a + b $$
$$ a - b $$
$$ a \times b $$
$$ a \div b $$
Allowing three inputs, the number of expressions increases dramatically, here are some examples
$$ a + b \times c $$
$$ (a + b) \times c $$
$$ (c \div a) + b $$
The first step in Haskell is to represent the expressions. There are three kinds of expression:
1. `Letter` expressions. Just a `Single` letter in isolation
$$a$$
$$b$$
2. `ExprT` expressions. `T`erm-expressions involving two terms with addition or subtraction
$$a+c$$
$$d-b$$
3. `ExprF` expressions. `F`actor-expressions involving two factors with multiplication and division
$$a \times c$$
$$d \div b$$
```
data Expr = ExprT Terms -- term-expression , i.e. two or more terms separated by (+) or (-)
| ExprF Factors -- factor-expression, i.e. two or more factors separated by (*) or (/)
| Single Letter -- where 'Letter' (see below) represents 'a' through 'f'.
```
with a convenient `Letter` wrapper to store a single character, `'a'`,`'b'`,`'c'`,`'d'`,`'e'`,`'f'`:
```
data Letter = Letter Char -- the Char is between 'a' and 'f' inclusive, for the six allowed variables
letter2offset :: Char -> Int -- 'a'=0, 'b'=1, 'c'=2, ...
letter2offset l = (fromEnum l) - (fromEnum 'a')
```
## Complex expressions
More complex expressions are classified according to the 'top-level' operation, respecting the typical [Order of operations](https://en.wikipedia.org/wiki/Order_of_operations) in arithmetic.
As a result, this is a term-expression:
$$
b \times c + a
$$
because the addition is performed last. This is more clear with parentheses and spacing
$$
(b{\times}c) + a
$$
or with this tree, where `*` represents multiplication:
```
+ -- the top-level operation
/ \
* a
/ \
b c
```
On the other hand, the following is a factor-expression:
$$
b \times (c{+}a)
$$
```
* -- the top-level operation
/ \
b +
/ \
c a
```
## Recursive definition of `Expr`
Focusing on `ExprT`, we need to represent the collection of two, or more, terms which are added or subtracted together. Each term is simply a subexpression
```
data Terms = Terms Expr AddOrSubtract Expr
-- |
-- `---- second sub-Expr-ession
-- |
-- \----------------------- first sub-Expr-ession
```
where `AddOrSubtract` is simply `Add` or `Sub`:
```
data AddOrSubtract = Add | Sub
```
You might notice that this only allows two sub-`Expr`essions, when I said "two or more". To allow more than two terms, we allow that the first sub-Expr-ession might also be an `ExprT` expression. i.e.
$$ d+a+e $$
is represented as
$$ (d+a)+e $$
```
+
/ \
+ +
/ \ \
d a e
```
A longer example is $a+b-c+d-e+f$, which we can represent as $((((a+b)-c)+d)-e)+f$.
To avoid redundantly generating many expressions which are obviously identical to each other, we ban expression such as
$$ a*b+(c+d) $$
which are identical to
$$ (a*b+c)+d $$
To enforce this, we require that the second sub-expression inside a `ExprT` must **_not_** itself be a `T`erm-expression. It may be `Letter` or `ExprF`. The second sub-expression in $a*b+(c+d)$ is $(c+d)$, which is a term-expression and therefore this expression is disallowed
## Constraints
To recap, with more clarity about the restriction we've just explained. There are three kinds of `Expr`:
1. `ExprT` - a term-expression made up of exactly two sub-Expressions, combined by either addition or subtraction, where the second subexpression must not be a `ExprT`.
1. `ExprF` - a factor-expression made up of exactly two sub-Expressions, combined by either multiplication or division, where the second subexpression must not be a `ExprF`.
1. `Letter` - a single letter from the six allowed: $a,b,c,d,e,f$
The other constraint, of course, is that a given `Letter` may not appear more than once, i.e. we don't allow this:
$$ a*b + a $$
```
+
/ \
* a -- 'a' is (Letter 0)
/ \
a b -- 'a' is (Letter 0), 'b' is (Letter 1)
```
$a$ (i.e. `Letter 0`) appears twice in that expression, therefore that expression is invalid.
# Recap the data representation
To recap the data representation:
```
data Expr = ExprT Terms -- additive expression, i.e. two or more terms separated by +-
| ExprF Factors -- multiplicative expression, i.e. two or more factors separated by +-
| Single Letter -- where 'Letter' (see below) represents 'a' through 'f', the first six letters
data Terms = Terms Expr AddOrSubtract Expr -- second Expr must *not* be ExprT
data Factors = Factors Expr MultiplyOrDivide Expr -- second Expr must *not* be ExprF
data AddOrSubtract = Add | Sub
data MultiplyOrDivide = Mul | Div
data Letter = Letter Char -- the Char is between 'a' and 'f' inclusive, for the six allowed variables
letter2offset :: Char -> Int -- 'a'=0, 'b'=1, 'c'=2, ...
letter2offset l = (fromEnum l) - (fromEnum 'a')
```
## Generating all expressions, subject to those constraints
Given three letters $a,b,c$, represented as a list in Haskell with `[Letter 0, Letter 1, Letter 2]`, we want to generate all the valid `Expr`-essions.
We first need to define a useful utility function, `allPartitions`. Given a list, it will return a list of pairs of lists, with every possible way to split the list into two lists. If there are $n$ items in the original list, there will be $2^n$ items in the list of all partitions.
```
allPartitions :: [a] -> [([a],[a])]
allPartitions [] = [([],[])]
allPartitions [x] = [([x],[]),([],[x])]
allPartitions (x:xs) = [ (x:l,r ) | (l,r) <- allPartitions xs ]
++ [ (l ,x:r) | (l,r) <- allPartitions xs ]
```
```{haskell}
let { allPartitions [] = [([],[])] ; allPartitions [x] = [([x],[]),([],[x])] ; allPartitions (x:xs) = [ (x:l,r ) | (l,r) <- allPartitions xs ] ++ [ (l ,x:r) | (l,r) <- allPartitions xs ] }
allPartitions ['a'..'c']
```
## `allExprs`
So, `allExprs` is a function from a list of `Letter`s to a list of `Expr`essions:
```
allExprs :: [Letter] -> [Expr]
```
If the list of letters has zero or one elements, the list of expressions is relatively simple
```
allExprs [] = [] -- no expressions possible
allExprs [l] = [Single l] -- just one expression is possible, made up of a Single Letter
```
But if there are more than two letters in the lists, we have to build a list of all the possible term-expressions and all the possible factor-expressions. Here is the full function before I discuss it a little further:
```
allExprs :: [Letter] -> [Expr]
allExprs [] = []
allExprs [l] = [Single l]
allExprs ls = [ ExprT (Terms e1 addOrSubtract e2)
| (l,r) <- allPartitions ls
, not . null $ l
, not . null $ r
, e2 <- allExprs r
, case e2 of -- to enforce that the second Expr in an ExprT is *not* an ExprT
(ExprT _) -> False
_ -> True
, e1 <- allExprs l
, addOrSubtract <- [Add, Sub]
]
++ [ ExprF (Factors e1 multiplyOrDivide e2)
| (l,r) <- allPartitions ls
, not . null $ l
, not . null $ r
, e2 <- allExprs r
, case e2 of -- to enforce that the second Expr in an ExprF is *not* an ExprF
(ExprF _) -> False
_ -> True
, e1 <- allExprs l
, multiplyOrDivide <- [Mul, Div]
]
```
A *list-comprehension* in Haskell is `[ e | ... ]`, where the `...` are made up of several loop variables and condition expressions separated by commas. For example, to make a list of $x^2$ for each *odd* integer in the range `[0..10]`, you would do
```
squares_of_some_odd_numbers = [ x^2 | x <- [0..10] , x `mod` 2 == 1]
```
Detailed explanation of the above:
1. `(l,r) <- allPartitions ls` - for each possible partition of `ls`, store the two sides into `l` and `r`.
1. `, not . null $ l` - condition ignoring an empty left-hand-side.
1. `, not . null $ r` - condition ignoring an empty right-hand-side.
1. `, e2 <- allExprs r` - recursively call `allExprs` with the right-hand-side.
1. `, case e2 of { (ExprT _) -> False; _ -> True }` - skip (via `False`) when the second sub-expression would be a term-expression.
1. `, e1 <- allExprs l` - recursively call `allExprs` with the left-hand-side.
1. `, addOrSubtract <- [Add, Sub]` - to try both possibilities
Each of those items in the list above is either something to loop over (`e2`,`e1`, or `addOrSubtract`) or a condition to skip over some of those. For every combination that is acceptable, we evaluate
`ExprT (Terms e1 addOrSubtract e2)`
(or
`ExprF (Factors e1 multiplyOrDivide e2)`
when computing all the factor-expressions) and return the entire list.
# Evaluating an expression with a given set of inputs
With an expression $a+b \times c$ and inputs `[10,30,100]` we "fill" the expression to get a "filled-expression"
$$ 10+30 \times 100 $$
with value $3010$.
Bear in mind that we only allow integer division in this game. And therefore, we cannot fill
$a \div b$ with inputs `[5,2]` as $\frac52$ is not an integer and we cannot allow division by zero either.
Therefore, our `eval` function doesn't return `Int` for the value; instead it returns `Maybe Int` (known as Optional Int, or similar, in other programming languages).
```
eval :: [Int] -> Expr -> Maybe Int
-- | | |
-- | | `------ returned Int, or 'Nothing' if there was a division problem
-- | `----------------- expression to evaluate
-- `------------------------- a list of integers to take the place of 'a' to 'f' in the expression
```
We must deal with each of the three kinds of expression. For those not familiar with Haskell's `do` notation, I will not attempt to explain its many varied and wonderful uses! But, in the context of a function returning `Maybe Int`, it allows us to 'see through' the `Maybe` and just directly perform operations on the contained `Int`s:
```
eval is (Single (Letter l)) = do
return (is !! (letter2offset l))
```
Next the Term-expression, simply add or subtract. But, remember that `eval is e1` and `eval is e2` may be `Nothing` (if there a division problem in the sub-expression), in which case the `do` will automatically *short-circuit* as the `<-` will fail to extract and `Int` from the `Maybe Int`, and therefore we simply return `Nothing` without attemping to do any addition or subtraction:
```
eval is (ExprT (Terms e1 addOrSubtract e2)) = do
t1 <- eval is e1
t2 <- eval is e2
case addOrSubtract of
Add -> return (t1+t2)
Sub -> return (t1-t2)
```
Multiplication is straightforward:
```
eval is (ExprF (Factors e1 Mul e2)) = do
f1 <- eval is e1
f2 <- eval is e2
return (f1*f2)
```
However, division must be treated specially. If the second expression (the number to divide by) is zero, or it does not divide evenly into the numerator,
then the `guard` will force the *short-circuit* to occur and return `Nothing`:
```
eval is (ExprF (Factors e1 Div e2)) = do
f1 <- eval is e1
f2 <- eval is e2
guard $ f2 /= 0
guard ( f1 `mod` f2 == 0 )
return (f1 `div` f2)
```
## Bringing it all together
We can now bring everything together, attemping to "fill in" every possible expression with the inputs and testing if the `eval`-uated result equals our target.
Remember I said earlier that we don't actually need to use all six inputs exactly one each? We are allowed to use a subset instead if we wish.
With a target of $4$ and inputs $(1,2,3)$ we are entitled to use $4=1+3$ and ignore the $2$.
Therefore, we do another partitioning, simply dividing the list into those that we will use and those we will not use:
```
allPartialExprs :: [Letter] -> [Expr]
allPartialExprs ls = [ es
| (used, ignored) <- allPartitions ls
, not . null $ used
, es <- allExprs used
]
```
which is then called by `findSolutions`:
```
findSolutions target inputs = do
let range_of_chars = ['a' .. (toEnum . (subtract 1) . (+ (length inputs)) . fromEnum $ 'a')]
-- if (length inputs == 6), then range_of_chars will be ['a'..'f']
e <- allPartialExprs $ map Letter range_of_chars
guard . (== (Just target)) $ (eval inputs e)
return (e, showWithInputs inputs e)
```
For each successful solution, `findSolutions` returns the raw `Expr` object as well as a nice string represenation of the expression "filled-in" with the inputs.
Finally, a nice demo where it's all printed nicely:
```
findSolutionsAndPrintThemNicely target inputs =
findSolutions target inputs &
mapM (\(e,filledIn) ->
do
let showe = show e
let padding = replicate (24 - length showe) ' '
putStrLn (padding ++ showe ++ " " ++ filledIn)
)
main = do
findSolutionsAndPrintThemNicely 562 [9,8,2,4,10,4]
```
This gives hundreds of solutions, here are a few:
```
(a*b*c*d-e-f) (9*8*2*4-10-4)
(a*b*d*c-f-e) (9*8*4*2-4-10)
(a*c*f*b-d-e) (9*2*4*8-4-10)
(b*c*f*a-e-d) (8*2*4*9-10-4)
((e+f)*(a-d)*b+c) ((10+4)*(9-4)*8+2)
(c+(d*f-a)*b*e) (2+(4*4-9)*8*10)
(c+(a-f)*(d+e)*b) (2+(9-4)*(4+10)*8)
(c-(e+f)*b*(d-a)) (2-(10+4)*8*(4-9))
```
***by [Aaron](https://aaronmcdaid.github.io/), 29th October 2017***