## Substitution and Equational Reasoning

- 01/10/2020 -In a lazy functional language where variables are immutable and effects are explicit, new powerful tools for understanding what programs mean become available - substitution and equational reasoning.

In Haskell, when we write the equals sign (`=`

) what we mean is that the two sides in this context are the same, and we can substitute one for another keeping the same behaviour of the program.

So that in:

```
twentyOne =
let
seven = 7
in
seven + seven + 7
```

We can substitute `seven`

and `7`

with one another and still get the same program.

The program above is identical (in behaviour) to:

```
twentyOne =
let
seven = 7
in
7 + 7 + seven
```

and

`twentyOne = 7 + 7 + 7`

This is true for bigger and more complicated expressions as well, including ones that use IO. You can always replace one side with the other and get the same behaviour.

This is useful when you want to understand the behaviour of an expression, and is often called 'applying equational reasoning'.

#### Example: `factorial 3`

For example, what does `factorial 3`

means given this definition?

```
factorial :: Integer -> Integer
factorial =
\n ->
if n == 1
then
1
else
n * factorial (n - 1)
```

(Note that I'm using a lambda instead of regular function definition because it's easier to substitute visually, but it's the same as `factorial n =`

)

We can use substitution to find out:

(1) We start with `factorial 3`

and replace `factorial`

with the body of the function:

```
( \n ->
if n == 1
then
1
else
n * factorial (n - 1)
) 3
```

(2) Application of lambda function means that we can replace `n`

with `3`

in the body of the lambda expression:

```
if 3 == 1
then
1
else
3 * factorial (3 - 1)
```

(3) Next, we can substitute `==`

with it's definition, only that for ints this is a primitive operation (meaning it's not written in haskell but in the haskell runtime), but we know that the implementation means `True`

if the two sides are the same and `False`

if they are not, so we'll do it for them:

```
if False
then
1
else
3 * factorial (3 - 1)
```

(4) Next, again, `if`

is a primitive operation, so we'll just do what we know it expects: the first branch if the test is `True`

, or the second branch if the test is `False`

. So we know that this is the second branch here.

`3 * factorial (3 - 1)`

(5) Now what? we can again replace `*`

, but it's primitive which means multiply the two sides. Nothing to substitute for the left side, but we can work on the other side:

```
3 * ( \n ->
if n == 1
then
1
else
n * factorial (n - 1)
) (3 - 1)
```

Note: Why did I decide to first substitute `factorial`

and not `(3 - 1)`

? Because in Haskell we first try to evaluate/substitute the function being called before we do that to its arguments.

(6) Now, we can again substitute `n`

with `(3 - 1)`

in the body of the lambda expression:

```
3 * ( if (3 - 1) == 1
then
1
else
(3 - 1) * factorial ((3 - 1) - 1)
)
```

(7) And then substitute the `(3 - 1)`

in the `if`

with it's implementation (the result here is `2`

):

```
3 * ( if 2 == 1
then
1
else
2 * factorial (2 - 1)
)
```

(Hold up: did you notice that all instances of `(3 - 1)`

were substituted to `2`

? As we discussed, they mean the same thing so *you don't have to do that* when you are trying to understand the code, but the Haskell runtime is smart and knows in this case that these are the same thing (they are all the `n`

from before), so it calculates it once and applies it to other instances as well to save us time. Thanks Haskell!)

(8) Again, here `2 == 1`

means `False`

and then the `if False`

means substitute for the second branch:

`3 * ( 2 * factorial (2 - 1) )`

(9) Substituting `factorial`

with its implementation again:

```
3 * ( 2 * ( \n ->
if n == 1
then
1
else
n * factorial (n - 1)
) (2 - 1)
)
```

(10) And substitute the `n`

with `(2 - 1)`

in the lambda body:

```
3 * ( 2 * ( if (2 - 1) == 1
then
1
else
(2 - 1) * factorial ((2 - 1) - 1)
)
)
```

(11) And we substitute `(2 - 1)`

with its implementation:

```
3 * ( 2 * ( if 1 == 1
then
1
else
1 * factorial (1 - 1)
)
)
```

(12) And finally, `1 == 1`

is actually `True`

, so we substitute with the *first* branch this time:

`3 * ( 2 * ( 1 ) )`

And voila, we now know what the result is: `3 * 2 * 1 = 6`

!

#### What did we learn from substituting this?

- We learned that we get the expected result when running
`factorial 3`

- We learned that this implementation has to remember the list of numbers it needs to multiply in memory until the end, so it's not very space-efficient.

But what about `factorial 0`

? Does it give the expected result of `1`

? Try to substitute for yourself and see!

Also, what about the following implementation of `factorial`

, does it has the same space issue as our previous implementation? Evaluate `factorial 3`

using substitution and find out:

```
factorial :: Integer -> Integer
factorial = impl 1
impl :: Integer -> Integer -> Integer
impl =
\acc n -> -- acc stands for accumulator
if n == 0
then
acc
else
impl (n * acc) (n - 1)
```

(Is the answer still yes? What can we do about it? (See BangPatterns).)

#### Conclusion and Further Reading

I hope this was helpful to you in some way, I think substitution and equational reasoning are key tools in a Haskeller's belt, and having a good grasp on the subject will make writing and debugging Haskell code that much easier.

If you'd like to learn more about this and related topics, check out: