# Beautiful folds in F# - Part 1

###
Brian Berns
*Updated on *
・5 min read

Beautiful folds in F# (3 Part Series)

# Sequences

F# sequences have some interesting behavior. Because they are "lazy", the elements of a sequence are generated on-demand when you enumerate the sequence. This can be useful for large sequences, because its elements don't all exist in memory at the same time (unlike a true collection type, like a list or an array).

Be aware, however, that if you enumerate a sequence a second time, its elements are generated again from scratch. Thus, if a sequence generator has side-effects, those effects occur each time the sequence is enumerated. This can be a very bad outcome, and so side-effects in sequences should be avoided when possible.

In some cases, multiple enumerations of a sequence don't even produce the same results. For example, consider a sequence that represents a database cursor:

```
use rdr = cmd.ExecuteReader()
seq {
while rdr.Read() do
yield rdr.GetInt32(0)
}
```

This might yield some values the first time through, but subsequent iterations will all be empty because the reader can't be rewound (if it's even still open at all).

For the purposes of this article, we'll conjure artificially unrepeatable sequences that work fine once, but fail if enumerated again:

```
/// Creates a sequence that can only be enumerated once.
let unrepeatable items =
let mutable active = true
seq {
if active then yield! items
else failwith "Closed"
active <- false
}
```

Let's say we want to compute the mean of a sequence of integers (without using F#'s `Seq.average`

library function). Naively, we could simply compute the sum of the sequence and its length and then divide to get the mean:

```
/// Computes mean from sum and length.
let toMean (sum : int) (length : int) =
float sum / float length
/// Computes mean of a sequence by iterating it twice.
let meanNaive items =
let sum = Seq.sum items
let length = Seq.length items // exception thrown here
toMean sum length
let items = seq { 1 .. 10 } |> unrepeatable
printfn "%A" <| meanNaive items // BOOM!
```

This blows up because `meanNaive`

attempts to enumerate the sequence twice. Of course, we could try to avoid this problem by reading the entire sequence into an array and then computing the array's sum and length, but this requires enough memory to hold the entire sequence, which might be prohibitive for large sequences. Is there some way to compute the mean so that we only enumerate the sequence once, without keeping all its elements in memory?

# Fold

One way is to manually track the sum and the length ourselves via a "fold":

```
/// Computes mean of a sequence by iterating it only once.
let meanManual items =
let sum, length =
Seq.fold (fun (sumAcc, lenAcc) item ->
sumAcc + item, lenAcc + 1) (0, 0) items
toMean sum length
```

In this version, we track the sequence's running sum in `sumAcc`

and its running length in `lenAcc`

, so at each step we compute `sumAcc + item`

and `lenAcc + 1`

. The two accumulators are combined in a tuple with initial value `(0, 0)`

, which we deconstruct at the end so we can divide the final sum by the final length in `toMean`

. This version works fine and we can consider the problem solved... if this is the only computation we need to perform.

But what if there are many different computations we might want to perform on an unrepeatable sequence? For example, if we separately want to determine the sequence's minimum and maximum values, we have to hand code a new function that follows the same basic pattern. This seems like a potentially large amount of duplicate (and potentially buggy) code.

# Refactor!

Surely we can refactor this code to avoid duplication. To do this, we need versions of `Seq.sum`

and `Seq.length`

and `Seq.min`

and `Seq.max`

that can be composed arbitrarily and then plugged into a fold.

First, we need something that holds just the information necessary to compute the sum of a sequence. In particular, we need a "step" function that accumulates the running sum, and we need the correct initial value for the running sum, which is `0`

. Let's write some quick-and-dirty code to try this idea out using a plain tuple:

```
let composableSum =
(fun acc item -> acc + item), // accumulate running sum
0 // initial value of running sum
```

Of course, the step function is just `(+)`

, but I've written it out with points to be clearer.

Now let's do the same thing for length, min, and max:

```
let composableLength =
(fun acc item -> acc + 1),
0
let composableMin =
(fun acc item -> min acc item),
Int32.MaxValue
let composableMax =
(fun acc item -> max acc item),
Int32.MinValue
```

For simplicity, I've cheated by using `Int32.MaxValue`

and `Int32.MinValue`

as initial values for min and max. This doesn't work well for empty sequences, but it's good enough to get started. (We should probably be using `Option`

instead. Exercise for the reader.)

As the names suggest, we can compose two of these together to produce something that performs both in a single pass of the sequence:

```
/// Creates a new computation that's composed of the two given computations.
let compose (step1, init1) (step2, init2) =
let step (acc1, acc2) item =
step1 acc1 item, step2 acc2 item
let init = init1, init2
step, init
/// Infix version.
let (<*>) = compose
```

The composed `step`

function assumes that the tracked value are in a tuple, which it updates. Similarly, the composed `init`

function simply gloms the two individual initial values together in a tuple. We can easily compose multiple instances at once using the infix `<*>`

version of the function . So, for example, we can define a computation that calculates a sequence's sum, length, min, and max all in a single pass like this:

```
composableSum <*> composableLength <*> composableMin <*> composableMax
```

Now that we've defined the computation, we need a way to run it using `Seq.fold`

:

```
/// Runs the given computation on the given items.
let run (step, init) items =
Seq.fold step init items
```

We have all the pieces we need, so let's take it out for a test drive:

```
let comp = composableSum <*> composableLength <*> composableMin <*> composableMax
let result = run comp (unrepeatable [3; 1; 4; 1; 9; 2; 6])
```

Success! We've correctly computed the sequence's sum, length, min, and max in a single pass, producing `(((26, 7), 1), 9)`

as the final result. Note, however, that each layer of composition creates a nested tuple in the result, which seems unfortunate. Is there any way to flatten this out so the internal mechanism used in composing a computation isn't exposed to the caller?

We've covered a lot of ground already, so I'm going to pause here, and will address that last question in a follow-up, which is now available: Beautiful folds in F# - Part 2

Beautiful folds in F# (3 Part Series)