# folding-tree?

The term ofI guess`folding-tree`

is not an official or common name. But I'll call the function as`folding-tree`

here.`Tree Merging`

is more accurate term. see more.

~~I lost the original source code because link I have is broken now. So this article is licensed under MIT as I borrow from other's~~

# Credit Update

I found the original link! and What I saw for the first time is this.

I found this function in another function generating prime numbers.

which I hardly understand at the time, even though it looks very interesting. Fortunately I have better understanding about it, so I'd like to share.

## How to use (test)

This source code is written in **literate haskell** and you try to use in `ghci`

like below:

```
shell> curl -sL https://raw.githubusercontent.com/jeongoon/jeongoon.github.io/main/posts/2022-06-05-foldt.lhs -o FoldTree.lhs
shell> ghci
GHCi, version 8.10.7: https://www.haskell.org/ghc/ :? for help
Loaded GHCi configuration from /home/myoungjin/.config/ghc/ghci.conf
λ= :l FoldTree.lhs
[1 of 1] Compiling FoldTree ( FoldTree.lhs, interpreted )
Ok, one module loaded.
λ= take 7 . foldt $ [[3,6..], [5,10..], [7,14..]]
[3,5,6,7,9,10,12]
```

## Module FoldTree

```
module FoldTree (foldt) where
foldt :: Ord a => [[a]] -> [a]
```

### foldt

`foldt`

takes a list of list of series of numbers and return a merged (union) and sorted list. It is more like making a rope with the strings.

However, there are some assumption about usage:

- Each inner list contains sorted already in ascending order
- All the (outter) groups are sorted by order of the first element of each group.

So If we have multiples of number three and five and seven,

```
λ> make_some_multiples n = [ n, n+n .. 20 ]
λ> three_multiples = make_some_multiples 3
λ> take 5 $ three_multiples
[3,6,9,12,15]
λ> -- ^ note: already sorted
λ> five_multiples = make_some_multiples 5
λ> seven_multiples = make_some_multiples 7
```

We need to apply the list of each multiples in this way:

```
λ> foldt [ three_multiples, five_multiples, seven_multiples ]
```

Note: In general, if your implementation has specific limitation on input value, you might need to write the function name as something imposing the limitation or your own data type.

Another option would be phantom type

^{1}.

Those assumptions help for our *thunk ^{2}s* to care about only the members of list in the future, which goes pretty well with the nature of lazy evaluation!

The original `foldt`

implementation is for the infinite list has less condition, however, I'd like to apply foldt to fixed size of list so has more edge cases:

```
foldt [] = []
foldt ([]:_) = []
```

So, on second condition, we can return empty list if the list of left hand side has empty body. Because each multiples has longer list if the initial number is less. (so if longer list is empty, we don't need to check shorter list)

```
foldt ((x:xs):t) = x : unionSort xs (foldt (pairs t))
```

The last pattern is for general condition. And foldt appears in the end again to make recursive call.

Basically the *first element of leftmost group* has lowest value, so it will be the first element in the result. This is the basic concept of `foldt`

. And rest of list (xs) will be union-ed with rest of `foldt`

-ed list.

`unionSort`

will remove a duplicated member and take the element at lowest value out of the both list.

### unionSort

```
unionSort :: Ord a => [a] -> [a] -> [a]
unionSort [] ys = ys
unionSort xs [] = xs
unionSort xs@(x:xt) ys@(y:yt) =
case x `compare` y of
LT -> x : unionSort xt ys -- x used; do unionSort on rest of them
EQ -> x : unionSort xt yt -- x used; y is duplicated
GT -> y : unionSort xs yt
```

`unionSort`

also has recursive call to finish the `union`

and `sort`

on rest of members which depend on the value chosen for the *head* of result.

### pairs

`pairs`

do the same sort method used in `foldt`

. `foldt`

takes only one group each time, on the other hand, `pairs`

tries to take every two groups each time. If the pair is not available, it returns empty or the *leftmost* group so that `foldt`

will end its job earlier. *Those edge cases are also only required when applied to finite list.*

```
pairs :: Ord a => [[a]] -> [[a]]
-- edge cases ...
pairs [] = []
pairs ([]:_) = [] -- left always has longer list; no need to go further
pairs (ms:[]) = [ms] -- just return leftmost group
```

The second pattern matching will reduce the searching time as well -- as we saw in the pattern matching of `foldt`

.

*Those edge cases could be different when the different types of series of numbers are used.*

```
pairs ((m:ms):ns:t') = (m : unionSort ms ns) : pairs t'
```

If two groups are taken to perform union, `m`

is supposed to be the lowest value and will be the first value of `pairs`

function -- which is actually doing amazing job to wire all the `foldt`

and `pairs`

smoothly because it doesn't need to go further to get first value.

Note: there is no doubt that less evaluation tends to be more efficient in lazy computation.

To organize the rest of them `unionSort`

will be applied on rest of between two groups, and `pairs`

will pursue the tail of the code again to finish the job.

This is one of most beautiful piece of recursive programming.

## foldt examples

### sumOfMultiples

This task is introduced at exercism.org. Even though, we could solve this problem by checking divisibility of all the member numbers which is given as `factor`

list, it was worth to try because `foldt`

is fast enough to solve by *union*-ing the numbers and get only one of common multiples if some are duplicated.

```
sumOfMultiples factors limit =
sum $ foldt [ [n,n+n..(limit -1)] | n <- factors', n > 0 ]
where factors' = sort factors
-- note: sort is required due to the foldt has assumption!
```

```
λ= sumOfMultiples [2,3,5,7,11] 100000
3961010983
```

### the origin of foldt - prime numbers

the original file contains a lot more implementation of prime numbers

^{3}.

Too many recursive call probably makes us confused at first. But if you know how foldt acts in there, it will be easier to figure out how it works!

To be honest, This code is still hard for me to understand. Or even if I could understand it, I don't think I could invent something like this. 😅

The basic idea of the generating is called `Sieve of Eratosthenes`

. But foldt accomplish the task in unique way to perform sieving when compared to imperative pseudo code.

We could think `minus`

function is to used sieve and `foldt`

function used to eliminate the duplication and to sort.

```
module PrimeNumber
( primesTME
) where
primesTME :: [Int]
primesTME = 2 : ([3,5..] `minus` foldt [ [p*p,p*p+2*p..] | p <- primes_ ])
where
primes_ = 3 : ([5,7..] `minus` foldt [ [p*p,p*p+2*p..] | p <- primes_ ])
foldt ~((x:xs):t) = x : union xs (foldt (pairs t))
pairs ~((x:xs):ys:t) = (x : union xs ys) : pairs t
minus :: [Int] -> [Int] -> [Int]
minus xs@(x:xt) ys@(y:yt) = case compare x y of
LT -> x : minus xt ys
EQ -> minus xt yt
GT -> minus xs yt
union :: [Int] -> [Int] -> [Int]
union xs@(x:xt) ys@(y:yt) = case compare x y of
LT -> x : union xt ys
EQ -> x : union xt yt
GT -> y : union xs yt
```

As you can see `foldt`

and `pairs`

are now much simpler because it works on the infinite list. and `~`

prefix will calm down the `ghc`

because it tells `ghc`

to trust that our pattern matching will serve all the cases even though it doesn't look like exhaustive.

I was able to compile this module without

`~`

. I guess that there are some still possibility for older ghc complier to complain about non-exhaustive pattern matching.

## What we can do more?

I was thinking about making business hours by using `foldt`

. A set of trading hours could rely the public holiday and day light saving time if applicable. (because most of times, UTC will be preferred to save time information) or each weekday could have the different schedule like *"shopping day"* (which is Thursday in Australia).

So we could have many groups of time tables and get the final trading hours list by applying union and sort them and also filter them by removing some special occasion or public holidays!

It sounds the usage of `foldt`

is suitable for this tasks, doesn't it?

I hope I could write an article about it! Thank you for reading!

## Top comments (0)