Folding and Unfolding in F# and Linq August, 2009
I'll say up front that there is a lot of good information out there about this from folks who actually know what their doing... :) I'm mainly blogging about this to help me understand and remember it. One of the best resources I found is Matthew Podwysocki's blog. He blogs regularly about FP and has some nice posts on folds here and here.
Folding
So a fold basically takes a sequence and turns it (Or "folds" it) into a single result. So for example if you have an array of integers you want to add together. Folds come in 2 flavors, right and left. Right folds iterate from the last element the the first whereas left folds iterate from the first element to the last. A right fold is not available out of the box for an F# sequence (Which is an IEnumerable) or Enumerable/Queryable since the last element of an enumeration is unknown and the enumeration could possibly be infinite (Although there are ways to make a right fold out of a left fold as seen here). In the case of summation it doesn't really matter which direction we fold since addition is commutative. Folding can be done with the following:
F#
Microsoft.FSharp.Collections.List
val fold : ('State -> 'T -> 'State) -> 'State -> 'T list -> 'State val foldBack : ('T -> 'State -> 'State) -> 'T list -> 'State -> 'State
Microsoft.FSharp.Collections.Seq
val fold : ('State -> 'T -> 'State) -> 'State -> seq<'T> -> 'State
Linq:
public static TResult Aggregate( this IEnumerable source, TAccumulate seed, Func func ) public static TResult Aggregate ( this IQueryable source, TAccumulate seed, Expression<Func > func )
Folds take an accumulator function that operates on each element and a seed value. The function that operates on each element is passed the element and current state and returns the new state. So addition using a left fold would be as follows:
F#
let values = [1; 5; 22; 45] // List.fold [function [state] [item]] [seed] [list] let total = List.fold (fun state item -> state + item) 0 values
Linq/C#
int[] values = { 1, 5, 22, 45 }; // [IEnumerable/IQueryable].Aggregate([seed], [function [state], [item]]) int total = values.Aggregate(0, (state, item) => state + item);
The F# foldBack function, a right fold, is similar to the fold function except that it starts with the last element and works toward the first.
Folds Using the First or Last Element as a Seed
There are also times where you might want to use the first or last element as the seed. For example if you wanted to create a comma separated list of elements in an array you might use the String.Join() method in .NET. This creates a string that contains the elements separated by a specified value. In this case first element is simply a seed and the other elements are concatenated with the specified value (Like a comma). We can do this with the following:
F#:
Microsoft.FSharp.Collections.List
val reduce : ('T -> 'T -> 'T) -> 'T list -> 'T val reduceBack : ('T -> 'T -> 'T) -> 'T list -> 'T
Microsoft.FSharp.Collections.Seq
val reduce : ('T -> 'T -> 'T) -> seq<'T> -> 'T
Linq:
public static TSource Aggregate( this IEnumerable source, Func func ) public static TSource Aggregate ( this IQueryable source, Expression<Func > func )
When using F# the reduce function, the first element is the seed, on the other hand when using the F# reduceBack function the last element is the seed. Note that the Linq version is simply an overload of Aggregate method that does not take a seed. Lets say we have a list of strings we want to join together into a comma separated list:
F#:
let values = ["Alpha";"Beta";"Gamma";"Delta"] // List.reduce [function [state] [item]] [list] let value = values |> List.reduce (fun state item -> state + ", " + item)
string[] values = { "Alpha", "Beta", "Gamma", "Delta" }; // [IEnumerable/IQueryable].Aggregate([function [state], [item]]) string value = values.Aggregate((state, item) => state + ", " + item);
This produces the string Alpha, Beta, Gamma, Delta
The calls to the accumulator looks like this:
1) (fun "Alpha" "Beta" -> "Alpha" + ", " + "Beta") 2) (fun "Alpha, Beta" "Gamma" -> "Alpha, Beta" + ", " + "Gamma") 3) (fun "Alpha, Beta, Gamma" "Delta" -> "Alpha, Beta, Gamma" + ", " + "Delta")
Map/Reduce
Now what if you have a list of one type but want to reduce it to another. For example, what if we have a list of integers we want to reduce to a comma separated string. In that case we can us a map/reduce approach (See a recent discussion about this here on hubFS, mad props to Kha). First we map every element in the array to a string then reduce results:
F#
let primes = [2;3;5;7] // Seq.map [function [item]] [list] // Seq.reduce [function [state] [item]] [list] let output = primes |> Seq.map (fun i -> i.ToString()) |> Seq.reduce (fun state item -> state + ", " + item)
Linq/C#:
int[] primes = { 2, 3, 5, 7 }; // [IEnumerable/IQueryable].Select([function [item]]) // [IEnumerable/IQueryable].Aggregate([function [state], [item]]) string value = primes.Select(i => i.ToString()).Aggregate((state, item) => state + ", " + item);
This first converts the integer sequence to a list of strings. We could also format the strings as desired, for example if we were converting a list of datetimes or real numbers. Then that converted list can be reduced. Note that the reduce/Aggregate function requires that the state value be of the same type as the elements thus the need for the map/Select to convert them first.
One thing to note about folds is that many other functions in F# and Linq are based on folds. For example the F# sum and Linq Sum functions are really just folds, as are others.
Unfolding
Now you may interested in doing the opposite of a fold; taking a single value and turning it into a list, or "unfolding" it. Linq doesn't offer an unfold function out of the box as of .NET 3.5 but Matthew Podwysocki discusses and implementation in C# here. F# offers an unfold method in the Seq module:
Microsoft.FSharp.Collections.Seq
val unfold : ('State -> ('T * 'State) option) -> 'State -> seq<'T>
As a very simple example lets generate the even numbers between zero and ten:
// Seq.unfold [function [currentState]] [initialState] let values = Seq.unfold (fun state -> Some (state, 2 + state)) 0 let output = values |> Seq.takeWhile(fun i -> i <= 10) |> Seq.map (fun i -> i.ToString()) |> Seq.reduce (fun state item -> state + ", " + item)
This produces 0, 2, 4, 6, 8, 10.
So lets take a look at the unfold function. The first parameter is an generator function that accepts the current state. For the first iteration, the value passed in is the initial state parameter, which is the second parameter passed to the unfold function (To get things started) which is zero in the example above. The generator function must return an option type of a two element tuple. The first element of the tuple is the item to be yielded and the second element is the state to pass on the generator function in the next iteration (Which in the example above is the current state plus two). You return Some when there are results or None when there are no more results. In cases where the sequence is infinite you would never pass None (As is the case in the example above).
As more complex example (And one that shows the true power of unfold) we can generate the Fibonacci sequence (As Erik describes here). Lets say we want elements of the Fibonacci sequence that are less than 50.
let fib = Seq.unfold (fun (lastValue, currentValue) -> Some (lastValue, (currentValue, lastValue + currentValue))) (1, 1) let output = fib |> Seq.takeWhile(fun i -> i <= 50) |> Seq.map (fun i -> i.ToString()) |> Seq.reduce (fun state item -> state + ", " + item)
This produces 1, 1, 2, 3, 5, 8, 13, 21, 34
You'll notice in this case that the state is now a two element tuple, the first element being the last value and the second element being the current value.
Folding/unfolding are very powerful concepts. I've found that learning these functional concepts has really helped my imperative programming as these concepts enable you to create very succinct yet expressive code. Chuck Jazdzewski expressed it well on his blog where he quoted the professor teaching his APL course; "If you are using a loop, you're doing it wrong"... Well said.