Saturday, April 25, 2009

From for loop to anamorphism

Introduction

Sometimes you want to generate a sequence of objects. This is often done using a for loop:
[Test]
public void CreateListOfFoos()
{
var xs = new List<Foo>();
for (int i = 0; i < 10; i++)
{
xs.Add(new Foo());
}

Assert.AreEqual(10, xs.Count());
}


In this post, I will show you how this code can be made more general, ultimately turning it into an anamorphism over lists.


Generalizing constructed type



First, what we want to do is to be able to create something other than a Foo. We apply ExtractVariable once and we also extract the constructor call to a lambda:


[Test]
public void ExtractMethod()
{
var times = 10;
Func<Foo>newFoo = () => new Foo();

var xs = CreateFooNumberOfTimes(times, newFoo);

Assert.AreEqual(10, xs.Count());
}

private IEnumerable<Foo> CreateFooNumberOfTimes(int times, Func<Foo> newFoo)
{
var xs = new List<Foo>();
for (int i = 0; i < times; i++)
{
xs.Add(newFoo.Invoke());
}
return xs;
}


From the above code, it’s easy to generalize on the type created. Note that we could have skipped the “constructor lambda” and instead used the “where T : new()” constraint.


[Test]
public void GeneralizeTypeForFunc()
{

var times = 10;
Func<Foo> newFoo = () => new Foo();

var xs = CreateTNumberOfTimes(times, newFoo);

Assert.AreEqual(10, xs.Count());
}

private IEnumerable<T> CreateTNumberOfTimes<T>(int times, Func<T> newFoo)
{
var xs = new List<T>();
for (int i = 0; i < times; i++)
{
xs.Add(newFoo.Invoke());
}
return xs;
}


Generalizing type for accumulator



We have now parametrized Foo to T, but wouldn’t it be possible to parametrize from “int” to A, as well? Let’s begin with breaking out the int-specific code from the for loop:


[Test]
public void ExtractForLoopLogic()
{
Func<Foo> newFoo = () => new Foo();

int i = 0; // Will be modified lots of times
Func<bool> expr = () => i < 10;
Action inc = () => i++;

var xs = CreateTUntil(newFoo, expr, inc);

Assert.AreEqual(10, xs.Count());
}

private IEnumerable<T>CreateTUntil<T>(Func<T> newT, Func<bool> expr, Action inc)
{
var xs = new List<T>();
for (; expr.Invoke(); inc.Invoke())
{
xs.Add(newT.Invoke());
}
return xs;
}


As indicated in the code, the variable “i” will be modified the closure called in CreateTUntil. Bart de Smet calls this a cruel lambda. Except that it’s quite hard to understand a lambda that mutate its outer scope, refactoring code that’s using cruel lambdas can make your code go totally bananas!


Let’s rewrite the code to use a pure lambdas instead. To do this, we need to refactor the for loop to a while loop, since the first and third “parameters” to a for loop are statements (not pure). We also parametrize from “int” to type parameter “A” instead.


Going pure and a more general constructor



[Test]
public void ForLoopToWhileLoop()
{
Func<Foo> newFoo = () => new Foo();
Func<int,bool> expr = a => a < 10;
Func<int,int> inc = i => i + 1;

var xs = CreateTUntilUsingWhile(newFoo, 0, expr, inc);

Assert.AreEqual(10, xs.Count());
}

// Now a pure method
private IEnumerable<T> CreateTUntilUsingWhile<T>
(Func<T> newT, int init, Func<int,bool> expr, Func<int,int> inc)
{

var xs = new List<T>();

int i = init;

while (expr.Invoke(i))
{
xs.Add(newT.Invoke());
i = inc.Invoke(i);
}
return xs;
}


Now we don’t have any concrete types in the method signature, except bool, which I think is ok to have there at this point. But, as the observant reader might have noticed, the constructor can’t be called with a variable argument, i.e. the accumulated value. What we need to do is to “connect” the lambdas that generates values, like this:


[Test]
public void ArgumentForConstructor()
{
Func<int,Result<Foo,int>> gen = n => new Result<Foo,int>(new Foo(), n + 1);
Func<int,bool> expr = a => a < 10;

var xs = GeneralizedCreateTWithArgsUntilUsingWhile(gen, 0, expr);

Assert.AreEqual(10, xs.Count());
}

private IEnumerable<T> GeneralizedCreateTWithArgsUntilUsingWhile<T,A>
(Func<a,Result><T,A>> gen, A init, Func<A,bool> expr)
{
var xs = new List<T>();

var i = init;

while (expr.Invoke(i))
{
var result = gen.Invoke(i);

xs.Add(result.Value);
i = result.Accumulator;
}

return xs;
}


Going recursive



Now it’s up to the generating lambda to pass an argument to the constructor or not. What’s funny with this is that if we replace the while loop to a recursive call, we come pretty close to the definition of an anamorphism over lists in the introduction of Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire by Erik Meijer et. al (link to postscript version).


[Test]
public void WhileLoopToRec()
{

Func<int, Result<Foo, int>> gen = n => new Result<Foo, int>(new Foo(), n + 1);
Func<int, bool> expr = a => a < 10;

var xs = CreateTUntilUsingRec(gen, 0, expr);

Assert.AreEqual(10, xs.Count());
}

private IEnumerable<T> CreateTUntilUsingRec<T,A>
(Func<A, Result<T, A>> gen, A init, Func<A, bool> expr)
{

if (!expr.Invoke(init))
return new List<T>();

var result = gen.Invoke(init);

return (new List<T> {result.Value})
.Concat(CreateTUntilUsingRec(gen, result.Accumulator, expr));

}


Abstracting away from bool



It seems that the only thing left to do is to abstract away the dependency on “bool” in CreateTUsingRec, but then we bump into a small problem,as you will see. What we do is to join the two lambdas into one and performing a null check inside the recursive function.


[Test]
public void MergeOnceMore()
{
int? i = 0;
Func<int, Result<int?, int>> gen = n => new Result<int?, int>(n < 10 ? i : null, n + 1);

var xs = CreateTUntilUsingRecNullCheck(gen, 0);

Assert.AreEqual(10, xs.Count());
}

private IEnumerable<T> CreateTUntilUsingRecNullCheck<T, A>
(Func<A, Result<T, A>> gen, A init)
{

var result = gen.Invoke(init);
if (result.Value == null)
return new List<T>();

return (new List<T> { result.Value })
.Concat(CreateTUntilUsingRecNullCheck(gen, result.Accumulator));

}


Writing it in F# instead



The problem with this solution is that we have lost the ability to generate ordinary non-nullable structs and that’s bad! We could easily solve the problem with writing our own MyNullable<T> which would  allow both classes and structs as instantiators of the type variable, but instead of doing that, I’ll show you something similar: the option type in F#.


let rec anamorphism n f = 
match f n with
| option.None -> []
| option.Some (e,next) -> e::(anamorphism next f)

> anamorphism 0 (fun a -> if a < 5 then option.Some("Bar" + a.ToString(), a + 1) else option.None);;
val it : string list = ["Bar0"; "Bar1"; "Bar2"; "Bar3"; "Bar4"]


Here, we removed the Result class and used a pair instead, together with the Option type, which is essentially the same as Nullable<T>, but without the restriction on type.


Conslusion



Functional programming is perhaps more abstract than imperative programming, but it is also seems to be more general, at least in this case. This post has only showed an anamorphism over lists, which is the simple case. Look here if you want to see a more advanced example.

2 comments:

Christian Genne said...

Nice! Very complicated C# code in the end, F# is truly a better choice when trying to generalize code.

Often, IMO, and as you really see in this post, it's better to have simple and straight forward code (as in the first code example), instead of trying to generalize it just because it's possible. You often end up with unreadable code, which you "aint gonna need" anyway (YAGNE)!

Btw, a little C# tip: you can leave out the ".Invoke" when calling a delegate. I.e. writing "newFoo()" instead of "newFoo.Invoke()".

Anonymous said...

This is too complex without fully using LINQ. You really need to learn about Enumerable.Range(x,y).Select(...) or Enumerable.Repeat