Software engineering blog of Clément Bouillier: Expression Trees and Reflection performances

Thursday, July 2, 2009

Expression Trees and Reflection performances

Context

Lots of people think Reflection is evil for performance, but if you can tune your code to avoid some problems, and even better, you can write code as quick as "good old fashion C# code" (I mean writing code without Reflection).

Beware, I am not justifying use of Reflection in any application to implement the "graal of generic code". I think that Reflection could be used in some frameworks, but application code must be strongly typed for readability and maintainability. Example of framework is Object/Relational Mapping, and that's why I would like to talk here of Reflection and performance, because ORM performance can be a performance bottle-neck in an enterprise application.

I worked on a project that uses a "home maid" ORM, which one uses Reflection to create object instances (entities) and to hydrate them. Working on performance with a memory profiler, I saw that a lot of time was spent in Type.GetProperty PropertyInfo.SetValue method (both in same proportion). So I had a look at code and saw that first I could cache the PropertyInfo retrieved by Type.GetProperty, which give me a 50% performance win relative to overhead time spent in Reflection.
It is a first lesson on Reflection performance, cache Reflection objects relative to objects you use.

Then, I think about the old time when we only have Reflection.Emit to improve more this piece of code : you can reduce the overhead to zero using it, I let you find it on the web or see the implementation in NHibernate that uses this, because it seems a little bit painful since it is IL generation, i.e "improved assembler" more or less ;). Second lesson, time spent in Reflection could be reduce to the time taken to setup your application, and then with a zero overhead due to Reflection after that.

Finally, I had a look at new features of C#3.0 and .NET3.5 API, around LINQ and lambda expressions, then I was free from using Reflection.Emit. The main class is Expression found in System.Linq.Expressions. I am not sure, but probably it is less powerful than Reflection.Emit (I think of complex method generation...).

Now let's go into code with a simple case of setting a property one one million objects.

First try (lesson one learned)

I give you the following simple code : first
with GetProperty and SetValue inside the iteration, and second, with GetProperty outside the iteration. Here is the code:
   1:  // Set using GetProperty + SetValue
   2:  init = DateTime.Now;
   3:  start = DateTime.Now;
   4:  foreach (SimpleObjectA o in objects)
   5:  {
   6:      prop = o.GetType().GetProperty("Name");
   7:      prop.SetValue(o, "plip", null);
   8:  }
   9:  end = DateTime.Now;
  10:  Console.WriteLine("Set using GetProperty + SetValue : {0} + {1}", start.Subtract(init), end.Subtract(start));
  11:   
  12:  // Set using GetProperty only once + SetValue
  13:  init = DateTime.Now;
  14:  prop = typeof(SimpleObjectA).GetProperty("Name");
  15:  start = DateTime.Now;
  16:  foreach (SimpleObjectA o in objects)
  17:  {
  18:      prop.SetValue(o, "plip", null);
  19:  }
  20:  end = DateTime.Now;
  21:  Console.WriteLine("Set using GetProperty only once + SetValue : {0} + {1}", start.Subtract(init), end.Subtract(start));

Second try

Then, I tried to make it with Linq.Expressions. And it gives me some interesting thoughts about how to prepare the setter at initialization.
First, I thought to create a strongly typed Expression of the type Action where T is the object which has a property to set, PT is the type of the property to set, in my example, it was very simple, but in ORM context for example, it could be a little bit more tricky. Here is the code for example:
   1:  // Set using typed LambdaExpression (Action<T,I>)
   2:  init = DateTime.Now;
   3:  param = Expression.Parameter(typeof(SimpleObjectA), "x");
   4:  value = Expression.Parameter(typeof(string), "y");
   5:  setter = typeof(SimpleObjectA).GetProperty("Name").GetSetMethod();
   6:  Expression<Action<SimpleObjectA, string>> typedExpression = Expression.Lambda<Action<SimpleObjectA, string>>(Expression.Call(param, setter, value), param, value);
   7:  Action<SimpleObjectA, string> set = typedExpression.Compile();
   8:  start = DateTime.Now;
   9:  foreach (SimpleObjectA o in objects)
  10:  {
  11:      set(o, "plip");
  12:  }
  13:  end = DateTime.Now;
  14:  Console.WriteLine("Set using typed LambdaExpression (Action<T,I>) : {0} + {1}", start.Subtract(init), end.Subtract(start));

In fact, when doing mapping, you do not know the T type, then I think to use Action (Delegate in fact). Now you have to use Delegate and DynamicInvoke which far more bad than SetValue, so I searched for an other solution. For example, here is the code:
   1:  // Set using LambdaExpression + DynamicInvoke
   2:  init = DateTime.Now;
   3:  param = Expression.Parameter(typeof(SimpleObjectA), "x");
   4:  value = Expression.Parameter(typeof(string), "y");
   5:  setter = typeof(SimpleObjectA).GetProperty("Name").GetSetMethod();
   6:  LambdaExpression dynamicExpression = Expression.Lambda(Expression.Call(param, setter, value), param, value);
   7:  Delegate dynamic = dynamicExpression.Compile();
   8:  start = DateTime.Now;
   9:  foreach (SimpleObjectA o in objects)
  10:  {
  11:      dynamic.DynamicInvoke(o, "plip");
  12:  }
  13:  end = DateTime.Now;
  14:  Console.WriteLine("Set using LambdaExpression + DynamicInvoke : {0} + {1}", start.Subtract(init), end.Subtract(start));

So I found this article from Nate Kohari on Late Bound Invocation with Expression Trees. Then we use an Action and bound to real types inside the expression tree (which can be cached). Now, performance are equivalent to use the property setter directly. Here is the code:
   1:  // Set using typed LambdaExpression (Action<object,object>) -> late bound
   2:  init = DateTime.Now;
   3:  param = Expression.Parameter(typeof(object), "x");
   4:  value = Expression.Parameter(typeof(object), "y");
   5:  setter = typeof(SimpleObjectA).GetProperty("Name").GetSetMethod();
   6:  Expression<Action<object, object>> lateBoundTypedExpression = Expression.Lambda<Action<object, object>>(Expression.Call(Expression.Convert(param, setter.DeclaringType), setter, Expression.Convert(value, setter.GetParameters()[0].ParameterType)), param, value);
   7:  Action<object, object> lateBoundSet = lateBoundTypedExpression.Compile();
   8:  start = DateTime.Now;
   9:  foreach (SimpleObjectA o in objects)
  10:  {
  11:      lateBoundSet(o, "plip");
  12:  }
  13:  end = DateTime.Now;
  14:  Console.WriteLine("Set using typed LambdaExpression (Action<object,object>) -> late bound : {0} + {1}", start.Subtract(init), end.Subtract(start));


In conclusion, you will find time in ms I got on my laptop :

StrategyInitialization timeOne million iterations time
GetProperty + SetValue inside iteration02891
GetProperty once & SetValue inside iteration01953
Delagate + DynamicInvoke1876234
Latebound Lambda Expressions0
31
Strongly typed Lambda Expressions031
Property setter directly031

I tried to go with 10 millions iterations, and I got approximatively 10x time in the previous table.
With 60 millions iterations, I try only the three most performant solutions and we starting to see little differences between them (I could have tried a realist situation with several properties bound, but in fact, I would try 10 millions iterations with 6 properties and it should give the same results):
StrategyInitialization time60 million iterations time
Latebound Lambda Expressions02219
Strongly typed Lambda Expressions01797
Property setter directly01578

1 comment:

Nestor said...

That's what i was looking for... Calculations means arguments !