Universal PredicateBuilder for Expression

C# .NET language

In this new short post, I’ll show you how to create a universal PredicateBuilder for Expression in C# to merge 2 or more expressions with Linq. But first, I want to be sure you have a clear idea of what Func and Expression are.

Func<T> vs. Expression<Func<T>> in LINQ

First, Linq is fantastic! It provides a consistent syntax to query all sorts of data from in-memory collections, SQL databases, XML files, even external APIs. One of its strengths is that you can write a Linq provider for any data source the you want to support.

So, most people know about the “not obvious until it’s obvious” difference between IEnumerable<T> and IQueryable<T> – one represents an in-memory collection and one represents a query which will be executed at some point against a data source. Both have an almost identical set of LINQ extensions, except the IEnumerable<T> extensions accept Func<T> and the IQueryable<T> extensions accept Expression<Func<T>>.

Func<T> vs. Expression<Func<T>>

Then, go ahead – fire up Visual Studio and take a look at the method signatures of an IEnumerable and an IQueryable for the where Linq extension.

  • The IEnumerable version: Where(Func<T, bool> predicate)
  • The IQueryable version: Where(Expression<Func<T, bool>> predicate)

Don’t worry if you haven’t noticed the differences, you call them with the exact same syntax.

  • IEnumberable version: .Where(x => x.property == "value")
  • IQueryable version: .Where(x => x.property == "value")

So, what is the difference?

  • Func<T> is just a pointer to an ordinary delegate that has been compiled down to IL (intermediate language) just like any other C# code that you write. There is nothing special about it.
  • Expression<Func<T>> is a description of a function as an expression tree. It can be compiled to IL at run time that generates a Func<T> but it can also be translated to other languages.

You need an expression for IQueryable because we don’t know what we’re querying. The specific IQueryable implementation will translate the given expression into whatever language needed to access the data.

So, you don’t need an expression for IEnumerable as it’s just an in-memory collection that understands vanilla IL so we can save a whole bunch of overhead and throw compiled queries at it.

You can convert an Expression<Func<T>> to a Func<T> by calling the Compile method that compiles the expression tree to IL – this is done at run-time so has a performance overhead compared to dealing with Func<T> directly. You cannot convert a Func<T> to an Expression<Func<T>> as you cannot reverse engineer IL to get the original source code back at run time. Not only is it very difficult to reverse engineer but compiling is a lossy process full of performance tricks. So, you’ll never be able to get the exact source code back even if you were super determined.

Expression in C# and Linq

The lambda Expression can be assigned to the Func or Action type delegates to process over in-memory collections. The .NET compiler converts the lambda expression assigned to Func or Action type delegate into executable code at compile time.

LINQ introduced the new type called Expression that represents strongly typed lambda expression. It means lambda expression can also be assigned to Expression<TDelegate> type. The .NET compiler converts the lambda expression which is assigned to Expression<TDelegate> into an Expression tree instead of executable code. This expression tree is used by remote LINQ query providers as a data structure to build a runtime query out of it (such as LINQ-to-SQL, EntityFramework or any other LINQ query provider that implements IQueryable<T> interface).

The following figure illustrates differences when the lambda expression assigned to the Func or Action delegate and the Expression in LINQ.

Expression and Func - Universal PredicateBuilder for Expression
Expression and Func

Define an Expression

Take the reference of System.Linq.Expressions namespace and use an Expression<TDelegate> class to define an Expression. Expression<TDelegate> requires delegate type Func or Action.

For example, you can assign lambda expression to the isTeenAger variable of Func type delegate, as shown below:

public class Student 
{
    public int StudentID { get; set; }
    public string StudentName { get; set; }
    public int Age { get; set; }
}

Func<Student, bool> isTeenAger = s => s.Age > 12 && s.Age < 20;

And now, you can convert the above Func type delegate into an Expression by wrapping Func delegate with Expresson, as below:

Expression<Func<Student, bool>> isTeenAgerExpr = s => s.Age > 12 && s.Age < 20;

in the same way, you can also wrap an Action<t> type delegate with Expression if you don’t return a value from the delegate.

Expression<Action<Student>> printStudentName = s => Console.WriteLine(s.StudentName);

Thus, you can define Expression<TDelegate> type. Now, let’s see how to invoke delegate wrapped by an Expression<TDelegate>.

Invoke an Expression

You can invoke the delegate wrapped by an Expression the same way as a delegate, but first you need to compile it using the Compile() method. Compile() returns delegateof Func or Action type so that you can invoke it like a delegate.

Expression<Func<Student, bool>> isTeenAgerExpr = s => s.Age > 12 && s.Age < 20;

//compile Expression using Compile method to invoke it as Delegate
Func<Student, bool>  isTeenAger = isTeenAgerExpr.Compile();
            
//Invoke
bool result = isTeenAger(new Student(){ StudentID = 1, StudentName = "Steve", Age = 20});

Expression Tree

You have learned about the Expression in the previous section. Now, let’s learn about the Expression tree here.

Expression tree as name suggests is nothing but expressions arranged in a tree-like data structure. Each node in an expression tree is an expression. For example, an expression tree can be used to represent mathematical formula x < y where x, < and y will be represented as an expression and arranged in the tree like structure.

Expression tree is an in-memory representation of a lambda expression. It holds the actual elements of the query, not the result of the query.

The expression tree makes the structure of the lambda expression transparent and explicit. You can interact with the data in the expression tree just as you can with any other data structure.

For example, consider the following isTeenAgerExpr expression:

Expression<Func<Student, bool>> isTeenAgerExpr = s => s.age > 12 && s.age < 20;

The compiler will translate the above expression into the following expression tree:

Expression.Lambda<Func<Student, bool>>(
                Expression.AndAlso(
                    Expression.GreaterThan(Expression.Property(pe, "Age"), Expression.Constant(12, typeof(int))),
                    Expression.LessThan(Expression.Property(pe, "Age"), Expression.Constant(20, typeof(int)))),
                        new[] { pe });

You can also build an expression tree manually. Let’s see how to build an expression tree for the following simple lambda expression:

Func<Student, bool> isAdult = s => s.age >= 18;

This Func type delegate is similar to the following method:

public bool function(Student s)
{
  return s.Age > 18;
}

To create the expression tree, first of all, create a parameter expression where Student is the type of the parameter and ‘s’ is the name of the parameter as below:

ParameterExpression pe = Expression.Parameter(typeof(Student), "s");

Now, use Expression.Property() to create s.Age expression where s is the parameter and Age is the property name of Student. (Expression is an abstract class that contains static helper methods to create the Expression tree manually.)

MemberExpression me = Expression.Property(pe, "Age");

Now, create a constant expression for 18:

ConstantExpression constant = Expression.Constant(18, typeof(int));

Use parameters

Till now, we have built expression trees for s.Age (member expression) and 18 (constant expression). We now need to check whether a member expression is greater than a constant expression or not. For that, use the Expression.GreaterThanOrEqual() method and pass the member expression and constant expression as parameters:

BinaryExpression body = Expression.GreaterThanOrEqual(me, constant);

Thus, we have built an expression tree for a lambda expression body s.Age >= 18. We now need to join the parameter and body expressions. Use Expression.Lambda(body, parameters array) to join the body and parameter part of the lambda expression s => s.age >= 18:

var isAdultExprTree = Expression.Lambda<Func<Student, bool>>(body, new[] { pe });

This way you can build an expression tree for simple Func delegates with a lambda expression.

ParameterExpression pe = Expression.Parameter(typeof(Student), "s");

MemberExpression me = Expression.Property(pe, "Age");

ConstantExpression constant = Expression.Constant(18, typeof(int));

BinaryExpression body = Expression.GreaterThanOrEqual(me, constant);

var ExpressionTree = Expression.Lambda<Func<Student, bool>>(body, new[] { pe });

Console.WriteLine("Expression Tree: {0}", ExpressionTree);
		
Console.WriteLine("Expression Tree Body: {0}", ExpressionTree.Body);
		
Console.WriteLine("Number of Parameters in Expression Tree: {0}", 
                                ExpressionTree.Parameters.Count);
		
Console.WriteLine("Parameters in Expression Tree: {0}", ExpressionTree.Parameters[0]);

The following image illustrates the whole process of creating an expression tree:

Linq construct expression tree - Universal PredicateBuilder for Expression
Linq construct expression tree

Why Expression Tree?

We have seen in the previous section that the lambda expression assigned to Func<T> compiles into executable code and the lambda expression assigned to Expression<TDelegate> type compiles into Expression tree.

Executable code executes in the same application domain to process over in-memory collection. Enumerable static classes contain extension methods for in-memory collections that implements IEnumerable<T> interface e.g., List<T>, Dictionary<T>, etc. The Extension methods in an Enumerable class accept a predicate parameter of Func type delegate. For example, the Where extension method accepts Func<TSource, bool> predicate. It then compiles it into IL (Intermediate Language) to process over in-memory collections that are in the same AppDomain.

The following image shows Where extension method in Enumerable class includes Func delegate as a parameter:

Func delegate in Where - Universal PredicateBuilder for Expression
Func delegate in Where

Func delegate is a raw executable code. So, if you debug the code and find that the Func delegate doesn’t have a clear code. You cannot see its parameters, return type and body:

Func delegate in debug mode - Universal PredicateBuilder for Expression
Func delegate in debug mode

Func delegate is for in-memory collections because it will be processed in the same AppDomain. What about remote LINQ query providers like LINQ-to-SQL, Entity Framework or other third-party products that provides LINQ capabilities?

How would they parse lambda expression that has been compiled into raw executable code to know about the parameters, return type of lambda expression and build runtime query to process further? The answer is Expression tree.

So, the compiler transforms Expression<TDelegate> into a data structure called an expression tree.

ExpressionTree in debug mode
ExpressionTree in debug mode

Now you can see the difference between a normal delegate and an Expression. An expression tree is transparent. You can retrieve a parameter, return type and body expression information from the expression, as below:

Expression<Func<Student, bool>> isTeenAgerExpr = s => s.Age > 12 && s.Age < 20;

Console.WriteLine("Expression: {0}", isTeenAgerExpr );
        
Console.WriteLine("Expression Type: {0}", isTeenAgerExpr.NodeType);

var parameters = isTeenAgerExpr.Parameters;

foreach (var param in parameters)
{
    Console.WriteLine("Parameter Name: {0}", param.Name);
    Console.WriteLine("Parameter Type: {0}", param.Type.Name );
}
var bodyExpr = isTeenAgerExpr.Body as BinaryExpression;

Console.WriteLine("Left side of body expression: {0}", bodyExpr.Left);
Console.WriteLine("Binary Expression Type: {0}", bodyExpr.NodeType);
Console.WriteLine("Right side of body expression: {0}", bodyExpr.Right);
Console.WriteLine("Return Type: {0}", isTeenAgerExpr.ReturnType);

Merge 2 or more expressions

So, after this long introduction, it is time to face hot to create merge Expression creating a universal PredicateBuilder for Expression. What is this mean? Consider the following scenario. You want to apply an expression based on some conditions.

For example, you want to create a function that reads from a repository some data and filters it based on the parameters. So, this is my real function GetData. GetData has markets as parameters. markets are defined as a List<long>. First, we have to define a new Expression to use:

Expression<Func<Invoice, bool>> funz = null;

So, if markets has at least a value, I want to filter for it. If there is only one market it is easy and I can use one of the expressions I showed above. The problem starts when I have to combine more than one filter. When I add a new filter, it could be appended to an existing filter as an AND if the filters have to be applied together or apply the OR filter in other cases. And also, to apply the opposite of a filter (NOT).

public List<IncomeValue> GetData(List<long> markets = null)
{
    List<IncomeValue> rtn = new List<IncomeValue>();

    Expression<Func<Invoice, bool>> funz = null;

    if (markets != null && markets.Count() > 0)
    {
        Expression<Func<Invoices, bool>> tmpFunz = null;
        foreach (long market in markets)
        {
            Expression<Func<Invoice, bool>> fnz = r => r.MarketId == market;
            tmpFunz = tmpFunz.Or(fnz);
        }
        funz = funz.And(tmpFunz);
    }

    IQueryable<Invoice> list = _db.Invoices;

    if (funz != null)
        list = list.Where(funz);

    var records = list.GroupBy(r => r.MarketId).Select(group => new
    {
        Country = group.Key,
        Count = group.Count()
    }).OrderBy(x => x.Score);

    foreach (var item in records)
        rtn.Add(new IncomeValue() { Count = item.Count, Score = item.Country});

    return rtn;
}

Then, natively there is not a simple way to combine expressions together. We have to write an Expression Extension for having those functions.

ExpressionExtensions code

Here’s an implementation of PredicateBuilder that doesn’t use Invoke. It instead uses a Replace method that replaces all instances of one expression with another.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Linq.Expressions;
using System.Text;
using System.Threading.Tasks;

namespace PSC.Extensions
{
    public static class ExpressionExtensions
    {
        public static Expression<Func<T, bool>> And<T>(this Expression<Func<T, bool>> expr1, 
               Expression<Func<T, bool>> expr2)
        {
            if (expr2 == null && expr1 != null)
                return expr1;

            if (expr1 == null && expr2 != null)
                return expr2;

            var secondBody = expr2.Body.Replace(expr2.Parameters[0], expr1.Parameters[0]);
            return Expression.Lambda<Func<T, bool>>
                  (Expression.AndAlso(expr1.Body, secondBody), expr1.Parameters);
        }

        public static Expression<Func<T, bool>> Or<T>(this Expression<Func<T, bool>> expr1, 
               Expression<Func<T, bool>> expr2)
        {
            if (expr2 == null && expr1 != null)
                return expr1;

            if (expr1 == null && expr2 != null)
                return expr2;

            var secondBody = expr2.Body.Replace(expr2.Parameters[0], expr1.Parameters[0]);
            return Expression.Lambda<Func<T, bool>>
                  (Expression.OrElse(expr1.Body, secondBody), expr1.Parameters);
        }

        public static Expression<Func<T, bool>> Not<T>(this Expression<Func<T, bool>> expression)
        {
            var negated = Expression.Not(expression.Body);
            return Expression.Lambda<Func<T, bool>>(negated, expression.Parameters);
        }

        public static Expression Replace(this Expression expression, Expression searchEx, 
               Expression replaceEx)
        {
            return new ReplaceVisitor(searchEx, replaceEx).Visit(expression);
        }

        internal class ReplaceVisitor : ExpressionVisitor
        {
            private readonly Expression from, to;
            public ReplaceVisitor(Expression from, Expression to)
            {
                this.from = from;
                this.to = to;
            }
            public override Expression Visit(Expression node)
            {
                return node == from ? to : base.Visit(node);
            }
        }
    }
}

In conclusion, with a custom extension I created a universal PredicateBuilder for Expression. You can easily merge 2 or more Expression with AND, OR, or NOT.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.