LINQ Your Way to Happiness

August 14, 2014

Blog | Development | LINQ Your Way to Happiness
LINQ Your Way to Happiness

Rarely does laziness payoff, unless your idea of payoff is being caught up on that guilty pleasure TV series that everybody is talking about (currently, mine is Orange Is the New Black). Laziness normally ends up with a messy house, unpaid bills, incomplete projects, and a to-do list a mile long. In development, however, the tools we use are designed to be lazy and do as little as possible for as short a time as possible. That fancy Intel i7 processor in your machine? Likely underclocking itself to save on juice.

The tools we use at GeekHive are no different. .NET includes a handy set of language enhancements known as LINQ, or Language Integrated Query syntax. Let’s take a look at some of the benefits we reap from using the syntax, as well as some of the potential pitfalls in being slightly too lazy for your own good.

Syntax Overview

Before digging deeper, there are two ways to leverage the LINQ syntax/architecture. The first uses a series of Extension Methods developed and maintained by Microsoft, 3rd party companies, or crafting your own. A second relies on the compressed syntax known as Query Expressions.

In the examples below, both syntaxes are shown as well as the more traditional ForEach approach. I show how to take a set of Integer values, extract all even numbers excluding all those less than 10, and finally obtain the string representation of that value. (Note: explicit types and unnecessary verbosity has been used on purpose for clarity, although the ‘var’ keyword is an invaluable and sometimes necessary shorthand.)

public IEnumerable ForEachExample(IEnumerable values)
{
    List result = new List();
    foreach (int v in values)
    {
        if ((v % 2) == 0)
        {
            if (v < 10)
            {
                result.Add(v.ToString());
            }
        }
    }
    return result;
}

public void ExtensionExample(IEnumerable values)
{
    IEnumerable result = values.Where(v => (v % 2) == 0)
                                       .Where(v => v < 10)
                                       .Select(v => v.ToString());
}

public IEnumerable ExpressionExample(IEnumerable values)
{
    IEnumerable result = from v in values
                                 where (v % 2) == 0
                                 where v < 10
                                 select v.ToString();

    return result;
}

The syntaxes different greatly, but the core content is there. We can clearly see the modulus operator to identify even values, the exclusion of values less than 10 and the conversion to a string value. Leaving aside the implicit difference where the ‘foreach’ example has iterated over the complete “values” collection and the extension and expression examples have not yet , we can see a few of the many operations supported by the syntax. A more complete list can be found here, with helpful examples for each.

The Bob

Using the knowledge that a LINQ expression (using either Extension Methods or Query Expressions) is only ever run when a terminating statement is used, we can craft complex, conditional queries. For instance, we can pass a “query object” between methods to build our final query, skipping and including methods as we go, based on other variables.

Using another example, let’s take a given set of DateTime values. If the dates falls on Saturday or Sunday, let’s exclude them. However, if today’s date is a Saturday or Sunday, let’s exclude all others. With this filtered set, order the the values by the time of day and return the total number of ticks in each value. Once again, the code samples are verbose for clarity and could be condensed multiple ways.

public long[] DateTimeExample(IEnumerable values)
{
    var dayOfWeek = DateTime.Today.DayOfWeek;
    if (dayOfWeek == DayOfWeek.Saturday || dayOfWeek == DayOfWeek.Sunday)
    {
        var filteredResult = IncludeWeekends(values);
        var orderedResult = OrderByTimeOfDay(filteredResult);
        var convertedResult = GetTicks(orderedResult);
        return convertedResult.ToArray();
    }
    else
    {
        var filteredResult = ExcludeWeekends(values);
        var orderedResult = OrderByTimeOfDay(filteredResult);
        var convertedResult = GetTicks(orderedResult);
        return convertedResult.ToArray();
    }
}

public IEnumerable IncludeWeekends(IEnumerable values)
{
    return from v in values
           where v.DayOfWeek == DayOfWeek.Saturday || v.DayOfWeek == DayOfWeek.Sunday
           select v;
}

public IEnumerable ExcludeWeekends(IEnumerable values)
{
    return from v in values
           where v.DayOfWeek != DayOfWeek.Saturday && v.DayOfWeek != DayOfWeek.Sunday
           select v;
}

public IEnumerable OrderByTimeOfDay(IEnumerable values)
{
    return from v in values
           orderby v.TimeOfDay
           select v;
}

public IEnumerable GetTicks(IEnumerable values)
{
    return from v in values
           select v.Ticks;
}

What you can see in the above example is that the initial values are passed through a Restriction Operation; the result of the Restriction Operation is passed through a Sorting Operation; the result of the Sorting Operation is passed through a Projection Operation; and finally the resulting crafted query is forcibly executed using a Conversion Operation and returned to the initial caller as an array.

This example attempts to show that a query can be crafted using methods residing across class, or even assembly, boundaries, building up a final set of instructions to eventually output the most glorious of result sets ever known to mankind!

The Repeater

Leveraging the power of deferred execution, the more formal name for the lazy nature of LINQ expressions, we can repeat an operation at a whim. Consider a scenario where a query’s underlying data source is an external resource, such as a database, and the result may change over time. In the below example, the result is obtained from the external resource at the start of the method followed by a wait operation (simulating some other long-running or delayed task.) Once the delay is complete, the operation is executed again on the same IQueryable instance which may or may not return a different value (depending on which Restriction, Ordering, or other operations have been applied.)

private EventWaitHandle waitHandle = new AutoResetEvent(false);
public bool RepeaterExample(IQueryable getValueFromDb)
{
    var result1 = getValueFromDb.First();

    waitHandle.WaitOne(); // Or some other long-running operation

    var result2 = getValueFromDb.First();

    return (result1 != result2);
}

The Accidental Repeater

Unfortunately, with great power comes great responsibility. An inexperienced developer can easily repeat an entire query multiple times by not fully understanding which operations are deferred and which are not.

Here our input values are ordered and projected into a set of strings. However, as explained earlier, these operations will not be applied until a terminating operation is performed (such as a .ToArray() call.)

Likewise, the invocation of the .Any() operator with a filtering parameter means that the Ordering and Projection operations must be executed (in part or in full) so that the .Any operation can run to completion. Note that the framework is intelligent enough to recognize that the .Any operator need not execute the completion and can stop once a suitable value matching the criteria is found.

Assuming the .Any operator returns true, the filtered result is then materialized into an Array to be iterated over within the For Each loop. In the end, the Ordering and Projection operations have likely ran to completion twice, which could easily be disastrous should the collections be large or the operations complex or reliant on external resources.

public void AccidentalRepeater(IEnumerable values)
{
    var filteredResult = values.OrderByDescending(v => v).Select(v => v.ToString());

    if (filteredResult.Any(v => v.StartsWith("Phil")))
    {
        var filteredResultArray = filteredResult.ToArray();
        foreach (var filteredValue in filteredResultArray)
        {
            Console.WriteLine(filteredValue);
        }
    }
}

Teamwork, Teamwork!

While we can all agree that software development has both a team-oriented component, as well as an individual component (two people can’t always share a single keyboard), working in parallel has its advantages. The Parallel LINQ implementation (PLINQ) is a .Net feature that builds upon the traditional LINQ syntax that allows developers to easily leverage the power of the modern-age processor.

As the name suggests, the implementation allows for a LINQ query to be executed in parallel to obtain the result (hopefully) quicker than a single-threaded execution would achieve. Luckily, the framework is, once again, intelligent enough to briefly analyze the operator combinations to identify whether a parallel or sequential mode of operation would be best suited. This MSDN article examples further the decision-making processes at play.

With all multi-threaded operations, the developer must weigh the benefits of parallel execution against the associated overhead, something best achieved using representative configurations, loads and data.

In the next example, the input set consists of multiple streams representing some arbitrary data which must be compressed and written to its final resting place. All the worries of how best to create a new thread or how many threads to leverage can be ignored and instead, only the operation to be performed in parallel must be described.

public void PLINQExample(IEnumerable values)
{
    var pValues = values.AsParallel();
    pValues.ForAll(v => ProcessValue(v));
}

public void ProcessValue(Stream value)
{
    var zipStream = new System.IO.Compression.GZipStream(value, CompressionMode.Compress);
    WriteToFinalDestination(zipStream);
}

Happiness

While the number of operations and the power of LINQ cannot be easily summarized in a few pages, as a .Net developer, I am grateful to have them at my disposal. When faced with almost any task, the need to filter, sort, project or convert a collection is almost always present. Having a robust set of libraries that have been thoroughly tested for accuracy and throughput, is a great and powerful ally.

Got any tips, tricks, or gotchas relating to LINQ, the sequential or parallel kind? Let us know @GeekHive.

Phil Azzi, Developer, GeekHive

Phil Azzi

Technical Lead
Tags
  • .NET

Recent Work

Check out what else we've been working on