Parallel LINQ (PLINQ) is a concurrency execution engine for executing Language-Integrated Query (LINQ) queries. PLINQ is actually a part of the Parallel Extensions library (previously known as Parallel Framework Extensions PFX), which is a managed concurrency library that comprises two parts: Task Parallel Library (TPL) and PLINQ. The former is a task parallelism component, and the latter is a concurrency execution engine built on top of the CLR. This article takes a look at PLINQ and its features. For more information about LINQ, see "LINQed & Layered" and "Understanding the LinqDataSource Control."

PLINQ Prerequisites

To work with PLINQ, you should have one of the following installed in your system:

  • Visual Studio 2008 with the Parallel Extensions Library
  • Visual Studio 2010 Beta 1 or later

Also, you should have a good understanding of LINQ and how to use LINQ queries.

What Is PLINQ?

Simply put, PLINQ is a parallel execution engine for executing your LINQ queries on multicore systems. The MSDN article, "ParallelLINQ: Running Queries On Multi-Core Processors," states: "PLINQ is a query execution engine that accepts any LINQ-to-Objects or LINQ-to-XML query and automatically utilizes multiple processors or cores for execution when they are available."

PLINQ is a programming model that you can use to build applications that can take advantage of parallel hardware for improved performance and scalability without the need to go deep into the intrinsic details of what data parallelism is and how it all works. The key to PLINQ is parallel execution using multiple threads, which execute concurrently. Note that a thread is the path of execution within a process and is also the smallest unit of execution within a process. PLINQ is based on extension methods and can be used to take advantage of multiple processors in your system.

Parallelizing Your LINQ Queries

When you're writing your LINQ queries, to parallelize those queries you should either reference the System.Concurrency.dll assembly at compilation time or the System.Linq.ParallelEnumerable.AsParallel extension method on your data.

Consider the following code:

var integerList = Enumerable.Range(1, 100);

var data = from x in integerList.AsParallel()

where x <= 25

select x;

foreach (var v in data)

{

Console.WriteLine(v);

}

Notice the usage of the AsParallel() statement. This would return and object of type ParallelQuery<int>.

The AsParallel extension method is defined as shown in the following example:

public static class System.Linq.ParallelEnumerable {

public static IParallelEnumerable<T> AsParallel<T>(

this IEnumerable<T> source);

//Other Standard Query Operators

}

Note that the AsParallel method is overloaded and can accept variable integer arguments and also a ParallelQueryOptions enumeration as parameters. The first argument that is, the integer argument denotes the degree of parallelism. The degree of parallelism is given by the number of threads in use. The other parameter, ParallelQueryOptions, is an enumeration that can have one of the two values: None and PreserveOrdering. The PreserveOrdering value is used to preserve the order of the elements.

Under the Covers

Note that any PLINQ query that can be parallelized is based on partitioning. What PLINQ does is breaks the input data into pieces and then distributes it to the processing cores on your system. Partitioning is of the following types:

  • range partitioning
  • chunk partitioning
  • striped partitioning
  • hash partitioning

Processing the Data in Parallel

PLINQ allows you to process parallel items in a collection using Parallel.For and Parallel.ForEach loops. Here is an example that illustrates how you can use the ForAll() loop to process items:

IEnumerable<int> integerList = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };

var data = from i in integerList.AsParallel()

where i <= 5 select i;

data.ForAll(i => Console.WriteLine(i));

Here is another example that shows the elasped time taken by the AsParallel() method to perform a particular task.

int[] myList = new int[90000];

Random randomListInstance = new Random();

for (int i = 0; i < myList.Length; i++)

myList[i] = randomListInstance.Next(90000);

Stopwatch stopWatch = new Stopwatch();

stopWatch.Start();

var results = from n in myList.AsParallel() select n;

stopWatch.Stop();

Console.WriteLine("Time Elasped is: "+stopWatch.Elapsed.Milliseconds.ToString()+" milliseconds");

Console.Read();

You can also handle exceptions thrown by your PLINQ queries. To do so, you need to use the System.Threading.AggregateException class. You can retrieve the details of the actual exceptions using the InnerException property of the System.Threading.AggregateException class.