[PRL] Parallel Performance: Optimize Managed Code For Multi-Core Machines -- MSDN Magazine, October 2007

Mitchell Wand wand at ccs.neu.edu
Mon Sep 24 20:22:29 EDT 2007


shared http://msdn.microsoft.com/msdnmag/issues/07/10/Futures/default.aspx



Multi-processor machines are now becoming standard while the speed increases
of single processors have slowed down. The key to performance improvements
is therefore to run a program on multiple processors in parallel.
Unfortunately, it is still very hard to write algorithms that actually take
advantage of those multiple processors. In fact, most applications use just
a single core and see no speed improvements when run on a multi-core
machine. We need to write our programs in a new way.
Introducing TPL

The Task Parallel Library (TPL) is designed to make it much easier to write
managed code that can automatically use multiple processors. Using the
library, you can conveniently express potential parallelism in existing
sequential code, where the exposed parallel tasks will be run concurrently
on all available processors. Usually this results in significant speedups.

TPL is being created as a collaborative effort by Microsoft(r) Research, the
Microsoft Common Language Runtime (CLR) team, and the Parallel Computing
Platform team. TPL is a major component of the Parallel FX library, the next
generation of concurrency support for the Microsoft .NET Framework. Though
it has not yet reached version 1.0, the first Parallel FX Community Tech
Preview (CTP) will be available from MSDN(r) in Fall '07. Watch
http://blogs.msdn.com/somasegar <http://http//blogs.msdn.com/somasegar> for
details. TPL does not require any language extensions and works with the
.NET Framework 3.5 and higher.

Visual Studio(r) 2008 is fully supported and all parallelism is expressed
using normal method calls. For example, suppose you have the following for
loop that squares the elements of an array:

for (int i = 0; i < 100; i++) {
  a[i] = a[i]*a[i];
}

Since the iterations are independent of each other, that is, subsequent
iterations do not read state updates made by prior iterations, you can use
TPL to express the potential parallelism with a call to the parallel for
method, like this:

Parallel.For(0, 100, delegate(int i) {
  a[i] = a[i]*a[i];
});

Note that Parallel.For is just a normal static method with three arguments,
where the last argument is a delegate expression. This delegate captures the
unchanged loop body of the previous example, which makes it particularly
easy to experiment with introducing concurrency into a program.

The library contains sophisticated algorithms for dynamic work distribution
and automatically adapts to the workload and particular machine. Meanwhile,
the primitives of the library only express potential parallelism, but do not
guarantee it. For example, on a single-processor machine, parallel for loops
are executed sequentially, closely matching the performance of strictly
sequential code. On a dual-core machine, however, the library uses two
worker threads to execute the loop in parallel, depending on the workload
and configuration. This means you can introduce parallelism into your code
today and your applications will use multiple processors automatically when
they are available. At the same time, the code will still perform well on
older single-processor machines.

Unfortunately, the library does not help to correctly synchronize parallel
code that uses shared memory. It is still the programmer's responsibility to
ensure that certain code can be safely executed in parallel. Other
mechanisms, such as locks, are still needed to protect concurrent
modifications to shared memory. TPL does offer some abstractions, though,
that help with synchronization (as we will show you in a moment).

...more details in the article (luckily it's not long).

<http://msdn.microsoft.com/msdnmag/issues/07/10/Futures/default.aspx#contents>
-------------- next part --------------
HTML attachment scrubbed and removed


More information about the PRL mailing list