Have you tried to use anything from the TPL yet?
You could try to parallelize one of the loops and see if the performance increases or drops: see In a nested loop, should Parallel.For be used on the outer or inner loop?
For example with this code:
var width = 3000;
var height = 2000;
var arr = new int[width, height];
var sw = Stopwatch.StartNew();
sw = Stopwatch.StartNew();
foreach (var p in PointGenerator.GeneratePoints(height, width))
{
arr[p.X, p.Y] = 3;
}
sw.Elapsed.Dump();
sw = Stopwatch.StartNew();
PointGenerator.Process(height, width, p =>
{
arr[p.X, p.Y] = 4;
});
sw.Elapsed.Dump();
where
class PointGenerator
{
public static IEnumerable<Point> GeneratePoints(int height, int width)
{
for (int x = 0; x < width; x++)
{
for (int y = 0; y < height; y++)
{
yield return new Point(x, y);
}
}
}
public static void Process(int height, int width, Action<Point> doSomething)
{
Parallel.For(0, width, new ParallelOptions { MaxDegreeOfParallelism = 4 }, x =>
{
for (int y = 0; y < height; y++)
{
doSomething(new Point(x, y));
}
});
}
}
the Process method is 10x faster on my machine (tested in LINQPad in release mode).