NativeArray.SortJob() is fast… or is it?

While I was making a framework of rendering sprites using Compute Buffers, I was thinking of what sorting algorithm to use to sort sprites (say render from top to bottom). I’m using a Burst compiled Quicksort in my mesh based sprite rendering system. This is already good but maybe I could do better. Since Unity is making it easier to write multithreaded code, there must be multithreaded sorting algorithms out there that I could use. I found lots of them but I found them hard to translate into jobs code. What I realized is that maybe Unity already has one and they do albeit it’s just partly multithreaded. I found NativeArrayExtensions.SortJob() in the DOTS forum. Like any curious programmer, I compared it with my Quicksort implementation.

Comparison code

My lazy approach for comparison is to just use one system to run both sorting algorithms and see their performance through the profiler. Here’s what it looks like:

// This is the test item that we will sort
public readonly struct TestSortEntry : IComparable<TestSortEntry> {
    public readonly int value;

    public TestSortEntry(int value) {
        this.value = value;
    }

    public int CompareTo(TestSortEntry other) {
        return this.value.CompareTo(other.value);
    }
}

// This is the system that runs both sorting algorithms
[AlwaysUpdateSystem]
public class NativeSortTestSystem : SystemBase {
    private const int ENTRIES_COUNT = 20000;

    protected override void OnUpdate() {
        this.Dependency = OnUpdate(this.Dependency);
    }

    private JobHandle OnUpdate(JobHandle inputDeps) {
        // Create the array to sort
        NativeArray<TestSortEntry> entries = new NativeArray<TestSortEntry>(ENTRIES_COUNT, Allocator.TempJob);
        
        // Populate
        JobHandle populateHandle = new PopulateJob() {
            entries = entries, random = new Random((uint) UnityEngine.Random.Range(0, uint.MaxValue))
        }.Schedule(entries.Length, 64);
        
        // We complete population because we don't want this in the sorting chain
        populateHandle.Complete();
        
        // SortJob()
        JobHandle chainHandle = ExecuteSortJob(inputDeps, entries);
        
        // Quicksort
        chainHandle = ExecuteQuickSort(chainHandle, entries);

        entries.Dispose();

        return chainHandle;
    }

    private static JobHandle ExecuteSortJob(JobHandle inputDeps, NativeArray<TestSortEntry> entries) {
        // Create copy of array to sort
        NativeArray<TestSortEntry> copy = new NativeArray<TestSortEntry>(ENTRIES_COUNT, Allocator.TempJob);
        copy.CopyFrom(entries);

        JobHandle handle = inputDeps;

        // Sort using SortJob()
        handle = NativeSortExtension.SortJob(copy, handle); 

        // Dispose
        handle = new DisposeNativeArrayJob<TestSortEntry>(copy).Schedule(handle);

        return handle;
    }

    private static JobHandle ExecuteQuickSort(JobHandle inputDeps, NativeArray<TestSortEntry> entries) {
        // Create copy of array to sort
        NativeArray<TestSortEntry> copy = new NativeArray<TestSortEntry>(ENTRIES_COUNT, Allocator.TempJob);
        copy.CopyFrom(entries);

        JobHandle handle = inputDeps;
        
        // Quicksort
        handle = new QuickSortJob() {
            entries = copy
        }.Schedule(handle);

        // Dispose
        handle = new DisposeNativeArrayJob<TestSortEntry>(copy).Schedule(handle);

        return handle;
    }

    [BurstCompile]
    private struct PopulateJob : IJobParallelFor {
        [NativeDisableParallelForRestriction]
        public NativeArray<TestSortEntry> entries;

        public Random random;
        
        public void Execute(int index) {
            this.entries[index] = new TestSortEntry(this.random.NextInt(0, 100000)); 
        }
    }
    
    [BurstCompile]
    private struct QuickSortJob : IJob {
        public NativeArray<TestSortEntry> entries;

        public void Execute() {
            if (this.entries.Length > 0) {
                Quicksort(0, this.entries.Length - 1);
            }
        }

        private void Quicksort(int left, int right) {
            int i = left;
            int j = right;
            TestSortEntry pivot = this.entries[(left + right) / 2];

            while (i <= j) {
                // Lesser
                while (Compare(this.entries[i], ref pivot) < 0) {
                    ++i;
                }

                // Greater
                while (Compare(this.entries[j], ref pivot) > 0) {
                    --j;
                }

                if (i <= j) {
                    // Swap
                    TestSortEntry temp = this.entries[i];
                    this.entries[i] = this.entries[j];
                    this.entries[j] = temp;

                    ++i;
                    --j;
                }
            }

            // Recurse
            if (left < j) {
                Quicksort(left, j);
            }

            if (i < right) {
                Quicksort(i, right);
            }
        }

        private int Compare(TestSortEntry a, ref TestSortEntry b) {
            return a.CompareTo(b);
        }
    }
}

What I do here is in every update, I generate an array with 20,000 elements of random values then sort them using SortJob() followed by Quicksort. This code automatically runs in an empty scene. This is what the profiler looks like in the editor:

The one I highlighted with total running time of 1.026ms is the SortJob(). Notice that it’s just partly multithreaded (SegmentSort). Quicksort is run in just a single thread at 1.73ms. SortJob() is indeed faster here.

However…

I made a development build to see the real deal. I wanted to see how faster SortJob() is when it’s in the build. The results are surprising:

In the sample profile here, SortJob() executes a total of 5.069ms while Quicksort runs faster at 1.28ms compared to its running time in the editor (1.73).

How could this be? I don’t know the answer yet. I’m happy to use Quicksort for our game for now. If you know a faster multithreaded sorting algorithm that’s implemented in DOTS, let me know.

That’s it for this post.

2 thoughts on “NativeArray.SortJob() is fast… or is it?

    1. Yeah. Burst is enabled as evidenced by the Burst compiled quicksort. Someone from the forum pointed out that using a generic struct in a generic method can’t be Burst compiled. This is bad UX, though, so I’m still waiting for resolution from Unity devs.

      Liked by 1 person

Leave a comment