4
\$\begingroup\$

(This post has a continuation post.)

This one is my attempt at LSD radix sort:

Code

com.github.coderodde.util.LSDRadixsort.java:

package com.github.coderodde.util;

import java.util.Arrays;

/**
 * This class provides the method for sorting {@code int} arrays using 
 * least-significant digit (LSD) radix sort.
 */
public final class LSDRadixsort {
    
    /**
     * The number of counters in the counter array.
     */
    private static final int NUMBER_OF_COUNTERS = 256;
    
    /**
     * Sorts the entire {@code int} array into ascending order.
     * 
     * @param array the array to sort. 
     */
    public static void sort(int[] array) {
        sort(array, 0, array.length);
    }
    
    /**
     * Sorts the range {@code array[fromIndex ... toIndex - 1]} into ascending
     * order.
     * 
     * @param array     the array holding the range.
     * @param fromIndex the starting, inclusive index of the sorting range.
     * @param toIndex   the ending, exclusive index of the sorting range. 
     */
    public static void sort(int[] array, int fromIndex, int toIndex) {
        checkRangeIndices(array.length,
                          fromIndex,
                          toIndex);
        
        int rangeLength = toIndex - fromIndex;
        
        if (rangeLength < 2) {
            // Trivially sorted:
            return;
        }
        
        // buffer and counterMap are allocated only once for the sake of 
        // performance:
        int[] buffer = new int[rangeLength];
        int[] counterMap = new int[NUMBER_OF_COUNTERS];
        
        // Spawn sorting:
        sortImpl(array,
                 buffer, 
                 counterMap,
                 fromIndex,
                 toIndex);
    }
    
    private static void sortImpl(int[] array,
                                 int[] buffer, 
                                 int[] counterMap,
                                 int fromIndex,
                                 int toIndex) {
        // Sort first by least-significant bytes, then by second 
        // least-significant, and finally by third least-signficant byte:
        for (int byteIndex = 0; byteIndex != 3; byteIndex++) {
            countingSortImpl(array, 
                             buffer, 
                             counterMap, 
                             byteIndex,
                             fromIndex,
                             toIndex);
        }
        
        // Deal with the signed data:
        countingSortImplSigned(array, 
                               buffer, 
                               counterMap,
                               fromIndex,
                               toIndex);
    }
    
    /**
     * Performs the counting sort on {@code array[fromIndex ... toIndex - 1]}.
     * 
     * @param array      the array to sort.
     * @param buffer     the buffer array.
     * @param counterMap the counter array. We reuse this in order not to
     *                   allocate it everytime we call this method.
     * @param byteIndex  the index of the byte that serves as the sorting key.
     * @param fromIndex  the starting, inclusive index of the sorting range.
     * @param toIndex    the ending, exclusive index of the sorting range.
     */
    private static void countingSortImpl(int[] array,
                                         int[] buffer,
                                         int[] counterMap,
                                         int byteIndex,
                                         int fromIndex,
                                         int toIndex) {
        Arrays.fill(counterMap, 0);
        
        // Count the elements:
        for (int i = fromIndex; i != toIndex; i++) {
            counterMap[extractCounterIndex(array[i], byteIndex)]++;
        }

        // Make the counter map accummulative:
        for (int i = 1; i != NUMBER_OF_COUNTERS; i++) {
            counterMap[i] += counterMap[i - 1];
        }
        
        // Build the buffer array (which will end up sorted):
        for (int i = toIndex - 1; i >= fromIndex; i--) {
            int index = extractCounterIndex(array[i], byteIndex);
            buffer[counterMap[index]-- - 1] = array[i];
        }
        
        // Just copy the buffer to the array:
        System.arraycopy(buffer, 
                         0, 
                         array,
                         fromIndex, 
                         buffer.length);
    }
    
    /**
     * Sorts the {@code array[fromIndex ... toIndex - 1]} by most significant 
     * bytes that contain the sign bits.
     * 
     * @param array      the array to sort.
     * @param buffer     the buffer array.
     * @param counterMap the counter map. We pass this array in order not to 
     *                   create it in this method.
     * @param fromIndex  the starting, inclusive index of the sorting range.
     * @param toIndex    the ending, exclusive index of the sorting range.
     */
    private static void countingSortImplSigned(int[] array,
                                               int[] buffer,
                                               int[] counterMap,
                                               int fromIndex,
                                               int toIndex) {
        Arrays.fill(counterMap, 0);
        
        // Count the elements:
        for (int i = fromIndex; i != toIndex; i++) {
            counterMap[extractCounterIndexSigned(array[i])]++;
        }

        // Make the counter map accummulative:
        for (int i = 1; i != NUMBER_OF_COUNTERS; i++) {
            counterMap[i] += counterMap[i - 1];
        }
        
        // Build the output array:
        for (int i = toIndex - 1; i >= fromIndex; i--) {
            int index = extractCounterIndexSigned(array[i]);
            buffer[counterMap[index]-- - 1] = array[i];
        }
        
        // Just copy the buffer to the array:
        System.arraycopy(buffer, 
                         0, 
                         array,
                         fromIndex, 
                         buffer.length);
    }
    
    /**
     * Extracts the counter array index from the integer datum.
     * 
     * @param datum     the integer key.
     * @param byteIndex the index of the byte of the key to consider.
     * @return          the index into counter array.
     */
    private static int extractCounterIndex(int datum, int byteIndex) {
        // Shift so that the target byte is the leftmost byte and set to zero
        // all the remaining bits:
        return (datum >>> (byteIndex * 8)) & 0xff;
    }
    
    /**
     * Extracts the counter array index from the integer datum. Considers only
     * the most significant byte that contains the sign bit. The sign bit is 
     * flipped in order to put the datum in correct location in the counter 
     * array.
     * 
     * @param datum the integer key.
     * @return      the index into counter array. 
     */
    private static int extractCounterIndexSigned(int datum) {
        // We use xor ^ operator in order to flip the bit index 7 (8th bit from
        // the least significant end):
        return (datum >>> 24) ^ 0b1000_0000;
    }
    
    /**
     * Checks that the specified sorting range is reasonable.
     * 
     * @param arrayLength the total length of the target array.
     * @param fromIndex   the starting, inclusive index of the sorting range.
     * @param toIndex     the ending, exclusive index of the sorting range.
     */
    private static void checkRangeIndices(int arrayLength, 
                                          int fromIndex,
                                          int toIndex) {
        if (fromIndex < 0) {
            throw new IllegalArgumentException(
                    String.format(
                            "fromIndex(%d) is negative. Must be at least 0.", 
                            fromIndex));
        }
        
        if (toIndex > arrayLength) {
            throw new IllegalArgumentException(
                    String.format(
                            "toIndex(%d) is too large. Must be at most %d.",
                            toIndex, 
                            arrayLength));
        }
        
        if (fromIndex > toIndex) {
            throw new IllegalArgumentException(
                    String.format(
                            "toIndex(%d) > fromIndex(%d).", 
                            toIndex,
                            fromIndex));
        }
    }
}

com.github.coderodde.util.demo.LSDRadixsortDemo.java:

package com.github.coderodde.util.demo;

import com.github.coderodde.util.LSDRadixsort;
import java.util.Arrays;
import java.util.Random;

public class LSDRadixsortDemo {
    
    private static final int LENGTH = 100_000_000;
    private static final int PREFIX_SUFFIX_EXCLUSION_RANGE_LENGTH = 50;
    
    public static void main(String[] args) {
        long seed = System.currentTimeMillis();
        Random random = new Random(seed);
        
        System.out.printf("Seed = %d.\n", seed);
        
        long startTime = System.currentTimeMillis();
        int[] array1 = createRandomIntegerArray(LENGTH, random);
        int[] array2 = array1.clone();
        long endTime = System.currentTimeMillis();
        
        int fromIndex = 
                random.nextInt(PREFIX_SUFFIX_EXCLUSION_RANGE_LENGTH + 1);
        
        int toIndex = LENGTH - random.nextInt(
                        PREFIX_SUFFIX_EXCLUSION_RANGE_LENGTH + 1);
        
        System.out.printf("Built demo arrays in %d milliseconds.\n",
                          endTime - startTime);
        
        startTime = System.currentTimeMillis();
        LSDRadixsort.sort(array1, fromIndex, toIndex);
        endTime = System.currentTimeMillis();
        long durationRadixsort = endTime - startTime;
        
        System.out.printf("LSDRadixsort took %d milliseconds.\n", 
                          durationRadixsort);
        
        startTime = System.currentTimeMillis();
        Arrays.sort(array2, fromIndex, toIndex);
        endTime = System.currentTimeMillis();
        long durationArraysSort = endTime - startTime;
        
        System.out.printf("Arrays.sort took %d milliseconds.\n", 
                          durationArraysSort);
        
        System.out.printf("Arrays agree: %b.\n",
                          Arrays.equals(array1,
                                        array2));
        
        float ratio = (float) durationRadixsort / (float) durationArraysSort;
        
        System.out.println(
                String.format(
                        "Time ratio: %.3f.\n", ratio)
                        .replace(',', '.'));
    }
    
    private static int[] createRandomIntegerArray(int length, Random random) {
        int[] array = new int[length];
        
        for (int i = 0; i < length; i++) {
            array[i] = random.nextInt();
        }
        
        return array;
    }
}

Typical output

Seed = 1710160736231.
Built demo arrays in 1437 milliseconds.
LSDRadixsort took 2885 milliseconds.
Arrays.sort took 16079 milliseconds.
Arrays agree: true.
Time ratio: 0.179.

Critique request

So how am I doing here? Anything to improve? Naming? Javadoc?

\$\endgroup\$
4
  • 1
    \$\begingroup\$ Interesting speedup, by the way, regardless of any remark on the code itself :) \$\endgroup\$ Commented Mar 12, 2024 at 12:18
  • 1
    \$\begingroup\$ Nope, I don't think I have anything to add to the code. :) Documenting the O(N) space requirement could be useful as well as some hints towards the cutoffs where other algorithms become more efficient. \$\endgroup\$ Commented Mar 14, 2024 at 5:15
  • \$\begingroup\$ @TorbenPutkonen Suomi Perkele! :D \$\endgroup\$ Commented Mar 14, 2024 at 7:09
  • \$\begingroup\$ @TorbenPutkonen On longs, the radix sort in question takes about 44% of time taken by Arrays.sort. However, on shorts, Arrays.sort is significantly faster. \$\endgroup\$ Commented Mar 14, 2024 at 8:37

1 Answer 1

3
\$\begingroup\$

In general the code uses the right spacing and variable naming.


 * This class provides the method for sorting {@code int} arrays using 
 * least-significant digit (LSD) radix sort.

No it doesn't, unless you call a byte a digit; this should be mentioned at least to future developers that need to look at the code (this might be less important to users as the functionality should remain the same even if digits are used).

I'd also indicate that this is a helper or utility class, to save users from looking for constructors or factory methods.

Possibly it would also be a good idea to indicate the order of operations required for a sort. That's the only reason to prefer it over other options it seems to me.


LSDRadixsort

It depends, but nowadays I see often that the acronyms are also in CamelCase: LsdRadixSort. LSDRadixsort is fine of course but I see the point of the proponents of using smaller caps for anything but the first letter.


private static final int NUMBER_OF_COUNTERS = 256;

To me this was just the size of a buffer, but it it seems it is the number of values within a byte. In that case you could use BYTE_VALUES or something similar. Or assign it Byte.MAX_VALUE - Byte.MIN_VALUE + 1 to indicate the bytes.


checkRangeIndices(array.length,
                      fromIndex,
                      toIndex);

The disadvantage of such a check method is that the exception appears to be coming from another method than the one requiring the parameters. This may obscure the error somewhat.

There are ways to alter the stack trace, but they are rather ugly so I won't go into them.


// Trivially sorted:


// Sort first by least-significant bytes, then by second 
// least-significant, and finally by third least-signficant byte:

To me the use of capitalization and the colon is somewhat annoying and unnecessary.


    int[] counterMap = new int[NUMBER_OF_COUNTERS];

A map is generally not an array, here I would have expected a comment about how the map is being used.


    // Spawn sorting:

Generally we associate spawning with the spawning of a process or a thread. I would not use any comment just to indicate a call.


for (int byteIndex = 0; byteIndex != 3; byteIndex++) {

Generally byteIndex < 3 is used, and I would keep to that if simply to avoid confusion. I'm seeing this at multiple places in the code.


extractCounterIndex

This is a badly named method. It is the byte / digit value that is being extracted. That it is stored in the also badly named "map" in inconsequential for the method itself.


buffer[counterMap[index]-- - 1] = array[i];

All those comments, and this hard to read piece of code that performs the magic is not explained.


There is a lot of code which is duplicated just to get around the issue of a signed byte. It should be possible to refactor that and get rid of the spurious code (mapping from [-128, -127] to [0, 255] and back again when required). As it is, if there was a bug or other fix then it might end up in one method and not in the other.

Maybe it would be an interesting idea to also be able to treat integers as unsigned.

\$\endgroup\$
1
  • 1
    \$\begingroup\$ The usual way to implement utility classes is to provide a private no-args constructor. I don't think it's something that confuses users and requires extra documentation (especially since the naming and method signatures are identical to java.util.Arrays.sort(...). \$\endgroup\$ Commented Mar 13, 2024 at 10:58

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.