What would Helman-JaJa listrank pseudocode be like? I tried looking around but all I found were "prosecode" descriptions (eg pp. 18-19 here) which I find kinda hard to follow.
-
$\begingroup$ This resource might be easier. dev.to/downey/the-helman-and-jaja-list-ranking-algorithm-4486 $\endgroup$user16034– user160342023-06-08 13:44:52 +00:00Commented Jun 8, 2023 at 13:44
-
$\begingroup$ Thanks, the illustrations are very helpful. $\endgroup$jon doyle– jon doyle2023-06-15 01:30:18 +00:00Commented Jun 15, 2023 at 1:30
1 Answer
Helman-JáJá
Helman-JáJá's is a beautiful parallel algorithm, but the original paper (if I may say so) does not do a particularly great job of communicating some of that beauty to non-experts (this, incidentally, is also my only criticism of JáJá's wonderful text on parallel algorithms). Technically, it is a prefix computation (a.k.a. running totals) algorithm, and list ranking happens to be just one such prefix computation where it might be useful.
I'll quote the original paper (pp. 43-44) and give rough, very high-level pseudocode with comments, and a super brief summary in my own words of what the step aims to do. The paper assumes that it is operating on two types, viz. List
and Sublist
. If you're planning to implement this (a great concurrent programming exercise), these could be structs or classes, depending on the language you use (I recommend Cilk, but feel free to use what you like).
Compute Processor Sums
(1) Processor $P_i (0\leq i\leq p - 1)$ visits the list elements with array indices $\frac{in}{p}$ through $\frac{(i + 1)n}{p} - 1$ in order of increasing index and computes the sum of the successor indices. [...] [The] negative successor index [that] denotes the terminal list element [...] is replaced by the value ($-s$) for future convenience. Additionally, as each element of
List
is read, the value in the successor field is preserved by copying it to an identically indexed location in the arraySucc
. The resulting sum of processor indices is stored in location $i$ of the array $Z$.
Summary: Partition the list evenly among the processors and let each processor compute the sum of successors. The summation is basically a reduction; tons of scope for exploiting parallelism if you know the right algorithms. The following part (ignoring the negative successor index and replacing it with $-s$, the array $Z$) is something you may find useful as implementation detail.
Pseudocode:
parforeach processor i operating on its partition of elements j:
Z[i] <- Σj
copy the successor of j into Succ, aligned with List by index
Find the Head
(2) Processor $P_0$ computes the sum $T$ of the $p$ values in the array $Z$. The index of the head of the list is then $h = (\frac{1}{2}n(n - 1)) - T$
Summary: Reduce the processor-sums array to a single sum. The HJ paper treats this as a serial step executed only by the master thread/processor 0, but this needn't be. The formula finds the head, given the (+)reduction of the entire array (the paper covers why on p. 42, but the short version is that the head of the entire list is no one's successor, so it will be the only element 'missing' from the sum of all successors from $0$ to $n - 1$, which is given by $\frac{1}{2}n(n - 1)$).
Pseudocode:
processor 0:
T <- ΣZ[i]
h <- n(n - 1)/2 - T
Partition the List
(3) For $j=\frac{is}{p}$ up to $\frac{(i + 1)s}{p} - 1$, processor $P_i$ randomly chooses a location $x$ from the block of list elements with indices $(j - 1)\frac{n}{(s - 1)}$ through $j\frac{n}{(s - 1)} - 1$ as a splitter which defines the head of a sublist in
List
(processor $P_0$ chooses the head of the list as its first splitter). This is recorded by settingSublists[j].head
to $x$. Additionally, the value ofList[x].successor
is copied toSublists[j].scratch
, after whichList[x].successor
is replaced with the value ($-j$) to denote both the beginning of a new sublist and the index of the record inSublists
which corresponds to its sublist.
Summary: Pick random indices to split the list into $s$ sublists. Everything else in the paragraph is either mathematical formulations for the indices in each partition, or how you could encode the information you would need for the subsequent steps.
Pseudocode:
parforeach processor i:
if (i = 0):
x <- head -- x is the splitter
else:
x <- random index in the processor's partition
Record the head of the sublist and its immediate successor
Compute Local Prefixes
(4) For $j=\frac{is}{p}$ up to $\frac{(i + 1)s}{p} - 1$, processor $P_i$ traverses the elements in the sublist which begins with
Sublists[j].head
and ends at the next element which has been chosen as a splitter (as evidenced by a negative value in the successor field). For each element traversed with index $x$ and predecessor $pre$ (excluding the first element in the sublist), we setList[x].successor = -j
to record the index of the record inSublists
which corresponds to that sublist. Additionally, we record the prefix value of that element within its sublist by settingList[x].prefix_data = List[x]prefix_data ⊗ List[pre].prefix_data
. Finally, if $x$ is also the last element in the sublist (but not the last element in the list) and $k$ is the index of the record inSublists
which corresponds to the successor of $x$, then we also setSublists[j].successor = k
andSublists[k].prefix_data = List[x].prefix_data
. Finally, the prefix_data field ofSublists[0]
, which corresponds to the sublist at the head of the list is set to the prefix operator identity.
Summary: Compute the local running totals within each sublist. The rest of this long paragraph is bookkeeping to record the sublist prefix sums and track how the sublists are arranged (we're going to use it momentarily).
Pseudocode:
parforeach processor i:
Compute the local prefix sum within i's partition of the list
Store the local prefix at the next head
Mark the next sublist as the ith partition's successor
Order the Heads
(5) Beginning at the head, processor $P_0$ traverses the records in the array
Sublists
by following the successor pointers from the head atSublists[0]
. For each record traversed with index $j$ and predecessor $pre$, we compute the prefix value by settingSublists[j].prefix_data = Sublists[j].prefix_data ⊗ Sublists[pre].prefix_data
.
Summary: Traverse from head to head, computing the prefix sums of the stored prefix sums. In effect, this gives a relative prefix of the heads. The bookkeeping we'd done before helps us identify 'successor-heads' for this step.
Pseudocode:
processor 0:
j <- the head of the second sublist
pre <- head -- Sublists[0] is the head of the overall list
while(sublists remain):
Sublists[j].prefix_data <- Sublists[pre].prefix_data ⊗ Sublists[j].prefix_data
pre <- j
j <- next head after j
Compute All Prefix Sums
(6) Processor $P_i$ visits the list elements with array indices $\frac{in}{p}$ through $\frac{(i + 1)n}{p} - 1$ in order of increasing index and completes the prefix computation for each list element $x$ by setting
List[x].prefix_data = List[x].prefix_data ⊗ Sublists[-(List[x].successor)].prefix_data
. Additionally, as each element ofList
is read, the value in thesuccessor
field is replaced with the identically indexed element in the arraySucc
.
Summary: Revisit all the sublists, adding the relative prefix of the head to each element. To use the list ranking example, if something was ranked 2 in its sublist, and its head (0 locally) got a relative rank of 5, add 5 to every element of this sublist to get their 'global' ranks (in the big list). Any of our bookkeeping that didn't help us in step (5) become useful in step (6), with -(List[x].successor)
serving as an indicator of sublist membership. The last line is about restoring the 'true' successors, since we mutated List[i].successor
.
Pseudocode:
parforeach processor i:
foreach element x in its partition:
List[x].prefix_data <- List[x].prefix_data ⊗ Sublists[-(List[x].successor)].prefix_data -- Add the head's relative rank
List[x].successor <- Succ[x] -- restore the true successors
If you're reading this carefully, and especially if you've done concurrent programming before, you might have spotted an excellent opportunity to execute the last step as a massively-parallel operation.