Parallel Algorithm Pseudocode: Helman-JaJa Listrank

Question

What would Helman-JaJa listrank pseudocode be like? I tried looking around but all I found were "prosecode" descriptions (eg pp. 18-19 here) which I find kinda hard to follow.

This resource might be easier. dev.to/downey/the-helman-and-jaja-list-ranking-algorithm-4486 — user16034
– user16034, Commented Jun 8, 2023 at 13:44

Nathan Davis · Accepted Answer · 2023-06-12 05:30:46Z

Helman-JáJá

Helman-JáJá's is a beautiful parallel algorithm, but the original paper (if I may say so) does not do a particularly great job of communicating some of that beauty to non-experts (this, incidentally, is also my only criticism of JáJá's wonderful text on parallel algorithms). Technically, it is a prefix computation (a.k.a. running totals) algorithm, and list ranking happens to be just one such prefix computation where it might be useful.

I'll quote the original paper (pp. 43-44) and give rough, very high-level pseudocode with comments, and a super brief summary in my own words of what the step aims to do. The paper assumes that it is operating on two types, viz. List and Sublist. If you're planning to implement this (a great concurrent programming exercise), these could be structs or classes, depending on the language you use (I recommend Cilk, but feel free to use what you like).

Compute Processor Sums

(1) Processor $P_i (0\leq i\leq p - 1)$ visits the list elements with array indices $\frac{in}{p}$ through $\frac{(i + 1)n}{p} - 1$ in order of increasing index and computes the sum of the successor indices. [...] [The] negative successor index [that] denotes the terminal list element [...] is replaced by the value ($-s$) for future convenience. Additionally, as each element of List is read, the value in the successor field is preserved by copying it to an identically indexed location in the array Succ. The resulting sum of processor indices is stored in location $i$ of the array $Z$.

Summary: Partition the list evenly among the processors and let each processor compute the sum of successors. The summation is basically a reduction; tons of scope for exploiting parallelism if you know the right algorithms. The following part (ignoring the negative successor index and replacing it with $-s$, the array $Z$) is something you may find useful as implementation detail.

Pseudocode:

parforeach processor i operating on its partition of elements j:
  Z[i] <- Σj
  copy the successor of j into Succ, aligned with List by index

Find the Head

(2) Processor $P_0$ computes the sum $T$ of the $p$ values in the array $Z$. The index of the head of the list is then $h = (\frac{1}{2}n(n - 1)) - T$

Summary: Reduce the processor-sums array to a single sum. The HJ paper treats this as a serial step executed only by the master thread/processor 0, but this needn't be. The formula finds the head, given the (+)reduction of the entire array (the paper covers why on p. 42, but the short version is that the head of the entire list is no one's successor, so it will be the only element 'missing' from the sum of all successors from $0$ to $n - 1$, which is given by $\frac{1}{2}n(n - 1)$).

Pseudocode:

processor 0:
  T <- ΣZ[i]
  h <- n(n - 1)/2 - T

Partition the List

(3) For $j=\frac{is}{p}$ up to $\frac{(i + 1)s}{p} - 1$, processor $P_i$ randomly chooses a location $x$ from the block of list elements with indices $(j - 1)\frac{n}{(s - 1)}$ through $j\frac{n}{(s - 1)} - 1$ as a splitter which defines the head of a sublist in List (processor $P_0$ chooses the head of the list as its first splitter). This is recorded by setting Sublists[j].head to $x$. Additionally, the value of List[x].successor is copied to Sublists[j].scratch, after which List[x].successor is replaced with the value ($-j$) to denote both the beginning of a new sublist and the index of the record in Sublists which corresponds to its sublist.

Summary: Pick random indices to split the list into $s$ sublists. Everything else in the paragraph is either mathematical formulations for the indices in each partition, or how you could encode the information you would need for the subsequent steps.

Pseudocode:

parforeach processor i:
  if (i = 0):
    x <- head -- x is the splitter
  else:
    x <- random index in the processor's partition
  Record the head of the sublist and its immediate successor

Compute Local Prefixes

(4) For $j=\frac{is}{p}$ up to $\frac{(i + 1)s}{p} - 1$, processor $P_i$ traverses the elements in the sublist which begins with Sublists[j].head and ends at the next element which has been chosen as a splitter (as evidenced by a negative value in the successor field). For each element traversed with index $x$ and predecessor $pre$ (excluding the first element in the sublist), we set List[x].successor = -j to record the index of the record in Sublists which corresponds to that sublist. Additionally, we record the prefix value of that element within its sublist by setting List[x].prefix_data = List[x]prefix_data ⊗ List[pre].prefix_data. Finally, if $x$ is also the last element in the sublist (but not the last element in the list) and $k$ is the index of the record in Sublists which corresponds to the successor of $x$, then we also set Sublists[j].successor = k and Sublists[k].prefix_data = List[x].prefix_data. Finally, the prefix_data field of Sublists[0], which corresponds to the sublist at the head of the list is set to the prefix operator identity.

Summary: Compute the local running totals within each sublist. The rest of this long paragraph is bookkeeping to record the sublist prefix sums and track how the sublists are arranged (we're going to use it momentarily).

Pseudocode:

parforeach processor i:
  Compute the local prefix sum within i's partition of the list
  Store the local prefix at the next head
  Mark the next sublist as the ith partition's successor

Order the Heads

(5) Beginning at the head, processor $P_0$ traverses the records in the array Sublists by following the successor pointers from the head at Sublists[0]. For each record traversed with index $j$ and predecessor $pre$, we compute the prefix value by setting Sublists[j].prefix_data = Sublists[j].prefix_data ⊗ Sublists[pre].prefix_data.

Summary: Traverse from head to head, computing the prefix sums of the stored prefix sums. In effect, this gives a relative prefix of the heads. The bookkeeping we'd done before helps us identify 'successor-heads' for this step.

Pseudocode:

processor 0:
  j <- the head of the second sublist
  pre <- head -- Sublists[0] is the head of the overall list
  while(sublists remain):
    Sublists[j].prefix_data <- Sublists[pre].prefix_data ⊗ Sublists[j].prefix_data
    pre <- j
    j <- next head after j

Compute All Prefix Sums

(6) Processor $P_i$ visits the list elements with array indices $\frac{in}{p}$ through $\frac{(i + 1)n}{p} - 1$ in order of increasing index and completes the prefix computation for each list element $x$ by setting List[x].prefix_data = List[x].prefix_data ⊗ Sublists[-(List[x].successor)].prefix_data. Additionally, as each element of List is read, the value in the successor field is replaced with the identically indexed element in the array Succ.

Summary: Revisit all the sublists, adding the relative prefix of the head to each element. To use the list ranking example, if something was ranked 2 in its sublist, and its head (0 locally) got a relative rank of 5, add 5 to every element of this sublist to get their 'global' ranks (in the big list). Any of our bookkeeping that didn't help us in step (5) become useful in step (6), with -(List[x].successor) serving as an indicator of sublist membership. The last line is about restoring the 'true' successors, since we mutated List[i].successor.

Pseudocode:

parforeach processor i:
  foreach element x in its partition:
    List[x].prefix_data <- List[x].prefix_data ⊗ Sublists[-(List[x].successor)].prefix_data -- Add the head's relative rank
    List[x].successor <- Succ[x] -- restore the true successors

If you're reading this carefully, and especially if you've done concurrent programming before, you might have spotted an excellent opportunity to execute the last step as a massively-parallel operation.

Stack Exchange Network

Parallel Algorithm Pseudocode: Helman-JaJa Listrank

1 Answer 1

Helman-JáJá

Compute Processor Sums

Find the Head

Partition the List

Compute Local Prefixes

Order the Heads

Compute All Prefix Sums

Hot Network Questions

Parallel Algorithm Pseudocode: Helman-JaJa Listrank

1 Answer 1

Helman-JáJá

Compute Processor Sums

Find the Head

Partition the List

Compute Local Prefixes

Order the Heads

Compute All Prefix Sums

Related

Hot Network Questions