9
\$\begingroup\$

Newbie here, presenting the code for close public scrutiny.

Couple remarks regarding the context (taken from Princeton's COS 226 course):

  • In this problem I model a percolation system using an N-by-N grid of sites. Each site could be either open or blocked (all of them are blocked initially). To open the site, client should call open(row, col) method.
  • There is a notion of full site. "A full site is an open site that can be connected to an open site in the top row via a chain of neighboring (left, right, up, down) open sites [1]"
  • The system percolates, when there is a full site in the bottom row.

Below you can find an image of 2 percolation systems:

  • Now the goal is to perform efficient percolates() queries (boolean method that checks whether system percolates or not), using the UnionFind algorithm (Brute-force solution will take quadratic time, which is not scalable for huge systems).

Additional materials that might be useful:

[1] Reference: https://www.cs.princeton.edu/courses/archive/fall15/cos226/assignments/percolation.html

Below goes my implementation of Percolation:

Percolation.java

package percolation;

import edu.princeton.cs.algs4.WeightedQuickUnionUF;

/**
 * Modeling a percolation system using an N-by-N grid of sites. 
 * Each site is either open or blocked. A full site is an open site 
 * that can be connected to an open site in the top row via a chain of 
 * neighboring (left, right, up, down) open sites. 
 * The system percolates if there is a full site in the bottom row. 
 * In other words, a system percolates if we fill all open sites 
 * connected to the top row and that process fills some open site on the bottom row.
 * <p>
 * Application in practice: <a href="http://www.geoffreylandis.com/percolation.htp">The Fermi Paradox: An Approach Based on Percolation Theory</a>  </p>
 * <p> 
 * Code for Percolation Visualization can be found at: <a href="http://www.cs.princeton.edu/courses/archive/fall16/cos226/checklist/percolation.html">Princeton COS 226</a>
 * </p>
 */
public class Percolation {
    private final boolean[][] open;
    private final boolean[][] full;
    private int n;
    private final int N;
    private final WeightedQuickUnionUF uf;

    private final int top;      // virtual top-site
    private final int bottom;   // virtual bottom-site

    /* Rep Invariant
     *      open, full, uf != null
     *      open.length == full.length
     *      n >= 0      
     *      N > 0
     *      top != bottom 
     * Abstraction Function
     *      AF(N) represents a percolation system with N^2 sites in it (N-by-N grid).
     * Safety Exposure Argument
     *      All fields are private. All fields except n are final. 
     *      All fields except open, full and uf are immutable; 
     *      however they are not being shared with clients. 
     */

    /**
     * Creates N-by-N grid, with all sites initially blocked.
     * @param N dimension of the grid. Should be greater than 0.
     * @throws IllegalArgumentException if N <= 0
     */
    public Percolation(int N) {
        if (N <= 0) throw new IllegalArgumentException("N must be more than zero");

        this.open = new boolean[N][N];
        this.full = new boolean[N][N];
        this.n = 0;
        this.N = N;
        this.uf = new WeightedQuickUnionUF(N*N + 2);    // to allow top & bottom virtual sites
        this.top = N*N;                                 // top virtual site
        this.bottom = N*N + 1;                          // bottom virtual site

        for (int i = 0 ; i < N; i++) {      
            if (i != N * (N - 1) + i) {         // avoids bug in 1x1 case
                uf.union(top, i);
                uf.union(bottom, N * (N - 1) + i);  
            }
        }
        checkRep();         // checking rep invariant, assertions should be enabled!
    }

    /**
     * Opens the site (row, col) on the grid. This method has no effect
     * if the site is already open.
     * @param row row of the site to be opened
     * @param col column of the site to be opened
     * @throws IllegalArgumentException if row or col is < 0 or >= N
     */
    public void open(int row, int col) {
        if (row < 0 || row >= N) throw new IllegalArgumentException("Illegal value for row: must be between 0 and N - 1");
        if (col < 0 || col >= N) throw new IllegalArgumentException("Illegal value for col: must be between 0 and N - 1");

        if (!isOpen(row, col)) {
            open[row][col] = true;
            n++;
        }
        else return;        // to avoid unnecessary computations

        if (row == 0) {
            checkAdjacent(full, open, row, col);
            full[row][col] = true;
        }

        // filling in
        if (row + 1 < N  && isFull(row + 1, col)) checkAdjacent(full, open, row, col);
        if (col + 1 < N  && isFull(row, col + 1)) checkAdjacent(full, open, row, col);
        if (col - 1 >= 0 && isFull(row, col - 1)) checkAdjacent(full, open, row, col);
        if (row - 1 >= 0 && isFull(row - 1, col)) checkAdjacent(full, open, row, col);

        // merging
        if (row + 1 < N  && isOpen(row + 1, col)) uf.union(row * N + col, (row + 1) * N + col);
        if (col + 1 < N  && isOpen(row, col + 1)) uf.union(row * N + col, row * N + col + 1);
        if (col - 1 >= 0 && isOpen(row, col - 1)) uf.union(row * N + col, row * N + col - 1);
        if (row - 1 >= 0 && isOpen(row - 1, col)) uf.union(row * N + col, (row - 1) * N + col);
        if (N == 1)  {              // for 1x1 case
            uf.union(bottom, 0);
            uf.union(top, 0);
        }
        checkRep();   // checking rep invariant, assertions should be enabled!
    }

    // Recursively checks adjacent sites to be filled
    private void checkAdjacent(boolean[][] full, boolean[][] open, int row, int col) {
        if (row < 0 || row >= N) return;
        if (col < 0 || col >= N) return;
        if (!isOpen(row, col)) return;
        if (isFull(row, col)) return;

        full[row][col] = true;

        checkAdjacent(full, open, row + 1, col);
        checkAdjacent(full, open, row, col + 1);
        checkAdjacent(full, open, row, col - 1);
        checkAdjacent(full, open, row - 1, col);
    }

    /**
     * Checks whether the site (row, col) is open.
     * @param row row of the site
     * @param col column of the site
     * @return true iff, the site is open
     * @throws IllegalArgumentException if row or col is < 0 or >= N
     */
    public boolean isOpen(int row, int col) {
        if (row < 0 || row >= N) throw new IllegalArgumentException("Illegal value for row: must be between 0 and N - 1");
        if (col < 0 || col >= N) throw new IllegalArgumentException("Illegal value for col: must be between 0 and N - 1");
        return open[row][col];
    }

    /**
     * Checks whether the site (row, col) is full.
     * @param row row of the site
     * @param col column of the site
     * @return true iff, the site is full
     * @throws IllegalArgumentException if row or col is < 0 or >= N
     */
    public boolean isFull(int row, int col) {
        if (row < 0 || row >= N) throw new IllegalArgumentException("Illegal value for row: must be between 0 and N - 1");
        if (col < 0 || col >= N) throw new IllegalArgumentException("Illegal value for col: must be between 0 and N - 1");
        return full[row][col];
    }

    /**
     * @return the number of opened sites.
     */
    public int numberOfOpenSites() {
         return n;
    }

    /**
     * Checks whether system percolates or not. This method takes constant time.
     * @return true iff the system percolates 
     * (i.e possible to reach bottom row from top row, 
     * with the help of opened sites).
     */
    public boolean percolates() {
        return uf.connected(top, bottom);
    }

    // Note to self: checkRep ought to be called at every operation
    // that creates or mutates rep
    /**
     * Checks whether the rep invariant is being preserved. 
     * Assertions should be enabled.
     */
    private void checkRep() {
        assert open != null;
        assert full != null;
        assert uf != null;
        assert open.length == full.length;
        assert n >= 0;
        assert N > 0;
        assert top != bottom;
    }
}

And below goes the implementation of JUnit test case for Percolation

PercolationTest.java

package percolation;

import static org.junit.Assert.*;

import org.junit.Test;

public class PercolationTest {

    // check if assertions are on
    @Test(expected = AssertionError.class)
    public void testAssertionsEnabled() {
        assert false;
    }

    @Test
    public void emptyPercolationTest() {
        Percolation perc = new Percolation(5);
        for (int i = 0; i < 5; i++) {
            for (int j = 0; j < 5; j++) {
                assertFalse(perc.isOpen(i, j));
                assertFalse(perc.isFull(i, j));
            }
        }
        assertFalse(perc.percolates());
        assertEquals(0, perc.numberOfOpenSites());
    }

    @Test
    public void tinyPercolationTest() {
        Percolation perc = new Percolation(1);

        assertFalse(perc.isOpen(0, 0));
        assertFalse(perc.isFull(0, 0));
        assertFalse(perc.percolates());
        assertEquals(0, perc.numberOfOpenSites());

        perc.open(0, 0);
        assertTrue(perc.isOpen(0, 0));
        assertTrue(perc.isFull(0, 0));
        assertTrue(perc.percolates());
        assertEquals(1, perc.numberOfOpenSites());
    }

    @Test
    public void basicPercolationTest() {
        Percolation perc = new Percolation(3);

        perc.open(0, 0);
        assertTrue(perc.isOpen(0, 0));
        assertTrue(perc.isFull(0, 0));
        assertFalse(perc.percolates());

        perc.open(1, 0);
        assertTrue(perc.isOpen(1, 0));
        assertTrue(perc.isFull(1, 0));
        assertFalse(perc.percolates());

        perc.open(1, 1);
        assertTrue(perc.isOpen(1, 1));
        assertTrue(perc.isFull(1, 1));
        assertFalse(perc.percolates());

        perc.open(2, 2);
        assertTrue(perc.isOpen(2, 2));
        assertFalse(perc.isFull(2, 2));
        assertFalse(perc.percolates());

        perc.open(2, 0);
        assertTrue(perc.isOpen(2, 0));
        assertTrue(perc.isFull(2, 0));
        assertTrue(perc.percolates());

        assertEquals(5, perc.numberOfOpenSites());
    }

    // common problem
    @Test
    public void backwashPercolationTest() {
        Percolation perc = new Percolation(4);

        perc.open(0, 0);
        perc.open(1, 0);
        perc.open(2, 0);

        // sites not intended to be full, added before model percolates
        perc.open(2, 2);
        perc.open(3, 2);

        perc.open(3, 0);
        assertTrue(perc.percolates());

        assertFalse(perc.isFull(2, 2));
        assertFalse(perc.isFull(3, 2));

        // sites not intended to be full, added after model percolates
        perc.open(1, 2);
        perc.open(1, 3);
        perc.open(3, 3);

        assertFalse(perc.isFull(1, 2));
        assertFalse(perc.isFull(1, 3));
        assertFalse(perc.isFull(3, 3));
    }

    @Test
    public void sameNumberOfOpenSitesTest() {
        Percolation perc = new Percolation(3);
        perc.open(0, 0);
        perc.open(0, 0);
        assertEquals(1, perc.numberOfOpenSites());
    }
}

Any constructive feedback is very much welcome!

\$\endgroup\$
14
  • \$\begingroup\$ If "all of them are blocked initially", what causes them to become open? Is my assumption correct that each system to be looked at is basically a dataset that defines which sites are open? How to traverse a graph of nodes ("sites" in your code) is a problem often also referred to as [tag: pathfinding], which is why I suggested an edit to add this tag. It's somewhat well studied in the fields of game development and robotics, where the term "percolation" might not be associated with it. \$\endgroup\$ Commented Sep 25, 2016 at 11:03
  • \$\begingroup\$ @I'lladdcommentstomorrow well, not exactly (my bad). There is an open(row, col) method that opens specified site at position row, col. Think of it like this: there is a fluid (blue boxes) that flows from top to bottom via open sites (starts from 0th row and can only flow through the open sites). As soon as fluid reaches the bottom site(s) we say that system percolates. There is a video that will help you visualize the process, and note that he opens sites randomly: youtube.com/watch?v=xUWuZjadbbQ So my question to you is: can this still be considered as a pathfinding problem? \$\endgroup\$ Commented Sep 25, 2016 at 11:24
  • \$\begingroup\$ @I'lladdcommentstomorrow btw, you might find this page useful: cs.princeton.edu/courses/archive/fall15/cos226/assignments/… \$\endgroup\$ Commented Sep 25, 2016 at 11:45
  • 3
    \$\begingroup\$ Thanks for the clarification. I think it can still be considered a pathfinding problem. Advanced questions in this field deal with changing environments, too. Say you want to walk through a busy place like times square: gaps between the people do open and close all the time. In your case, the gaps can only open. At the very least, it's a conceptually related problem. \$\endgroup\$ Commented Sep 25, 2016 at 13:14
  • 1
    \$\begingroup\$ PS: connected() and hence percolates() take time proportional to log2 N + log2 M where N and M are the number of connected nodes to top and bottom nodes \$\endgroup\$ Commented Sep 26, 2016 at 16:23

1 Answer 1

1
\$\begingroup\$

I won't go into the logic, just the format and basic Java & OOP issues.

First, good defensive practices, throwing exceptions when input is illegal. Good reflex, but don't overdo it.


Let's look at the class definition:

public class Percolation {
    private final boolean[][] open;
    private final boolean[][] full;
    private int n;
    private final int N;
  1. Having a open and a full variable is:
    • confusing: It opens the possibility of having a cell be full, but NOT open. You don't have 4 possibilites, only 3, and your code should reflect that.
    • error-prone: You might mistakenly access open[i-1, j] and then full[i, j-1] and not see the error.
    • Wastefull: You will make two aray accesses every time

This is an ideal use-case for an enum:

public class Percolation {
    private static enum Cell {CLOSED, EMPTY, FULL}
    private final Cell[][] field;
  1. Next bit of class definition is this:
    private int n;
    private final int N;

The two "n" definitions are highly confusing. "n" is accepted naming for a size variable or a counter, but if you use two, you defeat the consensus. Make it clear the non-final one is a counter:

private int count;
private final int N;
  1. Finally, for the naming:
    private final WeightedQuickUnionUF uf;

"uf" is a badly choosen name. It doesn't take much to make it unionFind. Keystrokes don't count, readability matters!


Now let's look at the open method.

It should at least return at least a success result! To make sense to the user, and allow the reader to understand the 'why' without even needing to comment on stuff:

public boolean open(int row, int col) {

Now this bit of code isn't very idiomatic:

if (!isOpen(row, col)) {
    open[row][col] = true;
    n++;
}
else return;        // to avoid unnecessary computations

It should be (without needing to explain in comments):

if (isOpen(row, col)) {
    return false;
}       
open[row][col] = true;
n++;

Make life easy when throwing exceptions (and use braces) by repeating the input: the caller might not be conscious he/she sent the wrong value:

if (row < 0 || row >= N) {
    throw new IllegalArgumentException("Illegal value for row: must be between 0 and " + (N - 1) + ", but was " + row);
}

The checkAdjacentmethod doesn't need all this parameter passing, because it is a non-static method it already has the fulland openfields accessible to it:

private void checkAdjacent(int row, int col) {

Also it is badly named for a method with side-effects. Consider naming it propagate because it changes the full status as it goes!


Too much checking going on

isOpen and isFull already check range. So don't check it anew before calling it:

    if (row + 1 < N  && isFull(row + 1, col)) checkAdjacent(full, open, row, col);

Becomes:

if (isFull(row, col + 1)) {
     checkAdjacent(full, open, row, col);
}

BUT you'd need to let that method return false when out of bounds isntead of raising the exception. Then make it private and make it be isEmptyOrOOB. I strongly encourage you to do something similar.

Also the whole 4-way repetition is calling the same method passing the same parameters, so feels like duplicate (well, four-plicate):

    if (isFull(row + 1, col)) checkAdjacent(full, open, row, col);
    if (isFull(row, col + 1)) checkAdjacent(full, open, row, col);
    if (isFull(row, col - 1)) checkAdjacent(full, open, row, col);
    if (isFull(row - 1, col)) checkAdjacent(full, open, row, col);

Could probably be replaced by:

if (isFull(row + 1, col) || isFull(row, col + 1) || isFull(row, col - 1)) || isFull(row - 1, col)){
    checkAdjacent(full, open, row, col);
}

I have no idea what uf.union does (didn't read external link) but it might benefit frm the same treatment. At least the multi bound-checking before it must go.


The getter function numberOfOpenSites() should be named getNumberOfOpenSites()


CheckRep is very wrong:

private void checkRep() {
    assert open != null;
    assert full != null;
    assert uf != null;
    assert open.length == full.length;
    assert n >= 0;
    assert N > 0;
    assert top != bottom;
}

Assertions must be put in place, not grouped in a method. When they will be disabled (no one bothers enabling it) they cost nothing, but the empty method call becomes useless.

All assertions are useless: your open, full, N and uf variables are final and can never be null (N can never be negative). In general no need to check final variables non-nullity for invariance.

I don't understand what topand bottomstand for, but they are final as well, and can never become equal.

In general, invariance checking is not checking that stuff that are final ever change, it is that some construct emerging from changing variables, still holds. I don't immediately see any invariant worth checking.


I did not review the Unit tests.

\$\endgroup\$
3
  • \$\begingroup\$ Thanks a lot for your insights, I have revised my code. Few remarks: isEmpty() and isFull() should remain public, otherwise it would break several clients that depend on them. Do you think representing full[][] and open[][] as 2d byte array instead of enums worth the effort (readability vs memory consumption)? Thanks! \$\endgroup\$ Commented Oct 27, 2016 at 19:26
  • 1
    \$\begingroup\$ Sure, keep isFull and isEmpty in the public API, but make sure to use a better version internally. For performance, you already chose Java for extendability and ease of use over C. The same applied, the perf will be largely unchanged, but the added expressiveness will let you leverage some kickass External APIs much more easily. Often, chosing slower but more precise implementations usually lets you implement large scale optimisations that make up for it. Plus memory is dirt cheap these days. \$\endgroup\$ Commented Oct 27, 2016 at 21:54
  • \$\begingroup\$ Finally IIRC enums are implemented as int so it might just be a win-win \$\endgroup\$ Commented Oct 27, 2016 at 21:56

You must log in to answer this question.