0

when I solve the Subset Sum problem or "P = NP" it takes 5 minutes with the following code. I am really curious to know how much faster it would be if I were using .parallelStream. however I don't understand how to convert the code.

public class MainActivity {
final static Integer[] POPS = {8897109, 12828837, 9461105, 6371773, 5965343, 5946800, 5582170, 5564635, 5268860, 4552402, 4335391, 4296250, 4224851, 4192887, 3439809, 3279833, 3095313, 2812896, 2783243, 2710489, 2543482, 2356285, 2226009, 2149127, 2142508, 2134411};
final static int TOTAL = 100000000;
/**
 * @param args
 */
public static void main(String[] args) {
    Combinations c = new Combinations(POPS, TOTAL);
    c.chooser();
}

}

import java.util.ArrayList;
import org.paukov.combinatorics.Factory;
import org.paukov.combinatorics.Generator;
import org.paukov.combinatorics.ICombinatoricsVector;


public class Combinations {
private Integer[] POPS;
private int TOTAL;

public combinations(Integer[] pops, int total){
    this.POPS = pops;
    this.TOTAL = total;
}

public void chooser(){
    for(int i = 1; i<=POPS.length; i++){
        System.out.println(i);
        ICombinatoricsVector<Integer> initialVector = Factory.createVector(POPS);
        Generator<Integer> gen = Factory.createSimpleCombinationGenerator(initialVector, i);
        for (ICombinatoricsVector<Integer> combination : gen) {
            String temp = combination.toString();
            int size = temp.indexOf("size");
            temp = temp.substring(22, size-3);
            int sum = Adder(temp);
            if (sum == TOTAL){
                System.out.println(temp);
            }
        }
    }
}

public int adder(String combos){
    int total = 0;
    String[] parts = combos.split(", ");
    ArrayList<Integer> nums = new ArrayList<>();
    for(int i = 0; i<parts.length; i++){
        nums.add(Integer.parseInt(parts[i]));
    }
    for(int temp : nums){
        total += temp;
    }

    return total;
}
}

Here is the code with the string stuff taken out. It only takes about 15 seconds now. I realize that .parallelStream() will not reduce its time much, but could someone still at least give me some hints on how to do it.

import java.util.ArrayList;
import java.util.List;

import org.paukov.combinatorics.Factory;
import org.paukov.combinatorics.Generator;
import org.paukov.combinatorics.ICombinatoricsVector;


public class Combinations {
private Integer[] POPS;
private int TOTAL;

public Combinations(Integer[] pops, int total){
    this.POPS = pops;
    this.TOTAL = total;
}

public void chooser(){
    for(int i = 1; i<=POPS.length; i++){
        System.out.println(i);
        ICombinatoricsVector<Integer> initialVector = Factory.createVector(POPS);
        Generator<Integer> gen = Factory.createSimpleCombinationGenerator(initialVector, i);
        for (ICombinatoricsVector<Integer> combination : gen) {
            List<Integer> temp = combination.getVector();
            int sum = adder(temp);
            if (sum == TOTAL){
                System.out.println(temp);
            }
        }
    }
}

public int adder(List<Integer> combos){
    int total = 0;
    for(Integer temp : combos){
        total+=temp;
    }
    return total;
}
}
9
  • 2
    Maybe you should try to get rid of all these conversions from and to String first. Then it might turn out that the actual operation you are performing (summing some integers) is way too simple to be accelerated by parallel computing. Commented Aug 18, 2014 at 17:10
  • you might be right since I didn't find all the combinations myself. I used this object ICombinatoricsVector<Integer> combination if I could code that part with parallelStream it would probably make a big difference. Commented Aug 18, 2014 at 17:17
  • 1
    It would also be helpful if you included code that invoked the algorithm shown here, so we could run it ourselves and see what's taking 5 minutes. Also, please follow Java code conventions of capitalizing class names and starting method names with initial lower case. Commented Aug 18, 2014 at 23:21
  • @StuartMarks I believe I have made all the convention changes and added the main method. I would have private messaged you this if there was a way. Commented Aug 19, 2014 at 2:20
  • No need for a private message. I think it's good to have a public comment thread about how to improve the question; it helps others. In that vein, it would have been convenient if you had linked to the library you're using. Is it Combinatoricslib? More importantly, you should take Holger's advice and avoid string processing. That library provides the ability to get values from the vectors; use that instead of stringifying the vector and parsing the result. You'll still have boxing/unboxing overhead though. Commented Aug 19, 2014 at 9:26

1 Answer 1

2

This runs about three times as fast on my box (i7-2600 with hyper-threading, 8 virtual cores):

public class Combinations {
    private Integer[] POPS;
    private int TOTAL;

    public Combinations(Integer[] pops, int total) {
        this.POPS = pops;
        this.TOTAL = total;
    }

    public void chooser() {
        ICombinatoricsVector<Integer> initialVector = Factory.createVector(POPS);

        IntStream.range(1, POPS.length + 1).parallel()
                .peek(System.out::println)
                .mapToObj(i -> Factory.createSimpleCombinationGenerator(initialVector, i))
                .flatMap(gen -> genToStream(gen, false)
                        .map(ICombinatoricsVector::getVector)
                        .filter(v -> adder(v) == TOTAL))
                .forEach(System.out::println);
    }

    public static int adder(Iterable<Integer> combos) {
        int total = 0;
        for (Integer temp : combos) {
            total += temp;
        }
        return total;
    }

    public static <E> Stream<ICombinatoricsVector<E>> genToStream(Generator<E> gen, boolean parallel) {
        return StreamSupport.stream(Spliterators.spliterator(gen.iterator(),
                gen.getNumberOfGeneratedObjects(), Spliterator.ORDERED), parallel);
    }
}

This uses a parallel stream for the outer loop, a regular stream for the inner loop, and avoids using a stream to sum the list (for speed). You can try a parallel inner stream with genToStream(gen, true), but I didn't see any difference in speed.

Also, if you want a List<List<Integer>> of matches, just change the forEach line to .collect(Collectors.toList());.

Sign up to request clarification or add additional context in comments.

4 Comments

@SeanVanGorder this code is awesome and it only took about 5 seconds. I would like to ask you all sorts of questions of how it works since I can't understand it but I am sure you don't have the time. Could you please direct me to a guide so that I may teach myself.
Well, for a quick overview, you can think of a stream like a conveyor belt or assembly line, and each map/filter/etc. call adds another machine that operates on each item as it passes. The map operation processes the item, the filter operation tests the item and dumps rejected items off the line, the peek operation inspects the item without changing it (usually for logging/debugging), and the line ends at a terminal operation like collect or forEach. A parallel stream means items are sent to multiple identical assembly lines which all join up before the terminal operation.
The lambdas here, like x -> func(x), take each item as input (x is just a parameter name) and return the result of the statement or block after the ->. For map this means the result replaces the item, for filter the result is a boolean that decides which items are rejected, etc. The notation Cls::func is basically shorthand for the lambda x -> Cls.func(x), or possibly x.func() (if x is an instance of Cls). This genToStream implementation is the standard way to make a stream from an Iterator, because it just wouldn't be Java without some hideous unnecessary boilerplate.
For more details, see the tutorials on lambdas, method references, and streams.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.