If the 5th Fibonacci number has to be computed, I'm trying to split the 2 recursive trees into 2 threads:
5
4 3
3 2 2 1
2 1 1 0 1 0
Thread 1 Thread 2
My attempt:
public class ParallelFibbonaci
{
static ExecutorService executor = Executors.newFixedThreadPool(2);
// Memoized map.
private static Map<Integer, Long> mem = new ConcurrentHashMap<>();
public static void main(String[] args) throws ExecutionException, InterruptedException{
final long then = System.nanoTime();
System.out.println(fib_wrapper(Integer.parseInt(args[0])));
final long millis = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - then);
System.out.println("Time(ms): " + millis);
executor.shutdown();
}
public static int fib_wrapper(int n) throws ExecutionException, InterruptedException {
Future<Integer> future = executor.submit(() -> fib(n-1));
Future<Integer> future2 = executor.submit(() -> fib(n-2));
return future2.get() + future.get();
}
public static int fib(int n) throws ExecutionException, InterruptedException {
if (n == 1)
return 1;
if (n == 0)
return 0;
if (mem.containsKey(n)) {
return mem.get(n);
}
long x = fib(n-1) + fib(n-2);
mem.put(n, x);
return x;
}
}
I couldn't prove the parallel version to run faster than the simple version, in fact, all my profiling (done with System.nanoTime() which I believe is not the most reliable) revealed it to be 20-30 times slower:
Sid:Programs siddhartha$ java -Xms2G -Xmx4G parallelfibbonaci 1000
817770325994397771
Time(ms): 60
Sid:Programs siddhartha$ java -Xms2G -Xmx4G simplefibbonaci 1000
817770325994397771
Time(ms): 4
Sid:Programs siddhartha$ java -Xms2G -Xmx4G parallelfibbonaci 500
2171430676560690477
Time(ms): 59
Sid:Programs siddhartha$ java -Xms2G -Xmx4G simplefibbonaci 500
2171430676560690477
Time(ms): 2
Any higher n's crashed my program with a stackoverflow error.
In theory, I'd expect using 2 threads on a tree-recursive algorithm to work faster than a single-threaded implementation, however it seems this is clearly not the case. What am I missing here?