0

I was just starting out with dynamic programming and tried solving factorial problem using the same, i used binary tree for underlying data structure, but when i thought of comparing it with normal recursion, the recursion solution is giving the result faster. I am using lenovo Ideapad3 with arch and using chrono to calculate time. Following is the code

#include<iostream>
#include <chrono>
using namespace std;
using namespace chrono;

class Fact{
    struct Node {
        unsigned long long data, value;
        Node* Rptr;
        Node* Lptr;
        Node(): data(0), value(0), Rptr(nullptr), Lptr(nullptr) {}
        Node(int d, int v): data(d), value(v), Rptr(nullptr), Lptr(nullptr) {}
    };
    Node* root;
    public:
        Fact(): root(nullptr) {}
        Node* inTree(Fact::Node* ptr, int data);
        Node* insertInTree(Fact::Node* ptr, int data, int value);
        unsigned long long findFact(int num);
        void deleteTree(Node* root) {
            if (root == NULL) {
                return;
            }
            deleteTree(root->Lptr);
            deleteTree(root->Rptr);
            delete root; 
        }

        ~Fact() {
            deleteTree(root);
        }
};

unsigned long long factorial(int n) {
    return (n < 1) ? 1 : n*factorial(n-1);
}

int main() {
    Fact fact;

    //DP solution
    auto start = high_resolution_clock::now();
    auto answer = fact.findFact(20);
    auto stop = high_resolution_clock::now();
    cout << "Factorial of 20 is:" << answer << endl;
    auto duration = duration_cast<microseconds>(stop - start);
    cout << "Time taken: " << duration.count() / 1000.0 << " milliseconds" << endl;

    //Recursion
    answer = 0;
    start = high_resolution_clock::now();
    answer = factorial(20);
    stop = high_resolution_clock::now();
    cout << "Factorial of 20 is:" << answer<< endl;
    duration = duration_cast<microseconds>(stop - start);
    cout << "Time taken: " << duration.count() / 1000.0 << " milliseconds" << endl;
    return 0;
}

unsigned long long Fact::findFact(int num) {
    Node* temp;
    if(num <= 1) return 1;
    else if((temp = inTree(root,num))) return temp->value;
    else {
        long long res = num * findFact(num - 1);
        insertInTree(Fact::root,num, res);
        return res;
    }
}

Fact::Node* Fact::insertInTree(Fact::Node* ptr, int data, int value) {
    if (ptr == NULL) {
        return new Node(data, value);
    }

    if (data < ptr->data) {
        ptr->Lptr = insertInTree(ptr->Lptr, data, value);
    } else if (data > ptr->data) {
        ptr->Rptr = insertInTree(ptr->Rptr, data, value);
    }
    return ptr;
}

Fact::Node* Fact::inTree(Fact::Node* ptr, int data) {
    if(!ptr) return nullptr;
    else {
        if(ptr->data == data) return ptr;
        else {
            if(data > ptr->data) return inTree(ptr->Rptr, data);
            else return inTree(ptr->Lptr, data);
        }
    }
}

Following is the output after using O3 flag:

[firefly@machine]$ g++ -O3 factorial.cpp -o fact
[firefly@machine]$ ./fact 
Factorial of 20 is:2432902008176640000  
Time taken: 0.001 milliseconds          //Time for DP-sol
Factorial of 20 is:2432902008176640000  
Time taken: 0 milliseconds              //Time for recursion

what could be the possible explanation for the same?

7
  • 1
    Your tree lookups never succeed because no intermediate value is never used twice. Note that in this code your binary tree is unbalanced (it's essentially a linked list, which cause O(n^2) performance when looking up values), and the results are stored as ints, which (if you're using a standard compiler/runtime) will not be large enough to store factorials above 12. Commented Dec 26, 2024 at 9:38
  • 1
    Also, your benchmarking approach is faulty (although it happens to give appropriate results in this case); for programs that run in essentially no time, you can't get an accurate result with a single sample. Commented Dec 26, 2024 at 9:40
  • 1
    And C++ has reasonably efficient maps in the standard library. For example you can use std::unordered_map as a cache, replacing your binary tree implementation, although no cache is better in this case. Commented Dec 26, 2024 at 9:44
  • 1
    Dynamic programming is only useful if you are able to reuse intermediate solutions, either because of the structure of the overall problem (Fibonacci or an exhaustive tree traversal) or because you do multiple queries. Factorial has neither, so you're doing extra work for nothing. Commented Dec 26, 2024 at 9:44
  • 1
    Also, your tree implementation is simply wrong. The intention of insertInTree is intended to store the key/value pair in the tree, but when root is nullptr (which it is at the start), the implementation allocates a Node, which is returned but never used, and no value is ever inserted into the tree. Commented Dec 26, 2024 at 9:48

1 Answer 1

2

Your tree-building solution builds the tree and does exactly the same calculations as the simple recursive solution. So the extra time you observe is the (at least) the time to build the tree.

Plus, the simple solution can be (and probably is) optimized away completely by the compiler, which is unlikely to happen with the tree-building solution.

Plus, your tree is degenerate because you insert the nodes in the descending order and it is not self-balancing, so extremely inefficient.

Plus, the implementation does the tree traversal twice, one time when checking inTree and the other time when inserting insertInTree.

Plus, your tree implementation is somewhat buggy.

In order to see any hint of performance improvement with this task, you probably should:

  1. Replace the tree with some other data structure. A simple array will work just fine in this case. You are not going to calculat factorials of numbers more than 20, so an array of length 21 will suffice. A self-balancing tree is theoretically nice, but has a super highg set up cost and constant performance factor, so you may or may not be able to see any improvement. Same goes for a full blown hash table.
  2. Calculate the factorials of random numbers to prevent the compiler from folding your simple factorial call to a constant.
  3. Do the calculation many times for different numbers, not just once, and measure the total time. This will reuse earlier calculations over and over again, which is the whole point of DP. Calculating just one factorial uses the result of each intermediate calculation once.
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.