4
\$\begingroup\$

I'm trying to solve a programming challenge where I have to find the median of given subarrays of an array. I have used std::vector's n_th element for this. But the evaluator fails some test cases as "time limit exceeded".

#include <iostream>
#include <bits/stdc++.h>

using namespace std;

/*
In
6
2 4 5 3 1 6
3
1 6
2 4
3 3

Out
3
4
5

*/

int get_ceil(int length)
{
    if (length % 2 == 0)
    {
        return length/2;
    }
    else
    {
        return (length+1)/2;
    }

}

int get_median_of_subarray(std::vector<int> & full_array, int lo, int hi)
{

    std::vector<int>::const_iterator begin = full_array.begin() + lo;
    std::vector<int>::const_iterator end = full_array.begin() + hi + 1;
    std::vector<int> sub_array(begin, end);

    int median_pos = get_ceil(hi - lo + 1);
    std::nth_element (sub_array.begin(), sub_array.begin() + median_pos -1, sub_array.end());
    return sub_array[median_pos - 1];
}



int main()
{
    ios_base::sync_with_stdio(false);

    int array_size = 0;
    std::vector<int> input_array;

    cin >> array_size;

    for(int i = 0; i < array_size; i++)
    {
        int x;
        cin >> x;
        input_array.push_back(x);
    }

    int query_size = 0;
    cin >> query_size;

    int lo = 0;
    int hi = 0;

    for(int i = 0; i < query_size; i++)
    {
        cin >> lo >> hi;
        std::cout << get_median_of_subarray(input_array, --lo, --hi) << std::endl;
    }

    return 0;
}
\$\endgroup\$
4
  • 1
    \$\begingroup\$ full_array should be a reference-to-const, it isn't modified inside your function. This just clarifies your code a bit. The median of [2, 10] is 6, which your algorithm ignores. Since you get a timeout and not a test failure, I guess that all this is just about odd-sized sequences. \$\endgroup\$ Commented Feb 16, 2020 at 13:20
  • 1
    \$\begingroup\$ Also not performance-related: get_ceil() is a bit overcomplicated. If length = 3, (length + 1) / 2 = 2. If length = 4, (length + 1) / 2 = 2, due to integer division. C++ doesn't round but truncate the result. Therefore, you can always return (length + 1) / 2;. I'd also call this get_half_ceil(), because that's closer to what it does. \$\endgroup\$ Commented Feb 16, 2020 at 13:43
  • 1
    \$\begingroup\$ Could you please copy the question from the code challenge and add a link to it as well. \$\endgroup\$ Commented Feb 16, 2020 at 18:13
  • \$\begingroup\$ Actually, you have used <algorithm>'s std::nth_element() on a vector. The distinction is important, because the standard algorithms are written to be generic, and usable on all kinds of container (and, in some cases on things that aren't containers). \$\endgroup\$ Commented Feb 13, 2022 at 10:24

2 Answers 2

5
\$\begingroup\$
  • One performance problem is the overhead of allocating and resizing a vector. Since you read the size up front, why not simply reserve() enough space?
    cin >> array_size;
    input_array.reserve(array_size);
  • If you sorted the sequence, computing the median would become dead simple and very fast (O(1)). This may be a better approach. However, you can't sort the sequence once and then run different queries, because the subsequence applies to the unsorted sequence. I haven't thought this through completely, but when removing two elements from the sequence, there are several possible ways it can affect the median:

    • If both elements are larger, the median becomes the previous value in the sorted sequence.
    • If both elements are smaller, the median becomes the next value in the sorted sequence.
    • If one element element is smaller and one is larger, the median remains the same.
    • If one element is equal to the median, it moves opposite to the other value.

    The problem here is that the "new" median can easily be one that's should be removed already, so this simple approach is not yet enough, but it might serve as a start to a better algorithm. In any case, what you need to keep in mind is the complexity and the variables that affect it. For your task, you have the number of values n and the number of queries m. Your algorithm is almost optimal for the case that m = 1. In order to improve the overall performance, you can actually optimize the performance for the case of m > 1, even at the expense of the the m = 1 case! Often, you can improve the performance by first sorting the data and then switching to a different (faster) algorithm for the actual work.

\$\endgroup\$
8
\$\begingroup\$

Not related to performance, but never ever do this:

#include <bits/stdc++.h>

using namespace std;

For reference see these Q&A at Stack Overflow:

\$\endgroup\$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.