Preprocessing
First you can preprocess A before all queries and generate a table (say times_of) such that when given a number n, one can efficiently obtain the number of times n appears in A through expression like times_of[n]. In the following example assuming A is of type int[N], we use an std::map to implement the table. Its construction costs O(NlogN) time.
auto preprocess(int *begin, int *end)
{
std::map<int, std::size_t> times_of;
while (begin != end)
{
++times_of[*begin];
++begin;
}
return times_of;
}
Let min and max be the minimum and maximum elements of A respectively. Then the following lemma applies:
The minimum number of distinct sorted subsequences is equal to max{0, times_of[min] - times_of[min-1]} + ... + max{0, times_of[max] -
times_of[max-1]}.
A rigorous proof is a bit technical, so I omit it from this answer. Roughly speaking, consider numbers from small to large. If n appears more than n-1, it has to bring extra times_of[n]-times_of[n-1] subsequences.
With this lemma, we can compute initially the minimum number of distinct sorted subsequences result in O(N) time (by iterating through times_of, not by iterating from min to max). The following is a sample code:
std::size_t result = 0;
auto prev = std::make_pair(min - 1, static_cast<std::size_t>(0));
for (auto &cur : times_of)
{
// times_of[cur.first-1] == 0
if (cur.first != prev.first + 1) result += cur.second;
// times_of[cur.first-1] == times_of[prev.first]
else if (cur.second > prev.second) result += cur.second - prev.second;
prev = cur;
}
Queries
To deal with a query A[u] = v, we first update times_of[A[u]] and times_of[v] which costs O(logN) time. Then according to the lemma, we need only to recompute constant (i.e. 4) related terms to update result. Each recomputation costs O(logN) time (to find the previous or next element in times_of), so a query takes O(logN) time in total.