1

We got this problem in our course that no one who I had talked to solved it. I would like for some help. So here's the problem:

Let A be array of length n which contains n digits (digit is between 0-9). A numeral sub-sequence of A is a sequence of positive numbers which their digits compose a sub-sequence of A, when all digits of a certain number in the sequence appear in a row in A.

For example: the sequence 13,1,345,89,23 is a numeral sub-sequence of input array A: [1,3,5,1,2,3,4,5,8,9,4,5,2,3]

Length of a numeral sub-sequence is the amount of numbers which appear in it (in the example above: 5) Numeral sub-sequence is increasing if every number in the sequence is bigger than the number before it.

The request is to find an algorithm in dynamic programming approach (based on recursive formula) that finds the longest increasing numeral sub-sequence of an input array A.

Thanks in advance for all helpers!

6
  • I think this is similar to the longest increasing subsequence problem with an additional constraint of choosing only those numbers which have all of their digits present in array A. So, firstly, you remove all the numbers in the sequence 13,1,345,89,23, 400 which at least one digit not present in array A. In this case we will drop 400. (Now any sequence will form a numeral sub-sequence). Secondly, you just follow the usual longest increasing subsequence approach on the remaining numbers(13,1,345,89,23) to find the longest increasing numeral subsequence. Commented Dec 5, 2017 at 14:33
  • I think wasn't clear enough: I don't have any sequence at all. All I get is the array - what I need to find is a numeral increasing sub-sequence of the input array in a maximal length. I agree the idea is similar to the LIS problem but I still can't see how can I use it. Commented Dec 5, 2017 at 14:42
  • Oh! You did not say so. So, I just assumed that the sequence was also provided. Commented Dec 5, 2017 at 14:45
  • So, all the numbers in the final sequence should have their digits from array A. Am I right? And do you need to find the length of the longest sequence or the sequence itself? And will it be valid if I say 91 is a numeral number to A? Also, is this (31, 1, 543, 98, 32) a valid numeral subsequence to A? Commented Dec 5, 2017 at 14:51
  • I need to find the sequence itself. There could be more than one option for that but what matter is the it will in a maximal length. 91 is not a numeral sub sequence because you dont have 9 and then 1 in A. (31, 1, 543, 98, 32) is a valid numeral subsequence to A - but no increasing.... Commented Dec 5, 2017 at 15:06

1 Answer 1

1

Look at the first digit in the array. Either this digit is not part of a number in your number sequence or it is. If it is, the number could have 1, 2, ..., n digits. For each guess, return:

  • not in a number: return f(array[2...n], -1)
  • 1st digit of 1-digit number: return array[1] union f(array[2...n], number(array[1]))
  • 1st digit of 2-digit number: return array[1...2] union f(array[3...n], number(array[1...2]))
  • 1st digit of 3-digit number: return array[1...3] union f(array[4...n], number(array[1...3]))
  • ...
  • 1st digit of n-digit number: return array[1...n]

There are some optimizations you can do here to skip some steps along the way.

  • f(array[1...k], x) = f(array[1...k], y) if the smallest choice for the next number in the sequence given hypothetical last numbers x and y is the same. So, if the smallest choice for the next number in array[1...k] is the same for x and y, and we already computed the value of f for x, we can reuse that value for y.

  • f(array[1...k], x) = c + f(array[2...k], x) whenever array[1] = 0, where c = 1 if x < 0 and c = 0 if x >= 0. That is, we can ignore leading zeroes except possibly a leading zero at the beginning of the array which should always be chosen as our first one-digit number.

  • when deciding whether a digit will be the first digit of a k-digit number, if you never choose leading zeroes, you know an upper bound on the number of remaining numbers in your sequence is given by n/k, since any numbers chosen after this one will need to be at least k digits long. If you remember the longest sequence you've seen so far, you can recognize paths that have no hope of doing better than what you've seen and ignore them.

  • if an array has at least k(k+1)/2 non-zero digits in it, there is a number sequence of length at least k obtained by taking numbers with 1, 2, ..., k non-zero digits sequentially left to right. So, if you pre-compute this value, you can potentially avoid some paths right off the bat.

Here's rough pseudocode with the optimizations discussed:

solve(array[1...n])

    z = number of non-zero entries in array
    last_number = -1
    min_soln = floor((sqrt(1 + 8z) - 1) / 2)
    return solve_internal(array[1...n], min_soln, last_number)



memo = {}

solve_internal(array[1...n], min_soln, last_number)

    // ignore potentially leading zeroes except the first one
    if array[1] = 0 then
        if last_number < 0 then
            return {0} union solve_internal(array[2...n], min_soln - 1, 0)
        else then
            return solve_internal(array[2...n], min_soln, last_number)

    // abort since we don't have enough digits left to get a solution
    if floor(n / #digits(last_number)) < min_soln return []

    // look up current situation in previous partial solutions
    z = smallest number formable in array greater than last_number
    if memo contains (n, z) then
        return memo[n, z]

    soln = {}
    for k = 1 to n do
        soln_k = solve_internal(array[k+1...n], min_soln - 1, array[1...k])
        if |soln_k| > |soln| then
            soln = soln_k
            min_soln = |soln|

    memo[n, z] = soln
    return soln
Sign up to request clarification or add additional context in comments.

8 Comments

hi, can you please just clarify a few things just to make sure i got it right: 1. what does exactly min_soln and memo represent? 2. why in the beginning we compute min_soln in this way : floor((sqrt(1 + 8z) - 1) / 2) ?
@OriNetanelBen-Zaken min_soln is the minimum length number sequence must return to be viable; if we detect we cannot achieve this, we abort. memo is a record of partial solutions already computed so we can reuse branches we've already been down. The initial calculation for min_soln comes from the observation that any array with k(k+1)/2 non-zero digits has a number sequence of length at least k achieved by taking 1 digit, 2 digits, ..., k digits in turn. Perhaps min_soln should be set to one less than this to account for the worst case.
alright. Definitely clearer, and brilliant actually. My I ask just one more thing? If I got it right in every run of the recursive function (which no stops at the stop conditions) we compute z (this will take probably O(n) time). If so, then why in the for loop we ignore it and call every time to the recursively with array[1...k] as the last number? I am a bit confused now. All is good until we start the for loop. Can you explain that please?
@OriNetanelBen-Zaken The for-loop handles the case where we don't prematurely abort or return based on one of the optimizations we discussed. In fact, the rest of the algorithm could be removed and this, by itself, would guarantee a correct result. It assumes that the current digit is the first digit of a new k-digit number and works out the length of the resulting maximal sequence under each assumption, remembering the assumption that gave the best result. Note: it looks like I neglect to "guess" that the digit isn't part of any number; this is an error to correct in my code.
(to fix the error mentioned, replace soln = {} with soln = solve_internal(array[2...n], min_soln, last_number) as this is the longest number sequence that doesn't use the first digit in the array).
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.