-2

Please be merciful - I've never asked a question here (I've answered a few) and am a total Python noob.

I developed an algorithm in SQL Server to compute the Longest Possible Common Subsequence (LPCS) between two strings and am struggling to rewrite it in Python.

The Algorithm

Consider these two strings: "DDB089A3D8014E" and "83B8FC624F774A". The LCS between the two strings is "8384", the LCS length is 4. The length of the longest possible subsequence between the two is 6. To prove this I can sort the characters in each string alphabetically and compute the LCS again: LCS("00134889ABDDDE","234467788ABCFF") = "3488AB"

With K as the length of the alphabet, my algorithm computes the Longest Possible Common Subsequence in O(K) time. It's been an invaluable NLP tool in my SQL work, but now I need it in Python (Go Lang works too BTW).

UPDATE...

My original version of the question was closed and I have since figured out to do it in Python. The ultimate goal was to apply this my logic to three strings. Thank you to @Unmitigated for your help, this was my first Python function.

from collections import Counter

def lpcs_optimized(string1: str, string2: str, string3: str, return_sequence: bool = False) -> tuple[int, str] | int:
    """
    Compute the LPCS3 length for three strings, optimized using frequency maps.
    
    Args:
        string1 (str): First string.
        string2 (str): Second string.
        string3 (str): Third string.
        return_sequence (bool): If True, return (length, sequence); else return length.
    
    Returns:
        int or Tuple[int, str]: LPCS3 length, or (length, sequence) if return_sequence=True.
    
    Time Complexity: O(m + n + o), where m, n, o are string lengths.
    """
    # Precompute frequency maps
    freq1 = Counter(string1)
    freq2 = Counter(string2)
    freq3 = Counter(string3)
    
    cs = 0
    sequence = []
    
    # Iterate over characters in first string
    for char in freq1:
        if char in freq2 and char in freq3:
            min_freq = min(freq1[char], freq2[char], freq3[char])
            cs += min_freq
            if return_sequence:
                sequence.extend([char] * min_freq)
    
    if return_sequence:
        return cs, ''.join(sequence)
    return cs

# Example usage
if __name__ == "__main__":
    s1 = "aaabcff"
    s2 = "aabbceff"
    s3 = "aaabbcdf"
    print(f"LPCS3 length: {lpcs_optimized(s1, s2, s3)}")  # Output: 5
    print(f"LPCS3 sequence: {lpcs_optimized(s1, s2, s3, return_sequence=True)[1]}")  # Output: aabbc

1 Answer 1

2

Create Counters from both strings, then sum the minimum of the number of occurrences of each character in both strings.

from collections import Counter


def solve(str1: str, str2: str):
    counter1 = Counter(str1)
    return sum(min(freq, counter1[ch]) for ch, freq in Counter(str2).items())

As Kelly Bundy suggests, this can be simplified to:

def solve(str1: str, str2: str):
    return (Counter(str1) & Counter(str2)).total()
Sign up to request clarification or add additional context in comments.

5 Comments

Or just return (Counter(str1) & Counter(str2)).total()
Good point; that's more elegant.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.