Python Forum
longest palindromic subsequence DP solution bug
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
longest palindromic subsequence DP solution bug
#1
Hey,

I am trying to code the following problem: given a string find the length of the longest palindromic subsequence and for the input "GATTACAGAT" I get ouput 5 instead of the expected 6. I cannot find the flaw in logic, can somebody help?

def solve(w,i,j,dp):
    
    #if i > j then the maximum palindromic subsequence is 0
    if i > j:
        return 0

    # if i == j then the string is a palindrome of length 1
    if i == j:
        return 1

    #if dp[i][j] was solved already, return the value
    if dp[i][j] != -1:
        return dp[i][j]

    #if the characters at i and j are the same we recurse and add 2 to the result
    if w[i] == w[j]:
        dp[i][j] = 2 + solve(w,i+1,j-1,dp)
        return dp[i][j]

    #if the characters at i and j are not identical the longest palindromic subsequence is equal to that found at either i+1 and j or i and j-1
    if w[i] != w[j]:
        dp[i][j] = max(solve(w,i+1,j,dp),solve(w,i,j-1,dp))
        return dp[i][j]

def driver(w):
    dp = [[-1]*len(w)]*len(w)
    solve(w,0,len(w)-1,dp)
    print(dp[0][len(w)-1])

w = "GATTACAGAT"
driver(w)
Reply
#2
The primary logic flaw is thinking your code is readable. Some comments would be nice.
Reply
#3
(Feb-02-2022, 05:38 PM)deanhystad Wrote: The primary logic flaw is thinking your code is readable. Some comments would be nice.

Sorry about that, I added comments.
Reply
#4
And the longest palindrome I see in "GATTACAGAT" is "ATTA". I think your output should be 4, not 6 (or 5).
Reply
#5
(Feb-02-2022, 07:20 PM)deanhystad Wrote: And the longest palindrome I see in "GATTACAGAT" is "ATTA". I think your output should be 4, not 6 (or 5).

The question asks for a sequence not substring, the longest subsequence is GATTAG.
Reply
#6
So a subsequence can skip letters? I think of a subsequence as being a slice, not a sequential sample. I'm sure your code will make more sense to me now.

...much later...
Nope. I am still very confused about your algorithm.
Reply
#7
(Feb-02-2022, 03:52 PM)dvdlly Wrote:     dp = [[-1]*len(w)]*len(w)

The * operator makes multiple references to an object. It doesn't make multiple independent objects. For the inner portion [-1]*len(w) this is fine. The integers are immutable. But for the outer portion, you just have multiple references to the same inner list.

I think you need this to be independent list objects.
    dp = [[-1]*len(w) for _ in range(len(w))]]
Reply
#8
I added some prints so I could watch execution.
def solve(w,i,j,dp):
    print(w, i, j, w[i:j+1])
    #if i > j then the maximum palindromic subsequence is 0
    if i > j:
        print("i > j")
        return 0
 
    # if i == j then the string is a palindrome of length 1
    if i == j:
        print("i == j")
        return 1
 
    #if dp[i][j] was solved already, return the value
    if dp[i][j] != -1:
        return dp[i][j]
 
    #if the characters at i and j are the same we recurse and add 2 to the result
    if w[i] == w[j]:
        print(f"w[{i}] == w[{j}]", dp[i][j])
        dp[i][j] = 2 + solve(w,i+1,j-1,dp)
        return dp[i][j]
 
    #if the characters at i and j are not identical the longest palindromic subsequence is equal to that found at either i+1 and j or i and j-1
    if w[i] != w[j]:
        print(f"w[{i}] != w[{j}]", dp[i][j])
        dp[i][j] = max(solve(w,i+1,j,dp),solve(w,i,j-1,dp))
        return dp[i][j]
Finally think I have this figured. You start at the outside and work your way inward until there are no more matches. Then you start unwinding the recursion and begin again moving inward form different starting points. Neat!

bowlofred nailed the problem. When I changed the dp initializer to this:
dp = [[-1]*len(w) for _ in range(len(w))]
I saw this line printed out for the very first time:
GATTACAGAT 0 7 GATTACAG
i = 0 and j = 7 are the starting points you need to find "GATTAG".

To make sure I understood it I rewrote it to less C-like and more Python-like.
def solve(word, cache=None):
    """
    Return length of longest palindrome sequence contained in word.
    Letters in the palindrome sequence are sampled in order from the original word.  "STATS" is
    a valid palindrome of "STARTS" because it only removes letters. 
    """
    if cache is None:
        # cache is dictionary of solved sequences.  It eliminates recomputing
        # palindrome lengths for subsequences that were already solved.
        cache = {}

    if len(word) < 2:
        # End of recursion.  Cannot shrink word any more
        return len(word)
 
    if word in cache:
        # End of recursion.  Already solved this sequence
        return cache[word]

    # This is where the work is done.
    if word[0] == word[-1]:
        # If the first and last letters of word match we have a palindrome
        # of length 2 plus the palindrome length of word with the first and last
        # letters removed.
        cache[word] = 2 + solve(word[1:-1], cache)
    else:
        # If the letters do not match compute the the palindrome length of word
        # with the first letter removed and word with the last letter removed.
        # Use the max length as the length of word.
        cache[word] = max(solve(word[1:], cache), solve(word[:-1], cache))
    return cache[word]
 
print(solve("GATTACAGAT"))
I don't know if it is easier to read because it is Python or if it is just easier for me to read because I know what is happening now.

I also tried a brute force solution.
import itertools

def palindromes(word):
    """Return palindrome sequences contained in word.  The sequence
    can skip letters between the start and end characters.
    """
    # For all possible lengths
    for count in range(2, len(word)+1):
        even = count % 2 == 0
        # for all combinations of this length
        for combo in itertools.combinations(word, r=count):
            a = count // 2 - 1
            b = a + 1 if even else a + 2
            # yield sequence if it is a palindrome
            if combo[:a+1] == combo[b:][::-1]:
                yield ''.join(combo)

# Get all the palindromes.  Sort by length (reverse) and then alphabetically
words = list(palindromes("GATTACAGAT"))
words.sort(key=lambda x: (-len(x), x))
print(words)
The brute force method is painfully inefficient. The time required to solve doubling with each additional letter, Solving "ABLE WAS I ERE I SAW ELBA" takes less than a millisecond using dvdlly's method. Using the brute force method it takes 10 seconds.
ibreeden likes this post
Reply
#9
Thanks to both of you, it seems to work now.
Reply
#10
Knowing the length without knowing the actual sequence is maddening! This is your code modified to return the longest sequence, not the length.
def solve(word, cache=None):
    """
    Return longest palindrome sequence contained in word.
    A palendrome sequence can skip letters in the original word.
    """
    if cache is None:
        cache = {}   # Cache of found palindromes
 
    if len(word) < 2:  # Cannot find a shorter palindrome.  Return word
        return word
  
    if word in cache:
        return cache[word]  # Sequence already solved.  Return palindrome
 
    if word[0] == word[-1]:
          # start and end are part of palindrome
        cache[word] = f"{word[0]}{solve(word[1:-1], cache)}{word[0]}"
    else:
        # Use the longer palindrome of word[1:] and word[:-1]
        start = solve(word[1:], cache)  # remove starting letter
        end = solve(word[:-1], cache) # remove ending letter
        cache[word] = start if len(start) > len(end) else end
    return cache[word]
  
print(solve(print(solve("INTERGALLACTIC ARCTIC ETERNITY"))))
Output:
INTERACTCARETNI
My 23 letter test sequence only needed to test 390 sequences to solve. A brute force approach would test 33,554,405 different combinations.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Python code for Longest Common Subsequence Bolt 3 965 Sep-22-2023, 08:09 AM
Last Post: Bolt
  Python implementation of Longest Common Substring problem Bolt 0 569 Sep-17-2023, 08:31 PM
Last Post: Bolt
  Longest sequence of repeating integers in a numpy array Cricri 5 5,810 Jun-08-2020, 06:48 AM
Last Post: Cricri
  Keep the longest chains Amniote 11 4,203 Jul-03-2019, 05:07 PM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020