one dimensional dynamic programming

10 min read

Regular Expression Matching

Given an input string (s) and a pattern (p), implement regular expression matching with support for ’.’ and ’*‘.

’.’ Matches any single character. ’*’ Matches zero or more of the preceding element. The matching should cover the entire input string (not partial).

Note:

s could be empty and contains only lowercase letters a-z. p could be empty and contains only lowercase letters a-z, and characters like . or *. Example 1:

Input: s = “aa” p = “a” Output: false Explanation: “a” does not match the entire string “aa”. Example 2:

Input: s = “aa” p = “a” Output: true Explanation: ’’ means zero or more of the precedeng element, ‘a’. Therefore, by repeating ‘a’ once, it becomes “aa”. Example 3:

Input: s = “ab” p = “.” Output: true Explanation: “.” means “zero or more (*) of any character (.)“. Example 4:

Input: s = “aab” p = “cab” Output: true Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches “aab”. Example 5:

Input: s = “mississippi” p = “misisp*.” Output: false

solution

Approach 1: Recursion

  • Intuition

If there were no Kleene stars (the * wildcard character for regular expressions), the problem would be easier - we simply check from left to right if each character of the text matches the pattern.

When a star is present, we may need to check many different suffixes of the text and see if they match the rest of the pattern. A recursive solution is a straightforward way to represent this relationship.

  • Algorithm

Without a Kleene star, our solution would look like this:

def match(text, pattern):
    if not pattern: return not text
    first_match = bool(text) and pattern[0] in {text[0], '.'}
    return first_match and match(text[1:], pattern[1:])

If a star is present in the pattern, it will be in the second position pattern[1]. Then, we may ignore this part of the pattern, or delete a matching character in the text. If we have a match on the remaining strings after any of these operations, then the initial inputs matched.

class Solution(object):
    def isMatch(self, text, pattern):
        if not pattern:
            return not text

        first_match = bool(text) and pattern[0] in {text[0], '.'}

        if len(pattern) >= 2 and pattern[1] == '*':
            return (self.isMatch(text, pattern[2:]) or
                    first_match and self.isMatch(text[1:], pattern))
        else:
            return first_match and self.isMatch(text[1:], pattern[1:])

Approach 2: Dynamic Programming

  • Intuition

As the problem has an optimal substructure, it is natural to cache intermediate results. We ask the question dp(i, j): does text[i:] and pattern[j:] match? We can describe our answer in terms of answers to questions involving smaller strings.

  • Algorithm

We proceed with the same recursion as in Approach 1, except because calls will only ever be made to match(text[i:], pattern[j:]), we use dp(i, j) to handle those calls instead, saving us expensive string-building operations and allowing us to cache the intermediate results.

  • Top-Down Variation

    class Solution(object):
    def isMatch(self, text, pattern):
        memo = {}
        def dp(i, j):
            if (i, j) not in memo:
                if j == len(pattern):
                    ans = i == len(text)
                else:
                    first_match = i < len(text) and pattern[j] in {text[i], '.'}
                    if j+1 < len(pattern) and pattern[j+1] == '*':
                        ans = dp(i, j+2) or first_match and dp(i+1, j)
                    else:
                        ans = first_match and dp(i+1, j+1)
    
                memo[i, j] = ans
            return memo[i, j]
    
        return dp(0, 0)
  • Bottom-Up Variation

    class Solution:
    """
    @param s: A string 
    @param p: A string includes "." and "*"
    @return: A boolean
    """
    def isMatch(self, s, p):
        # write your code here
        m, n = len(s), len(p)
        # Initialize the table with False. The first row is satisfied.
        dp = [[False for _ in range(n+1)] for _ in range(m+1)]
        dp[0][0] = True
        # Update the corner case of when s is an empty string but p is not.
        # Since each '*' can eliminate the character before it, the table is
        # vertically updated by the one before previous.
        for j in range(2, n + 1):
            dp[0][j] = dp[0][j - 2] and p[j - 1] == '*'
    
    
        for i in range(1,m+1):
            for j in range(1, n+1):
                if p[j-1] != '*':
                    # refer to condition 1 and 2
                    dp[i][j] = dp[i-1][j-1] and (s[i-1] == p[j-1] or p[j-1] == '.')
                else:
                    # refer to condition 3
                    # * eliminate the previous or count one of the previous.
                    dp[i][j] = dp[i][j-1] or (j >= 2 and dp[i][j-2])
                    # * count multiple of the previous.
                    if s[i-1] == p[j-2] or p[j-2] == '.':
                        dp[i][j] |= dp[i-1][j]
        return dp[-1][-1]
  • Complexity Analysis
  • Time Complexity: Let T,P be the lengths of the text and the pattern respectively. The work for every call to dp(i, j) for i=0, … ,T; j=0, … ,P is done once, and it is O(1) work. Hence, the time complexity is O(TP).
  • Space Complexity: The only memory we use is the O(TP) boolean entries in our cache. Hence, the space complexity is O(TP).

Wildcard Matching

Given an input string (s) and a pattern (p), implement wildcard pattern matching with support for ’?’ and ’*‘.

’?’ Matches any single character. ’*’ Matches any sequence of characters (including the empty sequence). The matching should cover the entire input string (not partial).

Note: s could be empty and contains only lowercase letters a-z. p could be empty and contains only lowercase letters a-z, and characters like ? or *.

solution

almost the same as the last question, let`s see the code

class Solution:
    """
    @param s: A string 
    @param p: A string includes "?" and "*"
    @return: is Match?
    """
    def isMatch(self, s, p):
        # write your code here
        s,p = 'e' + s, 'e' + p    # e is for empty string
        n = len(s)
        m = len(p)
        dp = [[False] * m for _ in range(n)]
        dp[0][0] = True
        
        # initialize the first row.
        for j in range(1, m):
            if p[j] == '*':
                dp[0][j] = dp[0][j-1] 
                
        # First column is already all False 
        
        for i in range(1,n):
            for j in range(1,m):
                if s[i] == p[j] or p[j] == '?':
                    dp[i][j] = dp[i-1][j-1]
                elif p[j] == '*':
                    dp[i][j] = dp[i-1][j-1] or dp[i][j-1] or dp[i-1][j]
        return dp[-1][-1]

Burst Balloons

Given n balloons, indexed from 0 to n-1. Each balloon is painted with a number on it represented by array nums. You are asked to burst all the balloons. If the you burst balloon i you will get nums[left] * nums[i] * nums[right] coins. Here left and right are adjacent indices of i. After the burst, the left and right then becomes adjacent.

Find the maximum coins you can collect by bursting the balloons wisely.

Note: You may imagine nums[-1] = nums[n] = 1. They are not real therefore you can not burst them. 0 ≤ n ≤ 500, 0 ≤ nums[i] ≤ 100 Example: Input: [3,1,5,8] Output: 167 Explanation: nums = [3,1,5,8] –> [3,5,8] –> [3,8] –> [8] –> [] coins = 315 + 358 + 138 + 181 = 167

solution

  • Be Naive First When I first get this problem, it is far from dynamic programming to me. I started with the most naive idea the backtracking.

We have n balloons to burst, which mean we have n steps in the game. In the i th step we have n-i balloons to burst, i = 0~n-1. Therefore we are looking at an algorithm of O(n!). Well, it is slow, probably works for n < 12 only.

Of course this is not the point to implement it. We need to identify the redundant works we did in it and try to optimize.

Well, we can find that for any balloons left the maxCoins does not depends on the balloons already bursted. This indicate that we can use memorization (top down) or dynamic programming (bottom up) for all the cases from small numbers of balloon until n balloons. How many cases are there? For k balloons there are C(n, k) cases and for each case it need to scan the k balloons to compare. The sum is quite big still. It is better than O(n!) but worse than O(2^n).

  • Better idea We then think can we apply the divide and conquer technique? After all there seems to be many self similar sub problems from the previous analysis.

Well, the nature way to divide the problem is burst one balloon and separate the balloons into 2 sub sections one on the left and one one the right. However, in this problem the left and right become adjacent and have effects on the maxCoins in the future.

Then another interesting idea come up. Which is quite often seen in dp problem analysis. That is reverse thinking. Like I said the coins you get for a balloon does not depend on the balloons already burst. Therefore instead of divide the problem by the first balloon to burst, we divide the problem by the last balloon to burst. Why is that? Because only the first and last balloons we are sure of their adjacent balloons before hand! For the first we have nums[i-1]nums[i]nums[i+1] for the last we have nums[-1]nums[i]nums[n]. OK. Think about n balloons if i is the last one to burst, what now? We can see that the balloons is again separated into 2 sections. But this time since the balloon i is the last balloon of all to burst, the left and right section now has well defined boundary and do not affect each other! Therefore we can do either recursive method with memoization or dp.

  • Final Here comes the final solutions. Note that we put 2 balloons with 1 as boundaries and also burst all the zero balloons in the first round since they won’t give any coins. The algorithm runs in O(n^3) which can be easily seen from the 3 loops in dp solution.

    class Solution:
    """
    @param nums: A list of integer
    @return: An integer, maximum coins
    """
    def maxCoins(self, nums):
        # write your code here
        n = len(nums)
        A = [1]*(n+2)
        for i in range(1,n+1):
            A[i] = nums[i-1]
        
        f = [[0]*(n+2) for _ in range(n+2)]
        
        for i in range(n+1,-1,-1):
            for j in range(i,n+2):
                for k in range(i+1,j):
                    f[i][j] = max(f[i][j], f[i][k] + f[k][j] + A[i]*A[k]*A[j])
         
        return f[0][n+1]
                
## Word Break
Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

Note:

The same word in the dictionary may be reused multiple times in the segmentation.
You may assume the dictionary does not contain duplicate words.

Example 1:
Input: s = "leetcode", wordDict = ["leet", "code"]
Output: true
Explanation: Return true because "leetcode" can be segmented as "leet code".

Example 2:
Input: s = "applepenapple", wordDict = ["apple", "pen"]
Output: true
Explanation: Return true because "applepenapple" can be segmented as "apple pen apple".
             Note that you are allowed to reuse a dictionary word.

Example 3:
Input: s = "catsandog", wordDict = ["cats", "dog", "sand", "and", "cat"]
Output: false

dp[i] is True if there is a word in the dictionary that ends at ith index of s AND d is also True at the beginning of the word
```python{numberLines: true}
class Solution(object):
    def wordBreak(self, s, wordDict):
        n = len(s)
        dp = [False] * (n+1)
        dp[0] = True
        for i in range(1,n+1):
            for j in range(0, i):
                if dp[j] and s[j:i] in wordDict:
                    dp[i] = True
                    break
        return dp[-1]  # or dp[n]

also for the second layer loop can be the wordDict loop, it can be faster when s is relatively longer and wordDict is relatively shorter

class Solution(object):
    def wordBreak(self, s, wordDict):
        n = len(s)
        dp = [False for i in range(n+1)]
        dp[0] = True
        for i in range(1,n+1):
            for w in wordDict:
                if dp[i-len(w)] and s[i-len(w):i]==w:
                    dp[i]=True
        return dp[-1]

Word Break II

Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, add spaces in s to construct a sentence where each word is a valid dictionary word. Return all such possible sentences.

Note: The same word in the dictionary may be reused multiple times in the segmentation. You may assume the dictionary does not contain duplicate words.

Example 1: Input: s = “catsanddog” wordDict = [“cat”, “cats”, “and”, “sand”, “dog”] Output: [ “cats and dog”, “cat sand dog” ]

Example 2: Input: s = “pineapplepenapple” wordDict = [“apple”, “pen”, “applepen”, “pine”, “pineapple”] Output: [ “pine apple pen apple”, “pineapple pen apple”, “pine applepen apple” ] Explanation: Note that you are allowed to reuse a dictionary word.

Example 3: Input: s = “catsandog” wordDict = [“cats”, “dog”, “sand”, “and”, “cat”] Output: []

class Solution:
    """
    @param: s: A string
    @param: wordDict: A set of words.
    @return: All possible sentences.
    """
    def wordBreak(self, s, wordDict):
        # write your code here
        valid_words = set(wordDict)
        mem = collections.defaultdict(list)

        def helper(s):
            """
            Recursive helper returning all possible segmentations of s
            """
            # If we've computed all the word breaks for s, return it.
            if s in mem:
                return mem[s]

            # Add s to the list if it works too.
            if s in valid_words:
                mem[s].append(s)

            # Now, iterate through s, checking and saving prefix-suffix combos.
            for i in range(1, len(s)):
                prefix = s[:i]
                if prefix in valid_words:
                    suffix = s[i:]
                    segmented_suffixes = helper(suffix)
                    """
                    the following append can be change with extend to save time:
                    if segmented_suffixes:
                        mem[s].extend([prefix + " " + segmented_suffix for segmented_suffix in segmented_suffixes]) 
                    """
                    if segmented_suffixes:
                        for segmented_suffix in segmented_suffixes:
                            mem[s].append(prefix + " " + segmented_suffix)
            
            # Return all the valid word breaks for this query
            #print s,mem[s]
            return mem[s]
        
        return helper(s)