DEV Community

hackem
hackem

Posted on

Word Break

Problem Description

Given a string s and a dictionary of strings wordDict, return true if s can be segmented into a space-separated sequence of one or more dictionary words.

Note that the same word in the dictionary may be reused multiple times in the segmentation.

Example:
Input: s = "leetcode", wordDict = ["leet","code"]
Output: true
Explanation: Return true because "leetcode" can be segmented as "leet code".

Example:
Input: s = "catsandog", wordDict =["cats","dog","sand","and","cat"]
Output: false

Concept

It is rather straightforward or 'human-like' to brute force this problem. To segment s, we simply check if we can create a joint of words in wordDict to create s. A simple approach would be to check whether for each segment in s there is a match in wordDict. If the segment is a match, it also has to have a connected path of matched segments until the end of s . Example: It is not enough that 'cats' and 'and' (from wordDict) are matches in 'catsandog' if we do not find a match for segment 'og'.

A dynamic programming solution is applicable here as we are creating a new subproblem each time we find a segment matching in s, the subproblem is s minus the matched segment. Our base case being an empty s.

I did a recursive approach by starting from the beginning of s. The function then breaks off into all subproblems created by all matchings in s and returns True if one subproblem reaches an empty s, i.e. all parts of s matched. I tried this solution however I got a Time Limit Exceeded. I then chanced upon a clever trick in the leetcode discussions section to save computational time.

For this solution, we create a dp array for all indices of s plus one for the base case. Set all value of the array to False except our base case. Then we iterate from the end of s checking for each element whether it is a match and whether that match is connected to the end by checking dp[i + len(matched_word)]. The advantage of this method is that we save computational time by caching our results.

Implementation

My initial solution, recursion

class Solution:
    def wordBreak(self,s, wordDict, i=0):
        res = []
        for w in wordDict:
            if (i+len(w))<=len(s) and (s[i:i+len(w)]==w):
                if i+len(w) == len(s): return True
                else:
                    res.append(self.wordBreak(s[i+len(w):],wordDict))
        if True in res:
            return True
        return False


Enter fullscreen mode Exit fullscreen mode

The bottom-up dp caching solution

class Solution:
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
        dp = [False] * (len(s) + 1)
        dp[len(s)] = True
        for i in range(len(s)-1,-1,-1):
            for w in wordDict:
                if ((i+len(w))<=len(s)) and (s[i:i+len(w)]==w):
                    dp[i] = dp[i+len(w)]
                if dp[i]:
                    break
        return dp[0]
Enter fullscreen mode Exit fullscreen mode

Remarks

This problem appeared uncomplicated but had an interesting elusiveness that made it different from other dp problems. The solutions remind us to consider computational time beyond just time complexity. Both solutions were

O(n2m)O(n^2*m)

where n is the length of s and m the length of wordDict. Yet, only the caching solution was accepted. Remarkception, still trying to figure out dev.to's mathematical text support, currently using katex.

Top comments (0)