Longest Increasing Subsequence

#leetcode #algorithms #competitiveprogramming #computerscience

Problem Description

Given an integer array nums, return the length of the longest strictly increasing subsequence.

A subsequence is a sequence that can be derived from an array by deleting some or no elements without changing the order of the remaining elements. For example, [3,6,2,7] is a subsequence of the array [0,3,1,6,2,2,7].

Concept

A brute force method is applied by simply making all possible increasing subsequences from the zeroth index then keeping all of them in memory and eventually returning the longest among the candidate lists. This approach has an O(n^2) time complexity where n is the length of the array. This is due to n number of permutations from the zeroth index with each permutation requiring at worst n number of checks (to form the subsequence).

From the brute force method it is obvious that there are potential improvements if we take into account that we want to keep the elements in the subsequence as small as we can before we increment another element. For example: we would obviously discard the 3 in [0,3] for our next element if it is smaller than 3 as it increases the range of numbers we can get for our next larger element, i.e., [0,1] is better than [0,3]. If we apply the same concept not only to the last number in our current subsequence but to all the numbers in the subsequence we will then be able to increase the range of possible numbers in all positions in the subsequence. Hence, we will be creating and maintaining the longest increasing subsequence throughout the loop. Our algorithm would work like so: 1. If the next element is larger than the last element in our subsequence, append it to subsequence. 2. Else, find the biggest number in the subsequence the element can replace such that the number is smaller than the element and replacing the number with the element maintains the incremental condition of the subsequence. I know this sounds confusing, the example will illustrate the concept better.

Notice a couple of things from this example. Firstly, the final longest increasing subsequence is not an existent valid subsequence in nums. Fortunately, the problem requires only to return the length of the subsequence. Secondly, we guarantee the past existence of a valid subsequence after we substituted an element in the subsequence with the next element so long as the next element is smaller than the past element. Example: The 5th iteration produces subsequence [1,2,6] although this is not derivable from nums, we certainly know such a a valid subsequence existed, [4,5,6] from the third iteration. Hence, it also stands to reason that the existence of [1,2,6,7,8], a non-existent subsequence, guaranteed a past valid subsequence that fulfils our condition of incrementation, in this example is [4,5,6,7,8]. These two observations show us that the algorithm does not produce false subsequences. To illustrate the power of the algorithm see the following example.

Notice that by finding the smallest possible number to occupy each index we are maximising the range of numbers we can find. Although we could have created the longest increasing subsequence similarly by our elements in the 3rd iteration, [4,5,6], by minimising the elements such as in the 6th iteration we increase the potential of our subsequence as the next element needs to only be bigger than 3, not 6. Understand that we can only place 3 at the 6th iteration because the element before 3 is 2 and not 5, highlighting the importance of minimising the number in all indices due to the domino effect.

The algorithm keeps a current longest increasing subsequence at each iteration and iterates once throughout the list. To find the position of our next element in the current subsequence we do a binary search. Hence, the time complexity of the algorithm is O(n*log(n)).

Implementation

We create a variable tmp to keep track of the current longest increasing subsequence. We then loop over each element in nums. In each iteration we check whether, upon insertion, the element will be the largest element or not. This check is done by bisect_left which accept the sorted list tmp and a number x and returns the position x occupies in tmp so that tmp maintains sorted. The function bisect_left uses a binary search. If the element is the biggest then bisect_left will return the last index of the newly appended a which is which is equals to len(temp). As we established, if the element, n, is the biggest we append it, else, we find the smallest element that is bigger than n in tmp and replace it with n, bisect_left returns the index to be replaced. Finally we return the variable.

def lengthOfLIS(self, nums: List[int]) -> int:
        N = len(nums)
        tmp = [nums[0]]
        for n in nums:
            x = bisect_left(tmp,n)
            if x == len(tmp):
                tmp.append(n)
            else:
                tmp[x] = n
        return len(tmp)

Real-life application of the Longest Increasing Subsequence Algorithm

Part of Maximum Unique Match Finder for aligning entire genomes.

Remarks

Surprisingly, the algorithm to find the longest increasing subsequence, not merely its length, also has a time complexity of O(n*log(n)). The algorithm is inspired by the card game Patience or Klondike. Here's a great resource to learn about Patience Sort. Funnily enough, during my Police Officer Basic Course I sat beside my friend who would play Klondike throughout class, it was one of the few games available on our issued laptops.

DEV Community