DEV Community

Madhu0212
Madhu0212

Posted on

Longest Increasing Subsequence in Python

The Longest Increasing Subsequence (LIS) problem is a well-known problem in computer science and is used to find the longest subsequence of a given sequence in which the elements are in sorted order, from lowest to highest. In other words, it is the problem of finding a subsequence of a given sequence in which the subsequence's elements are sorted in increasing order, and in which the subsequence is as long as possible.

For example, consider the sequence [3, 10, 2, 1, 20]. A possible longest increasing subsequence of this sequence is [3, 10, 20], which has length 3.

The LIS problem has a number of important applications, including data compression, gene sequence analysis, and natural language processing. It is a classic problem that has been studied extensively and has several efficient algorithms for solving it.

In this blog, we will discuss two approaches for solving the LIS problem: a recursive approach and a dynamic programming (DP) approach. We will also implement these solutions in Python and compare the time complexity of the two approaches.

Understanding the LIS problem with an example

To understand the LIS problem better, let's consider the following example. Suppose we are given the sequence [5, 6, 7, 1, 2, 8]. The longest increasing subsequence of this sequence is [1, 2, 8], which has length 3.To find the longest increasing subsequence, we can follow a step-by-step approach as follows:

  1. Begin by considering the first element of the sequence, which is 5. There is no increasing subsequence starting with 5, so we move on to the next element.
  2. Consider the second element of the sequence, which is 6. There is no increasing subsequence starting with 6 either, so we move on to the next element.
  3. Consider the third element of the sequence, which is 7. There is no increasing subsequence starting with 7 either, so we move on to the next element.
  4. Consider the fourth element of the sequence, which is 1. There is an increasing subsequence starting with 1, which is [1].
  5. Consider the fifth element of the sequence, which is 2. There is an increasing subsequence starting with 2, which is [2].
  6. Consider the sixth element of the sequence, which is 8. There is an increasing subsequence starting with 8, which is [8].

Now, we need to find the longest increasing subsequence that includes elements from the entire sequence. We can do this by combining the increasing subsequences that we found in the previous steps. The longest increasing subsequence is [1, 2, 8], which has length 3.

This is one way to solve the LIS problem. In the following sections, we will discuss two more efficient approaches for solving the LIS problem: a recursive approach and a dynamic programming approach.

Recursive approach to solving the LIS problem

One way to solve the LIS problem is by using a recursive approach. In this approach, we try to find the longest increasing subsequence for each element of the sequence, and then return the longest of these subsequences.

To do this, we can define a recursive function lis(arr, n) that takes in a sequence arr and an integer n, which represents the length of the sequence. The function returns the length of the longest increasing subsequence ending at index n.

The function can be implemented as follows:

def lis(arr, n):
  if n == 0:
    return 1
  max_ending_here = 1
  for i in range(n):
    res = lis(arr, i)
    if arr[i] < arr[n] and res + 1 > max_ending_here:
      max_ending_here = res + 1
  return max_ending_here
Enter fullscreen mode Exit fullscreen mode

Let's understand how this function works. The base case of the function is when n is equal to 0, in which case we return 1. This is because a single element is always a valid increasing subsequence.

For the recursive case, we consider each element i from 0 to n-1. For each element, we calculate the length of the longest increasing subsequence ending at index i by calling the lis function recursively. If the element at index i is less than the element at index n and the length of the subsequence ending at index i plus 1 is greater than the maximum ending here, we update the maximum ending here to be the length of the subsequence ending at index i plus 1.

Finally, we return the maximum ending here as the length of the longest increasing subsequence ending at index n.

To find the longest increasing subsequence of the entire sequence, we can call the lis function with the length of the sequence as the second argument. For example, to find the longest increasing subsequence of the sequence [5, 6, 7, 1, 2, 8], we can call lis([5, 6, 7, 1, 2, 8], 5), which will return the length of the longest increasing subsequence ending at index 5, which is 3.

This recursive approach is simple to implement, but it has a time complexity of O(2^n), which is not very efficient for large sequences. In the next section, we will discuss a more efficient approach called dynamic programming, which has a time complexity of O(n^2).

Dynamic Programming approach to solving the LIS problem

The dynamic programming (DP) approach to solving the LIS problem involves breaking the problem down into smaller subproblems and storing the solutions to these subproblems in a table. This allows us to avoid recalculating the solutions to subproblems and reduces the overall time complexity of the solution.

To implement the DP approach, we can define a function lis(arr, n) that takes in a sequence arr and an integer n, which represents the length of the sequence. The function returns the length of the longest increasing subsequence ending at index n.

The function can be implemented as follows:

def lis(arr, n):
  dp = [1] * n
  for i in range(1, n):
    for j in range(i):
      if arr[i] > arr[j] and dp[i] < dp[j] + 1:
        dp[i] = dp[j] + 1
  return max(dp)
Enter fullscreen mode Exit fullscreen mode

Let's understand how this function works. We start by initializing a list dp with n elements, all set to 1. This list will store the lengths of the longest increasing subsequences ending at each index.

Next, we iterate over the elements of the sequence, starting from the second element (index 1). For each element, we iterate over the elements before it (from 0 to i-1). If the element at the current index i is greater than the element at index j and the length of the longest increasing subsequence ending at index i is less than the length of the longest increasing subsequence ending at index j plus 1, we update the value at index i in the dp list to be the length of the longest increasing subsequence ending at index j plus 1.

Finally, we return the maximum value in the dp list, which is the length of the longest increasing subsequence ending at any index.

To find the longest increasing subsequence of the entire sequence, we can call the lis function with the length of the sequence as the second argument. For example, to find the longest increasing subsequence of the sequence [5, 6, 7, 1, 2, 8], we can call lis([5, 6, 7, 1, 2, 8], 6), which will return the length of the longest increasing subsequence ending at any index, which is 3.

The time complexity of this DP approach is O(n^2), which is much more efficient than the recursive approach for large sequences.

Implementing the LIS solution in Python

Here is the complete Python code for implementing the LIS solution using the dynamic programming approach:

def lis(arr, n):
  dp = [1] * n
  for i in range(1, n):
    for j in range(i):
      if arr[i] > arr[j] and dp[i] < dp[j] + 1:
        dp[i] = dp[j] + 1
  return max(dp)

# Test the function
arr = [5, 6, 7, 1, 2, 8]
n = len(arr)
print(lis(arr, n))  # Output: 3
Enter fullscreen mode Exit fullscreen mode

This code first defines the lis function, which takes in a sequence arr and an integer n and returns the length of the longest increasing subsequence ending at any index. The function uses the DP approach, as described in the previous section.

Next, the code tests the function by calling it with the sequence [5, 6, 7, 1, 2, 8] and the length of the sequence as the arguments. The output is 3, which is the correct length of the longest increasing subsequence of the given sequence. You can try the function with other sequences as well to test its correctness.

Comparison of the time complexity of the recursive and DP approaches

The time complexity of the recursive approach to solving the LIS problem is O(2^n), while the time complexity of the dynamic programming (DP) approach is O(n^2). This means that the DP approach is much more efficient for large sequences, as it has a lower time complexity.

To understand why the DP approach has a lower time complexity, let's consider an example. Suppose we are given the sequence [5, 6, 7, 1, 2, 8] and we want to find the length of the longest increasing subsequence.

Using the recursive approach, we need to consider every possible subsequence and check if it is increasing. This involves a lot of recursive calls, and the number of calls grows exponentially with the size of the sequence. For a sequence of length n, the number of recursive calls is 2^n, which is why the time complexity of the recursive approach is O(2^n).

On the other hand, using the DP approach, we can store the solutions to subproblems in a table and avoid recalculating them. This reduces the number of calculations we need to do, and the time complexity becomes O(n^2), which is much more efficient for large sequences.

In general, the DP approach is preferred over the recursive approach for solving problems that involve overlapping subproblems, as it is more efficient in terms of time complexity. However, the recursive approach may be easier to implement and may be preferred in some cases.

Applications of the LIS problem in real-world scenarios

The Longest Increasing Subsequence (LIS) problem has a number of important applications in real-world scenarios. Some of the applications of the LIS problem are:

  1. Data compression: The LIS problem can be used to compress data by finding the longest increasing subsequence of a sequence of data and storing only the subsequence instead of the entire sequence. This can save storage space and reduce the time required to transmit or process the data.
  2. Gene sequence analysis: The LIS problem can be used to analyze gene sequences and find patterns in the data. For example, it can be used to find the longest increasing subsequences of gene expression data, which can help identify genes that are related to certain traits or conditions.
  3. Natural language processing: The LIS problem can be used in natural language processing (NLP) tasks, such as machine translation and text summarization. For example, it can be used to find the longest increasing subsequences of words in a sentence, which can help identify the main ideas or themes in the text.
  4. Scheduling: The LIS problem can be used in scheduling tasks, such as scheduling jobs or classes. For example, it can be used to find the longest increasing subsequence of time slots that are available for scheduling, which can help identify the best time slots to schedule tasks.

Overall, the LIS problem has many important applications in a variety of fields, including computer science, biology, and natural language processing.

Conclusion

In this blog, we discussed the Longest Increasing Subsequence (LIS) problem, which is the problem of finding the longest subsequence of a given sequence in which the elements are in sorted order, from lowest to highest. We also discussed two approaches for solving the LIS problem a recursive approach and a dynamic programming (DP) approach. We implemented these solutions in Python and compared the time complexity of the two approaches.

We also discussed the applications of the LIS problem in real-world scenarios, such as data compression, gene sequence analysis, and natural language processing.

To learn more about the LIS problem, you can refer to the following resources:

Top comments (0)