Every data structure performs various operations when implementing an algorithm. Some of the key and general operations include iterating over a collection, inserting an item at a point in the collection, deleting, updating or creating a copy of an item or the entire collection. In programming, the choice of the data structure is very important as it affects the performance of the application. This is because the operations of the data structures have different time and space complexities.
Space Complexity is generally used to refer to the measure of how much space an algorithm takes. It includes both auxiliary space and space used by input.
Auxiliary Space is the extra space or temporary space used by an algorithm.
Space Complexity of an algorithm is total space taken by the algorithm with respect to the input size.
Time Complexity is the the measure of how long it takes for the algorithm to compute the required operation. Time complexity is measured using the Big-O notation.
Big-O notation is a way to measure performance of an operation based on the input size,n.
Run-time Complexity Types (BIG-O Notation Types)
Constant time O(1)
An algorithm is said to have a constant time when it’s run-time not dependent on the input data(n). No matter how big your collection is, the time it takes to perform an operation is constant. This means that the algorithm/operation will always take the same amount of time regardless of the number of elements we’re working with. For example, accessing the first element of a list is always O(1) regardless of how big the list is.
Logarithmic time O(log n)
Algorithms with logarithmic time complexity reduce the input data size in each step of the operation. Usually, Binary trees and Binary search operations have O(log n ) as their time complexity.
Linear time O(n)
An algorithm is said to have a linear time complexity when the run-time is directly and linearly proportional to the size of the input data. This is the best possible time complexity when the algorithm has to examine all the items in the input data. For example:
for value in data: print(value)
Example of such operations would be linear search hence the iteration over the list is O(n).
Quasilinear time(n log n)
Where each operation in the input data have a logarithm time complexity. Commonly seen in optimized sorting algorithms such as mergesort, timsort, heapsort.
In mergesort, the input data is broken down into several sub-lists until each sublist consitsts of a single element and then the sublists are merged into a sorted list. This gives us a time complexity of O(nlogn)
Quadratic time O(n^2):
An algorithm is said to have a quadratic time complexity when the time it takes to perform an operation is proportional to the square of the items in the collection. This occurs when the algorithm needs to perform a linear time operation for each item in the input data. Bubble sort has O(n^2) .For example, a loop within a loop:
for x in data: for y in data: print(x, y)
Exponential time O(2^n)
An algorithm is said to have an exponential time complexity when the growth doubles with each addition to the input data set. This kind of time complexity is usually seen in brute-force algorithms. For example, the recursive Fibonacci algorithm has O(2^n) time complexity.
Factorial time (n!)
An algorithm is said to have a factorial time complexity when every single permutation of a collection is computed in an operation and hence the time it takes to perform an operation is factorial of the size of the items in the collection. The Travelling Salesman Problem and the Heap’s algorithm(generating all possible permutations of n objects) have O(n!) time complexity.
Disadvantage:It is very slow.
Big-O notation Summary Graph
Best, Average and Worst Cases
Best case scenario
In the best-case analysis, we calculate lower bound on running time of an algorithm. Best-case scenario occurs when the data structures and the items in the collection along with the parameters are at their optimum state . This causes the minimum number of operations to be executed. For example, in linear search problem, the best case occurs when x(the item we are searching for) is present at the beginning of the list. The number of operations in the best case is constant (not dependent on n). So time complexity in the best case would be O(1)
Average case scenario
Occurs when we define the complexity based on the uniform distribution of the values of the input.
In average case analysis, we take all possible inputs and calculate computing time for all of the inputs. Sum all the calculated values and divide the sum by total number of inputs. We must know (or predict) distribution of cases. For the linear search problem, let us assume that all cases are uniformly distributed (including the case of x not being present in array). So we sum all the cases and divide the sum by (n+1).
Worst Case Scenario
Worst case scenario could be an operation that requires finding an item that is positioned as the last item in a large-sized collection such as a list and the algorithm iterates over the collection from the very first item. For example, a linear search’s worst case occurs when x is not present in the list so the iteration compares x with all the elements.This would give us a run-time of O(n)
Time Complexities of Python Data structures
Get Item: O(1)
Get Length: O(1)
Get Item: O(1)
Set Item: O(1)
Delete Item: O(1)
Iterate Over Dictionary: O(n
Check for item in set: O(1)
Difference of set A from B: O(length of A)
Intersection of set A and B: O(minimum of the length of either A or B)
Union of set A and B: O(N) with respect to length(A) + length(B)
Tuples support all operations that do not mutate the data structure (and they
have the same complexity classes).