Optimised code is essential because it directly impacts the efficiency, performance, and scalability of software. Well-written code runs faster, consumes fewer resources, and is more maintainable, making it better suited for handling larger workloads and improving user experience. It also reduces operational costs, as efficient code requires less processing power and memory, which is particularly crucial in environments with limited resources, such as embedded systems or large-scale cloud applications.
Poorly written code, on the other hand, can lead to slow execution times, increased energy consumption, and higher infrastructure costs. For example, in a web application, inefficient code can slow down page loads, leading to a poor user experience and potentially driving users away. In data processing tasks, inefficient algorithms can significantly increase the time it takes to process large datasets, delaying critical insights and decisions.
Moreover, optimised code is often more straightforward to maintain and extend. By adhering to optimisation best practices, developers can ensure that their codebase remains clean and modular, making it easier to update or scale the application as needed. This becomes increasingly important as software projects grow in complexity and as the demands on the system increase.
Let’s explore 10 Python programming optimisation techniques that can help you write more efficient and performant code. These techniques are crucial for developing robust applications that meet performance requirements while remaining scalable and maintainable over time. These techniques can also be applied to other programming languages by following the best practices.
1. Variable Packing
Variable packing minimises memory usage by grouping multiple data items into a single structure. This technique is critical in scenarios where memory access times significantly impact performance, such as in large-scale data processing. When related data is packed together, it allows for more efficient use of CPU cache, leading to faster data retrieval.
Example:
import struct
# Packing two integers into a binary format
packed_data = struct.pack('ii', 10, 20)
# Unpacking the packed binary data
a, b = struct.unpack('ii', packed_data)
In this example, using the struct
module packs integers into a compact binary format, making data processing more efficient.
2. Storage vs. Memory
Understanding the difference between storage (disk) and memory (RAM) is crucial. Memory operations are faster but volatile, while storage is persistent but slower. In performance-critical applications, keeping frequently accessed data in memory and minimising storage I/O is essential for speed.
Example:
import mmap
# Memory-mapping a file
with open("data.txt", "r+b") as f:
mmapped_file = mmap.mmap(f.fileno(), 0)
print(mmapped_file.readline())
mmapped_file.close()
Memory-mapped files allow you to treat disk storage as if it were memory, speeding up access times for large files.
3. Fixed-Length vs. Variable-Length Variables
Fixed-length variables are stored in a contiguous block of memory, making access and manipulation faster. Variable-length variables, on the other hand, require additional overhead to manage dynamic memory allocation, which can slow down operations, particularly in real-time systems.
Example:
import array
# Using fixed-length array for performance
fixed_array = array.array('i', [1, 2, 3, 4, 5])
# Dynamic list (variable-length)
dynamic_list = [1, 2, 3, 4, 5]
Here, array.array
provides a fixed-length array, offering more predictable performance than dynamic lists.
4. Internal vs. Public Functions
Internal functions are those intended to be used only within the module where they are defined, often optimised for speed and efficiency. Public functions are exposed for external use and may include additional error handling or logging, making them slightly less efficient.
Example:
def _private_function(data):
# Optimized for internal use, with minimal error handling
return data ** 2
def public_function(data):
# Includes additional checks for external use
if isinstance(data, int):
return _private_function(data)
raise ValueError("Input must be an integer")
By keeping the heavy computation in a private function, you optimise the code's efficiency, reserving public functions for external safety and usability.
5. Function Modifiers
In Python, decorators serve as function modifiers, allowing you to add functionality before or after the function's main execution. This is useful for tasks like caching, access control, or logging, which can optimise resource usage across multiple function calls.
Example:
from functools import lru_cache
@lru_cache(maxsize=100)
def compute_heavy_function(x):
# A computationally expensive operation
return x ** x
Using lru_cache
as a decorator caches the results of expensive function calls, improving performance by avoiding redundant computations.
6. Use Libraries
Leveraging libraries allows you to avoid reinventing the wheel. Libraries like NumPy are written in C and built for performance, making them far more efficient for heavy numerical computations compared to pure Python implementations.
Example:
import numpy as np
# Efficient matrix multiplication using NumPy
matrix_a = np.random.rand(1000, 1000)
matrix_b = np.random.rand(1000, 1000)
result = np.dot(matrix_a, matrix_b)
Here, NumPy's dot
function is enhanced for matrix operations, far outperforming nested loops in pure Python.
7. Short-Circuiting Conditionals
Short-circuiting reduces unnecessary evaluations, which is particularly valuable in complex condition checks or when involving resource-intensive operations. It prevents execution of conditions that don't need to be checked, saving both time and computational power.
Since conditional checks will stop the second they find the first value which satisfies the condition, you should put the variables most likely to validate/invalidate the condition first. In OR conditions (or), try to put the variable with the highest likelihood of being true first, and in AND conditions (and), try to put the variable with the highest likelihood of being false first. As soon as that variable is checked, the conditional can exit without needing to check the other values.
Example:
def complex_condition(x, y):
return x != 0 and y / x > 2 # Stops evaluation if x is 0
In this example, Python’s logical operators ensure that the division is only executed if x
is non-zero, preventing potential runtime errors and unnecessary computation.
8. Free Up Memory
In long-running applications, especially those dealing with large datasets, it’s essential to free up memory once it’s no longer needed. This can be done using del
, gc.collect()
, or by allowing objects to go out of scope.
Example:
import gc
# Manual garbage collection to free up memory
large_data = [i for i in range(1000000)]
del large_data
gc.collect() # Forces garbage collection
Using gc.collect()
ensures that memory is reclaimed promptly, which is critical in memory-constrained environments.
9. Short Error Messages
In systems where memory or bandwidth is limited, such as embedded systems or logging in distributed applications, short error messages can reduce overhead. This practice also applies to scenarios where large-scale error logging is necessary.
Example:
try:
result = 10 / 0
except ZeroDivisionError:
print("Err: Div/0") # Short, concise error message
Short error messages are useful in environments where resource efficiency is crucial, such as IoT devices or high-frequency trading systems.
10. Optimize Loops
Loops are a common source of inefficiency, especially when processing large datasets. Optimising loops by reducing iterations, simplifying the logic, or using vectorised operations can significantly improve performance.
Example:
import numpy as np
# Vectorised operation with NumPy
array = np.array([1, 2, 3, 4, 5])
# Instead of looping through elements
result = array * 2 # Efficient, vectorised operation
NumPy (Numerical Python) is a popular Python library used for numerical and scientific computing. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays.
It can be installed with pip by running
pip install numpy
.
Vectorisation eliminates the need for explicit loops, leveraging low-level optimisations for faster execution.
By applying these techniques, you can ensure your Python or other programming language programs run faster, use less memory, and are more scalable, which is especially important for applications in data science, web and systems programming.
PS: you can use https://perfpy.com/#/ to check python code efficiency.
Top comments (28)
While it's certainly good practice to keep some things "private" that are not intended to be used by external callers, this does nothing for speed or efficiency. Prefixing with an underscore as in the example is really just a convention anyway and doesn't prevent calling the function.
Thanks for the feedback.
Actually, optimized code is harder to mantain. That's why it is much better to use an appropriate algorithm than use, say, arrays instead of lists. Moreover, the use of these techniques before improving the algorithm in use would be an example of premature optimization.
consider lazy evaluation (
map
,filter
,reduce
, and the entireitertools
module). I have often seen folks comprehend a large list into memory for a loop only to break the loop early meaning all members of the comprehended list with indices higher than the member where the list was broken were wasted CPU time and memory. I have also seen folks comprehend an original dataset into a second, separate variable (possibly for clarity's sake?) to use once and then never again but the GC never collects it because a variable still refers to it. It would have been better to either assign the second variable as a map of the original data variable or simply iterate directly over the comprehended list without assigning a variable name to it so that the GC cleans it up immediately after the loop is complete rather than having it stick around in memory for no reason.tl;dr there is a time and place for mappings vs comprehensions. Both have pros and cons. Please consider why and how you are using comprehended variables!
Yes that’s right, thanks for the insight.
loops are the most unoptimized when it comes to python. many devs overlook this.
great read james!
I can tell you that the loop problem is dev specific, not Python. You need to design good logic or program flow to have optimized loops. I've seen 2-3 loops in places where changing the flow will leave just 1 in place
Agreed!
@roshan_khan_28 yes that’s right, thanks for the feedback.
Interesting, and similar advice was given over 20 years for other languages. In any case, keep in mind Knuth's advice:
“The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.”
stackify.com/premature-optimizatio...
Section 10 Optimizing Loops, you introduce numpy without a single mention of it. I find that a tad surprising and would fix it. Numpy is indeed great for such operations as you're illustrated but it is not a small library, nor the only way to do that (jax is another example), what you are introducing here is literally a suggestion to use numpy for efficient processing of large data sets which is great advice, but that is how it should be introduce IMHO - it's not a Python thing so much as a numpy thing.
Thanks for the notice, I have added some explanation about numpy.
While short logs could seems as good idea from the pov of an SRE it would be better to use a logger with configurable format so you can not only have control over format but also over log level
This increases application observability, log readability and shorten debug times!
@juanitomint Thank you for the addition.
Thanks a lot 🪴
I guess I would have to bookmark this now because some of my code are so memory tasking and would immensely benefit from these tips.
Good Article! Insightful and loved it!
Thanks @ddebajyati, I’m glad you enjoyed it.
I think the memory mapped file does not speed up the code. It just let you treat a file as a bytearray in memory . It still needs disk I/O to read the data in a file.
Yeah maybe it's a good strategy for allocate that time at startup do frequently accessed files 🤔
is the python going to be the future ???
OR
google doesn't need the python developers???
Nice!
Great list, James! Avoiding unnecessary work is another great one ;)
Very helpful