When it comes to finding ways to run a task python offers us tools that allow us to automate those task, whether it be read/writes or api calls there's lots of usecases where this might be necessary.
Lets start with understanding the what. So, what's a subprocess in python?
A subprocess in Python is a task that a python script delegates to the Operative system (OS) - Daniel Diaz, GeekFlare
In short, a subprocess allows us to achieve the unachievable by giving us a new level of cohesiveness with the operating system.
- Understand the concept of subprocess
- Have learned the basics of the Python multi-process/time library
- Found meaningful examples/usecases where you can implement the subprocess/multiprocessing library
In this tutorial I am assuming you have some basic foundations in python programming, and data structures
The nice part about the subprocess module is it falls under the core utilities offered to us as python developers. This means there is no need to install this module as a dependency using PIP or Pipenv, it's there for us already!
#calling the subprocess module into our environment
Quick use cases,
- Program checker
- Setting up a virtual environments
- Run other programming languages
- Open external programs
In this tutorial we will be looking at using the Process.start() module to run background computations for excel data.
So after learning at a high level what the subprocess package does, and what it looks like at the import level you're ready to move towards this part of the tutorial.
I wanted to give two quick distinctions we will be using the
Process.start() api from the
multiprocessing library offered by Python.
So, why cover Subprocess's?
I like to explain things backwards. For me this helps by understanding the full context of my environment in which I am learning from when what I am learning from is more reading intensive rather than visually intensive.
Recap: We know what subprocess's are now think of multiprocessing as the stepchild/(really parent) to what a subprocess will allow you to do. Instead of running that process in the background we can now bring it "foreground" by running multiple process's single handedly.
Here we are going to take our data set and using some basic arithmetic sort and sum all the entries in our list of data. Once
this is complete we will set a task for this to occur every 60 seconds to simulate a batched process using code. To draw more from real world scenarios this data set could be healthcare/financial data for a company, and your jobs has been to extract, transform this data into a list. So, after doing so the company will need a summary for all the data in this list to make biweekly decisions.
#lets assume this data is updated every half minute
data = [3, 4, 7, 10, 2, 32, 15, 8]
from multiprocessing import Process
from file_location import data
results = sum(data)
if __name__ == '__main__':
p = Process(target=process_manager(), args=(data))
To explain, we import our
multiprocessing and near real time data object into our code environment. We define a function to govern our process that we want to run every minute.
Here you can see how Process takes two arguments, the target function and arguments. These will work together to fork this process in our developer environment and run in sync with our application.