DEV Community

Cover image for Automation and Multi-processing w/ Python
Adrian Brown
Adrian Brown

Posted on

Automation and Multi-processing w/ Python

TL;DR

When it comes to finding ways to run a task python offers us tools that allow us to automate those task, whether it be read/writes or api calls there's lots of usecases where this might be necessary.

Lets start with understanding the what. So, what's a subprocess in python?

A subprocess in Python is a task that a python script delegates to the Operative system (OS) - Daniel Diaz, GeekFlare

In short, a subprocess allows us to achieve the unachievable by giving us a new level of cohesiveness with the operating system.

Introduction

By the end of this tutorial, you will

  • Understand the concept of subprocess
  • Have learned the basics of the Python multi-process/time library
  • Found meaningful examples/usecases where you can implement the subprocess/multiprocessing library

Core Concepts

In this tutorial I am assuming you have some basic foundations in python programming, and data structures

The nice part about the subprocess module is it falls under the core utilities offered to us as python developers. This means there is no need to install this module as a dependency using PIP or Pipenv, it's there for us already!

#calling the subprocess module into our environment
import subprocess
Enter fullscreen mode Exit fullscreen mode

Quick use cases,

  1. Program checker
  2. Setting up a virtual environments
  3. Run other programming languages
  4. Open external programs

Our Application

In this tutorial we will be looking at using the Process.start() module to run background computations for excel data.

Tutorial

So after learning at a high level what the subprocess package does, and what it looks like at the import level you're ready to move towards this part of the tutorial.

I wanted to give two quick distinctions we will be using the Process.start() api from the multiprocessing library offered by Python.

So, why cover Subprocess's?

I like to explain things backwards. For me this helps by understanding the full context of my environment in which I am learning from when what I am learning from is more reading intensive rather than visually intensive.

Recap: We know what subprocess's are now think of multiprocessing as the stepchild/(really parent) to what a subprocess will allow you to do. Instead of running that process in the background we can now bring it "foreground" by running multiple process's single handedly.

Task

Here we are going to take our data set and using some basic arithmetic sort and sum all the entries in our list of data. Once
this is complete we will set a task for this to occur every 60 seconds to simulate a batched process using code. To draw more from real world scenarios this data set could be healthcare/financial data for a company, and your jobs has been to extract, transform this data into a list. So, after doing so the company will need a summary for all the data in this list to make biweekly decisions.

Our data set
#lets assume this data is updated every half minute
data = [3, 4, 7, 10, 2, 32, 15, 8]
Enter fullscreen mode Exit fullscreen mode
Code
from multiprocessing import Process
from file_location import data

def process_manager(data):
    while True:
        results = sum(data)
        time.sleep(60)
        return results

if __name__ == '__main__':
    p = Process(target=process_manager(), args=(data))
    p.start()
Enter fullscreen mode Exit fullscreen mode

To explain, we import our multiprocessing and near real time data object into our code environment. We define a function to govern our process that we want to run every minute.

Here you can see how Process takes two arguments, the target function and arguments. These will work together to fork this process in our developer environment and run in sync with our application.

NOTE: Python has been known to lack/struggle with multiprocessing and concurrency compared to other tools/packages offered by other languages like rust, so be sure to check the trade offs when making time-cost decisions. Feel free to copy/paste and try this in your own IDE, if you run into issues spend some time debugging and getting your code to run. This should be a safe exercise to practice your debugging skills with as you will gain experience/understanding of more PYTHON!

Link to multiprocess

Top comments (1)

Collapse
 
adrbrownx profile image
Adrian Brown

Very interesting! I’ll definitely check it out.