CLI apps with Python

#python #beginners #ebook #exercises

This chapter will show a few examples of processing CLI arguments using sys and argparse modules. The fileinput module is also introduced in this chapter, which is handy for in-place file editing.

sys.argv

Command line arguments passed when executing a Python program can be accessed as a list of strings via sys.argv. The first element (index 0) contains the name of the Python script or -c or empty string, depending upon how the Python interpreter was called. Rest of the elements will have the command line arguments, if any were passed along the script to be executed. See docs.python: sys.argv for more details.

Here's a program that accepts two numbers passed as CLI arguments and displays the sum only if the input was passed correctly.

# sum_two_nums.py
import ast
import sys

try:
    num1, num2 = sys.argv[1:]
    total = ast.literal_eval(num1) + ast.literal_eval(num2)
except ValueError:
    sys.exit('Error: Please provide exactly two numbers as arguments')
else:
    print(f'{num1} + {num2} = {total}')

The ast.literal_eval() method is handy for converting a string value to built-in literals, especially for collection data types. If you wanted to use int() and float() for the above program, you'd have to add logic for separating the input into integers and floating-point first. Passing a string to sys.exit() gets printed to the stderr stream and sets the exit status as 1 in addition to terminating the script.

Here's a sample run:

$ python3.9 sum_two_nums.py 2 3.14
2 + 3.14 = 5.140000000000001
$ echo $?
0

$ python3.9 sum_two_nums.py 2 3.14 7
Error: Please provide exactly two numbers as arguments
$ echo $?
1
$ python3.9 sum_two_nums.py 2 abc
Error: Please provide exactly two numbers as arguments

As an exercise, modify the above program to handle TypeError exceptions. Instead of the output shown below, inform the user about the error using sys.exit() method.

$ python3.9 sum_two_nums.py 2 [1]
Traceback (most recent call last):
  File "/home/learnbyexample/Python/programs/sum_two_nums.py", line 6, in <module>
    total = ast.literal_eval(num1) + ast.literal_eval(num2)
TypeError: unsupported operand type(s) for +: 'int' and 'list'

As another exercise, accept one or more numbers as input arguments. Calculate and display the following details about the input — sum, product and average.

In-place editing with fileinput

To edit a file in-place, the fileinput module comes in handy. Here's a program that loops over filenames passed as CLI arguments (i.e. sys.argv[1:]), does some processing and writes back the changes to the original input files. You can also provide one or more filenames to the files keyword argument, if you do not wish to pass them as CLI arguments.

# inplace_edit.py
import fileinput

with fileinput.input(inplace=True) as f:
    for ip_line in f:
        op_line = ip_line.rstrip('\n').capitalize() + '.'
        print(op_line)

Note that unlike open(), the FileInput object doesn't support write() method. However, using print() is enough. Here's a sample run:

$ python3.9 inplace_edit.py [io]p.txt

$ # check if files have changed
$ cat ip.txt
Hi there.
Today is sunny.
Have a nice day.
$ cat op.txt
This is a sample line of text.
Yet another line.

$ # if stdin is passed as input, inplace gets disabled
$ echo 'GooD moRNiNg' | python3.9 inplace_edit.py
Good morning.

As inplace=True permanently modifies your input files, it is always a good idea to check your logic on sample files first. That way your data wouldn't be lost because of an error in your program. You can also ask fileinput to create backups if you need to recover original files later — for example, backup='.bkp' will create backups by adding .bkp as the suffix to the original filenames.

argparse

sys.argv is good enough for simple use cases. If you wish to create a CLI application with various kinds of flags and arguments (some of which may be optional/mandatory) and so on, use a module such as the built-in argparse or a third-party solution like click.

Quoting from docs.python: argparse:

The argparse module makes it easy to write user-friendly command-line interfaces. The program defines what arguments it requires, and argparse will figure out how to parse those out of sys.argv. The argparse module also automatically generates help and usage messages and issues errors when users give the program invalid arguments.

Here's a CLI application that accepts a file containing a list of filenames that are to be sorted by their extension. Files with the same extension are further sorted in ascending order. The program also implements an optional flag to remove duplicate entries.

# sort_ext.py
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('-f', '--file', required=True,
                    help="input file to be sorted")
parser.add_argument('-u', '--unique', action='store_true',
                    help="sort uniquely")
args = parser.parse_args()

ip_lines = open(args.file).readlines()
if args.unique:
    ip_lines = set(ip_lines)

op_lines = sorted(ip_lines, key=lambda s: (s.rsplit('.', 1)[-1], s))
for line in op_lines:
    print(line, end='')

The documentation for the CLI application is generated automatically based on the information passed to the parser. You can use help options (which is added automatically too) to view the documentation, as shown below:

$ python3.9 sort_ext.py -h
usage: sort_ext.py [-h] -f FILE [-u]

optional arguments:
  -h, --help            show this help message and exit
  -f FILE, --file FILE  input file to be sorted
  -u, --unique          sort uniquely

$ python3.9 sort_ext.py
usage: sort_ext.py [-h] -f FILE [-u]
sort_ext.py: error: the following arguments are required: -f/--file

The add_argument() method allows you to add details about an option/argument for the CLI application. The first parameter names an argument or option (starts with -). The help keyword argument lets you add documentation for that particular option/argument. See docs.python: add_argument for documentation and details about other keyword arguments.

The above program adds two options, one to store the filename to be sorted and the other to act as a flag for sorting uniquely. Here's a sample text file that needs to be sorted based on the extension.

$ cat sample.txt
input.log
basic.test
input.log
out.put.txt
sync.py
input.log
async.txt

Here's the output with both types of sorting supported by the program.

# default sort
$ python3.9 sort_ext.py -f sample.txt
input.log
input.log
input.log
sync.py
basic.test
async.txt
out.put.txt

# unique sort
$ python3.9 sort_ext.py -uf sample.txt
input.log
sync.py
basic.test
async.txt
out.put.txt

See docs.python HOWTOs: Argparse Tutorial for a more detailed introduction.

Accepting stdin

CLI tools like grep, sed, awk and many others can accept data from stdin as well as accept filenames as arguments. The previous program modified to add stdin functionality is shown below. args.file is now a positional argument instead of an option. nargs='?' indicates that this argument is optional. type=argparse.FileType('r') allows you to automatically get a filehandle in read mode for the filename supplied as an argument. If filename isn't provided, default=sys.stdin kicks in and you get a filehandle for the stdin data.

# sort_ext_stdin.py
import argparse, sys

parser = argparse.ArgumentParser()
parser.add_argument('file', nargs='?',
                    type=argparse.FileType('r'), default=sys.stdin,
                    help="input file to be sorted")
parser.add_argument('-u', '--unique', action='store_true',
                    help="sort uniquely")
args = parser.parse_args()

ip_lines = args.file.readlines()
if args.unique:
    ip_lines = set(ip_lines)

op_lines = sorted(ip_lines, key=lambda s: (s.rsplit('.', 1)[-1], s))
for line in op_lines:
    print(line, end='')

Here's the help for the modified program:

$ python3.9 sort_ext_stdin.py -h
usage: sort_ext_stdin.py [-h] [-u] [file]

positional arguments:
  file          input file to be sorted

optional arguments:
  -h, --help    show this help message and exit
  -u, --unique  sort uniquely

Here's a sample run showing both stdin and filename argument functionality.

# 'cat' is used here for illustration purposes only
$ cat sample.txt | python3.9 sort_ext_stdin.py
input.log
input.log
input.log
sync.py
basic.test
async.txt
out.put.txt
$ python3.9 sort_ext_stdin.py -u sample.txt
input.log
sync.py
basic.test
async.txt
out.put.txt