DEV Community

Cover image for Argument parsing and subparsers in Python
TaiKedz
TaiKedz

Posted on • Edited on

Argument parsing and subparsers in Python

(Image (C) Tai Kedzierski)

Python ArgParse Quick-reference

A quick reference for argument parsing - and a suggestion for a sub-command implementation model.

Parsing - a quick reference

General argument definition (from the standard documentation), annotated.

We note that the type= argument is actually a callable handler, and could be any function, or even a constructor. In this way, you guarantee that an argument is immediately type-converted to what you need; anything then that uses that argument can get a fully-extrapolated argument as a result, without needing to defer any argument interpretation.

parser.add_argument(
    'integers', # Positional argument name

    metavar='N', # will be actually accessed in  "parsed_args.N"
                 # instead of "parsed_args.integers"

    type=int,    # actually a type converter - the callable must take 1 argument,
                 #  the value, and return the value coerced to the correct type

    nargs='+',   # Specifies that there must be one to any number of these

    help='an integer for the accumulator'  # A help string for "--help" option
    )
Enter fullscreen mode Exit fullscreen mode

If you want an option to actually be a flag, you can specify an action instead:

parsers.add_argument(
    # Optional argument name
    "--say-hi",

    # An action spec. See also store_const
    # this populates the parsed_args.say_hi with True
    #  if specified, but defaulting to False otherwise
    action="store_true"
    )


# If no action spec is provided, the optional argument
#  specifies that it needs to take a value
parsers.add_argument(
    "--config", # optional argument
    "-c", # shorthand
    )
# Essentially expects `... --config SOME_VALUE ...`
Enter fullscreen mode Exit fullscreen mode

Subparsers

Your script might be able to take subcommands - this is the situation where in calling your script, a particular type of action needs to be taken, each with its own argument tree. Sub-commands can, themselves, have their own sub-commands in turn.

For example, the awscli tool:

  • top level command is aws
  • for S3 interaction, we use the s3 subcommand: aws s3
  • we may want to either upload or download: aws s3 upload

A basic example of implementing an argument parsing function with subparsers (for sub commands):


def parse_app_args(arguments=None):
    # Create a top-level parser
    parser = argparse.ArgumentParser()

    # Immediately create a subparser holder for a sub-command.
    # The sub-command's string will be stored in "parsed_arguments.cmd"
    #   of the parsed arguments' namespeace object
    subparsers = parser.add_subparsers(dest="cmd")

    # Now we have the holder, add actual parsers to it

    # The first subcommand. When arguments are parsed, the value "status"
    #   would be found in "parsed_args.cmd" , because it is a named parser for the
    #   subparser collection with `dest="cmd"` from above
    subparser.add_parser("status")

    # We add a second subcommand, whose value is an alternative 
    # available to `cmd`
    subp_power = subparsers.add_parser("power")

    # Sub-arguments - uses the same function signature as described
    # in the documentation for top-level "add_parser"
    # Note the use of `choice=` to force specific raw values
    subp_power.add_argument("state", choices=["on", "off"])

    # parse_app_args implicitly parses sys.argv[1:]
    #   if arguments==None
    # but you could also parse a custom list of args as well...
    return parser.parse_args(arguments)


parsed_args = parse_app_args()
print(parsed_args)

if parsed_args.cmd == "status":
    print_status()
elif parsed_args.cmd == "power":
    set_power(parsed_args.state)
Enter fullscreen mode Exit fullscreen mode

A suggestion for subcommand-to-module implementation

Often you will find you need multiple actions for your application - that's when you can implement subcommands, and use sub-parsers.

At this point, it is worth considering that unless you like large, unwieldy files, or hopping around trying to marry argument parsers in one file with their corresponding modules in another, then you might want to be a bit more strategic about your implementation.

In the following example, we impleemnt

  • the main logic and base parser in the main.py file
  • but specifically let the modules be in charge both of
    • the arguments they expect for their subcommands
    • their actual implementation.
    • with all modules providing both setup_args(subparser) and run(args) methods

main.py

#!/usr/bin/env python3

import argparse

# Some fictional machine API - split the logic into modules
import power
import engine


def parse_app_args(args=None):
    parser = argparse.ArgumentParser()

    subparsers = parser.add_subparsers(dest="cmd")

    # Farming out the subparser definitionss to their respective modules
    # So each module can define its parsing options itself
    power.setup_args(subparsers)
    engine.setup_args(subparsers)

    return parser.parse_args(args)


def main():
    parsed_args = parse_app_args()

    # We make a point of moving subcommand implementations to their own files,
    #  to decluttrer this main file
    command_map = {
        "power": power.run,
        "engine": engine.run,
    }

    # Because the parser will only accept values for the named subparser,
    #  we can consider the check has already been done for us :-)
    command_map[parsed_args.cmd](parsed_args)


if __name__ == "__main__":
    main()

Enter fullscreen mode Exit fullscreen mode

engine.py

def setup_args(subparsers):
    subp_engine = subparsers.add_parser("engine")
    subp_engine.add_argument("speed", type=int)


def run(args):
    print("Setting engine speed: {}".format(args.speed))
Enter fullscreen mode Exit fullscreen mode

power.py

def setup_args(subparsers):
    subp_power = subparsers.add_parser("power")
    subp_power.add_argument("state", choices=["on", "off"])


def run(args):
    print("Setting power state: {}".format(args.state))
Enter fullscreen mode Exit fullscreen mode

Making one argument dependent on another

Subparsers are useful when mutually-exclusive sub-commands are to be specified.

However in some cases one argument only needs to be specified if a primary argument was supplied, and two primary arguments are expected to co-exist.

For example, I could have a demo-server standup tool:

standup.py [--server-1 --config-1 CONFIG1] [--server-2 --config-2 CONFIG2]
Enter fullscreen mode Exit fullscreen mode

Following a StackOverflow response on the problem, it seems that using two parser steps is needed.

# Add a checker to set the required flags
checker = argparse.ArgumentParser()
checker.add_argument("--server-1", action="store_true")
checker.add_argument("--server-2", action="store_true")
checks, _ = checker.parse_known_args()

# Create a new main parser, inheriting the argument definitions
#   from the checker, to not duplicate definitions.
parser = argparse.ArgumentParser(parents=[checker])

# Use the flags from `checker` to flip the requirement state
#   as needed
parser.add_argument("--config-1", required=checks.server_1)
parser.add_argument("--config-2", required=checks.server_2)

parsed_args = parser.parse_args()
Enter fullscreen mode Exit fullscreen mode

Top comments (3)

Collapse
 
xtofl profile image
xtofl

I love the modularity this brings! The nice thing about this approach is that it doesn't rely on anything but standard libraries.

The downside it that the subcommand run commands receive the whole args namespace, which invites the implementer to use information not intended for them.

(btw, parser.parse_args() uses sys.args by default, no need to provide them)

I recently gave up on my hang to 'as pure-python as possible', and started using click.group. click reduces argument parsing boilerplate to the bare minimum. On top of that, it pulls you towards separation of parsing and business logic by offering convenient decorators.

The pattern in click becomes:

# cli.py
import click  # sigh.  a third party lib.  o well, let's give it a chance.
import power
import engine

@click.group()
def cli():
  pass

cli.add_command(engine.cli, "engine")
cli.add_command(power.cli, "power")

if __name__=="__main__":
    cli()
Enter fullscreen mode Exit fullscreen mode
import click

@click.command()
@click.option('--state', type=click.Choice(("on", "off")))
def cli(state):
  print("Setting power state: {}".format(state))
Enter fullscreen mode Exit fullscreen mode
Collapse
 
taikedz profile image
TaiKedz • Edited

Ah yes, when there are multiple other parts of the command, the subcommands can get in a bit of a tizzy, or even clash.... I guess each submodule's implementor needs to be aware of each other. I'd actually imagine this were possible with a bit of __eq__() hackery

if parsed_args.cmd == "power":
    power.run(parsed_args.cmd)
Enter fullscreen mode Exit fullscreen mode

where parsed_args.cmd is actually a namespace - but in an equality check, uses its string value... we explored that kind of overriding didn't we 😅

click looks interesting.... yes, I do also have a (strong!) preference to try and stay with the default batteries and only look askance if it's really getting too hairy....

With click though, I dread a cascading chain of decorators when we start dealing with a multitude of arguments .... I'm also reeling from some of the syntax that it seems to introduce...

Collapse
 
taikedz profile image
TaiKedz • Edited

Also, I tried the use-case of multiple-commands and, of course, only one subcommand can be passed.

Subcommands can co-exist with other options at the same level as the subcommand so the problem persists.... then again, only one subparser can exist on one same level. Getting several to co-exist might in fact be beyond the scope of argparse and doing so might be a rather fraught exercise...

Doing more than script.py subcommand $SUB_CMD_ARGS (like, some ill-fated script.py sub1 $SUB_CMD_ARGS ... sub2 $SUB2_ARGS ... - where do the arguments of sub1 end and where does sub2 start ??) likely warrants a revised approach ... EDIT: this is nonsensical. Only one subcommand should run at a time. What I was thinking of was, each option having required supplementary arguments.

Like (--activate-engine1 requiring as a result --engine1-config CONFIG1) alongside (--activate-engine2 requiring as a result --engine2-config CONFIG2). Making some arguments dependent on the existence of others.