Kshitij Chawla

Posted on Mar 23, 2019 • Edited on Apr 6, 2019

Code. Changing it without breaking it, using a Decorator.

#python #decorators #softwaredevelopment #programming

How does one go from knowing the functionality to using it to solve problems?

The answer is by using it in a personal low impact, side project. Reading a post like this might help as well.

I recently started working as a Python Developer for Nilearn (http://nilearn.github.io/) & Nistats (http://nistats.github.io/), two open source python libraries used in neuroscience for analyzing functional MRI data.

After merging a recent pull-request in Nilearn, it occurred to me after the fact, that it would have been useful to document the entire process for other people like me who are new to development work and did not write commercial code in their previous occupation.

I decided to retrace and reconstruct my steps and my thinking at the time, adding code and text to illustrate them. This has the unfortunate side-effect that the depth of detail is only as good as my memory and the actual code samples will only be as good as the commits I made. There will also be code samples that I wrote that I never committed or never wrote and simply thought of as possibilities in my mind.

That should not take away from the utility of this post.

Note that I (eventually) use decorators and **kwargs but do not explain these, as that is not the purpose with which I am writing this post, there are plenty of tutorials online to understand those two ideas.

Additionally, the libraries are currently Python2 compatible, so our choices can be slightly limited sometimes.

That being said, Geronimo!

Prologue

This is the function we had in our library. I have removed the docstring for brevity, and the functionality is not important for this post:

def view_connectome(adjacency_matrix, 
                    coords, 
                    threshold=None,
                    cmap=cm.cyan_orange, 
                    symmetric_cmap=True,
                    linewidth=6., 
                    marker_size=3.,
                    ):
    connectome_info = _get_connectome(
        adjacency_matrix, coords, threshold=threshold, cmap=cmap,
        symmetric_cmap=symmetric_cmap)
    connectome_info["line_width"] = linewidth
    connectome_info["marker_size"] = marker_size
    return _make_connectome_html(connectome_info)

The task was to make the view_connectome() consistent with another, older function in the library, plot_connectome(). We wanted them to have the same function signature.

Why? Consistency, which, when handled properly and reasonably, can assist with ease of use. In essence once a user understands one of our <>_connectome() function, they will know how to call all of them.

We wanted to change the names of certain parameters in the function signature:

coords > node_coords
cmap > edge_cmap
threshold > edge_threshold
marker_size > node_size

Zeroth Iteration:

If I was the ONLY user of this library, then the change is extremely simple:

def view_connectome(adjacency_matrix, 
                    node_coords,  # change param name here,
                    edge_threshold=None,  # here,
                    edge_cmap=cm.cyan_orange, # here,
                    symmetric_cmap=True,
                    linewidth=6., 
                    node_size=3.,  # and here.
                    ):
    connectome_info = _get_connectome(
    # rename the variables in this line, ...
        adjacency_matrix, node_coords, threshold=edge_threshold, cmap=edge_cmap,
        symmetric_cmap=symmetric_cmap)
    connectome_info["line_width"] = linewidth
    connectome_info["marker_size"] = node_size  # ...and here.
    return _make_connectome_html(connectome_info)

Rename the parameters in the function signature, rename the variables inside the function body, make the appropriate changes in the docstring aaand...
Boom! DONE!

Not.

The Considerations:

I am NOT the only user of this library. If I made the changes this way, it would break things for every other user's existing code if they used keyworded arguments (which they should).

Do notice that I didn't change the names everywhere, only in the parameters and the associated variables. I didn't change it in the key name in connectome_info["marker_size"] = node_size, and I didn't change the cmap parameter in _get_connectome().

This is because:

That is not what the PR set out to do, and PRs should be as compact as possible to make it easy to see, track, debug, and review changes. Any additional issue or potential improvement or code clean up is a new issue, a new PR, that one may file simultaneously, and separately from current PR.
Changing anything else may trigger downstream problems that will need to be attended to.
_get_connectome() is a private function in Python, not a user facing one, so making changes to that is considerably simpler in terms of consequences to user interface.

What we needed to do was figure out a way to:

Make sure the positional args don't break.
Make both the new and old names of parameters work for a time.
Show deprecation warning to users during this transition phase.

For Nilearn, that transition period typically means 2 releases (roughly 8-11 months), including point/minor releases.

The above implementation only satisfies the first consideration (1. Make sure the positional args don't break.).

First Iteration

It's easy enough to do.
We replace the original params with new ones to preserve positional arguments, then add the replaced params at the end so the old keyworded args will still work.

Then we add the code for handing over the args to new parameters and generating the appropriate deprecation warnings, within this function's body.

import warnings


def view_connectome(adjacency_matrix, 
                    node_coords, 
                    edge_threshold=None,
                    edge_cmap=cm.cyan_orange, 
                    symmetric_cmap=True,
                    linewidth=6., 
                    node_size=3.,
                    # placing old params here to preserve old keyworded args.
                    coords=None, threshold=None, cmap=None, marker_size=None,
                    ):
    # code for the deprecation
    param_deprecation_msg_template = ('The param {} is being deprecated and will be
replaced by the param {} in future. Please use {}.')
    if marker_size:
        node_size = marker_size
        warnings.warn(param_deprecation_msg_template.format('marker_size',
                                                            'node_size',
                                                            'node_size',
                                                            )
    if threshold is not None:  # numpy arrays don't have unambiguous truthiness.
        edge_threshold = threshold
        warnings.warn(param_deprecation_msg_template.format('threshold',
                                                            'edge_threshold',
                                                            'edge_threshold',
                                                            )
    if cmap is not None:  # numpy arrays don't have unambiguous truthiness.
        edge_cmap = cmap
        warnings.warn(param_deprecation_msg_template.format('cmap',
                                                            'edge_cmap',
                                                            'edge_cmap',
                                                            )
    if coords is not None:  # numpy arrays don't have unambiguous truthiness.
        node_coords = coords
        warnings.warn(param_deprecation_msg_template.format('coords',
                                                            'node_coords',
                                                            'node_coords',
                                                            )    
    # original functionality of the function,
    connectome_info = _get_connectome(adjacency_matrix, 
                                      node_coords,
                                      edge_threshold=threshold,
                                      cmap=edge_cmap,
                                      symmetric_cmap=symmetric_cmap,
                                      )
    connectome_info["line_width"] = linewidth
    connectome_info["marker_size"] = node_size
    return _make_connectome_html(connectome_info)

Okay, that works, but (hu)man! does that look unclean or what?!
I have made the code longer, harder to read, unpleasant to look at, plus the function body is now doing more than one thing: parsing and handling the args and what it was originally doing.

I have deliberately omitted the warnings.filter() part out here (DeprecationWarnings are not displayed to end-users by default) for brevity.

Second Iteration

I decided to refactor the verbose warnings part into its own private function and calling that from within the principal function.


def view_connectome(adjacency_matrix, 
                    node_coords, 
                    edge_threshold=None,
                    edge_cmap=cm.cyan_orange, 
                    symmetric_cmap=True,
                    linewidth=6., 
                    node_size=3.,
                    **kwargs,
                    ):
    # generate the deprecation warnings
    kwargs = _param_deprecation_view_connectome(kwargs)

    # handover the args from old params to new ones.
    if kwargs['marker_size']:
        node_size = marker_size
    if kwargs['threshold'] is not None:
        edge_threshold = threshold
    if kwargs['cmap'] is not None:
        edge_cmap = cmap
    if kwargs['coords'] is not None:
        node_coords = coords

    connectome_info = _get_connectome(
        adjacency_matrix, node_coords, threshold=edge_threshold, cmap=edge_cmap,
        symmetric_cmap=symmetric_cmap)
    connectome_info["line_width"] = linewidth
    connectome_info["marker_size"] = node_size
    return _make_connectome_html(connectome_info)


def _param_deprecation_view_connectome(kwargs):
    kwargs.setdefault('marker_size', None)
    kwargs.setdefault('threshold', None)
    kwargs.setdefault('cmap', None)
    kwargs.setdefault('coords', None)


    param_deprecation_msg_template = ('The param {} is being deprecated and will be
                                       replaced by the param {} in future. Please
                                     use {}.')
    if kwargs['marker_size']:
        warnings.warn(param_deprecation_msg_template.format('marker_size',
                                                             'node_size',
                                                             'node_size',
                                                            )
    if kwargs['threshold'] is not None:  # numpy arrays don't have unambiguous truthiness.
        warnings.warn(param_deprecation_msg_template.format('threshold',
                                                            'edge_threshold',
                                                            'edge_threshold',
                                                            )
    if kwargs['cmap'] is not None:  # numpy arrays don't have unambiguous truthiness.
        warnings.warn(param_deprecation_msg_template.format('cmap',
                                                            'edge_cmap',
                                                            'edge_cmap',
                                                            )
    if kwargs['coords'] is not None:  # numpy arrays don't have unambiguous truthiness.
        warnings.warn(param_deprecation_msg_template.format('coords',
                                                            'node_coords',
                                                            'node_coords',
                                                           )    
    return kwargs

Much better. Here's what I thought about it:

Less number of code lines added to the original function.
Easier to write a unit test to check raised warnings.
Function signature hardly changed more than minimum necessity.

and...

Several lines of code had to be added to the original function, nevertheless.
No unit test to test the correct handover of values from old parameters to new ones.
Function signature did change beyond minimum necessity.

Third Iteration

Since essentially what I wanted to do was modify the behaviour of an existing function, or decorate it, I decided to employ a Python feature dedicated to this task, Decorators.

So I wrote one.


def _deprecate_params_view_connectome(func):
    """ Decorator to deprecate specific parameters in view_connectome()
     without modifying view_connectome().
     """
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        _warn_deprecated_params_view_connectome(kwargs)
        kwargs = _transfer_deprecated_param_vals_view_connectome(kwargs)
        return func(*args, **kwargs)

    return wrapper


@\_deprecate_params_view\_connectome
def view_connectome(adjacency_matrix, node_coords, edge_threshold=None,
                    edge_cmap=cm.bwr, symmetric_cmap=True,
                    linewidth=6., node_size=3.,
                    **kwargs):
    connectome_info = _get_connectome(
        adjacency_matrix, node_coords, threshold=edge_threshold, cmap=edge_cmap,
        symmetric_cmap=symmetric_cmap)
    connectome_info["line_width"] = linewidth
    connectome_info["marker_size"] = node_size
    return _make_connectome_html(connectome_info)


def _warn_deprecated_params_view_connectome(kwargs):
    """ For view_connectome(), raises warnings about deprecated parameters.
    """
    all_deprecated_params = {'coords': 'node_coords',
                             'threshold': 'edge_threshold',
                             'cmap': 'edge_cmap',
                             'marker_size': 'node_size',
                             }
    used_deprecated_params = set(kwargs).intersection(all_deprecated_params)
    for deprecated_param_ in used_deprecated_params:
        replacement_param = all_deprecated_params[deprecated_param_]
        param_deprecation_msg = (
            'The parameter "{}" will be removed in Nilearn version 0.6.0. '
            'Please use the parameter "{}" instead.'.format(deprecated_param_,
                                                            replacement_param,
                                                            )
        )
        warnings.filterwarnings('always', message=param_deprecation_msg)
        warnings.warn(category=DeprecationWarning,
                  message=param_deprecation_msg,
                  stacklevel=3)


def _transfer_deprecated_param_vals_view_connectome(kwargs):
    """ For view_connectome(), reassigns new parameters the values passed
    to their corresponding deprecated parameters.
    """
    coords = kwargs.get('coords', None)
    threshold = kwargs.get('threshold', None)
    cmap = kwargs.get('cmap', None)
    marker_size = kwargs.get('marker_size', None)

    if coords is not None:
        kwargs['node_coords'] = coords
    if threshold:
        kwargs['edge_threshold'] = threshold
    if cmap:
        kwargs['edge_cmap'] = cmap
    if marker_size:
        kwargs['node_size'] = marker_size
    return kwargs

Now this was much better!

I didn't have to change the function body at all, and the function signature change was acceptable to me. **kwargs was the only thing added to the signature, beyond changing the parameter name.

When the time came to remove this Deprecation entirely, the original function will not have to be meddled with at all.

I added a test (I think it is an integration test, but the semantics are not important for this post IMO).

def test_params_deprecation_view_connectome():
    deprecated_params = {'coords': 'node_coords',
                         'threshold': 'edge_threshold',
                         'cmap': 'edge_cmap',
                         'marker_size': 'node_size',
                         }
    deprecation_msg = (
        'The parameter "{}" will be removed in Nilearn version 0.6.0. '
        'Please use the parameter "{}" instead.'
    )
    warning_msgs = {old_: deprecation_msg.format(old_, new_)
                    for old_, new_ in deprecated_params.items()
                    }

    adj, coord = _make_connectome()
    with warnings.catch_warnings(record=True) as raised_warnings:
        html_connectome.view_connectome(adjacency_matrix=adj,
                                        coords=coord,
                                        edge_threshold='85.3%',
                                        edge_cmap=cm.cyan_orange,
                                        linewidth=8.5, node_size=4.2,
                                        )

        html_connectome.view_connectome(adjacency_matrix=adj,
                                        node_coords=coord,
                                        threshold='85.3%',
                                        edge_cmap=cm.cyan_orange,
                                        linewidth=8.5,
                                        node_size=4.2,
                                        )

        html_connectome.view_connectome(adjacency_matrix=adj,
                                        node_coords=coord,
                                        edge_threshold='85.3%',
                                        cmap=cm.cyan_orange,
                                        linewidth=8.5,
                                        node_size=4.2,
                                        )

        html_connectome.view_connectome(adjacency_matrix=adj,
                                        node_coords=coord,
                                        edge_threshold='85.3%',
                                        edge_cmap=cm.cyan_orange,
                                        linewidth=8.5,
                                        marker_size=4.2,
                                        )

        html_connectome.view_connectome(adjacency_matrix=adj,
                                        node_coords=coord,
                                        edge_threshold='85.3%',
                                        edge_cmap=cm.cyan_orange,
                                        linewidth=8.5,
                                        node_size=4.2,
                                        )

        html_connectome.view_connectome(adj,
                                        coord,
                                        '85.3%',
                                        cm.cyan_orange,
                                        8.5,
                                        4.2,
                                        )
    old_params = ['coords', 'threshold', 'cmap', 'marker_size']

    assert len(raised_warnings) == 4
    for old_param_, raised_warning_ in zip(old_params, raised_warnings):
        assert warning_msgs[old_param_] == str(raised_warning_.message)
        assert raised_warning_.category is DeprecationWarning

I thought that was that and this is good.

Fourth Iteration

Turns out few weeks later, I had to do the same thing with a another function in Nilearn, with its own set of parameters to be changed, and the same parameter in three different functions in Nistats.

That's when I decided to make this a general utility function, by enabling the decorator to accept arguments.

I also decided to reuse this code I was about to write for Nilearn to serve the same purpose in Nistats, because duh, and also, we are working on merging the Nistats library into Nilearn in the near future, so this was fine.

We didn't look for an external library that may already do this because:

We don't want to introduce any more dependencies in the code than we have to.
We didn't want to install a library for one small functionality.
Generally, a smaller (less used/popular) library might be more likely go out of maintenance. I haven't looked for any data to back this up, but seems sensical to me.
Generally, a lot of external contributors for Nilearn & Nistats libraries are scientists who happen to code, and while they are pretty great coders, introducing new things and libraries in the mix raises the contribution barrier.

I decided to create a general decorator for the purpose called replace_parameters and add it to our nilearn/_utils/helpers.py module.

Here's the code for the final thing:

def replace_parameters(replacement_params,
                       end_version='future',
                       lib_name='Nilearn',
                       ):
    """
    Decorator to deprecate & replace specified parameters
    in the decorated functions and methods
    without changing function definition or signature.

    Parameters
    ----------
    replacement_params : Dict[string, string]
        Dict where the key-value pairs represent the old parameters
        and their corresponding new parameters.
        Example: {old_param1: new_param1, old_param2: new_param2,...}

    end_version : str (optional) {'future' (default) | 'next' | <version>}
        Version when using the deprecated parameters will raise an error.
        For informational purpose in the warning text.

    lib_name: str (optional) (Default: 'Nilearn')
        Name of the library to which the decoratee belongs.
        For informational purpose in the warning text.
    """

    def _replace_params(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            _warn_deprecated_params(replacement_params, end_version, lib_name,
                                    kwargs
                                    )
            kwargs = _transfer_deprecated_param_vals(replacement_params,
                                                     kwargs
                                                     )
            return func(*args, **kwargs)

        return wrapper
    return _replace_params


def _warn_deprecated_params(replacement_params, end_version, lib_name, kwargs):
    """ For the decorator replace_parameters(),
        raises warnings about deprecated parameters.

    Parameters
    ----------
    replacement_params: Dict[str, str]
    Dictionary of old_parameters as keys with replacement parameters
    as their corresponding values.

    end_version: str
    The version where use of the deprecated parameters will raise an error.
    For informational purpose in the warning text.

    lib_name: str
    Name of the library. For informational purpose in the warning text.

    kwargs: Dict[str, any]
    Dictionary of all the keyword args passed on the decorated function.

    """
    used_deprecated_params = set(kwargs).intersection(replacement_params)
    for deprecated_param_ in used_deprecated_params:
        replacement_param = replacement_params[deprecated_param_]
        param_deprecation_msg = (
            'The parameter "{}" will be removed in {} release of {}. '
            'Please use the parameter "{}" instead.'.format(deprecated_param_,
                                                            end_version,
                                                            lib_name,
                                                            replacement_param,
                                                            )
        )
        warnings.filterwarnings('always', message=param_deprecation_msg)
        warnings.warn(category=DeprecationWarning,
                      message=param_deprecation_msg,
                      stacklevel=3)


def _transfer_deprecated_param_vals(replacement_params, kwargs):
    """ For the decorator replace_parameters(), reassigns new parameters
    the values passed to their corresponding deprecated parameters.

    Parameters
    ----------
    replacement_params: Dict[str, str]
    Dictionary of old_parameters as keys with replacement parameters
    as their corresponding values.

    kwargs: Dict[str, any]
    Dictionary of all the keyword args passed on the decorated function.

    Returns
    -------
    kwargs: Dict[str, any]
    Dictionary of all the keyword args to be passed on
    to the decorated function, with old parameter names
    replaced by new parameters, with their values intact.
    """
    for old_param, new_param in replacement_params.items():
        old_param_val = kwargs.setdefault(old_param, None)
        if old_param_val is not None:
            kwargs[new_param] = old_param_val
        kwargs.pop(old_param)
    return kwargs

Now did I need to write such extensive docstrings for the private functions here? Some may say no, I say yes.

Detailed docstring for internal/private functions eases understanding the code base for the next developer joining the team, especially when the original author may not be around.
It made it way easier for me to write unit tests, since I knew what was expected to come out. The docstring clarified all this in my mind.

And the principal function itself was this:


def _replacement_params_view_connectome():
    """ Returns a dict containing deprecated & replacement parameters
        as key-value pair for view_connectome().
        Avoids cluttering the global namespace.
    """
    return {
        'coords': 'node_coords',
        'threshold': 'edge_threshold',
        'cmap': 'edge_cmap',
        'marker_size': 'node_size',
        }


@replace_parameters(replacement_params=_replacement_params_view_connectome(),
                    end_version='0.6.0',
                    lib_name='Nilearn',
                    )
def view_connectome(adjacency_matrix, node_coords, edge_threshold=None,
                    edge_cmap=cm.bwr, symmetric_cmap=True,
                    linewidth=6., node_size=3.,
                    ):

    connectome_info = _get_connectome(
        adjacency_matrix, node_coords, threshold=edge_threshold, cmap=edge_cmap,
        symmetric_cmap=symmetric_cmap, marker_size=node_size)
    return _make_connectome_html(connectome_info)

Woohoo!

The function body didn't have to be changed beyond renaming the things we were trying to rename.
The function signature didn't have to be changed beyond renaming the parameters that we set out to rename. Turns out, with a decorator, I don't need an explicit **kwargs.
The functionality is easily reusable, clean, testable (and tested).

I decided to keep the integration test I had already written and added unit and integration tests for this new code.

from nilearn._utils.helpers import replace_parameters


def _mock_args_for_testing_replace_parameter():
    """
    :return: Creates mock deprecated & replacement parameters for use with
    testing functions related to replace_parameters().
    """
    mock_kwargs_with_deprecated_params_used = {
        'unchanged_param_0': 'unchanged_param_0_val',
        'deprecated_param_0': 'deprecated_param_0_val',
        'deprecated_param_1': 'deprecated_param_1_val',
        'unchanged_param_1': 'unchanged_param_1_val',
        }
    replacement_params = {
        'deprecated_param_0': 'replacement_param_0',
        'deprecated_param_1': 'replacement_param_1',
        }
    return mock_kwargs_with_deprecated_params_used, replacement_params


def test_replace_parameters():
    """ Integration tests that deprecates mock parameters in a mock function
    and checks that the deprecated parameters transfer their values correctly
    to replacement parameters and all deprecation warning are raised as
    expected.
    """
    mock_input, replacement_params = _mock_args_for_testing_replace_parameter()
    expected_output = ('dp0', 'dp1', 'up0', 'up1')
    expected_warnings = [
        ('The parameter "deprecated_param_0" will be removed in 0.6.1rc '
         'release of other_lib. Please use the parameter "replacement_param_0"'
         ' instead.'
         ),
        ('The parameter "deprecated_param_1" will be removed in 0.6.1rc '
         'release of other_lib. Please use the parameter "replacement_param_1"'
         ' instead.'
         ),
        ]

    @replace_parameters(replacement_params, '0.6.1rc', 'other_lib', )
    def mock_function(replacement_param_0, replacement_param_1,
                      unchanged_param_0, unchanged_param_1):
        return (replacement_param_0, replacement_param_1, unchanged_param_0,
                unchanged_param_1
                )

    with warnings.catch_warnings(record=True) as raised_warnings:
        actual_output = mock_function(deprecated_param_0='dp0',
                                      deprecated_param_1='dp1',
                                      unchanged_param_0='up0',
                                      unchanged_param_1='up1',
                                      )

    assert actual_output == expected_output

    expected_warnings.sort()
    raised_warnings.sort(key=lambda mem: str(mem.message))
    for raised_warning_, expected_warning_ in zip(raised_warnings,
                                                  expected_warnings):
        assert str(raised_warning_.message) == expected_warning_


def test_transfer_deprecated_param_vals():
    """ Unit test to check that values assigned to deprecated parameters are
    correctly reassigned to the replacement parameters.
    """
    mock_input, replacement_params = _mock_args_for_testing_replace_parameter()
    expected_output = {
        'unchanged_param_0': 'unchanged_param_0_val',
        'replacement_param_0': 'deprecated_param_0_val',
        'replacement_param_1': 'deprecated_param_1_val',
        'unchanged_param_1': 'unchanged_param_1_val',
        }
    actual_ouput = helpers._transfer_deprecated_param_vals(
            replacement_params,
            mock_input,
            )
    assert actual_ouput == expected_output


def test_future_warn_deprecated_params():
    """ Unit test to check that the correct warning is displayed.
    """
    mock_input, replacement_params = _mock_args_for_testing_replace_parameter()
    expected_warnings = [
        ('The parameter "deprecated_param_0" will be removed in sometime '
         'release of somelib. Please use the parameter "replacement_param_0" '
         'instead.'
         ),
        ('The parameter "deprecated_param_1" will be removed in sometime '
         'release of somelib. Please use the parameter "replacement_param_1" '
         'instead.'
         ),
        ]
    with warnings.catch_warnings(record=True) as raised_warnings:
        helpers._warn_deprecated_params(
                replacement_params,
                end_version='sometime',
                lib_name='somelib',
                kwargs=mock_input,
                )
    expected_warnings.sort()
    raised_warnings.sort(key=lambda mem: str(mem.message))
    for raised_warning_, expected_warning_ in zip(raised_warnings,
                                                  expected_warnings
                                                  ):
        assert str(raised_warning_.message) == expected_warning_

This was how I solved the task, and the step-by-step thinking that led me to the solution.
I have added the final code and tests as a gist here:
https://gist.github.com/kchawla-pi/40a08a18dc04f39cd338a4cdc15eb6a9

Update: Corrected a few incorrectly named parameters in the code samples and several typos, which I expect will always be ongoing work.
Added the missing _mock_args_for_testing_replace_parameter() in the final tests.