Bruno Oliveira

Posted on Aug 28, 2019 • Edited on Nov 4, 2019

Going Functional 3 - Filter and Reduce

#goingfunctional #haskell #python

Welcome to part 3 of the #goingfunctional series.

Following up where we left off, we will look into two more operations that are both important (in the sense they can be seen as building blocks for writing functional code) and useful (since they translate very well into business requirements) for industry programmers. They are reduce and filter.

Together with map and lambda functions, they provide almost a "framework" to functionalize your codebase, and that's well-worth knowing about, since it makes your codebase more readable, more modern and overall easier to work with.

Filter

Just like the name indicates, filter is a construct which allows one to filter on an iterable based on a specific filter. Like so:

  filter (\x -> x>3) ([1,2,3,4,5])
=> [4,5]

We can see it receives a filter function, and an iterable and returns a new iterable after the application of the filtering function.

In Python (2 or lower only), the same code as above can be written as:

>>> filter(lambda x: x>3, [1,2,3,4,5])
[4, 5]
>>>

In Python 3+, operations such as map, filter, reduce, etc, are defined as returning an iterator (See here to learn about it) which means that to get the same list as a result you need to do:

Python 3.7.4 (v3.7.4:e09359112e, Jul  8 2019, 14:54:52) 
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> list(filter(lambda x: x>3, [1,2,3,4,5]))
[4, 5]
>>>

By wrapping the resulting iterator in a list constructor call, the iterator will be materialized into a list and we get the same behavior as in Python 2.

The idea of filtering, is then to test each element of the list against a function and keep only the elements that pass the test.

This function is called, in the jargon, a predicate , which is a specialized name for a function which must return a boolean.

This makes sense for our case, since only the elements which pass the predicate test are kept in the list, so the return time of the function being used in the filter construct is required to be a boolean.

This includes all the possible combinations of filters as well as logical operators which result and/or combination end in a boolean: and, or, not, xor, any combination of these operators, as well as more specialized methods, for instance, checking if a property of an object matches a filtering criteria (getCarColor().contains("yellow") ) or for example checking if a list of Objects match a particular instance type. Anything goes, as long as we get a boolean as a result.

This is the power of mathematical abstraction at work.

Reduce

The next concept we'll talk about is reduce. It seems to me that people struggle a lot with this particular concept, for example, a fellow, dev.to member spent an entire day with it, so it's worhtwhile to explore it in depth. Let's start with the definition from the One True Source ( :D ) the HaskellWiki:

In functional programming, fold (or reduce) is a family of higher order functions that process a data structure in some order and build a return value. This is as opposed to the family of unfold functions which take a starting value and apply it to a function to generate a data structure.
Typically, a fold deals with two things: a combining function, and a data structure, typically a list of elements. The fold then proceeds to combine elements of the data structure using the function in some systematic way.

It's called reduce, because the idea is to systematically apply a function starting from a seed value, until the iterable object is exhausted, or, reduced to a single value.

Typical examples include summing all elements in a list, and for example, multiplying all elements in a list.

Summing elements

When accumulating elements with a summing operation, if we want to get the sum of all the elements in an iterable, we need to provide 0 as the identity value to start our reduction.

Identity needs to be equal for every single individual element, and, it's clear that for any integer X, we have that:

0+X = X+0 = X

We say that 0 is the identity value for our accumulator function which is also associative.

We can combine this with the filter operation we just learned about to sum all the even values in a list, which in Python is very neatly:

>>> reduce(lambda x,y: x+y, filter(lambda x: x%2==0,[1,2,3,4,5,6,7,8,9,10]), 0)
30
>>>

Reduce receives a lambda function, a list (which in our example is the result of the filter operation) and a start value (which if omitted will default to the first value of the list). Let's look at multiplication.

Multiplying elements

When accumulating elements with a multiplication operation, if we want to get the sum of all the elements in an iterable, we need to provide 1 as the identity value to start our reduction. More interestingly, let's see what happens when we do not provide a default value when multiplying:

>>> reduce(lambda x,y: x*y, [2,4,6])
48
>>>

Since we didn't provide a default value, unfolding the reduction looks like:

2*(4*(6))

When providing a default value as identity we get the same effect, except that the identity multiplies the first element to "kick off" the reduction:

>>> reduce(lambda x,y: x*y, [2,4,6],2)
96
>>>

2*(2*(4*(6)))

Applications in the real world

These operations are very useful and, like the previous ones we looked at, they compose very well with lots of different business domains, examples include:

Summing all salaries of employees who have been with a company for more than a year
Counting the number of orders which have been dispatched on a certain day
Number of nodes in a graph that hold a certain value
and many more examples...

Conclusions

Hope you liked to read and learn more about reducing and filtering your data and your objects to make your code more readable and maintainable.

Next time we'll look at the implications of reduction operations in more detail, about infinite(!!) data structures, lazy evaluation and how it changes the way you can look at your data.

DEV Community

Going Functional 3 - Filter and Reduce

Top comments (0)

Read next

Building Maintainable Python Applications with Hexagonal Architecture and Domain-Driven Design

Create API’s in 5 minutes with Python

Setting Up and Exploring Django's Admin Panel

Datasets for Computer Vision (3)