DEV Community 👩‍💻👨‍💻

Sylvain Kerkour
Sylvain Kerkour

Posted on • Originally published at kerkour.com

Learning Rust: Combinators

Combinators are a very interesting to make your code cleaner and more functional. Almost all the definitions you'll find on the internet will make your head explode 🤯 because they raise more questions than they answer.

Thus, here is my empiric definition: Combinators are methods that ease the manipulation of some type T. They favor a functional (method chaining) style of code.

let sum: u64 = vec![1, 2, 3].into_iter().map(|x| x * x).sum();
Enter fullscreen mode Exit fullscreen mode

This section will be pure how-to and real-world patterns about how combinators make your code easier to read or refactor.

This post is an excerpt from my book Black Hat Rust
Get 42% off until Thursady, November 11 with the coupon 1311B892

Iterators

Let start with iterators because this is certainly the situation where combinators are the most used.

Obtaining an iterator

An Iterator is an object that enables developers to traverse collections.

Iterators can be obtained from most of the collections of the standard library.

First, into_iter which provides an owned iterator: the collection is moved, and you can no longer use the original variable.

ch_03/snippets/combinators/src/main.rs

fn vector() {
    let v = vec![
        1, 2, 3,
    ];

    for x in v.into_iter() {
        println!("{}", x);
    }

    // you can't longer use v
}
Enter fullscreen mode Exit fullscreen mode

Then, iter which provides a borrowed iterator. Here key and value variables are references (&String in this case).

fn hashmap() {
    let mut h = HashMap::new();
    h.insert(String::from("Hello"), String::from("World"));

    for (key, value) in h.iter() {
        println!("{}: {}", key, value);
    }
}
Enter fullscreen mode Exit fullscreen mode

Since version 1.53 (released on June 17, 2021), iterators can also be obtained from arrays:

ch_03/snippets/combinators/src/main.rs

fn array() {
    let a =[
        1, 2, 3,
    ];

    for x in a.iter() {
        println!("{}", x);
    }
}
Enter fullscreen mode Exit fullscreen mode

Consuming iterators

Iterators are lazy: they won't do anything if they are not consumed.

As we have just seen, Iterators can be consumed with for x in loops. But this is not where they are the most used. Idiomatic Rust favor functional programming. It's a better fit for its ownership model.

for_each is the functional equivalent of for .. in .. loops:

ch_03/snippets/combinators/src/main.rs

fn for_each() {
    let v = vec!["Hello", "World", "!"].into_iter();

    v.for_each(|word| {
        println!("{}", word);
    });
}
Enter fullscreen mode Exit fullscreen mode

collect can be used to transform an iterator into a collection:

ch_03/snippets/combinators/src/main.rs

fn collect() {
    let x = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10].into_iter();

    let _: Vec<u64> = x.collect();
}
Enter fullscreen mode Exit fullscreen mode

Conversely, you can obtain an HashMap (or a BTreeMap, or other collections, see https://doc.rust-lang.org/std/iter/trait.FromIterator.html#implementors, using from_iter:

ch_03/snippets/combinators/src/main.rs

fn from_iter() {
    let x = vec![(1, 2), (3, 4), (5, 6)].into_iter();

    let _: HashMap<u64, u64> = HashMap::from_iter(x);
}
Enter fullscreen mode Exit fullscreen mode

reduce accumulates over an iterator by applying a closure:

ch_03/snippets/combinators/src/main.rs

fn reduce() {
    let values = vec![1, 2, 3, 4, 5].into_iter();

    let _sum = values.reduce(|acc, x| acc + x);
}
Enter fullscreen mode Exit fullscreen mode

Here _sum = 1 + 2 + 3 + 4 + 5 = 15

fold is like reduce but can return an accumulator of different type than the items of the iterator:

ch_03/snippets/combinators/src/main.rs

fn fold() {
    let values = vec!["Hello", "World", "!"].into_iter();

    let _sentence = values.fold(String::new(), |acc, x| acc + x);
}
Enter fullscreen mode Exit fullscreen mode

Here _sentence is a String, while the items of the iterator are of type &str.

Combinators

First, one of the most famous, and available in almost all languages: filter:

ch_03/snippets/combinators/src/main.rs

fn filter() {
    let v = vec![-1, 2, -3, 4, 5].into_iter();

    let _positive_numbers: Vec<i32> = v.filter(|x: &i32| x.is_positive()).collect();
}
Enter fullscreen mode Exit fullscreen mode

inspect can be used to... inspect the values flowing through an iterator:

ch_03/snippets/combinators/src/main.rs

fn inspect() {
    let v = vec![-1, 2, -3, 4, 5].into_iter();

    let _positive_numbers: Vec<i32> = v
        .inspect(|x| println!("Before filter: {}", x))
        .filter(|x: &i32| x.is_positive())
        .inspect(|x| println!("After filter: {}", x))
        .collect();
}
Enter fullscreen mode Exit fullscreen mode

map is used to convert an the items of an iterator from one type to another:

ch_03/snippets/combinators/src/main.rs

fn map() {
    let v = vec!["Hello", "World", "!"].into_iter();

    let w: Vec<String> = v.map(String::from).collect();
}
Enter fullscreen mode Exit fullscreen mode

Here from &str to String.

filter_map is kind of like chaining map and filter. It has the advantage of dealing with Option instead of bool:

ch_03/snippets/combinators/src/main.rs

fn filter_map() {
    let v = vec!["Hello", "World", "!"].into_iter();

    let w: Vec<String> = v
        .filter_map(|x| {
            if x.len() > 2 {
                Some(String::from(x))
            } else {
                None
            }
        })
        .collect();

    assert_eq!(w, vec!["Hello".to_string(), "World".to_string()]);
}
Enter fullscreen mode Exit fullscreen mode

chain merges two iterators:

ch_03/snippets/combinators/src/main.rs

fn chain() {
    let x = vec![1, 2, 3, 4, 5].into_iter();
    let y = vec![6, 7, 8, 9, 10].into_iter();

    let z: Vec<u64> = x.chain(y).collect();
    assert_eq!(z.len(), 10);
}
Enter fullscreen mode Exit fullscreen mode

flatten can be used to flatten collections of collections:

ch_03/snippets/combinators/src/main.rs

fn flatten() {
    let x = vec![vec![1, 2, 3, 4, 5], vec![6, 7, 8, 9, 10]].into_iter();

    let z: Vec<u64> = x.flatten().collect();
    assert_eq!(z.len(), 10);
}
Enter fullscreen mode Exit fullscreen mode

Now z = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

Composing combinators

This is where combinators shine: they make your code more elegant and (most of the time) easier to read because closer to how Humans think than how computers work.

ch_03/snippets/combinators/src/main.rs

#[test]
fn combinators() {
    let a = vec![
        "1",
        "2",
        "-1",
        "4",
        "-4",
        "100",
        "invalid",
        "Not a number",
        "",
    ];

    let _only_positive_numbers: Vec<i64> = a
        .into_iter()
        .filter_map(|x| x.parse::<i64>().ok())
        .filter(|x| x > &0)
        .collect();
}
Enter fullscreen mode Exit fullscreen mode

For example, the code snippet above replaces a big loop with complex logic, and instead, in a few lines, we do the following:

  • Try to parse an array of collection of strings into numbers
  • filter out invalid results
  • filter numbers less than 0
  • collect everything in a new vector

It has the advantage of working with immutable data and thus reduces the probability of bugs.

This post is an excerpt from my book Black Hat Rust
Get 42% off until Thursady, November 11 with the coupon 1311B892

Option

Use a default value: unwrap_or

fn option_unwrap_or() {
    let _port = std::env::var("PORT").ok().unwrap_or(String::from("8080"));
}
Enter fullscreen mode Exit fullscreen mode

Use a default Option value: or

// config.port is an Option<String>
let _port = config.port.or(std::env::var("PORT").ok());
// _port is an Option<String>
Enter fullscreen mode Exit fullscreen mode

Call a function if Option is Some: and_then

fn port_to_address() -> Option<String> {
    // ...
}

let _address = std::env::var("PORT").ok().and_then(port_to_address);
Enter fullscreen mode Exit fullscreen mode

Call a function if Option is None: or_else

fn get_default_port() -> Option<String> {
    // ...
}

let _port = std::env::var("PORT").ok().or_else(get_default_port);
Enter fullscreen mode Exit fullscreen mode

And the two extremely useful function for the Option type:
is_some and is_none

is_some returns true is an Option is Some (contains a value):

let a: Option<u32> = Some(1);

if a.is_some() {
    println!("will be printed");
}

let b: Option<u32> = None;

if b.is_some() {
    println!("will NOT be printed");
}
Enter fullscreen mode Exit fullscreen mode

is_none returns true is an Option is None (does not contain a value):

let a: Option<u32> = Some(1);

if a.is_none() {
    println!("will NOT be printed");
}


let b: Option<u32> = None;

if b.is_none() {
    println!("will be printed");
}
Enter fullscreen mode Exit fullscreen mode

You can find the other (and in my experience, less commonly used) combinators for the Option type online: https://doc.rust-lang.org/std/option/enum.Option.html.

Result

Convert a Result to an Option with ok:

ch_03/snippets/combinators/src/main.rs

fn result_ok() {
    let _port: Option<String> = std::env::var("PORT").ok();
}
Enter fullscreen mode Exit fullscreen mode

Use a default Result if Result is Err with or:

ch_03/snippets/combinators/src/main.rs

fn result_or() {
    let _port: Result<String, std::env::VarError> =
        std::env::var("PORT").or(Ok(String::from("8080")));
}
Enter fullscreen mode Exit fullscreen mode

map_err converts a Result<T, E> to a Result<T, F> by calling a function:

fn convert_error(err: ErrorType1) -> ErrorType2 {
    // ...
}


let _port: Result<String, ErrorType2> = std::env::var("PORT").map_err(convert_error);
Enter fullscreen mode Exit fullscreen mode

Call a function if Results is Ok: and_then.

fn port_to_address() -> Option<String> {
    // ...
}

let _address = std::env::var("PORT").and_then(port_to_address);
Enter fullscreen mode Exit fullscreen mode

Call a function and default value: map_or

let http_port = std::env::var("PORT")
    .map_or(Ok(String::from("8080")), |env_val| env_val.parse::<u16>())?;
Enter fullscreen mode Exit fullscreen mode

Chain a function if Result is Ok: map

let master_key = std::env::var("MASTER_KEY")
    .map_err(|_| env_not_found("MASTER_KEY"))
    .map(base64::decode)??;
Enter fullscreen mode Exit fullscreen mode

And the last two extremely useful functions for the Result type:
is_ok and is_err

is_ok returns true is an Result is Ok:

if std::env::var("DOES_EXIST").is_ok() {
    println!("will be printed");
}

if std::env::var("DOES_NOT_EXIST").is_ok() {
    println!("will NOT be printed");
}
Enter fullscreen mode Exit fullscreen mode

is_err returns true is an Result is Err:

if std::env::var("DOES_NOT_EXIST").is_err() {
    println!("will be printed");
}

if std::env::var("DOES_EXIST").is_err() {
    println!("will NOT be printed");
}
Enter fullscreen mode Exit fullscreen mode

You can find the other (and in my experience, less commonly used) combinators for the Result type online: https://doc.rust-lang.org/std/result/enum.Result.html.

When to use .unwrap() and .expect()

unwrap and expect can be used on both Option and Result. They have the potential to crash your program, so use them with parsimony.

I see 2 situations where it's legitimate to use them:

  • Either when doing exploration, and quick script-like programs, to not bother with handling all the edge cases.
  • When you are sure they will never crash, but, they should be accompanied by a comment explaining why it's safe to use them and why they won't crash the program.

This post is an excerpt from my book Black Hat Rust
Get 42% off until Thursady, November 11 with the coupon 1311B892

Top comments (0)

All DEV content is created by the community!

Hey, if you're landing here for the first time, you should know that this website is a global community of folks who blog about their experiences to help folks like you out.

Sign up now if you're curious. It's free!