Rust filter-map feature and Functional Programming

Another post discussing Rust today. Two of the more convenient features within Rust are the Option and the Result enum types. Writing a lot of code aggressivley utilizing these in place of “bad thing happened!” values unlocks a lot of cool features in Rust. I’d like to take a moment to discuss the Result.ok() method, and also the Iterator.filter_map() method.

Result and Option

Both of these types encapsulate a datatype that can report a “success”, with results, or “failure”, with (in the case of Result) a failure explanation, if desired. Both of these help provide more elegant alternatives to the following types of programming patterns:

Golang:

result, err := MyFunction(arg1, arg2);
if err != nil {
  // Do stuff on success
} else {
  // Handle failure
}

C/C++ and similar languages:

result_code = some_function(&input_data, &result_struct);
if result_code == ERROR_SUCCESS {
  // Do stuff on success
} else {
  // Handle failure
}

Result.ok(): Convert a Result Into an Option

An extremely useful feature I’ve found has been using the ok() method on a Result to effectively convert it into an Option. This throws away the failure context, and is useful in situations where you don’t really care why something failed, but you just want to ignore failures.

A great example of this would be using the str.parse() method to read a string of text from an input, in order to perform some sort of operation on it. For example, the following loop would read a number in from a user, square the input number, and print the result back to the user:

use std::io::stdin;

fn main() {
    println!("Enter a number, and I'll square it for you:");
    for line in stdin().lines() {
        if let Ok(data) = line {
            match data.parse::<u32>() {
                Ok(val) => {
                    let square = val * val;
                    println!("{} squared is {}", val, square);
                }
                Err(_) => {
                    println!("Did not provide a number")
                }
            }
        } else {
            // Break the loop on failure (like EOF)
            break;
        }
        println!("Enter a number, and I'll square it for you:");
    }
}

You may note that in both the line variable and the match call, I have chosen not to care about the specific context of the error, and instead just report that something unexpected happened and take the appropriate action. Rust offers an expressive way of handling this situation, without some of the omission I have performed with my if-let or the Err(_) declaring I don’t want to use the data: the ok() method. The above could be rewritten as:

use std::io::stdin;

fn main() {
    println!("Enter a number, and I'll square it for you:");
    for line in stdin().lines() {
        if let Some(data) = line.ok() {
            match data.parse::<u32>().ok() {
                Some(val) => {
                    let square = val * val;
                    println!("{} squared is {}", val, square);
                }
                None => {
                    println!("Did not provide a number")
                }
            }
        } else {
            // Break the loop on failure (like EOF)
            break;
        }
        println!("Enter a number, and I'll square it for you:");
    }
}

Going further, you may also opt to not report the failure of .parse() to the user, and insted just prompt them again for a number, as the failure to parse might be obvious. In this case, you’d replace the match with an if let similar to the call to line.ok(). This will make the code more concise, at the expense of perhaps some visible indicator for the user:

use std::io::stdin;

fn main() {
    println!("Enter a number, and I'll square it for you:");
    for line in stdin().lines() {
        if let Some(data) = line.ok() {
            if let Some(val) data.parse::<u32>().ok() {
                let square = val * val;
                println!("{} squared is {}", val, square);
            }
        } else {
            // Break the loop on failure (like EOF)
            break;
        }
        println!("Enter a number, and I'll square it for you:");
    }
}

Ultimately, what you have to work with here is an input iterator (stdin().lines()), a terminal (when stdin().lines() reports an Err), and a down-select filter (the parse::<u32>() call on the string value).

Rust Iterator Filtering

This is a prime candidate for the filter() method of the Iterator trait. The above code could be rewritten as follows:

use std::io::stdin;

fn main() {
    println!("Enter a number, and I'll square it for you:");
    for line in stdin().lines()
        .map(|l| l.ok())
        .filter(|data| data.as_ref().map_or(false, |d| d.parse::<u32>().is_ok()))
    {
        if let Some(data) = line {
            if let Some(val) = data.parse::<u32>().ok() {
                let square = val * val;
                println!("{} squared is {}", val, square);
            }
        }
        println!("Enter a number, and I'll square it for you:");
    }
}

This version prompts you to give a number, and will just ignore every line you enter until you give it what it is asking for. Similar to the prior examples, it will still break out of the loop if you CTRL-D to send it an EOF. However, two challenges are presented with the above version:

  1. This didn’t end up eliminating much code
  2. You end up calling the str.parse() method twice

The really long filter() call is deconstructed and explained below:

.filter(|data|                         // Take each Option input and map it to variable "data"
    data.as_ref()                      // Operate on data as a reference, so we don't consume it
    .map_or(false,                     // If data == None, return false (fail) to filter()
        |d|                            // Otherwise, data is Some(d), so map the inner value to d
           d.parse::<u32>()            // and then try to parse d
            .is_ok()))                 // If it is an Ok() Result, return true, otherwise false

For Loop as a Map

What is ultimately occurring here is that the code within the block is effectively a map() function, or, where we don’t care about carrying the results forward, a for_each().

In fact, the above code could be written to perform more functionally as below. Note that it is necessary to perform the .collect() operation on the .map() value, in order to force the iteration to execute.

use std::io::stdin;

fn main() {
    println!("Enter a number, and I'll square it for you:");
    stdin().lines()
        .map(|l| l.ok())
        .filter(|data| data.as_ref().map_or(false, |d| d.parse::<u32>().is_ok()))
        .map(|line| {
            if let Some(data) = line {
                if let Some(val) = data.parse::<u32>().ok() {
                    let square = val * val;
                    println!("{} squared is {}", val, square);
                }
            }
            println!("Enter a number, and I'll square it for you:");
        }).collect::<()>();
}

In cases like ours, Rust offers the .for_each() operation as a substitute for the .map()+.collect<()>(), in order to provide a cleaner representation of the operation being performed:

use std::io::stdin;

fn main() {
    println!("Enter a number, and I'll square it for you:");
    stdin().lines()
        .map(|l| l.ok())
        .filter(|data| data.as_ref().map_or(false, |d| d.parse::<u32>().is_ok()))
        .for_each(|line| {
            if let Some(data) = line {
                if let Some(val) = data.parse::<u32>().ok() {
                    let square = val * val;
                    println!("{} squared is {}", val, square);
                }
            }
            println!("Enter a number, and I'll square it for you:");
        });
}

The filter_map Shortcut

Similar to how Rust offers the .for_each() as a special-case .map()+.collect<()>() shorthand, it also offers a shorthand named filter_map() for cases where we’re cascading filter() and map() operations. In our case, it can be used to collapse the above .filter, .for_each, and also both of the if let statements - as the .filter_map() will operate as a streaming translator operation. Rather than passing the verbatim input values when a corresponding test function yields true, it will expect a test function that yields an Option as a return value, and will filter-out any data that evaluated to None, leaving just the Some(_) values, which can then be collected, or processed further.

Using this feature, the above code can be further reduced to the following:

use std::io::stdin;

fn main() {
    println!("Enter a number, and I'll square it for you:");
    stdin().lines()
        .map(|l| l.ok())
        .filter_map(|data| data
            .and_then(|d| d.parse::<u32>().ok())
            .and_then(|val| {
                let square = val * val;
                println!("{} squared is {}", val, square);
                println!("Enter a number, and I'll square it for you:");
                Some(())
            })).collect::<()>();
}

Making It Into a Streaming Program (for CLI pipe usage)

Often, when writing programs like these, the common use-case might not be for user-interaction, but rather for reading streaming data from a buffered source, such as a pipe or a file. If we just want to take a bunch of lines from the input, ignore anything that’s not a number, and then return the list of squares, in order, back to the user, we can condense the above code into the following:

use std::io::stdin;

fn main() {
    stdin().lines()                                 // Iterate over lines of input text
        .map(|l| l.ok())                            // Convert the Result into Option
        .filter_map(|data| data                     // For each Option of input, map it to "data""
            .and_then(|d| d.parse::<u32>().ok())    // Try parsing as u32, and convert this Result to Option
        .for_each(|val| println!("{}", val*val));   // For every u32 coming from this sequence, square it
                                                    //  and print the square to stdout
}

The above, then, can be given the following input data via stdin:

sdfasdfas
asdf
adsfdfsa
44
sadfasdfas
12
dsafasdfsadfsqae
96
sdfsadf
asdfadsf
55

This can be accomplished, in the crate’s folder, by doing:

cargo run < testdata.txt

And the following result should be displayed, corresponding to the squares for 44, 12, 96, and 55:

1936
144
9216
3025

Taking this another step further, you’re likely to want to collect all the significant items out of the input data, and store the list of computed results into an array, for use later on. Note that, in this example, some additional newline whitespace has been inserted in order to make room for more code documentation.

use std::io::stdin;

fn main() {
    let squares: Vec<u32> = stdin().lines()      // Iterate over lines of input text
        .map(|l| l.ok())                         // Convert the Result into Option
        .filter_map(|data| data                  // For each Option of input, map it to "data""
            .and_then(|d| d.parse::<u32>().ok()) // Try parsing as u32, yielding a Result, convert to Option
            .and_then(|v| Some(v*v)))            // If it's a Some(_), put the square in a new Some(_)
        .collect();                              // collect() all the squares into an array

        /// .... do something with squares ....

}