Rust filter-map feature and Functional Programming
Another post discussing Rust today. Two of the more convenient features within Rust are the Option and the Result enum
types. Writing a lot of code aggressivley utilizing these in place of “bad thing happened!” values unlocks a lot of cool features in Rust. I’d like to take a moment to discuss the Result.ok()
method, and also the Iterator.filter_map()
method.
Result and Option
Both of these types encapsulate a datatype that can report a “success”, with results, or
“failure”, with (in the case of Result
) a failure explanation, if desired. Both of these
help provide more elegant alternatives to the following types of programming patterns:
Golang:
result, err := MyFunction(arg1, arg2);
if err != nil {
// Do stuff on success
} else {
// Handle failure
}
C/C++ and similar languages:
result_code = some_function(&input_data, &result_struct);
if result_code == ERROR_SUCCESS {
// Do stuff on success
} else {
// Handle failure
}
Result.ok(): Convert a Result Into an Option
An extremely useful feature I’ve found has been using the ok()
method on a Result
to
effectively convert it into an Option
. This throws away the failure context, and is
useful in situations where you don’t really care why something failed, but you just want
to ignore failures.
A great example of this would be using the str.parse()
method to read a string of text
from an input, in order to perform some sort of operation on it. For example, the following
loop would read a number in from a user, square the input number, and print the result back
to the user:
use std::io::stdin;
fn main() {
println!("Enter a number, and I'll square it for you:");
for line in stdin().lines() {
if let Ok(data) = line {
match data.parse::<u32>() {
Ok(val) => {
let square = val * val;
println!("{} squared is {}", val, square);
}
Err(_) => {
println!("Did not provide a number")
}
}
} else {
// Break the loop on failure (like EOF)
break;
}
println!("Enter a number, and I'll square it for you:");
}
}
You may note that in both the line
variable and the match
call, I have chosen
not to care about the specific context of the error, and instead just report
that something unexpected happened and take the appropriate action. Rust offers
an expressive way of handling this situation, without some of the omission I have
performed with my if-let
or the Err(_)
declaring I don’t want to use the data:
the ok()
method. The above could be rewritten as:
use std::io::stdin;
fn main() {
println!("Enter a number, and I'll square it for you:");
for line in stdin().lines() {
if let Some(data) = line.ok() {
match data.parse::<u32>().ok() {
Some(val) => {
let square = val * val;
println!("{} squared is {}", val, square);
}
None => {
println!("Did not provide a number")
}
}
} else {
// Break the loop on failure (like EOF)
break;
}
println!("Enter a number, and I'll square it for you:");
}
}
Going further, you may also opt to not report the failure of .parse()
to the
user, and insted just prompt them again for a number, as the failure to parse
might be obvious. In this case, you’d replace the match
with an if let
similar
to the call to line.ok()
. This will make the code more concise, at the expense of
perhaps some visible indicator for the user:
use std::io::stdin;
fn main() {
println!("Enter a number, and I'll square it for you:");
for line in stdin().lines() {
if let Some(data) = line.ok() {
if let Some(val) data.parse::<u32>().ok() {
let square = val * val;
println!("{} squared is {}", val, square);
}
} else {
// Break the loop on failure (like EOF)
break;
}
println!("Enter a number, and I'll square it for you:");
}
}
Ultimately, what you have to work with here is an input iterator
(stdin().lines()
), a terminal (when stdin().lines()
reports an Err
),
and a down-select filter (the parse::<u32>()
call on the string value).
Rust Iterator Filtering
This is a prime candidate for the filter()
method of the Iterator
trait. The
above code could be rewritten as follows:
use std::io::stdin;
fn main() {
println!("Enter a number, and I'll square it for you:");
for line in stdin().lines()
.map(|l| l.ok())
.filter(|data| data.as_ref().map_or(false, |d| d.parse::<u32>().is_ok()))
{
if let Some(data) = line {
if let Some(val) = data.parse::<u32>().ok() {
let square = val * val;
println!("{} squared is {}", val, square);
}
}
println!("Enter a number, and I'll square it for you:");
}
}
This version prompts you to give a number, and will just ignore every line you enter until you give it what it is asking for. Similar to the prior examples, it will still break out of the loop if you CTRL-D to send it an EOF. However, two challenges are presented with the above version:
- This didn’t end up eliminating much code
- You end up calling the
str.parse()
method twice
The really long filter()
call is deconstructed and explained below:
.filter(|data| // Take each Option input and map it to variable "data"
data.as_ref() // Operate on data as a reference, so we don't consume it
.map_or(false, // If data == None, return false (fail) to filter()
|d| // Otherwise, data is Some(d), so map the inner value to d
d.parse::<u32>() // and then try to parse d
.is_ok())) // If it is an Ok() Result, return true, otherwise false
For Loop as a Map
What is ultimately occurring here is that the code within the block is effectively a
map()
function, or, where we don’t care about carrying the results forward, a
for_each().
In fact, the above code could be written to perform more functionally as below. Note
that it is necessary to perform the .collect()
operation on the .map()
value, in
order to force the iteration to execute.
use std::io::stdin;
fn main() {
println!("Enter a number, and I'll square it for you:");
stdin().lines()
.map(|l| l.ok())
.filter(|data| data.as_ref().map_or(false, |d| d.parse::<u32>().is_ok()))
.map(|line| {
if let Some(data) = line {
if let Some(val) = data.parse::<u32>().ok() {
let square = val * val;
println!("{} squared is {}", val, square);
}
}
println!("Enter a number, and I'll square it for you:");
}).collect::<()>();
}
In cases like ours, Rust offers the .for_each()
operation as a substitute for
the .map()
+.collect<()>()
, in order to provide a cleaner representation of
the operation being performed:
use std::io::stdin;
fn main() {
println!("Enter a number, and I'll square it for you:");
stdin().lines()
.map(|l| l.ok())
.filter(|data| data.as_ref().map_or(false, |d| d.parse::<u32>().is_ok()))
.for_each(|line| {
if let Some(data) = line {
if let Some(val) = data.parse::<u32>().ok() {
let square = val * val;
println!("{} squared is {}", val, square);
}
}
println!("Enter a number, and I'll square it for you:");
});
}
The filter_map Shortcut
Similar to how Rust offers the .for_each()
as a special-case
.map()
+.collect<()>()
shorthand, it also offers a shorthand named
filter_map()
for cases where we’re cascading filter()
and map()
operations. In our case, it
can be used to collapse the above .filter
, .for_each
, and also both of the if let
statements - as the .filter_map()
will operate as a streaming translator operation. Rather
than passing the verbatim input values when a corresponding test function yields true
, it
will expect a test function that yields an Option
as a return value, and will filter-out
any data that evaluated to None
, leaving just the Some(_)
values, which can then be
collected, or processed further.
Using this feature, the above code can be further reduced to the following:
use std::io::stdin;
fn main() {
println!("Enter a number, and I'll square it for you:");
stdin().lines()
.map(|l| l.ok())
.filter_map(|data| data
.and_then(|d| d.parse::<u32>().ok())
.and_then(|val| {
let square = val * val;
println!("{} squared is {}", val, square);
println!("Enter a number, and I'll square it for you:");
Some(())
})).collect::<()>();
}
Making It Into a Streaming Program (for CLI pipe usage)
Often, when writing programs like these, the common use-case might not be for user-interaction, but rather for reading streaming data from a buffered source, such as a pipe or a file. If we just want to take a bunch of lines from the input, ignore anything that’s not a number, and then return the list of squares, in order, back to the user, we can condense the above code into the following:
use std::io::stdin;
fn main() {
stdin().lines() // Iterate over lines of input text
.map(|l| l.ok()) // Convert the Result into Option
.filter_map(|data| data // For each Option of input, map it to "data""
.and_then(|d| d.parse::<u32>().ok()) // Try parsing as u32, and convert this Result to Option
.for_each(|val| println!("{}", val*val)); // For every u32 coming from this sequence, square it
// and print the square to stdout
}
The above, then, can be given the following input data via stdin
:
sdfasdfas
asdf
adsfdfsa
44
sadfasdfas
12
dsafasdfsadfsqae
96
sdfsadf
asdfadsf
55
This can be accomplished, in the crate’s folder, by doing:
cargo run < testdata.txt
And the following result should be displayed, corresponding to the squares for
44
, 12
, 96
, and 55
:
1936
144
9216
3025
Taking this another step further, you’re likely to want to collect all the significant items out of the input data, and store the list of computed results into an array, for use later on. Note that, in this example, some additional newline whitespace has been inserted in order to make room for more code documentation.
use std::io::stdin;
fn main() {
let squares: Vec<u32> = stdin().lines() // Iterate over lines of input text
.map(|l| l.ok()) // Convert the Result into Option
.filter_map(|data| data // For each Option of input, map it to "data""
.and_then(|d| d.parse::<u32>().ok()) // Try parsing as u32, yielding a Result, convert to Option
.and_then(|v| Some(v*v))) // If it's a Some(_), put the square in a new Some(_)
.collect(); // collect() all the squares into an array
/// .... do something with squares ....
}
Permanent Link: https://blog.malware.re/2022/12/23/Rust-filter-map/index.html