Chapter 14. Functions, closures and iterators
If you always do what you always did, you'll always get what you always got. - Henry Ford
Functions versus methods
Rust provides us a multitude of ways to structure our code, and one of the most important ones is how we use functions, methods, and the anonymous relative of functions called closures.
These give us freedom to structure our code in a way that is easy to read, maintain, and understand, depending on the context in which our software will be employed.
To now, you've seen all of this used but not explained specifically. In this chapter we'll do just that.
Functions
Functions are the simplest way to structure our code. They always have a name and do not capture their enviornment. As you've seen, they're defined using the fn keyword and their name.
Typically, you need to define the parameters, their types and a return type. It is standard to use a simple { } block to define the body of the function without a return statement, because we have an expression for the last line.
fn cuber(x: i32) -> i32 {
x * x * x
}
You can of course use the compiler to infer simple types like the above, however, it is better practice to define them explicitly.
No overloading
Rust does not support function overloading, something you might have seen in other languages like C++ or Java. That means that you cannot defined multiple functions with the same name, or you'll receive a big 'ole compiler error.
Trying this:
fn cuber(x: i32) -> i32 {
x * x * x
}
fn cuber(x: f32) -> f32 {
x * x * x
}
Gives us the following error:
error[E0428]: the name `cuber` is defined multiple times
return not necessary
In Rust, the last expression in a function is automatically returned, so you don't need to use the return keyword. This is different from other languages like C++ or Java, where you need to explicitly use return to return a value.
Notice how we don't have a semi-colon ; at the end of the last expression in the function. This is because it is not a statement, but an expression. In Rust, the last expression in a function is automatically returned, so you don't need to use the return keyword.
fn cuber(x: f32) -> f32 {
x * x * x
}
That means that the above function is equivalent to this:
fn cuber(x: f32) -> f32 {
return x * x * x;
}
Methods
Methods, like functions, let us define a block of code that can be reused. But different from functions, methods are always associated with a type. This means that they can access the data of the type they're associated with.
Inside of a type (an enum or a struct), we can define methods using the impl keyword. This is short for "implementation" and is used to define the methods of a type.
Following our previous example, let's look at how we might implement cuber as a method of a struct called Arithmetic.
struct Arithmetic {
x: i32,
}
impl Arithmetic {
fn cuber(&self) -> i32 {
self.x * self.x * self.x
}
}
Methods are always called using familiar dot notation. Here's how we'd use the cuber method of the Arithmetic struct:
fn main() {
let arithmetic_container = Arithmetic { x: 3 };
let result = arithmetic_container.cuber(); // dot notation
println!("The cube of 3 is: {}", result); // The cube of 3 is: 27
}
Note that the
&selfparameter is always passed to type methods.arthmetic_container.cuber()is the same asArithmetic::cuber(&arthmetic_container).
Privacy first
Just like with fields, even if a struct is marked pub if its methods are not marked pub, they are private by default. This means that they can only be accessed from within the module where they are defined.
If you want a method to be accessible from outside the module, you need to mark it as pub:
pub struct Arithmetic {
x: i32,
}
impl Arithmetic {
pub fn cuber(&self) -> i32 {
self.x * self.x * self.x
}
}
Environment reusability
Because methods are attached to a type and implicitly pass the self parameter, they can access the data of the type they're associated with. For simplicity, let's call their environment captured.
This is a very useful feature, because it allows us to reuse the same method for different instances of the same type, without having to pass the data explicitly. This is different from functions, which do not have access to the data of their environment or a type.
In the next section, we'll look at closures, which also capture their environment but are not associated with a type.
Overview of closures
Closures are functions that can capture their environment, however, are called anonymous, because we developers don't need to define a name for them.
These can be used as standalone functions, or as part of a function signature. It is very important that you pay attention to the context in which you are using them, as the environment they capture can change in different ways.
Basic syntax
To define a closure, we use the following, which is similar to a function definition but with the addition of the || syntax to indicate the parameters:
let cuber = |x: i32| x * x * x;
let result = cuber(3);
println!("The cube of 3 is: {}", result); // Output: The cube of 3 is: 27
Another difference to cuber specifically is the lack of brackets, which are not needed when the closure is a single expression. In addition, the compiler also infers the return type.
Functions with closure input parameters
When we define a function that takes a closure as an input parameter, we need to specify the type of the closure. This is done using function traits, which are traits that represents a function or closure.
Function trait bounds
By default, all closures implement FnOnce, because every closure can be called at least once.
FnOnce is actually the least restrictive function trait - it requires only that the closure can be called one time, even if it consumes captured variables.
But if a closure doesn't consume its captures, it may also implement FnMut and Fn. Let's look at each trait in more detail.
Fn(): Use when the closure is read-only and non-consuming.
fn use_fn<F>(f: F, num: i32) where F: Fn(i32) -> i32 {
f(num);
}
fn main() {
let cuber = |x: i32| x * x * x;
let result = use_fn(cuber, 3);
println!("The cube of 3 is: {:?}", result);
}
Follow along with this carefully. We're defining a function called use_fn that takes a closure and i32 argument as input parameters. The closure is defined using the Fn trait, which means it can be called like a function inside of use_fn.
Obviously, this redirection is contrived for example's sake. Don't try this in prod.
FnMut(): Use when the closure mutates state but does not consume it. This does not allow the closure to take ownership of the captured variables.
We use the FnMut trait when we want to allow the closure to modify its captured variables. This is useful when we want to change the state of the closure, but still want to be able to call it multiple times.
fn use_fn_mut<F>(mut f: F, num: i32) where F: FnMut(i32) -> i32 {
f(num);
}
fn main() {
let mut result = 0;
let cuber = |x: i32| {
result = x * x * x;
result
};
use_fn_mut(cuber, 3);
}
Notice how we are able to mutate result inside of the closure, but we are not able to take ownership of it. We've marked f as mut, which allows us to modify the closure's state, given the closure trait is defined as FnMut.
FnOnce(): Use when the closure consumes the state, or rather, takes ownership of captured variables.
We use the FnOnce trait when we want to allow the closure to take ownership of its captured variables. This is useful when we want to consume the closure and call the function exactly once.
Let's add an arbitrary string to the environment that's captured by the closure to see how this works.
fn use_fn_once<F>(f: F, num: i32) -> i32
where
F: FnOnce(i32) -> i32
{
f(num)
}
fn main() {
let my_string = String::from("I'm owned.");
let cuber = move |x: i32| {
println!("Captured: {}", my_string);
x * x * x
};
let result = use_fn_once(cuber, 3);
println!("The cube of 3 is: {:?}", result);
// The cuber closure has been consumed and cannot be used again
// println!("The cube of 4 is: {:?}", cuber(4)); // This will not compile
}
If you uncomment that println! statement you'll see something like the following:
error[E0382]: borrow of moved value: `cuber`
Here we use the move keywoard to indicate that we want to take ownership of the my_string variable inside the closure. This means that the closure will consume the variable and it will no longer be available after the closure is called.
One thing to keep in mind about function traits and closures is that if a closure captures nothing, or only captures variables by shared reference, it can implement Fn. This means it can be called multiple times without mutation or consumption.
Rust automatically infers the least restrictive applicable trait (Fn, FnMut, or FnOnce), so you only need to specify the trait explicitly when you're passing the closure as a parameter, and you want to constrain its behavior (e.g., require mutation or ownership consumption).
Idiomatic iteration in Rust
Iterators are one of the power tools we Rust developers can leverage to make custom types and built-in collections vastly easier to work with.
Making your custom Stack type function just like a Vec is incredibly simple
and comes with all the normal build-in iteration methods you're accustomed to. We'll review those and how to implement custom iterators in this section.
The Iterator trait
To make a custom type iterable all you need to do is implement the Iterator trait. Iterator is built-in and comes with all the methods you're used to using with built-in collections like Vec and HashMap.
So let's first define a custom AdStack type that will hold a vector of ids for ads. We'll then implement the Iterator trait for it.
#[derive(Debug)]
struct AdStack {
ids: Vec<i32>,
}
And the iterator implementation with the Iterator trait:
impl Iterator for AdStack {
type Item = i32;
fn next(&mut self) -> Option<Self::Item> {
if self.ids.is_empty() {
None
} else {
Some(self.ids.remove(0))
}
}
}
And that's it! Now we can do things like:
fn main() {
let mut ad_stack = AdStack { ids: vec![1, 2, 3, 4, 5] };
while let Some(ad_id) = ad_stack.next() {
println!("Ad ID: {}", ad_id);
}
}
The next method
You might have noticed the single method that we implmented in the Iterator above, next. It's the only one you need to implement to make your custom type iterable.
It allows you to call .next on your type and returns an Option<T> type (in this case Option<i32>).
Here's yet another usage example, leveraging built-ins:
fn main() {
let mut ad_stack = AdStack { ids: vec![1, 2, 3, 4, 5] };
ad_stack.for_each(|ad_id| {
println!("Ad ID: {}", ad_id);
});
}
Built-in iteration methods
The collections built-into Rust come with a host of iteration methods that are incredibly useful. These methods are defined on the Iterator trait and can be used with any type that implements the Iterator trait.
Let's review just a handful you'll see throughout the Rust ecosystem.
Any
The any Iterator method teturns a boolean indicating if any element matches the predicate.
It returns true as soon as it finds an element for which the predicate returns true, and false otherwise. It stops processing as soon as it finds a match.
Examine how we're simply looking to see if the below vector numbers has an odd number, or not.
let ad_ids = vec![1, 2, 3, 4, 5];
let has_odd = ad_ids.iter().any(|&x| x % 2 != 0);
println!("{}", has_odd); // true
Filter
The filter method serves a different purpose than any and has a different behavior.
Its purpose is to create a new iterator that only includes elements that satisfy a given predicate.
It does this by processing all elements of the iterator and returning a new iterator containing only the elements for which the predicate returns true.
As an example, the following filters only ods from the vector numbers.
fn main() {
let ad_ids = vec![1, 2, 3, 4, 5];
let odd_ad_ids: Vec<i32> = ad_ids.into_iter().filter(|&x| x % 2 != 0).collect();
println!("{:?}", ad_ids); // [1, 3, 5]
}
Map
The map method is used to transform each element of an iterator into a new value. It takes a closure as an argument and applies it to each element of the iterator, returning a new iterator with the transformed values.
Using our same ad_ids vector, we can use map to double each element:
fn main() {
let ad_ids = vec![1, 2, 3, 4, 5];
let doubled_ad_ids: Vec<i32> = ad_ids.into_iter().map(|x| x * 2).collect();
println!("{:?}", doubled_ad_ids); // [2, 4, 6, 8, 10]
}
filter_map
So what if we only want to double those ad_ids that are odd? We can use the filter_map method, which combines the functionality of filter and map.
fn main() {
let ad_ids = vec![1, 2, 3, 4, 5];
let doubled_odd_ad_ids: Vec<i32> = ad_ids.into_iter().filter_map(|x| {
if x % 2 != 0 {
Some(x * 2)
} else {
None
}
}).collect();
println!("{:?}", doubled_odd_ad_ids); // [2, 6]
}
How gorgeous is that?
Zero-cost abstractions
Built-in iteration methods are called zero-cost abstractions because they allow us to simplify our code without introducing any performance overhead. When you rock things like filter, map, and fold, the compiler is able to optimize them away, so they don't add any extra costs to your code versus you writing the same thing manually with for loops.
fn main() {
let ad_ids = vec![1, 2, 3, 4, 5];
let filtered_ad_ids: Vec<i32> = ad_ids.iter().filter(|&x| x > &3).map(|&x| x * 2).collect();
println!("{:?}", filtered_ad_ids); // [8, 10]
}
The code just above is equivalent to the following:
fn main() {
let ad_ids = vec![1, 2, 3, 4, 5];
let mut filtered_ad_ids = Vec::new();
for &x in ad_ids.iter() {
if x > 3 {
filtered_ad_ids.push(x * 2);
}
}
println!("{:?}", filtered_ad_ids); // [8, 10]
}
As you can see, the filter and map methods are both easier to read and they incur the same performance cost as the for loop.
Iteration methods are also implemented in a way that's called lazy, meaning
they do not execute until the iterator is completely consumed. For example,
calling filter_map on a collection does not actually call that method until the typical collect method or some other consumption method is called.
Below, we'll save an iterator and consume it later.
fn main() {
let ad_ids = vec![1, 2, 3, 4, 5];
let iter = ad_ids.iter().filter(|&x| x > 3);
let filtered_ad_ids: Vec<i32> = iter.collect();
println!("{:?}", filtered_ad_ids); // [4, 5]
}
This gives us performance tools and benefits, because iteration methods can be chained, saved as variables for use later and passed into other methods, as you saw above.
This is a powerful feature of Rust, and it allows us to write code that is efficient, easy to read and easy to maintain.