Chapter 10. Reference lifetimes, generics and traits

Reference lifetimes, generics and traits are three language features that make Rust extremely extensible, as well as uniquely safe, for a low-level language.

Code reuse is fundamental principal of quality software engineering. It reduces the error surface, speeds up debugging and allows others to better understand the code they're reading. Generics and traits are critical to being able to accomplish this in Rust.

Lastly, specifying how long a reference lives using function parameters and return values is incredibly useful, though likely new to most developers as this feature of the Rust language is somewhat unique versus other languages.

Chapter 10. Reference lifetimes, generics and traits

The point is that you don’t know the benefits in advance. You need to prepare your mind with the foundations. Many practical techniques of today will be obsolete in the future. – Alexander Stepanov

Defining reference lifetimes

Put simply, reference lifetimes are the scope in which a reference lives. And as an amazing feature, Rust allows developers to define how long references to some addresses in memory live. And when there's ambiguity, Rust actually requires developers to annotate reference lifetimes.

Remember when we claimed Rust was made for safety? This is one of its defining features in achieving that objective by preveting dangling references. Let's dig into an example of a dangler.

// will not compile
fn main() {
    let first;

    {
        let first_second = "Hello";
        first = &first_second;
    }
    println!("{}, world!", first);
}

You should see an error in your editor that says something similar to the following, when hovering over the first = &first_second; line:

`first_second` does not live long enough
borrowed value does not live long enough (rustc E0597)
──────────────────────────────────────────────────────
https://doc.rust-lang.org/error-index.html#E0597

And the compiler is even more helpful when hovering over the last bracket defining the inner scope, which tells us exactly what's going on.

`first_second` dropped here while still borrowed (rustc E0597)
──────────────────────────────────────────────────────────────
https://doc.rust-lang.org/error-index.html#E0597

Ultimate the code won't compile because we attempt to use the reference to a memory location in the inner scope in the outer scope.

Fixing this is simple; remove the inner scope.

fn main() {
    let first;

    let first_second = "Hello";

    first = &first_second;

    println!("{}, world!", first);
}

Lifetime annotation

Annotating lifetimes simplify tells the Rust compiler how multiple references should interact with one another.

The syntax for annotations starts with an apostraphe ' and are by convention very short, e.g. 'a. You can use them in function signatures, parameters and return statement annotations.

For example, the function below takes two vectors of integers and compares which has the greater sum over all their elements.

fn bigger_sum<'a>(first: &'a Vec<i32>, second: &'a Vec<i32>) -> &'a Vec<i32> {
    let sum_first: i32 = first.iter().sum();
    let sum_second: i32 = second.iter().sum();

    if sum_first > sum_second {
        first
    } else {
        second
    }
}

The function above says a few things to the compiler. It says that for some lifetime 'a, defined by <'a>:

  • the parameters first and second each must live at least as long as 'a, and
  • the reference returned from the function will live at least as long as 'a.

Exploring the lifetime contract

Using our function from above, let's explore how this can be used and the types of errors to expect when misused.

fn main() {
    let first: Vec<i32> = vec![1, 2, 3, 4];
    let second: Vec<i32> = vec![-1, 2, 3, 4];

    bigger_sum(&first, &second);
}

The above works without a hitch, and is obvious that it does; afterall, first and second clearly have the same lifespan.

fn main() {
    let second: Vec<i32> = vec![-1, 2, 3, 4];

    {
        let third: Vec<i32> = vec![2, 3, 4, 5];
        bigger_sum(&second, &third);
    }
}

In the above we clearly have different lifetimes, i.e. third has an inner scope that is clearly smaller than second. The compiler substitutes the smaller of the lifetimes necessary into 'a.

Now let's cause some breakage.

fn main() {
    let second: Vec<i32> = vec![-1, 2, 3, 4];

    let should_be_second: &Vec<i32>;

    {
        let forth: Vec<i32> = vec![2, 3, 4];
        should_be_second = bigger_sum(&second, &forth);
    }
    println!("{:?}", should_be_second);
}

In the function above, compiler errors start on the declaration of forth:

binding `forth` declared here (rustc E0597)
────────────────────────────────────────────────
https://doc.rust-lang.org/error-index.html#E0597

Happen again on the usage of &forth in the call to bigger_sum:

`forth` does not live long enough
borrowed value does not live long enough (rustc E0597)
──────────────────────────────────────────────────────
https://doc.rust-lang.org/error-index.html#E0597

And lastly the compiler warns us again on the inner-scoped bracket close:

`forth` dropped here while still borrowed (rustc E0597)
───────────────────────────────────────────────────────
https://doc.rust-lang.org/error-index.html#E0597

Guidelines

Functions cannot return lifetimes that have nothing to do with the parameter lifetimes. The below will fail miserably:

fn failing_lifetime_function<'a>(x: &i32,) -> &'a i32 {
    let result: i32 = 42;
    &result
}

Your compiler should say something about the &result reference because it has nothing to do with the lifetime of parameter x.

cannot return reference to local variable `result`
returns a reference to data owned by the current function (rustc E0515)
───────────────────────────────────────────────────────────────────────
https://doc.rust-lang.org/error-index.html#E0515

Getting formal

Lifetime annotations on function method parameters are called input lifetimes and those on return values output lifetimes. Importantly, a developer does not always have to define lifetimes because Rust performs lifetime elision for common, deterministic behavior.

The rules Rust uses are as follows:

  1. input lifetimes - assign a different lifetime to each parameter, e.g. fn ltime(x: &i32, y: &str) gets fn ltime<'a, 'b>(x: &'a i32, y: &'b str).
  2. output lifetimes - if a function has a single parameter all lifetimes are assigned the same way, i.e. fn ltime(x: &i32) is fn ltime<'a>(x: &'a i32) -> &'a i32.
  3. output lifetimes - if one input is &self or &mut self all lifetimes are assigned self. If you sit and consider this rule, it makes a lot of sense, because this implies we're working on &self so everything needs to live at least as long as the reference to self.

Leveraging generic types

Generic types allow us to build structs, enums and function signatures that take multiple concrete types, greatly reducing code boilerplate, helping developers adhere to the DRY - don't repeat yourself - paradigm.

This makes code safer, easier to understand, maintain and debug.

Type parameters

Naming a type parameter in Rust is flexible; they can be anything you want. That said, we'll traditionally use and start with T. Let's see this in action with our bigger_sum function from the earlier section on reference lifetimes.

In particular, let's refactor that function to take Vectors of T. Here's what it looked like before.

fn bigger_sum<'a>(first: &'a Vec<i32>, second: &'a Vec<i32>) -> &'a Vec<i32> {
    // find sums of each
    let sum_first: i32 = first.iter().sum();
    let sum_second: i32 = second.iter().sum();
    // return vector with larger sum
    if sum_first > sum_second {
        println!("The first vector has a larger sum: {}", sum_first);
        first
    } else {
        println!("The second vector has a larger sum: {}", sum_second);
        second
    }
}

Now we're going to add some traits, which we'll cover in detail over the next section, and refactor one line to make this bad boy work with pretty much any non-floating point number.

fn bigger_sum<'a, T>(first: &'a Vec<T>, second: &'a Vec<T>) -> &'a Vec<T>
where
    T: 'a + std::iter::Sum<&'a T> + std::cmp::PartialOrd + std::fmt::Display,
{
    // find sums of each
    let sum_first: T = first.iter().sum();
    let sum_second: T = second.iter().sum();
    // return vector with larger sum
    if sum_first > sum_second {
        println!("The first vector has a larger sum: {}", sum_first);
        first
    } else {
        println!("The second vector has a larger sum: {}", sum_second);
        second
    }
}

What we need to focus on here is the T where we allow the first and second input parameters to be of type T. In addition, we now allow the sum_first and sum_second function variables to take type T. Lastly, we return a reference to a Vector of type T.

We did this by specifying in the signature next to the reference lifetime T using a comma and instead of i32 in the parameters and function replacing those with T. If you're familiar with C++ or other languages that use generics, this syntax should be somewhat familiar to you.

Other items changed

In addition to the now generic types we added traits that restrict the type T to types T that have only certain capabilities. Those capabilities are listed after the where T: clause:

  • 'a
  • std::iter::Sum<&'a T>
  • std::cmp::PartialOrd
  • std::fmt::Display

In the next section we'll dig into these in more detail.

Using our newly generic bigger_sum

Now we can do things like the below, where we compare Vectors that store types like u16, usize and i32 from before.

We can use anything that has the std::cmp::PartialOrd trait!

fn main() {
    let first: Vec<u16> = vec![1, 2, 3, 4];
    let second: Vec<u16> = vec![1, 2, 3, 4, 5];

    bigger_sum(&first, &second);

    {
        let third: Vec<u16> = vec![2, 3, 4, 5];
        bigger_sum(&second, &third);
    }
    {
        let third: Vec<usize> = vec![2, 3, 4, 5];
        let forth: Vec<usize> = vec![2, 3, 4, 5];
        bigger_sum(&third, &forth);
    }
}

Generic types and structs

Structs in Rust can use generic types so that they are more flexible. Let's we're making an interactive video game ad for a draft beer restaurant.

First, for funsies, let's define our types of beer:

enum BeerType {
    IPA,
    Kolsch,
    Lager,
    Stout,
    Sour,
}

Not exhaustive but that lineup should please pretty much any beer lover.

Next we'll use this in a struct to model a pint glass, a.k.a. the thing you drink beer from.

struct PintGlass<T> where T: std::cmp::PartialOrd {
    beer: BeerType,
    price: T,
    is_empty: bool,
}

We've used the generic T to allow the instantiator of the PintGlass struct flexibility in using pretty much any integer type to model the price.

If you've ever built accounting paradigms into software it's a bad idea to use floating point numbers for prices. Watch this famous movie and consider why.

Now we can use it like so, allowing for even the strangest pricing models. Let's assume the below drinks are from a place called "VC Bar".

fn main() {
    let first_pint = PintGlass {
        beer: BeerType::IPA,
        price: 5,
        is_empty: true,
    };

    let second_pint = PintGlass {
        beer: BeerType::Stout,
        price: 6,
        is_empty: true,
    };

    // there's a deal with the third pint that the restaurant pays the customer
    // 1 unit of currency
    let third_pint = PintGlass {
        beer: BeerType::Kolsch,
        price: -1,
        is_empty: true,
    };

    // then because the customer is drunk they double charge them
    // let's call this establishment "VC Bar"
    let forth_pint = PintGlass {
        beer: BeerType::Lager,
        price: 12,
        is_empty: false,
    };
}

Assuming the above is modeling one individual sitting at this miserably misleading establishment, the PintGlass struct proves shockingly flexible, thanks to generic types.

Generics on methods

Now we can add a method or two to make the PintGlass struct even more powerful.

Let's add a set_to_empty method on the struct.

impl<T> PintGlass<T>
where
    T: std::cmp::PartialOrd,
{
    fn set_to_empty(&mut self) {
        self.is_empty = true;
    }
}

Notice how we need to add the generic type to the impl<T> and restate the traits. Now we can use it like so (slightly modifying the above to make forth_pint mutable.

fn main() {
    forth_pint.set_to_empty();

    // though shady, the business model obviously works
    let fifth_pint = PintGlass {
        beer: BeerType::IPA,
        price: 12,
        is_empty: false,
    };
}

Now we're able to fully capture a misleading business model that makes a ton of money while endangering its customers, while maintaining code safety ourselves.

Isn't that why you're learning Rust?

Making generics useful with traits

Now we can finally dive into how to restrict generic types so that they're actually useful. Traits in Rust simply outline what things a type can do - functionality specific to a type.

We saw this when we implemented the PintGlass struct and specified the std::cmp::PartialOrd for type T.

We did this so that we could do things such as the following.

fn main() {
    let pints = vec![first_pint, second_pint, third_pint, forth_pint, fifth_pint];
    let mut total_sales: i32 = 0;
    for pint in pints.iter() {
        total_sales += pint.price;
    }
    println!("The customer has paid {} to get black out drunk", total_sales);
}

Now thus far, we've only used traits, not defined them. Let's do the latter now by adding a trait Display that will define a print method so that we can output the contents of a PintGlass.

Based on our earlier example, maybe it should be named "puke"!

Implementing custom traits

To implement a trait you need to first define it and then add it to the impl block for the struct you want.

Define it like this.

pub trait Display {
    fn print(&self);
}

Then we'll add it using a impl plus for, for example.

impl<T> Display for PintGlass<T>
where
    T: std::cmp::PartialOrd + std::fmt::Display,
{
    fn print(&self) {
        println!(
            "{}",
            format!(
                "Beer {:?}, price {}, is empty? {}",
                self.beer, self.price, self.is_empty
            )
        );
    }
}

// add the Debug trait to your BeerType so it can be printed...
#[derive(Debug)]
enum BeerType {
...
}

Now you can use the print method in calling code.

fn main() {
    pints[4].print();
}

Now if we want to also make a WineGlass struct, we can add a Display trait to each type and use it with the print method, so that each type has its own way of printing how we'd like to represent that particular struct.

Note: In real life, we'd want to implement the standard library Display trait, not our own!

Adding a default implementation

We probably want to have a basic default that at least says what struct we're printing. Here's how to do that, by adding to the trait definition.

pub trait Display {
    fn print(&self) {
        println!("Some type of glass");
    }
}

Now all of our structs that implement any methods for the Display trait will have the print method by default.

where clauses - Trait Bounds

We saw ealier in our implementation of the PintGlass struct the usage of a where clause, which is how we specify Trait Bounds in Rust. These allow us to specify the types and their traits that are allowed.

In particular, only those types with the implemented traits are allowed, when specified by the trait bound.

struct PintGlass<T>
where
    T: std::cmp::PartialOrd,
{
    beer: BeerType,
    price: T,
    is_empty: bool,
}

In the PintGlass struct, type T must have the std::cmp::PartialOrd trait.

We can also specify a trait bound for a return type, however, we'll cover this in more detail in a proceeding chapter.

For now, returning a simple type looks like the below, for example.

fn return_something_with_display_trait() -> impl Display {
    PintGlass {
        beer: BeerType::IPA,
        price: 12,
        is_empty: false,
    }
}

Was this page helpful?