Chapter 16. Concurrency

Now that you've had a taste of concurrency using Arc<T>, we'll dig into the topic a bit more.

You wanted a banana but what you got was a gorilla holding the banana and the entire jungle. - Joe Armstrong (creator of Erlang)

What is concurrency?

In short, concurrency allows us to build software that handles multiple tasks at once. Concurrency itself includes threads, processes, synchronization mechanisms and algorithms for parallelism.

For our purposes, we need to focus on threads and, following Rust's standard library implementation, some pre-defined defaults and scope as to what is included. Our objective here is not exhaustive treatment of threading, rather a simple overview to get you productive quickly.

Basic threading in Rust

Modern computer processors have multiple-core processors and can execute multiple threads in parallel. Some basic computer architecture will help us conceptualize what exactly we're dealing with.

So what's a thread?

A modern computer has a CPU to handle calculations and memory, where things are stored on the stack and heap. Operating systems typically allow programs to run inside a process, which contains all the things a program needs to run, including memory allocation and access to the computer hardware (which we OS manages). Inside these processes modern operating systems allow threads, execution arms that can share and manage the data within their process.

According to Edward Lee and Sanjit Seshia in Introduction to Embedded Systems:

Threads are imperative programs that run concurrently and share a memory space. They can access each others’ variables. Many practitioners in the field use the term “threads” more narrowly to refer to particular ways of constructing programs that share memory, [others] to broadly refer to any mechanism where imperative programs run concurrently and share memory. In this broad sense, threads exist in the form of interrupts on almost all microprocessors, even without any operating system at all (bare iron).

Many bugs throughout software history were caused by accessing, sharing, mutating and otherwise attempting to correctly program multi-threaded applications. But with Rust, we have concurrency paradigms that get all of the Rust safety mechanisms by default, vastly improving our ability to correctly program concurrent applications.

Launching a thread of execution

Actually "spawning" a thread is quite easy. Import the standard library's thread module, spawn it, and you're rocking.

use std::thread;

fn main() {
    thread::spawn(|| {
        for index in 1..10 {
            println!("index from thread: {}", index);
            thread::sleep(Duration::from_millis(150));
        }
    });

}

But what if we wanted to do something like the below?

use std::thread;

fn main() {
    let the_kid_wont_stop = String::from("Are we there yet?");
    thread::spawn(|| {
        for index in 1..10 {
            println!("Number of times {} has been asked: {}", the_kid_wont_stop, index);
            thread::sleep(Duration::from_millis(150));
        }
    });

}

In the above, we want to print the string the_kid_wont_stop but the compiler screams that the_kid_wont_stop might outlive its scope (main). Why is that?

Well when we run a program with

fn main() {
    println!("Hello, world!");
}

we actually use one thread of execution which most of us call the main thread.

When we launch another thread, like the the_kid_wont_stop example above, we're actually launching a separate thread in addition the the main thread running. When we want to use the_kid_wont_stop, owned by the main thread, we have to explicitly move it to the thread in which we'd like to use it.

Here's a working example:

fn main() {
    let the_kid_wont_stop = String::from("Are we there yet?");
    thread::spawn(move || {
        for index in 1..10 {
            println!(
                "Number of times {} has been asked: {}",
                the_kid_wont_stop, index
            );
            thread::sleep(Duration::from_millis(150));
        }
    });
}

Sounds like someone's on A Christmas Vacation.

Timing with `join` handles

There's a whole lot more we can do with basic threads. Let's look at a more robust example to demonstrate how execution runs in parallel, rather than serially. In particular, we'll use a modified Config struct from the minigrep project in the Rust book, so that you can affect execution timing of the threads when running the program.

pub struct Config {
    pub main_thread_wait_in_seconds: u64,
    pub other_thread_wait_in_seconds: u64,
}
impl Config {
    pub fn build(mut args: impl Iterator<Item = String>) -> Result<Config, &'static str> {
        args.next();

        let main_thread_wait_in_seconds = match args.next() {
            Some(arg) => arg.parse().map_err(|_| "Invalid main thread wait time")?,
            None => return Err("Didn't get a main thread wait time"),
        };

        let other_thread_wait_in_seconds = match args.next() {
            Some(arg) => arg.parse().map_err(|_| "Invalid other thread wait time")?,
            None => return Err("Didn't get an other thread wait time"),
        };

        Ok(Config {
            main_thread_wait_in_seconds,
            other_thread_wait_in_seconds,
        })
    }
}

We'll use the above to take in two integers when running the program. We'll run it like this: cargo run 2 1, which tells the program that the main thread should take 2 seconds to execute in its loops and the other thread to take 1 second to execute its loop operations.

fn main() {
    let config = Config::build(env::args()).unwrap_or_else(|err| {
        eprintln!("Problem parsing arguments: {err}");
        process::exit(1);
    });
    println!(
        "Main thread will wait for {} seconds, other thread will wait for {} seconds",
        config.main_thread_wait_in_seconds, config.other_thread_wait_in_seconds
    );
    thread::spawn(move || {
        for index in 1..10 {
            println!("index from thread: {}", index);
            thread::sleep(Duration::from_secs(config.other_thread_wait_in_seconds));
        }
    });

    for index in 1..5 {
        println!("index from main thread: {}", index);
        thread::sleep(Duration::from_secs(config.main_thread_wait_in_seconds));
    }
}

When running the above, you'll get output that's not completely in order, demonstrating that the threads really are running simultaneously.

Now let's use join to tell the main thread when to wait for the second thread to stop:


fn main() {
    // Let's control the ending, now
    println!("Main thread initial execution is done. Now we'll do it again and use join to wait for the other thread to finish");
    let other_thread = thread::spawn(move || {
        for index in 1..10 {
            println!("index from second other_thread: {}", index);
            thread::sleep(Duration::from_secs(config.other_thread_wait_in_seconds));
        }
    });
    for index in 1..2 {
        println!("index from second main thread execution block: {}", index);
        thread::sleep(Duration::from_millis(100));
    }
    other_thread.join().unwrap();
}

Even though the main thread finishes really quickly, it waits for the other_thread to complete before exiting.

So let's return to the Arc<T> smart pointer from the previous chapter and examine how we might use threads with it to actually ensure protection when mutating data inside the Arc<T>.

Mutexes and `Arc<T>`

As we previously saw, Arc<T> provides multiple-ownership to the same data in a thread-safe manner. But what if we want to actually modify the data inside the Arc<T>? We need to use a mutex to ensure that only one thread can access the data at a time.

For that purpose we'll use Mutex<T> and Arc<T> together, with the Mutex<T> actually holding our inner data to protect against multiple-thread access.

In short, we will use the Arc<T> to guarantee safety with multiple owners, Mutex<T> to protect against multiple access and Arc<Mutext<T>> for shared mutable access to the data across multiple threads.

An example

So let's quickly look at a contrived example before returning to our previous DisplayAd setup with the AdBehavior trait.

use std::{sync::{Arc, Mutex}, thread, time::{Duration, SystemTime, UNIX_EPOCH}};

fn current_timestamp_utc_i32() -> i64 {
    SystemTime::now()
        .duration_since(UNIX_EPOCH)
        .expect("Error: Time went backwards")
        .as_nanos() as i64
}

fn main() {
    let data = Arc::new(Mutex::new(vec![]));

    let mut handles = vec![];

    for i in 0..5 {
        let data_ref = Arc::clone(&data);
        let handle = thread::spawn(move || {
            let mut vec = data_ref.lock().unwrap();
            thread::sleep(Duration::from_millis(50));
            vec.push(current_timestamp_utc_i32());
        });
        handles.push(handle);
    }

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Final data: {:?}", *data.lock().unwrap());
}

In the above we instantiate an empty vector - we don't know its size ahead of time and are going to push in some timestamps.

We then create a vector of thread handles, and for each thread we clone the Arc<Mutex<Vec<i64>>> reference. Then we lock the mutex to get access to the inner data, push a timestamp into the vector, and unlock the mutex when the thread exits.

Finally, we join all the threads and print out what's happening. You'll get something like the following:

Final data: [1747611054934278079, 1747611054984366732, 1747611055034455975, 1747611055084547488, 1747611055135756838]

Modifying our `DisplayAd` example

Now let's modify our DisplayAd example to use Arc<Mutex<T>> to protect the data inside the AdBehavior trait. We'll create a single vector to store the ads, then spawn multiple threads to add ads to the vector. Each thread will create a new DisplayAd and push it into the vector.

You'll see console output that prints these ads in a random order, demonstrating that the threads are running concurrently and modifying the same data.

use std::{
    sync::{Arc, Mutex},
    thread,
    time::{SystemTime, UNIX_EPOCH},
};

#[derive(Debug)]
struct DisplayAd {
    start_timestamp: i64,
    budget: u32,
    title: String,
    copy: String,
    call_to_action: String,
    media_asset_urls: Vec<String>,
    button_text: String,
    target_url: String,
}

fn current_timestamp_utc_millis_i64() -> i64 {
    SystemTime::now()
        .duration_since(UNIX_EPOCH)
        .expect("Time went backwards")
        .as_millis() as i64
}

fn main() {
    let ads: Arc<Mutex<Vec<Arc<Mutex<DisplayAd>>>>> = Arc::new(Mutex::new(vec![]));
    let mut handles = vec![];

    for i in 0..5 {
        let ads_ref = Arc::clone(&ads);
        let handle = thread::spawn(move || {
            let ad = DisplayAd {
                start_timestamp: current_timestamp_utc_millis_i64(),
                budget: 1000 + i * 500,
                title: format!("Thread {i} launch"),
                copy: "Run campaigns that work.".into(),
                call_to_action: "Get started".into(),
                media_asset_urls: vec!["https://assets.b00st.com/ad.png".into()],
                button_text: "Start now".into(),
                target_url: "https://b00st.com/app".into(),
            };

            let wrapped = Arc::new(Mutex::new(ad));
            ads_ref.lock().unwrap().push(wrapped);
        });
        handles.push(handle);
    }

    for handle in handles {
        handle.join().unwrap();
    }

    for (i, ad_arc) in ads.lock().unwrap().iter().enumerate() {
        let ad = ad_arc.lock().unwrap();
        println!("Ad {i}:\n{:#?}", *ad);
    }
}

You should have output that looks something like:

Ad 0:
DisplayAd {
    start_timestamp: 1747611492653,
    budget: 1500,
    title: "Thread 1 launch",
    copy: "Run campaigns that work.",
    call_to_action: "Get started",
    media_asset_urls: [
        "https://assets.b00st.com/ad.png",
    ],
    button_text: "Start now",
    target_url: "https://b00st.com/app",
}
Ad 1:
DisplayAd {
    start_timestamp: 1747611492653,
    budget: 2500,
    title: "Thread 3 launch",
    copy: "Run campaigns that work.",
    call_to_action: "Get started",
    media_asset_urls: [
        "https://assets.b00st.com/ad.png",
    ],
    button_text: "Start now",
    target_url: "https://b00st.com/app",
}
Ad 2:
DisplayAd {
    start_timestamp: 1747611492653,
    budget: 1000,
    title: "Thread 0 launch",
    copy: "Run campaigns that work.",
    call_to_action: "Get started",
    media_asset_urls: [
        "https://assets.b00st.com/ad.png",
    ],
    button_text: "Start now",
    target_url: "https://b00st.com/app",
}
Ad 3:
DisplayAd {
    start_timestamp: 1747611492653,
    budget: 2000,
    title: "Thread 2 launch",
    copy: "Run campaigns that work.",
    call_to_action: "Get started",
    media_asset_urls: [
        "https://assets.b00st.com/ad.png",
    ],
    button_text: "Start now",
    target_url: "https://b00st.com/app",
}
Ad 4:
DisplayAd {
    start_timestamp: 1747611492654,
    budget: 3000,
    title: "Thread 4 launch",
    copy: "Run campaigns that work.",
    call_to_action: "Get started",
    media_asset_urls: [
        "https://assets.b00st.com/ad.png",
    ],
    button_text: "Start now",
    target_url: "https://b00st.com/app",
}

It's like magic but it's just Rust.

Rust With Jason

Rust With Jason

Chapter 16. Concurrency

What is concurrency?

Basic threading in Rust

So what's a thread?

Launching a thread of execution

Timing with `join` handles

Mutexes and `Arc<T>`

An example

Modifying our `DisplayAd` example

Chapter 16. Concurrency

What is concurrency?

Basic threading in Rust

So what's a thread?

Launching a thread of execution

Timing with join handles

Mutexes and Arc<T>

An example

Modifying our DisplayAd example

Timing with `join` handles

Mutexes and `Arc<T>`

Modifying our `DisplayAd` example