Understanding Lifetime in Rust – Part II

In Part I we discussed the motivation behind lifetime management in Rust and how it works from a function. In this installment we will explore how lifetime helps us model containment relationship (that is, when an object contains a reference to another object).

The Business Requirement

We will design a Person type. A person may own a Car. A person should be able to buy and sell cars. Two persons should be able to exchange (or trade) their cars.

Design the Types

The Car type is easy. We will keep it simple.

struct Car {
    model : String
}

How should the Person type contain a Car? A full containment will look like this:

struct Person {
    car : Option<Car>
}

This is easy from a memory management point of view. But it has several problems. A car is not an integral part of a person. Buying and trading in cars will create copies of cars. For performance reasons we will like to avoid that. Instead, we will like Person to store an optional reference to a Car. But that’s when things begin get complicated.

Containing a Reference Pointer

In Rust a reference must point to a valid memory area. Which means that the container object must not live longer than the object that it holds a reference to. Rust requires us to model this rule using lifetime parameters. Our Person type will look like this.

struct Person<'a> {
    car:Option<&'a Car>
}

In Person<'a> we are saying that there is a lifitime named a.

In Option<&'a Car> we are saying that the Car object has a lifetime of a.

After this, the compiler will ensure that the Person object has a lifetime of a or less. This will be done to avoid a Person object from containing an invalid Car reference pointer.

What we did here is basic common sense. Unfortunately, the Rust compiler does not let us elide the lifetime parameters in this situation. At least not right now.

Write the Implementation

Now that our type is designed, we can go ahead an implement the methods.

impl <'a> Person<'a> {
    fn new() -> Person<'a> {
        Person{
            car: None
        }
    }

    fn buy_car(&mut self, c : &'a Car) {
        self.car = Some(c);
    }

    fn sell_car(&mut self) {
        self.car = None;
    }    
}

This has everything we need except for trading cars between persons. We will develop that later.

As you have caught on by now, lifetime parameters are built into the data type using the same syntax as generics (the full type name is: Person<'a>). That is why we start the implementation using impl <'a> Person<'a> just like we would if the Person type used generics. Technically this lets us write different implementations for different types of lifetimes. But this concept is beyond my understanding for now.

Using the Contained Reference

Alright. Let’s take our code for a spin. This will compile just fine.

fn main() {
    let car = Car{model: "Honda Civic".to_string()};
    let mut bob = Person::new();

    bob.buy_car(&car);
}

But this won’t:

fn main() {
    let mut bob = Person::new();
    let car = Car{model: "Honda Civic".to_string()};

    bob.buy_car(&car); //Error!
}

Objects are destroyed in the reverse order as they are created. That means the car variable gets destroyed before bob. So bob will briefly point to an invalid car, even if it is for a millisecond. Rust will not allow that.

The lifetime system doesn’t exactly predict when a reference falls out of use. Consider the example below.

fn main() {
    let ghibli = Car{model: "Maserati Ghibli".to_string()};
    let mut bob = Person::new();

    { 
        //Inner scope

        let civic = Car{model: "Honda Civic".to_string()};

        bob.buy_car(&civic); //Error!
        bob.buy_car(&ghibli);
    }    
}

The code does not compile because the civic variable doesn’t live as long as bob. This is unfortunate because clearly the code is perfectly safe from a human developer’s point of view. At the end of the inner scope bob no longer has any reference to civic. The system is strictly going by the way we declared the buy_car(&mut self, c : &'a Car) method. All it knows is that the supplied Car has a lifetime of a and the container object must not outlive it.

Implement Trading

Trading will involve swapping cars between two Person instances. We add this method to the implementation:

fn trade_with(&mut self, other : &mut Person<'a>) {
    let tmp = other.car;

    other.car = self.car;
    self.car = tmp;
} 

Nothing special here. Except to note that the lifetime is a part of the data type. The other variable has a full data type of &mut Person<'a>.

We can now use the trade_with() method like this.

fn main() {
    let civic = Car{model: "Honda Civic".to_string()};
    let ghibli = Car{model: "Maserati Ghibli".to_string()};

    let mut bob = Person::new();
    let mut alice = Person::new();

    bob.buy_car(&civic);
    alice.buy_car(&ghibli);

    bob.trade_with(&mut alice);
}

The Car objects still need to be created before the Person objects. But the order in which bob and alice are created makes no difference.

Borrow Rules Apply

When an object holds a reference to another it borrows the reference and all standard borrowing rules apply. This is expected. I mention them here just to reinforce the concepts.

You can have as many immutable borrows as you want as long as there are no mutable borrows.

let ghibli = Car{model: "Maserati Ghibli".to_string()};
let mut bob = Person::new();

bob.buy_car(&ghibli); //bob borrows ghibli immutably

let p1 = &ghibli; //More immutable borrows are OK
let p2 = &ghibli;

A mutable borrow is exclusive and is only permitted if there is no other borrow of any kind.

let mut ghibli = Car{model: "Maserati Ghibli".to_string()};
let mut bob = Person::new();

bob.buy_car(&ghibli); //bob borrows ghibli immutably

let p1 = &mut ghibli; //Can't do this.

You can not move an object while someone has borrowed it’s reference. Because doing so will make the borrowed reference point to an invalid memory area.

let mut ghibli = Car{model: "Maserati Ghibli".to_string()};
let mut bob = Person::new();

bob.buy_car(&ghibli); //bob borrows ghibli

let g = ghibli; //Can't move

This is, however, somewhat unexpected.

let civic = Car{model: "Honda Civic".to_string()};
let mut ghibli = Car{model: "Maserati Ghibli".to_string()};
let mut bob = Person::new();

bob.buy_car(&ghibli);
bob.buy_car(&civic);

let p1 = &mut ghibli; //Still Can't do this

From a human developer’s point of view the code above is safe. bob no longer borrows any reference to ghibli at a time when we are trying to borrow ghibli mutably. But the compiler is unable to deduce that.

Not All is Well

Well, as we have seen above it’s not that hard to model containment of reference pointers. The bad news is that this model does not always work. Consider this function. It will not compile.

fn shop_for_car(p : &mut Person) {
    let car = Car{model: "Mercedes GLK350".to_string()};

    p.buy_car(&car); //Error! car doesn't live long enough
}

That is because the car object simply doesn’t live as long as the Person buying it. So how can we keep reference to an object that is created in an inner scope like a function? The answer lies in heap allocation which in Rust is achieved via Box::new. We will explore that in Part III.

Advertisements

Easier libc in Rust

The libc crate gives us access to the UNIX system calls. But calling these functions from Rust can be difficult. You will find yourself constantly converting between String/&str and char* and other C data types. Rust provides several wrappers that use idiomatic Rust data types and error handling. But they are sometimes hard to find. Here’s an index of some of the most commonly used libc calls and their higher level wrappers.

chdir

Use std::env::set_current_dir. Example:

env::set_current_dir("/tmp");

Use std::fs::remove_file. Example:

std::fs::remove_file("file.txt");

rmdir

Use std::fs::remove_dir. Example:

if let Err(error) = std::fs::remove_dir("a_dir") {
    println!("There was a problem: {}", error);
}

rename

Use std::fs::rename.

if let Err(e) = std::fs::rename(
    "a_dir/file.txt", 
    "a_dir/new.txt") {
    println!("Failed to rename file: {}", e); 
} 

getcwd

Use std::env::current_dir. In the example below we try to convert the current directory name to a UTF-8 encoded &str:

if let Ok(path_buf) = std::env::current_dir() {
    if let Some(path_str) = path_buf.to_str() {
        println!("Current dir: {}", path_str);
    } else {
        println!("Could not convert path name to UTF-8.");
    }
} else {
    println!("Could not get current directory.");
}

stat

Use std::fs::metadata. UNIX specific data that is available from C’s struct stat can be accessed using the std::os::unix::fs::MetadataExt trait.

use std::os::unix::fs::MetadataExt;

fn main() {
    if let Ok(m) = std::fs::metadata("a_dir/file.txt") {
        println!("File size: {}", m.size());
        println!("Creation time: {}", m.ctime());
        println!("Access time: {}", m.atime());  
    } else {
        println!("Failed to get stat.");
    }
}

chmod

The wrapper approach involves calling std::fs::set_permissions. But to construct a std::fs::Permissions using the UNIX mode flags you will need to bring the std::os::unix::prelude::PermissionsExt trait into scope.

use std::os::unix::prelude::PermissionsExt;

fn main() {
    //chmod to 666
    if let Err(e) = std::fs::set_permissions(
        "a_dir/file.txt",
        std::fs::Permissions::from_mode(0o666)) {
        println!("Failed to chmod: {}", e);
    }
}

You can also call chmod directly from libc crate. But the function is in a strange module.

extern crate libc;
use libc::funcs::posix88::stat_::chmod;
use libc::types::os::arch::posix88::mode_t;

fn easy_chmod(path: &str, mode: u64) -> bool  {
    if let Ok(c_str) = std::ffi::CString::new(path) {
        unsafe {
            let result = chmod(
                c_str.as_ptr(), 
                mode as mode_t);

            return result == 0;
        }   
    }

    return false;
}   

fn main() {
    if !easy_chmod("a_dir/file.txt", 0o666) {
        println!("Failed to chmod.");
    }
}

mkdir

Use std::fs::create_dir. I find this utility method useful.

fn make_dir(path: &str) -> std::io::Result<()> {
    let err = match std::fs::create_dir(path) {
        Ok(_) => return Ok(()),
        Err(e) => e
    };

    match err.kind() {
        ErrorKind::AlreadyExists => return Ok(()),
        _ => {
            println!("Failed to create directory: {}", path);

            return Err(err);
        }
    };
}

getenv

Use std::env:var.

if let Ok(val) = std::env::var("HOME") {
    println!("Got variable: {}", val);
} else {
    println!("Failed to get env variable");
}

setenv

Use std:env:set_var.

std::env::set_var("GREET", "Hola");

Use std::fs::hard_link.

if let Err(e) = std::fs::hard_link(
    "a_dir/file.txt", "a_dir/another.txt") {
    println!("Failed to create hard link: {}", e);
}

Use std::os::unix::fs::symlink.

if let Err(e) = std::os::unix::fs::symlink(
    "a_dir/file.txt", "a_dir/another.txt") {
    println!("Failed to create soft link: {}", e);
}