Rust - Getting Started (May. 2021. updating continuously)

Page content

What’s this page?

As of Jan. 2021, I work as a DevOps (or infrastructure) engineer, but I like to solve problems with codes (front, back, whatever. it depends on the purposes.) For a year, my motivation abour learning Rust surges enough. This page is a memo while I’ve learned with Rust official document so that I can easily reminde the key feafures of Rust. Most part of this post consist of quotes from the document, but I also leave my opinions (could be wrong.)

I have experiences on

  • Python,
  • C++,
  • Java,
  • Go,
  • and Fortran (!!)

1. Getting started

1.1. Installation

  • Official installation goes well.
  • I leave my installation process to another page.

1.2 Hello, World!

  • Rust files always end with .rs extension.
  • The main function is special: it is always the first code that runs in every executable Rust program.
  • Rust style is to indent with four spaces, not a tab.
  • Using a ! means that you’re calling a macro instead of a normal function.

1.3 Hello, Cargo!

  • In Rust, packages of code are referred to as crates.
  • cargo new hello_cargo
  • Cargo expects your source files to live inside the src directory.
  • cargo build command creates an executable file in target/debug/hello_cargo.
  • Cargo.lock: This file keeps track of the exact versions of dependencies in your project.
  • cargo check: command quickly checks your code to make sure it compiles but doesn’t produce an executable.
  • When your project is finally ready for release, you can use cargo build --release to compile it with optimizations.

2. Programming a Guessing Game

  • We can start comment line with //.

  • Create variables.

    let foo = 5; // immmutable
    let mut foo = 5; // mutable
    
  • let mut guess = String::new(); The :: syntax indicates that new is an associated function of the String type. An associated function is implemented on a type, in this case String, rather than on a particular instance of a String.

  • User input.

    use std::io;
    let mut guess = String::new();
    io::stdin()
          .read_line(&mut guess)
          .expect("Failed to read line");
    

    The code store a standart input to the variable guess as a String.

  • std::io::stdin function returns an instance of std::io::Stdin, which is a type that represents a handle to the standard input for your terminal.

  • The job of read_line is to take whatever the user types into standard input and place that into a string, so it takes that string as an argument.

  • The & indicates that this argument is a reference, which gives you a way to let multiple parts of your code access one piece of data without needing to copy that data into memory multiple times.

  • References are immutable by default. Hence, you need to write &mut guess rather than &guess to make it mutable.

  • .expect() is a potential failuer handling.

  • It’s often wise to introduce a newline and other whitespace to help break up long lines.

  • read_line returns io::Result

  • Rust has a number of types named Result in its standard library

  • The Result types are enumerations = enum, which is a type that can have a fixed set of value.

  • For Result, the variants are Ok or Err.

  • An instance of io::Result has an expect method.

  • If you don’t call expect, the program will compile, but you’ll get a warning:

    $ cargo build
       Compiling guessing_game v0.1.0 (file:///projects/guessing_game)
    warning: unused `std::result::Result` that must be used
      --> src/main.rs:10:5
       |
    10 |     io::stdin().read_line(&mut guess);
       |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       |
       = note: `#[warn(unused_must_use)]` on by default
       = note: this `Result` may be an `Err` variant, which should be handled
    
        Finished dev [unoptimized + debuginfo] target(s) in 0.59s
    
  • The set of curly brackets, {}, is a placeholder.

    let x = 5;
    let y = 10;
    
    println!("x = {} and y = {}", x, y);
    
  • Rust doesn’t yet include random number functionality in its standard library. However, the Rust team does provide a rand crate.

  • Using crate in Cargo.toml.

    [dependencies]
    rand = "0.5.5"
    
    • The number 0.5.5 is actually shorthand for ^0.5.5, which means “any version that has a public API compatible with version 0.5.5.”
    • (My note): This line downloads and start compiling. See under /target/debug/deps/.
    • When you build a project for the first time, Cargo figures out all the versions of the dependencies that fit the criteria and then writes them to the Cargo.lock file. When you build your project in the future, Cargo will see that the Cargo.lock file exists and use the versions specified there rather than doing all the work of figuring out versions again.
    • When you do want to update a crate, Cargo provides another command, cargo update, which will ignore the Cargo.lock file and figure out all the latest versions that fit your specifications in Cargo.toml. If that works, Cargo will write those versions to the Cargo.lock file.
    • by default, Cargo will only look for (middle) versions greater than 0.5.5 and less than 0.6.0.
  • (My note, reminder) In rand::Rng, Rng is called assosiated function of rand.

    use rand::Rng;
    ...
    let secret_number = rand::thread_rng().gen_range(1, 101);
    
  • Note: A trait is a collection of methods defined for an unknown type: Self. They can access other methods declared in the same trait. https://doc.rust-lang.org/rust-by-example/trait.html

  • Simple match example.

    match guess.cmp(&secret_number) {
        Ordering::Less => println!("Too small!"),
        Ordering::Greater => println!("Too big!"),
        Ordering::Equal => println!("You win!"),
    }
    
    • std::cmp::Ordering is another enum, but the variants for Ordering are Less, Greater, and Equal.
    • The cmp method compares two values and can be called on anything that can be compared.
    • A match expression is made up of arms. An arm (=>) consists of a pattern and the code that should be run if the value given to the beginning of the match expression fits that arm’s pattern.
  • Rust has a strong, static type system. However, it also has type inference.

  • Integer type examples: i32, u32, i64.

  • Read the following code with paying attention to the type of guess.

    let mut guess = String::new();
    
    io::stdin()
        .read_line(&mut guess)
        .expect("Failed to read line");
    
    let guess: u32 = guess.trim().parse().expect("Please type a number!");
    
    • Rust allows us to shadow the previous value of guess with a new one. This feature is often used in situations in which you want to convert a value from one type to another type.
      • (My note): I wanted to understand about shadowing more detail, so I read this link.
    • trim method on a String instance will eliminate any whitespace at the beginning and end.
    • When the user presses enter, a newline character is added to the string.
    • The parse method on strings parses a string into some kind of number, and could easily cause an error (the string contained A👍%, there would be no way to convert that to a number.)
    • The colon : after guess tells Rust we’ll annotate the variable’s type.
  • Make above code more Rust-like.

    // from
    //let guess: u32 = guess.trim().parse().expect("Please type a number!");
    //
    // to
    let guess: u32 = match guess.trim().parse() {
        Ok(num) => num,
        Err(_) => continue,
    };
    
    • Switching from an expect call to a match expression is how you generally move from crashing on an error to handling the error.
    • The underscore, _, is a catchall value; in this example, we’re saying we want to match all Err values, no matter what information they have inside them.
  • loop can loop unlimitedly unless break; appears.

3. Common Programming Concepts

3.1 Variables and Mutability

  • By default variables are immutable. -> takes advantage of the safety and easy concurrency.

  • Why Rust encourages you to favor immutability?

    • It’s important that we get compile-time errors when we attempt to change a value that we previously designated as immutable because this very situation can lead to bugs.
    • But mutability can be very useful.
  • Like immutable variables, constants are values that are bound to a name and are not allowed to change, but there are a few differences between constants and variables.

    • First, you aren’t allowed to use mut with constants.
    • Constants can be declared in any scope, including the global scope, which makes them useful for values that many parts of code need to know about.
    • The last difference is that constants may be set only to a constant expression, not the result of a function call or any other value that could only be computed at runtime.
  • An example of constants.

    const MAX_POINTS: u32 = 100_000;
    
    • Use all uppercase with underscores between words.
    • Underscores can be inserted in numeric literals to improve readability.
  • Shadowing

    fn main() {
        let x = 5;
        let x = x + 1;
        let x = x * 2;
    
        println!("The value of x is: {}", x);
    }
    
  • Shadowing is different from marking a variable as mut, because we’ll get a compile-time error if we accidentally try to reassign to this variable without using the let keyword.

  • The other difference between mut and shadowing is that because we’re effectively creating a new variable when we use the let keyword again, we can change the type of the value but reuse the same name.

  • Shadowing thus spares us from having to come up with different names, such as spaces_str and spaces_num; instead, we can reuse the simpler spaces name.

  • My Summery on Shadowing: Shadowing can change type, but can’t change value with same type.

3.2 Data Types

  • A scalar type represents a single value.

    • Examples) integers, floating-point numbers, booleans, and characters.
  • Integer types.

    Length Signed Unsigned
    8-bit i8 u8
    16-bit i16 u16
    32-bit i32 u32
    64-bit i64 u64
    128-bit i128 u128
    arch isize usize
  • Signed numbers are stored using two’s complement representation.

  • Interger Literals in Rust.

    Number literals Example
    Decimal 98_222
    Hex 0xff
    Octal 0o77
    Binary 0b1111_0000
    Byte (u8 only) b'A'
    • We can use underscores in decimals.
  • Integer types default to i32: this type is generally the fastest, even on 64-bit systems.

  • When you’re compiling in debug mode, Rust includes checks for integer overflow that cause your program to panic at runtime if this behavior occurs.

  • Rust uses the term panicking when a program exits with an error.

  • When you’re compiling in release mode with the --release flag, Rust does not include checks for integer overflow that cause panics.

  • Rust’s floating-point types are f32 and f64.

  • The default type is f64 because on modern CPUs it’s roughly the same speed as f32 but is capable of more precision.

  • Floating-point numbers are represented according to the IEEE-754 standard.

  • Booleans are one byte in size.

  • Rust’s char type is four bytes in size and represents a Unicode Scalar Value, which means it can represent a lot more than just ASCII. …. your human intuition for what a “character” is may not match up with what a char is in Rust.

  • Compund types: tuple and array.

Tuple

  • Tuples have a fixed length: once declared, they cannot grow or shrink in size.
  • Example: let tup: (i32, f64, u8) = (500, 6.4, 1);
fn main() {
    let tup = (500, 6.4, 1);

    let (x, y, z) = tup;

    println!("The value of y is: {}", y);
}
// The value of y is: 6.4
  • We can access a tuple element directly by using a period (.) followed by the index of the value we want to access.
let x: (i32, f64, u8) = (500, 6.4, 1);
let five_hundred = x.0;

Array

  • Every element of an array must have the same type.
  • We can define like let a: [i32; 5] = [1, 2, 3, 4, 5];
  • If you want to create an array that contains the same value for each element, you can specify the initial value, followed by a semicolon, and then the length of the array in square brackets,
let a = [3; 5];
  • You can access elements of an array using indexing, let first = a[0];.
  • What happens if you try to access an element of an array that is past the end of the array? … The compilation didn’t produce any errors, but the program resulted in a runtime error and didn’t exit successfully (panic).
  • In many low-level languages, this kind of check is not done, and when you provide an incorrect index, invalid memory can be accessed.

3.3 functions

  • Function definitions in Rust start with fn and have a set of parentheses after the function name. The curly brackets tell the compiler where the function body begins and ends.
  • Rust code uses snake case as the conventional style for function and variable names.
  • Rust doesn’t care where you define your functions, only that they’re defined somewhere.
  • Rust is an expression-based language, this is an important distinction to understand.
    • Creating a variable and assigning a value to it with the let keyword is a statement.
    • Function definitions are also statements;
    • Statements do not return values.
    • 5 + 6, which is an expression that evaluates to the value 11.
    • Calling a function is an expression. Calling a macro is an expression. The block that we use to create new scopes, {}, is an expression,
    {
        let x = 3;
        x + 1
    }
    
    • The x + 1 line without a semicolon at the end, which is unlike most of the lines you’ve seen so far. Expressions do not include ending semicolons.
fn main() {
    let y = {
        let x = 3;
        x + 1
    };

    println!("The value of y is: {}", y);
    // This value of y is: 4
}

Functions with Return Values

fn main() {
    let x = plus_one(5);

    println!("The value of x is: {}", x);
}

fn plus_one(x: i32) -> i32 {
    x + 1
}
  • We don’t name return values, but we do declare their type after an arrow (->).
  • The return value of the function is synonymous with the value of the final expression in the block of the body of a function. You can return early from a function by using the return keyword and specifying a value, but most functions return the last expression implicitly.

3.4 Comments

Pass ;)

3.5 Control Flow

if number < 5 {
    println!("condition was true");
} else {
    println!("condition was false");
}
  • Blocks of code associated with the conditions in if expressions are sometimes called arms, just like the arms in match expressions.
  • It’s also worth noting that the condition in this code must be a bool. If the condition isn’t a bool, we’ll get an error.
  • You can have multiple conditions by combining if and else in an else if expression.
  • Because if is an expression, we can use it on the right side of a let statement.
    let condition = true;
    let number = if condition { 5 } else { 6 };
    
  • The values that have the potential to be results from each arm of the if must be the same type.
    • Decided at compile time.
    • The compiler would be more complex and would make fewer guarantees about the code if it had to keep track of multiple hypothetical types for any variable.
  • loop and break;.
  • You can add the value you want returned after the break expression you use to stop the loop; that value will be returned out of the loop so you can use it
let mut counter = 0;

let result = loop {
    counter += 1;

    if counter == 10 {
        break counter * 2;
    }
};

println!("The result is {}", result);
//The result is 20
  • while -> If the condition matchs, out from the loop.
fn main() {
    let mut number = 3;

    while number != 0 {
        println!("{}!", number);

        number -= 1;
    }

    println!("LIFTOFF!!!");
}
//3!
//2!
//1!
//LIFTOFF!!!
  • You could use the while construct to loop over the elements of a collection, such as an array.
fn main() {
    let a = [10, 20, 30, 40, 50];
    let mut index = 0;

    while index < 5 {
        println!("the value is: {}", a[index]);

        index += 1;
    }
}
the value is: 10
the value is: 20
the value is: 30
the value is: 40
the value is: 50
  • And there is for also.
fn main() {
    let a = [10, 20, 30, 40, 50];

    for element in a.iter() {
        println!("the value is: {}", element);
    }
}
  • An array loop should be use for because of safetiness.
fn main() {
    for number in (1..4).rev() {
        println!("{}!", number);
    }
    println!("LIFTOFF!!!");
}
  • rev reverses the iteration.

4. Understanding Ownership

  • Rust has no GC, but Ownership.

4.1 What Is Ownership?

General IT knowledge: stack and heap

  • The stack stores values in the order it gets them and removes the values in the opposite order. LIFO = FILO.
  • All data stored on the stack must have a known, fixed size. Data with an unknown size at compile time or a size that might change must be stored on the heap instead.
    • cf. In a context of computer science, heap is a tree with some special property. That special property of the heap is, the value of a node must be >= or <= to its children. But in a context of programming language, you can think heap is a free memory area which is assined to a program (process) when it’s execution time.
  • The heap is less organized: when you put data on the heap, you request a certain amount of space. The memory allocator finds an empty spot in the heap that is big enough, marks it as being in use, and returns a pointer, which is the address of that location. This process is called allocating on the heap and is sometimes abbreviated as just allocating. Pushing values onto the stack is not considered allocating. Because the pointer is a known, fixed size, you can store the pointer on the stack, but when you want the actual data, you must follow the pointer.
  • Pushing to the stack is faster than allocating on the heap because the allocator never has to search for a place to store new data; that location is always at the top of the stack.
  • When your code calls a function, the values passed into the function (including, potentially, pointers to data on the heap) and the function’s local variables get pushed onto the stack. When the function is over, those values get popped off the stack.

Ownership addresses the problems,

  1. Keeping track of what parts of code are using what data on the heap,
  2. minimizing the amount of duplicate data on the heap,
  3. and cleaning up unused data on the heap so you don’t run out of space

Once you understand ownership, you won’t need to think about the stack and the heap very often, but knowing that managing heap data is why ownership exists can help explain why it works the way it does.

In Rust, memory is managed through a system of ownership with a set of rules that the compiler checks at compile time. None of the ownership features slow down your program while it’s running.

Sideway: Heap fragmentation in Rust

https://internals.rust-lang.org/t/jemalloc-was-just-removed-from-the-standard-library/8759

… the std::alloc::System type to represent the system’s default allocator.

https://stackoverflow.com/questions/40658045/does-rusts-memory-management-result-in-fragmented-memory

Ownership Rules

  1. Each value in Rust has a variable that’s called its owner.
  2. There can only be one owner at a time.
  3. When the owner goes out of scope, the value will be dropped.
  • The types covered previously are all stored on the stack and popped off the stack when their scope is over, but we want to look at data that is stored on the heap and explore how Rust knows when to clean up that data.
let s = String::from("hello");
  • The Type String is allocated on the heap and as such is able to store an amount of text that is unknown to us at compile time.
  • In the case of a string literal (like, let literal = "I'm a string literal"), we know the contents at compile time, so the text is hardcoded directly into the final executable. This is why string literals are fast and efficient. But these properties only come from the string literal’s immutability.
  • With the String type, in order to support a mutable, growable piece of text, we need to allocate an amount of memory on the heap, unknown at compile time, to hold the contents. This means:
    • The memory must be requested from the memory allocator at runtime.
    • We need a way of returning this memory to the allocator when we’re done with our String.
    • That first part is done by us: when we call String::from, its implementation requests the memory it needs. However, the second part is different. (GC)
  • Rust takes a different path: the memory is automatically returned (~free) once the variable that owns it goes out of scope.
  • When a variable goes out of scope, Rust calls a special function for us. This function is called drop, and it’s where the author of String can put the code to return the memory. Rust calls drop automatically at the closing curly bracket.

Example.1: Stack

let x = 5;
let y = x;
  1. The value 5 will stored in the stack.
  2. Make a copy of the value in x and bind it to y.
  • Integers are simple values with a known, fixed size, and these two 5 values are pushed onto the stack.
  • My note: the variables like x and y have no meanings in assembly (a.k.a. compiled code). Only the values 5 are stored in real memory stack, and the Rust compiler remembers the each locations of these variables x and y.

Example.2: Heap

let s1 = String::from("hello");
let s2 = s1;
  • A String is made up of three parts:
    • A pointer to the memory that holds the contents of the string,
    • The length is how much memory, in bytes, the contents of the String is currently using, and
    • The capacity is the total amount of memory, in bytes, that the String has received from the allocator.
    • When we assign s1 to s2, the String data is copied, meaning we copy the pointer, the length, and the capacity that are on the stack. We do not copy the data on the heap that the pointer refers to.

image alt text

(My note) ptr, len and capacity are stored in stack. (My note) Core concept: String contains a pointer.

The following code returns error at its compile time.

let s1 = String::from("hello");
let s2 = s1;
println!("{}, world!", s1);
  • Note: shallow copy and deep copy: from the Python documentation.
    • A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
    • A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
    • … OK, deep copy make its copy of object in memory, and shallow copy just refer to the value.
  • The concept of copying the pointer, length, and capacity without copying the data probably sounds like making a shallow copy. But because Rust also invalidates the first variable, instead of being called a shallow copy, it’s known as a move.
  • Only s2 is valid, when it goes out of scope.
  • Rust will never automatically create “deep” copies of your data. Therefore, any automatic copying can be assumed to be inexpensive in terms of runtime performance.
  • If we do want to deeply copy the heap data of the String, not just the stack data, we can use a common method called clone.
fn main() {
    let s1 = String::from("hello");
    let s2 = s1.clone();

    println!("s1 = {}, s2 = {}", s1, s2);
}
// s1 = hello, s2 = hello
let x = 5;
let y = x;

println!("x = {}, y = {}", x, y);
  • The codes above returns no error because types such as integers that have a known size at compile time are stored entirely on the stack, so copies of the actual values are quick to make.
  • Rust has a special annotation called the Copy trait that we can place on types like integers that are stored on the stack.
  • As a general rule, any group of simple scalar values can be Copy, and nothing that requires allocation or is some form of resource is Copy.
    • u32, bool, f64, char, or Tuples (if they only contain types that are also Copy.

Ownership and Functions

  • The following code is failed when its compile time at the line println!("{}", s).
fn main() {
    let s = String::from("hello");  // s comes into scope

    takes_ownership(s);             // s's value moves into the function...
                                    // ... and so is no longer valid here
    println!("{}", s)
}

fn takes_ownership(some_string: String) { // some_string comes into scope
    println!("{}", some_string);
} // Here, some_string goes out of scope and `drop` is called. The backing
  // memory is freed.

Return Values and Scope

  • Returning values can also transfer ownership.
  • When a variable that includes data on the heap goes out of scope, the value will be cleaned up by drop unless the data has been moved to be owned by another variable.

Example:

fn main() {
    let s1 = gives_ownership();         // gives_ownership moves its return
                                        // value into s1

    let s2 = String::from("hello");     // s2 comes into scope

    let s3 = takes_and_gives_back(s2);  // s2 is moved into
                                        // takes_and_gives_back, which also
                                        // moves its return value into s3
} // Here, s3 goes out of scope and is dropped. s2 goes out of scope but was
  // moved, so nothing happens. s1 goes out of scope and is dropped.

fn gives_ownership() -> String {             // gives_ownership will move its
                                             // return value into the function
                                             // that calls it

    let some_string = String::from("hello"); // some_string comes into scope

    some_string                              // some_string is returned and
                                             // moves out to the calling
                                             // function
}

// takes_and_gives_back will take a String and return one
fn takes_and_gives_back(a_string: String) -> String { // a_string comes into
                                                      // scope

    a_string  // a_string is returned and moves out to the calling function
}
  • What if we want to let a function use a value but not take ownership? It’s quite annoying that anything we pass in also needs to be passed back if we want to use it again -> The solution is references.

4.2 References

let s1 = String::from("hello");
let len = calculate_length(&s1);

fn calculate_length(s: &String) -> usize {
    s.len()
}

image alt text

  • &s1 is a reference. It doesn’t own the ownership of s.
  • We call having references as function parameters borrowing.
  • So what happens if we try to modify something we’re borrowing? -> can change if mutable, but a restricton. You can have only one mutable reference to a particular piece of data in a particular scope. so the following code returns error.
fn main() {
    let mut s = String::from("hello");

    let r1 = &mut s;
    let r2 = &mut s;

    println!("{}, {}", r1, r2);
}
  • Mutable and immutable reference have no compatibility.
  • Multiple immutable references are okay.
  • Note that a reference’s scope starts from where it is introduced and continues through the last time that reference is used.
fn main() {
    let mut s = String::from("hello");

    let r1 = &s; // no problem
    let r2 = &s; // no problem (multiple immutable references.)
    println!("{} and {}", r1, r2);
    // r1 and r2 are no longer used after this point

    let r3 = &mut s; // no problem
    println!("{}", r3);
}
  • In Rust, the compiler guarantees that references will never be dangling reference.
  • The two rules of references
    • At any given time, you can have either one mutable reference or any number of immutable references.
    • References must always be valid.

4.3 The Slice Type

  • The slice is another data type that does not have ownership.
  • Slices let you reference (so doesn’t have ownership) a contiguous sequence of elements in a collection rather than the whole collection.
  • For example, let bytes = s.as_bytes();: s is String and bytes is an array of bytes.
fn first_word(s: &String) -> usize {
    let bytes = s.as_bytes();

    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' { //search for the byte that represents the space by using the byte literal syntax.
            return i;  //If we find a space, we return the position
        }
    }

    s.len() // Otherwise, we return the length of the string by using s.len()
}

fn main() {
    let mut s = String::from("hello world");

    let word = first_word(&s); // word will get the value 5

    s.clear(); // this empties the String, making it equal to ""

    // word still has the value 5 here, but there's no more string that
    // we could meaningfully use the value 5 with. word is now totally "invalid"!
}
  • Because we get a reference to the element from .iter().enumerate(), we use & in the pattern.
  • Because word isn’t connected to the state of s at all, word still contains the value 5.

String Slice

fn main() {
    let s = String::from("hello world");

    let hello = &s[0..5];
    let world = &s[6..11];
}

image alt text

  • world contains ptr to the 6th element of the s and length 5 (slice is references).
  • Rust’s range syntax is ...
  • String slice range indices must occur at valid UTF-8 character boundaries. If you attempt to create a string slice in the middle of a multibyte character, your program will exit with an error. For the purposes of introducing string slices, we are assuming ASCII only in this section;
fn first_word(s: &String) -> &str {
    let bytes = s.as_bytes();

    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' {
            return &s[0..i];
        }
    }

    &s[..]
}

fn main() {
    let mut s = String::from("hello world");

    let word = first_word(&s); // immutable reference

    s.clear(); // error! Because clear needs to truncate the String, it needs to get a mutable reference.

    println!("the first word is: {}", word);
}
  • My note: Because word is immutable, it cant clean (make it "", which means mutable borrow.)

Example: frequently used slice

fn main() {
let a = [1, 2, 3, 4, 5];

let slice = &a[1..3];
}

This slice has the type &[i32].

A struct, or structure, is a custom data type that lets you name and package together multiple related values that make up a meaningful group.

5.1 Defining and Instantiating Structs

  • The pieces of a struct can be different types.
  • Unlike with tuples, you’ll name each piece of data so it’s clear what the values mean.
  • you don’t have to rely on the order of the data to specify or access the values of an instance.
  • To use a struct after we’ve defined it, we create an instance of that struct by specifying concrete values for each of the fields. wiht key: value pairs.
struct User {
    username: String,
    email: String,
    sign_in_count: u64,
    active: bool,
}

fn main() {
    let user1 = User {
        email: String::from("someone@example.com"),
        username: String::from("someusername123"),
        active: true,
        sign_in_count: 1,
    };
}
  • To get a specific value from a struct, we can use dot notation.
  • If the instance is mutable, we can change a value by using the dot notation and assigning into a particular field.
  • user1.email = String::from("anotheremail@example.com");
  • the entire instance must be mutable; Rust doesn’t allow us to mark only certain fields as mutable.
  • create instalnce with function sample
fn build_user(email: String, username: String) -> User {
    User {
        email: email,
        username: username,
        active: true,
        sign_in_count: 1,
    }
}
  • Because the parameter names and the struct field names are exactly the same, we can use the field init shorthand syntax to rewrite build_user so that it behaves exactly the same but doesn’t have the repetition of email and username.
fn build_user(email: String, username: String) -> User {
    User {
        email,
        username,
        active: true,
        sign_in_count: 1,
    }
}
  • The syntax .. specifies that the remaining fields not explicitly set should have the same value as the fields in the given instance.
let user2 = User {
    email: String::from("another@example.com"),
    username: String::from("anotherusername567"),
    ..user1
};

user2 has a different value for email and username but has the same values for the active and sign_in_count fields from user1.

  • You can also define structs that look similar to tuples, called tuple structs. Tuple structs have the added meaning the struct name provides but don’t have names associated with their fields;
struct Color(i32, i32, i32);
let black = Color(0, 0, 0);
  • We can’t use &str instead of String::from() in a Structure. It returns error because of its lifetime. &str is a “string slice”, so it is an reference. The value of a struct can be reference, but lifetime issues are there.
  • (From chapter 10): Every reference in Rust has a lifetime, which is the scope for which that reference is valid.

5.2 An Example Program Using Structs

  • Practicale tips
  • We use structs to add meaning by labeling the data.
struct Rectangle {
    width: u32,
    height: u32,
}

fn main() {
    let rect1 = Rectangle {
        width: 30,
        height: 50,
    };

    println!(
        "The area of the rectangle is {} square pixels.",
        area(&rect1)
    );
}

fn area(rectangle: &Rectangle) -> u32 {
    rectangle.width * rectangle.height
}
  • We want to borrow the struct rather than take ownership of it. This way, main retains its ownership and can continue using rect1, which is the reason we use the & in the function signature and where we call the function.

  • By default, the curly brackets {} tell println! to use formatting known as Display: output intended for direct end user consumption. Due to this ambiguity, Rust doesn’t try to guess what we want, and structs don’t have a provided implementation of Display.

  • {:?} debug or {:#?} for pretty-print. Require #[derive(Debug)] jsut before the struct definition as shown below.

#[derive(Debug)]
struct Rectangle {
    width: u32,
    height: u32,
}

fn main() {
    let rect1 = Rectangle {
        width: 30,
        height: 50,
    };

    println!("rect1 is {:?}", rect1);
    // rect1 is Rectangle { width: 30, height: 50 }
}

I add the annotation to derive the Debug trait and printing the Rectangle instance using debug formatting. Rust has provided a number of traits for us to use with the derive annotation that can add useful behavior to our custom types.

About #[derive(Debug)], it’s called an attribute. https://doc.rust-lang.org/rust-by-example/attribute.html

5.3 Method syntax

  • Methods are different from functions in that they’re defined within the context of a struct (or an enum or a trait object,
  • their first parameter is always self, which represents the instance of the struct the method is being called on.
  • How to add method on struct? -> impl
#[derive(Debug)]
struct Rectangle {
    width: u32,
    height: u32,
}

impl Rectangle {
    fn area(&self) -> u32 {
        self.width * self.height
    }
}
  • How to access to method? -> Dot
  • Methods can take ownership of self, borrow self immutably as we’ve done above, or borrow self mutably, just as they can any other parameter.
  • C, C++ : In other words, if object is a pointer, object->something() is similar to (*object).something().
  • Rust doesn’t have an equivalent to the -> operator; instead, Rust has a feature called automatic referencing and dereferencing.
  • When you call a method with object.something(), Rust automatically adds in &, &mut, or * so object matches the signature of the method.
impl Rectangle {
    fn area(&self) -> u32 {
        self.width * self.height
    }

    fn can_hold(&self, other: &Rectangle) -> bool {
        self.width > other.width && self.height > other.height
    }
}
  • We’re allowed to define functions within impl blocks that don’t take self as a parameter. These are called associated functions because they’re associated with the struct.
  • Associated functions are often used for constructors that will return a new instance of the struct.
#[derive(Debug)]
struct Rectangle {
    width: u32,
    height: u32,
}

impl Rectangle {
    fn square(size: u32) -> Rectangle {
        Rectangle {
            width: size,
            height: size,
        }
    }
}

fn main() {
    let sq = Rectangle::square(3);
}
  • To call this associated function, we use the :: syntax with the struct name; let sq = Rectangle::square(3); is an example. This function is namespaced by the struct:
  • Each struct is allowed to have multiple impl blocks. (My memo) I can add new functions later.
  • My note: Why we need an associated function? -> my answer: at first the main benefit of method over function is for organization of codes, because we can put all function related to struct in a place. If we write functions instead, we could check all code base which is available with the struct. second, some function could be related real instances of struct, but some fuctions are related with the struct itself, thus they don’t need an instance of the type to work with, like String::from.

6 Enums and Pattern Matching

6.1 Defining an Enum

An enum definition is kind of custom data type.

  • Example 1. IP
enum IpAddrKind {
    V4,
    V6,
}

My Note:

  • The custom data type is IpAddrKind.

  • The actual type of IpAddrKind could be either V4 or V6.

  • Both V4 and V6 are also custom data types.

  • The code follows create each instances. The variants of the enum are namespaced under its identifier, and we use a double colon to separate the two:

let four = IpAddrKind::V4;
let six = IpAddrKind::V6;
  • The reason this is useful is that both values IpAddrKind::V4 and IpAddrKind::V6 are of the same type: IpAddrKind. We can then, for instance, define a function that takes any IpAddrKind:
fn route(ip_kind: IpAddrKind) {}

And we can call this function with either variant:

route(IpAddrKind::V4);
route(IpAddrKind::V6);

We can define types of each elements (=types) in the enum definition.

enum IpAddr {
    V4(String),
    V6(String),
}

let home = IpAddr::V4(String::from("127.0.0.1"));
let loopback = IpAddr::V6(String::from("::1"));
  • There’s another advantage to using an enum rather than a struct: each variant can have different types and amounts of associated data.
enum IpAddr {
    V4(u8, u8, u8, u8),
    V6(String),
}

let home = IpAddr::V4(127, 0, 0, 1);
let loopback = IpAddr::V6(String::from("::1"));
  • My note: enum makes us write codes in more abstract level, and it can also reduce code lines.
  • The following code are actually written in Rust standart library (because wanting to store IP addresses and encode which kind they are is so common.)
struct Ipv4Addr {
    // --snip--
}

struct Ipv6Addr {
    // --snip--
}

enum IpAddr {
    V4(Ipv4Addr),
    V6(Ipv6Addr),
}
  • Another example.
enum Message {
    Quit,
    Move { x: i32, y: i32 },
    Write(String),
    ChangeColor(i32, i32, i32),
}
  • If we used the different structs, which each have their own type, we couldn’t as easily define a function to take any of these kinds of messages as we could with the Message enum defined above, which is a single type.
    • My note: When the specification changes (addinG a new type), we should copy and paste the struct, but not in case of enum.
  • enum also can be implemented.
impl Message {
    fn call(&self) {
        // method body would be defined here
    }
}

let m = Message::Write(String::from("hello"));
m.call();

The Option Enum and Its Advantages Over Null Values

Please learn to learn from at once, eventhough you don’t understand the Option first. https://www.youtube.com/watch?v=JKmkKae-EhM

  • Option is another enum defined by the standard library.
  • The Option type is used in many places because it encodes the very common scenario in which a value could be something or it could be nothing. Expressing this concept in terms of the type system means the compiler can check whether you’ve handled all the cases you should be handling; this functionality can prevent bugs that are extremely common in other programming languages.
    • Rust doesn’t have the null feature. … In languages with null, variables can always be in one of two states: null or not-null.
    • The problem with null values is that if you try to use a null value as a not-null value, you’ll get an error of some kind.
    • However, the concept that null is trying to express is still a useful one: a null is a value that is currently invalid or absent for some reason.
    • The problem isn’t really with the concept but with the particular implementation.
  • Rust does not have nulls, but it does have an enum that can encode the concept of a value being present or absent. This enum is Option<T>, and it is defined by the standard library as follows:
enum Option<T> {
    Some(T),
    None,
}
  • You can use Some and None directly without the Option:: prefix.
  • For now, all you need to know is that <T> means the Some variant of the Option enum can hold one piece of data of any type.
let some_number = Some(5);
let some_string = Some("a string");

let absent_number: Option<i32> = None;
  • If we use None rather than Some, we need to tell Rust what type of Option<T> we have.

Why is having Option<T> any better than having null?

Because Option<T> and T (where T can be any type) are different types, the compiler won’t let us use an Option<T> value as if it were definitely a valid value.

In the following code, sum returns a compile error because Rust doesn’t understand how to add an i8 and an Option<i8>.

fn main() {
    let x: i8 = 5;
    let y: Option<i8> = Some(5);

    let sum = x + y;
}

This means, when we have a value of a type like i8 in Rust, the compiler will ensure that we always have a valid value. In other words, you have to convert an Option<T> to a T before you can perform T operations with it. Generally, this helps catch one of the most common issues with null.

6.2 The match Control Flow Operator

Here is an example. (Tips. From 1999 through 2008, the United States minted quarters with different designs for each of the 50 states on one side.)

#[derive(Debug)] // so we can inspect the state in a minute
enum UsState {
    Alabama,
    Alaska,
    // --snip--
}

enum Coin {
    Penny,
    Nickel,
    Dime,
    Quarter(UsState),
}

fn value_in_cents(coin: Coin) -> u8 {
    match coin {
        Coin::Penny => 1,
        Coin::Nickel => 5,
        Coin::Dime => 10,
        Coin::Quarter(state) => {
            println!("State quarter from {:?}!", state);
            25
        }
    }
}

Matching with Option<T>

  • Especially in the case of Option<T>, when Rust prevents us from forgetting to explicitly handle the None case, it protects us from assuming that we have a value when we might have null, thus making the billion-dollar mistake discussed earlier impossible.

The _ Placeholder

The _ will match all the possible cases that aren’t specified before it.

6.3 Concise Control Flow with if let

The if let syntax lets you combine if and let into a less verbose way to handle values that match one pattern while ignoring the rest.

let some_u8_value = Some(0u8);

// no if let syntax
match some_u8_value {
    Some(3) => println!("three"),
    _ => (),
}

// same as above (with if let syntax)
// note: the pattern is its first arm.
if let Some(3) = some_u8_value {
    println!("three");
}
  • We can include an else with an if let.
match coin {
    Coin::Quarter(state) => println!("State quarter from {:?}!", state),
    _ => count += 1,
}

// same as
if let Coin::Quarter(state) = coin {
        println!("State quarter from {:?}!", state);
} else {
    count += 1;
}

When we use if let?

Using if let means less typing, less indentation, and less boilerplate code. However, you lose the exhaustive checking that match enforces. Choosing between match and if let depends on what you’re doing in your particular situation and whether gaining conciseness is an appropriate trade-off for losing exhaustive checking.

With if let, we don’t need to write _ in match. And the difference between if is, in place of a condition expression if let expects the keyword let followed by a pattern, an = and a scrutinee expression.`

7. Managing Growing Projects with Packages, Crates, and Modules

  • Packages: A Cargo feature that lets you build, test, and share crates
  • Crates: A tree of modules that produces a library or executable
  • Modules and use: Let you control the organization, scope, and privacy of paths
  • Paths: A way of naming an item, such as a struct, function, or module

A package can contain multiple binary crates and optionally one library crate.

7.1 Packages and Crates

Packages

  • A package is one or more crates that provide a set of functionality.
  • A package contains a Cargo.toml file that describes how to build those crates.
  • A package must contain zero or one library crates, and no more.
  • When you enter cargo new hello_cargo, it creates the package hello_cargo, and this is described in Cargo.toml file.
    • We have a package that only contains src/main.rs, meaning it only contains a binary crate named hello_cargo.

Crates

  • A crate is a binary or library.
  • The crate root is a source file that the Rust compiler starts from and makes up the root module of your crate.

Sample: Cargo.toml

[package]
name = "hello_cargo"
version = "0.1.0"
authors = ["atlex <itsme@myemail.com>"]
edition = "2018"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]

Conventions

  • src/main.rs is the crate root of a binary crate with the same name as the package.
  • If the package directory contains src/lib.rs, the package contains a library crate with the same name as the package, and src/lib.rs is its crate root.
  • If a package contains src/main.rs and src/lib.rs, it has two crates: a library and a binary, both with the same name as the package.
  • A package can have multiple binary crates by placing files in the src/bin directory.
  • A crate will group related functionality together in a scope so the functionality is easy to share between multiple projects.

My note: in terms of a crate, library and binary can be regarded as different elements.

7.2 Defining Modules to Control Scope and Privacy

  • By using modules, we can group related definitions together and name why they’re related.

  • The use keyword brings a path into scope. Written in a parent code

  • The pub keyword to make items public. Written in child code

  • Privacy of an item is whether item can be used by outside code (public) or is an internal implementation detail and not available for outside use (private).

Sample

cargo new --lib restraunt

restraunt
├── Cargo.toml
└── src
    └── lib.rs

Write lib.rs as follows.

mod front_of_house {
    mod hosting {
        fn add_to_waitlist() {}
        fn seat_at_table() {}
    }

    mod serving {
        fn take_order() {}
        fn serve_order() {}
        fn take_payment() {}
    }
}

Module tree in this example

crate
 └── front_of_house
     ├── hosting
        ├── add_to_waitlist
        └── seat_at_table
     └── serving
         ├── take_order
         ├── serve_order
         └── take_payment
  • If module A is contained inside module B, we say that module A is the child of module B and that module B is the parent of module A.
  • The module tree might remind you of the filesystem’s directory tree on your computer; this is a very apt comparison! Just like directories in a filesystem, you use modules to organize your code. And just like files in a directory, we need a way to find our modules.

7.3 Paths for Referring to an Item in the Module Tree

  • If we want to call a function, we need to know its path.
  • A path can take two forms:
    • An absolute path starts from a crate root by using a crate name or a literal crate.
    • A relative path starts from the current module and uses self, super, or an identifier in the current module.
  • Both absolute and relative paths are followed by one or more identifiers separated by double colons (::).
  • Our preference is to specify absolute paths because it’s more likely to move code definitions and item calls independently of each other.
  • Rust’s privacy boundary: the line that encapsulates the implementation details external code isn’t allowed to know about, call, or rely on. So, if you want to make an item like a function or struct private, you put it in a module.
  • The way privacy works in Rust is that all items (functions, methods, structs, enums, modules, and constants) are private by default.
  • Items in a parent module can’t use the private items inside child modules, but items in child modules can use the items in their ancestor modules.
  • Making the module public doesn’t make its contents public.
Sample

src/lib.rs

mod front_of_house {
    pub mod hosting {
        pub fn add_to_waitlist() {}
    }
}

pub fn eat_at_restaurant() {
    // Absolute path
    crate::front_of_house::hosting::add_to_waitlist();

    // Relative path
    front_of_house::hosting::add_to_waitlist();
}
  • We can also construct relative paths that begin in the parent module by using super at the start of the path. This is like starting a filesystem path with the .. syntax.
fn serve_order() {}

mod back_of_house {
    fn fix_incorrect_order() {
        cook_order();
        super::serve_order();
    }

    fn cook_order() {}
}
  • We think the back_of_house module and the serve_order function are likely to stay in the same relationship to each other and get moved together should we decide to reorganize the crate’s module tree. Therefore, we used super so we’ll have fewer places to update code in the future if this code gets moved to a different module.

  • If we use pub before a struct definition, we make the struct public, but the struct’s fields will still be private. We can make each field public or not on a case-by-case basis.

mod back_of_house {
    pub struct Breakfast {
        pub toast: String,
        seasonal_fruit: String,
    }

    impl Breakfast {
        pub fn summer(toast: &str) -> Breakfast {
            Breakfast {
                toast: String::from(toast),
                seasonal_fruit: String::from("peaches"),
            }
        }
    }
}
pub fn eat_at_restaurant() {
    let mut meal = back_of_house::Breakfast::summer("Rye");
    meal.toast = String::from("Wheat");
    println!("I'd like {} toast please", meal.toast);

}

We’ve defined a public back_of_house::Breakfast struct with a public toast field but a private seasonal_fruit field. This models the case in a restaurant where the customer can pick the type of bread that comes with a meal, but the chef decides which fruit accompanies the meal based on what’s in season and in stock. The available fruit changes quickly, so customers can’t choose the fruit or even see which fruit they’ll get.

  • In contrast, if we make an enum public, all of its variants are then public. We only need the pub before the enum keyword.
mod back_of_house {
    pub enum Appetizer {
        Soup,
        Salad,
    }
}

pub fn eat_at_restaurant() {
    let order1 = back_of_house::Appetizer::Soup;
    let order2 = back_of_house::Appetizer::Salad;
}

7.4 Bringing Paths into Scope with the use Keyword

  • We can bring a path into a scope once and then call the items in that path as if they’re local items with the use keyword.
mod front_of_house {
    pub mod hosting {
        pub fn add_to_waitlist() {}
    }
}

use crate::front_of_house::hosting;
//or
//use self::front_of_house::hosting;

pub fn eat_at_restaurant() {
    hosting::add_to_waitlist();
    hosting::add_to_waitlist();
    hosting::add_to_waitlist();
}

Creating Idiomatic use Paths convention

The following use is bad.

use crate::front_of_house::hosting::add_to_waitlist;

pub fn eat_at_restaurant() {
    add_to_waitlist();
    add_to_waitlist();
    add_to_waitlist();
}

We don’t know in which scope add_to_waitlist comes from?

Another snippet which has the same probelm (bad).

use std::fmt::Result;
use std::io::Result as IoResult;
  • When we bring a name into scope with the use keyword, the name available in the new scope is private. -> pub use is called re-exporting, and with this syntax an external code also use them.

  • Note that the standard library (std) is also a crate that’s external to our package. Because the standard library is shipped with the Rust language, we don’t need to change Cargo.toml to include std. But we do need to refer to it with use to bring items from there into our package’s scope.

  • Here are smart ways to use.

// old
//use std::cmp::Ordering;
//use std::io;

// New!
use std::{cmp::Ordering, io};

// How about this?
//use std::io;
//use std::io::Write;

// Here!
use std::io::{self, Write};
  • If we want to bring all public items defined in a path into scope, we can specify that path followed by *, the glob operator:
use std::collections::*;

The glob operator is often used when testing to bring everything under test into the tests module.

7.5 Separating Modules into Different Files

src/lib.rs

mod front_of_house;
pub use crate::front_of_house::hosting;
// --snip--

src/front_of_house.rs

pub mod hosting {
    pub fn add_to_waitlist() {
        // --snip--
    }
}

Using a semicolon after mod front_of_house rather than using a block tells Rust to load the contents of the module from another file with the same name as the module.

My note: sample of an available depth structure

  1. src/lib.rs: pub use crate::front_of_house::hosting
  2. src/front_of_house.rs: pub mod hosting;
  3. src/front_of_house/hosting.rs: pub fn add_to_waitlist() {}

8. Common Collections

  • Collections: a number of very useful data structures included in Rust’s standard library.
  • The data these collections point to is stored on the heap, which means the amount of data does not need to be known at compile time and can grow or shrink as the program runs.
  • Three main collections: vector, string, hashmap

8.1 Storing Lists of Values with Vectors

Vector

  • Vec<T>
  • Vectors can only store values of the same type.
  • How to create a new empty vector:
    let v: Vec<i32> = Vec::new();
    
  • Rust can infer the type.
  • Rust provides the vec! macro for convenience. The macro will create a new vector that holds the values you give it.
    let v = vec![1, 2, 3];
    
  • Updating a vector (input a value to a vector) -> push
    let mut v = Vec::new();
    
    v.push(5);
    
  • A vector is freed when it goes out of scope. When the vector gets dropped, all of its contents are also dropped, meaning those integers it holds will be cleaned up.
  • There are two ways to read an element. &v[2] and v.get(2).
  • &v[2] returns the value, and v.get(2) returns Option<&T>.
  • &v[100] will cause the program to panic when it references a nonexistent element (i.e. there is no 100th element in v). When the get method is passed an index that is outside the vector, it returns None without panicking.
  • You would use get method if accessing an element beyond the range of the vector happens occasionally under normal circumstances.

Sample code of v.get():

let v = vec![1, 2, 3, 4, 5];

let third: &i32 = &v[2];
println!("The third element is {}", third);

match v.get(2) {
    Some(third) => println!("The third element is {}", third),
    None => println!("There is no third element."),
}
  • Mutability of elements: The following code returne compile error at line v.push(6);.
    let mut v = vec![1, 2, 3, 4, 5];
    let first = &v[0]; //immutable borrow
    v.push(6); //mutable borrow
    println!("The first element is: {}", first); // immutable borrow
    
    • Details about the error: If there isn’t enough room to put all the elements next to each other where the vector currently is. In that case, the reference to the first element would be pointing to deallocated memory. The borrowing rules prevent programs from ending up in that situation.
  • Note. push and pop method operate at the last element the vector.

Iterating over the Values in a Vector

// Just referencing
let v = vec![100, 32, 57];
for i in &v {
    println!("{}", i);
}

// Change elements
let mut v = vec![100, 32, 57];
for i in &mut v {
    *i += 50;
}

*i is called “dereference operator”. (Details are in Chapter 15)

There are definitely use cases for needing to store a list of items of different types. -> enum!!

enum SpreadsheetCell {
    Int(i32),
    Float(f64),
    Text(String),
}

let row = vec![
    SpreadsheetCell::Int(3),
    SpreadsheetCell::Text(String::from("blue")),
    SpreadsheetCell::Float(10.12),
];

8.2 Storing UTF-8 Encoded Text with Strings

  • Rust has only one string type in the core language, which is the string slice str that is usually seen in its borrowed form &str.
  • When Rustaceans refer to “strings” in Rust, they usually mean the String and the string slice &str types, not just one of those types.
  • Both String and a string slice &str are UTF-8 encoded.
  • We use the to_string method, which is available on any type that implements the Display trait, as string literals do.
  • Using the to_string method to create a String from a string literal.
    // the method works on a literal directly:
    let s = "initial contents".to_string();
    // same as
    let s = String::from("initial contents");
    
  • We can grow a String by using the push_str method to append a string.
    let mut s = String::from("foo");
    s.push_str("bar");
    // s ~ "foobar"
    
  • The push_str method takes a string slice because we don’t necessarily want to take ownership of the parameter. Therefore, the following codes returns s2 is bar, not a compile error.
    let mut s1 = String::from("foo");
    let s2 = "bar";
    s1.push_str(s2); // push_str() don't take ownership of s2
    println!("s2 is {}", s2);
    

Concatenation with the + Operator or the format! Macro

The following code contains a lot of knowledge.

let s1 = String::from("Hello, ");
let s2 = String::from("world!");
let s3 = s1 + &s2;

Before discussing about the code above, we should know that the + operator uses the add method, whose “signature” looks something like this (but isn’t exact):

fn add(self, s: &str) -> String {

Two discussions: let s3 = s1 + &s2;

  1. s3 takes ownership of s1. s1 becomes self of the add function.
  2. The + operator uses the add method, whose input is &str, not &String. The reason we’re able to use &s2 in the call to add is that the compiler can coerce the &String argument into a &str. When we call the add method, Rust uses a deref coercion, which here turns &s2 into &s2[..].
  • Tip: Append multiple Strings. With format! macro.
    let s1 = String::from("tic");
    let s2 = String::from("tac");
    let s3 = String::from("toe");
    
    let s = format!("{}-{}-{}", s1, s2, s3);
    

    The version of the code using format! is much easier to read and doesn’t take ownership of any of its parameters. format! macro works in the same way as println!, but instead of printing the output to the screen, it returns a String with the contents.

Indexing into Strings

Rust doesn’t allow us to get n-th charactor with the index. The following code returns a compile error.

let s1 = String::from("hello");
let h = s1[0];

The reason is…?

  • A String is a wrapper over a Vec<u8>.
    • Both String and a string slice &str are UTF-8 encoded.
  • In some languages, a character could be sepreated into two parts (in terms of UFT-8), like,
    // The u8 values of the String 
    [224, 164, 168, 224, 164, 174, 224, 164, 184, 224, 165, 141, 224, 164, 164, 224, 165, 135]
    // is same as the character set
    ['न', 'म', 'स', '्', 'त',  'े']
    // is same in the letter
    ["न", "म", "स्", "ते"]
    

Slicing Strings

Example 1. specifing by the number of bytes

let hello = "Здравствуйте";
let s = &hello[0..4];
// s will be Зд
// &hello[0..1] returns panic!
// thread 'main' panicked at 'byte index 1 is not a char boundary;

Example 2. specifing by charactors.

for c in "नमस्ते".chars() {
    println!("{}", c);
}
// न
// म
// स
// ्
// त
// े

Example 3. deviding in bytes.

for b in "नमस्ते".bytes() {
    println!("{}", b);
}
//224
//164
//// --snip--
//165
//135

Be sure to remember that valid Unicode scalar values may be made up of more than 1 byte.

8.3 Storing Keys with Associated Values in Hash Maps

  • Terminology: hash ~ map ~ hash table ~ dictionary ~ associative array
  • Hashmap ~ key-value

Example:

use std::collections::HashMap;

let mut scores = HashMap::new();

scores.insert(String::from("Blue"), 10);
  • The type HashMap<K, V> stores a mapping of keys of type K to values of type V.
  • Just like vectors, hash maps store their data on the heap.
  • Like vectors, hash maps are homogeneous: all of the keys must have the same type, and all of the values must have the same type.
  • .insert takes ownerships of the variables.

Example: Combining two Vec into a HashMap.

use std::collections::HashMap;

let teams = vec![String::from("Blue"), String::from("Yellow")];
let initial_scores = vec![10, 50];

let mut scores: HashMap<_, _> =
    teams.into_iter().zip(initial_scores.into_iter()).collect();

Accessing Values in a Hash Map

Done by get method.

use std::collections::HashMap;

let mut scores = HashMap::new();

scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);

let team_name = String::from("Blue");
let score = scores.get(&team_name);

Note that the result of scores.get(&team_name) is Some(&10) because get returns an Option<&V>; if there’s no value for that key in the hash map, get will return None.

Iteration

use std::collections::HashMap;

let mut scores = HashMap::new();

scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);

for (key, value) in &scores {
    println!("{}: {}", key, value);
}

Update a value (3 types)

Case 1. Overwriting a value. insert simply because an HashMap has a unique key.

use std::collections::HashMap;

let mut scores = HashMap::new();

scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Blue"), 25);

println!("{:?}", scores); // {"Blue": 25}

Case 2. Only inserting a value if the key has no value. or_insert method.

use std::collections::HashMap;

let mut scores = HashMap::new();
scores.insert(String::from("Blue"), 10);

scores.entry(String::from("Yellow")).or_insert(50);
scores.entry(String::from("Blue")).or_insert(50);

println!("{:?}", scores); // {"Yellow": 50, "Blue": 10}

entry method returns an enum called Entry that represents a value that might or might not exist.

Case 3. Updating a value based on the old value. Use dereference (before understanding Chap. 15, just notice about dereference *)

use std::collections::HashMap;

let text = "hello world wonderful world";

let mut map = HashMap::new();

for word in text.split_whitespace() {
    let count = map.entry(word).or_insert(0);
    *count += 1;
}

println!("{:?}", map);

The or_insert method actually returns a mutable reference (&mut V) to the value for this key. Here we store that mutable reference in the count variable, so in order to assign to that value, we must first dereference count using the asterisk (*).

Hashing Functions

For Hashing algorithm, Rust uses SipHash as of Apr. 2021.

My note: a slide about SipHash.

https://de.slideshare.net/ASF-WS/asfws2012-jean-philippeaumassonmartinbosslethashfloodingdosreloaded1

9. Error Handling

Rust groups errors into two major categories: recoverable and unrecoverable errors.

  • For a recoverable error, such as a file not found error, it’s reasonable to report the problem to the user and retry the operation.
  • Unrecoverable errors are always symptoms of bugs, like trying to access a location beyond the end of an array.

Rust doesn’t have exceptions. Instead, it has the type Result<T, E> for recoverable errors and the panic! macro that stops execution when the program encounters an unrecoverable error.

9.1 Unrecoverable Errors with panic!

  • When the panic! macro executes, your program will print a failure message, unwind and clean up the stack, and then quit.

There are two type of panic, unwinding and abort.

  • Unwinding: Rust walks back up the stack and cleans up the data from each function it encounters.
  • Abort: Memory that the program was using will then need to be cleaned up by the operating system.

Generally the walking back and cleanup in unwinding is a lot of work. Abort is an alternative.

Panic example: Buffer overread

fn main() {
    let v = vec![1, 2, 3];

    v[99];
}

The key to reading the backtrace is to start from the top and read until you see files you wrote. RUST_BACKTRACE=1 cargo run

9.2 Recoverable Errors with Result

Recall Result enum.

enum Result<T, E> {
    Ok(T),
    Err(E),
}
  • <T, E> means “T and E are generic type parameters”.

A good error handling example: Open file.

use std::fs::File;

fn main() {
    let f = File::open("hello.txt");

    let f = match f {
        Ok(file) => file,
        Err(error) => panic!("Problem opening the file: {:?}", error),
    };
}

Run without the file hello.txt.

$ cargo run
... (warning about _f)
    Finished dev [unoptimized + debuginfo] target(s) in 0.00s
     Running `target/debug/panic`
thread 'main' panicked at 'Problem opening the file: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/main.rs:8:23
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Example: switch operations by a type of errors.

use std::fs::File;
use std::io::ErrorKind;

fn main() {
    let f = File::open("hello.txt");

    let f = match f {
        Ok(file) => file,
        Err(error) => match error.kind() {
            ErrorKind::NotFound => match File::create("hello.txt") {
                Ok(fc) => fc,
                Err(e) => panic!("Problem creating the file: {:?}", e),
            },
            other_error => {
                panic!("Problem opening the file: {:?}", other_error)
            }
        },
    };
}

It’s more sophicicated because there is no match expression.

unwrap_or_else: Implemented in Option<T>. Returns the contained Some value or computes it from a closure.

use std::fs::File;
use std::io::ErrorKind;

fn main() {
    let f = File::open("hello.txt").unwrap_or_else(|error| {
        if error.kind() == ErrorKind::NotFound {
            File::create("hello.txt").unwrap_or_else(|error| {
                panic!("Problem creating the file: {:?}", error);
            })
        } else {
            panic!("Problem opening the file: {:?}", error);
        }
    });
}

unwrap <- Used frequent (IMO)

The Result<T, E> type has many helper methods defined on it to do various tasks. One of those methods, called unwrap, is a shortcut method that is implemented just like the match expression. If the Result value is the Ok variant, unwrap will return the value inside the Ok. If the Result is the Err variant, unwrap will call the panic! macro for us.

use std::fs::File;

fn main() {
    let f = File::open("hello.txt").unwrap();
}

expect

Similar to unwrap, but it lets us also choose the panic! error message.

use std::fs::File;

fn main() {
    let f = File::open("hello.txt").expect("Failed to open hello.txt");
}

? operator

use std::fs::File;
use std::io;
use std::io::Read;

fn read_username_from_file() -> Result<String, io::Error> {
    let mut f = File::open("hello.txt")?;
    let mut s = String::new();
    f.read_to_string(&mut s)?;
    Ok(s)
}

The ? placed after a Result value is defined to work

  • If the value of the Result is an Ok, the value inside the Ok will get returned from this expression, and the program will continue.
  • If the value is an Err, the Err will be returned from the whole function so the error value gets propagated to the calling code.

Error values that have the ? operator called on them go through the from function, defined in the From trait in the standard library, which is used to convert errors from one type into another.

The ? operator can be used in functions that have a return type of Result. We’re only allowed to use the ? operator in a function that returns Result or Option or another type that implements std::ops::Try. When you’re writing code in a function that doesn’t return one of these types, and you want to use ? when you call other functions that return Result<T, E>, one technique is to change the return type of your function to be Result<T, E> if you have no restrictions preventing that.

The main function is special, and there are restrictions on what its return type must be. One valid return type for main is (), and conveniently, another valid return type is Result<T, E>.

use std::error::Error;
use std::fs::File;

fn main() -> Result<(), Box<dyn Error>> {
    let f = File::open("hello.txt")?;

    Ok(())
}

For now, you can read Box<dyn Error> to mean “any kind of error.”

Tip: Reading a file into a string

Rust provides the convenient fs::read_to_string function that opens the file, creates a new String, reads the contents of the file, puts the contents into that String, and returns it.

use std::fs;
use std::io;

fn read_username_from_file() -> Result<String, io::Error> {
    fs::read_to_string("hello.txt")
}

9.3 To panic! or Not to panic!

Returning Result is a good default choice when you’re defining a function that might fail. (My note: user can handle errors. panic! stop the program!)

The unwrap and expect methods are very handy when prototyping, before you’re ready to decide how to handle errors.

In test phase, panic! is how a test is marked as a failure. (My note: single panic = fail of a whole test)

panic! is often appropriate if you’re calling external code that is out of your control and it returns an invalid state that you have no way of fixing. However, when failure is expected, it’s more appropriate to return a Result than to make a panic! call.

Functions often have contracts: their behavior is only guaranteed if the inputs meet particular requirements. Panicking when the contract is violated makes sense because a contract violation always indicates a caller-side bug and it’s not a kind of error you want the calling code to have to explicitly handle. … Contracts for a function, especially when a violation will cause a panic, should be explained in the API documentation for the function.

My note: for validation, use Rust’s type system.

Creating Custom Types for Validation

We can make a new type and put the validations in a function to create an instance of the type rather than repeating the validations everywhere. That way, it’s safe for functions to use the new type in their signatures and confidently use the values they receive.

Example:

pub struct Guess {
    value: i32,
}

impl Guess {
    pub fn new(value: i32) -> Guess {
        if value < 1 || value > 100 {
            panic!("Guess value must be between 1 and 100, got {}.", value);
        }

        Guess { value }
    }

    pub fn value(&self) -> i32 {
        self.value
    }
}

pub fn value(&self) -> i32 is called getter. This public method is necessary because the value field of the Guess struct is private.


Should be reviewed from here.

10. Generic Types, Traits, and Lifetimes

Generics are abstract stand-ins for concrete types or other properties.

Similar to the way a function takes parameters with unknown values to run the same code on multiple concrete values, functions can take parameters of some generic type instead of a concrete type, like i32 or String.

The core concept is “Removing Duplication by Extracting a Function.”

In case of a function:

  1. Identify duplicate code.
  2. Extract the duplicate code into the body of the function and specify the inputs and return values of that code in the function signature.
  3. Update the two instances of duplicated code to call the function instead.

10.1 Generic Data Types

Tips: By convention, parameter names in Rust are short, often just a letter, and Rust’s type-naming convention is CamelCase. Short for “type,” T is the default choice of most Rust programmers.

Motivation

Practice: We combine the two functions below.

fn largest_i32(list: &[i32]) -> &i32 {
    let mut largest = &list[0];

    for item in list {
        if item > largest {
            largest = item;
        }
    }

    largest
}

fn largest_char(list: &[char]) -> &char {
    let mut largest = &list[0];

    for item in list {
        if item > largest {
            largest = item;
        }
    }

    largest
}

First, define a generic function.

fn largest<T>(list: &[T]) -> &T {
  • To define a generic function, place type name declarations inside angle brackets, <>
  • This function has one parameter named list.
  • The list is a slice of values of type T

Example

fn largest<T>(list: &[T]) -> &T {
    let mut largest = &list[0];

    for item in list {
        if item > largest {
            largest = item;
        }
    }

    largest
}

fn main() {
    let number_list = vec![34, 50, 25, 100, 65];

    let result = largest(&number_list);
    println!("The largest number is {}", result);

    let char_list = vec!['y', 'm', 'a', 'q'];

    let result = largest(&char_list);
    println!("The largest char is {}", result);
}

It looks fine, but unfortunately, it returns compile error.

error[E0369]: binary operation `>` cannot be applied to type `&T`
 --> src/main.rs:5:17
  |
5 |         if item > largest {
  |            ---- ^ ------- &T
  |            |
  |            &T
  |
help: consider restricting type parameter `T`
  |
1 | fn largest<T: std::cmp::PartialOrd>(list: &[T]) -> &T {
  |             ^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to previous error

The root cause is, the trait std::cmp::PartialOrd is not implemented to Strings.

The final answer would be as follows, which is covered in the next section.

fn largest<T: PartialOrd + Copy>(list: &[T]) -> T {
    let mut largest = list[0];

    for &item in list {
        if item > largest {
            largest = item;
        }
    }

    largest
}

In Struct Definitions

We can define structs to use a generic type parameter in one or more fields using the <> syntax.

struct Point<T> {
    x: T,
    y: T,
}

fn main() {
    let integer = Point { x: 5, y: 10 };
    let float = Point { x: 1.0, y: 4.0 };
}

To define a Point struct where x and y are both generics but could have different types…

struct Point<T, U> {
    x: T,
    y: U,
}

In Enum Definitions

Remind Option in the Chapter 6.

enum Option<T> {
    Some(T),
    None,
}

Remind Result in the Chapter 9.

enum Result<T, E> {
    Ok(T),
    Err(E),
}

When we use generic types

When you recognize situations in your code with multiple struct or enum definitions that differ only in the types of the values they hold, you can avoid duplication by using generic types instead.

Implementation (In Method Definitions)

impl<T>. By declaring T as a generic type after impl, Rust can identify that the type in the angle brackets in Point is a generic type rather than a concrete type.

struct Point<T> {
    x: T,
    y: T,
}

impl<T> Point<T> {
    fn x(&self) -> &T {
        &self.x
    }
}

fn main() {
    let p = Point { x: 5, y: 10 };

    println!("p.x = {}", p.x());
}

Defined a method named x on Point<T> that returns a reference to the data in the field x.

When we write impl Point<f32>, methods are implemented only to type f32.

Performance of Code Using Generics

The good news is that Rust implements generics in such a way that your code doesn’t run any slower using generic types than it would with concrete types.

Monomorphization

Monomorphization is the process of turning generic code into specific code by filling in the concrete types that are used when compiled.

For example, when Rust compiles the following code, it performs monomorphization.

let integer = Some(5);
let float = Some(5.0);

10.2 Traits: Defining Shared Behavior

A trait tells the Rust compiler about functionality a particular type has and can share with other types.

pub trait Summary {
    fn summarize(&self) -> String;
}

Interpret as “any type that has the Summary trait will have the method summarize.”

Implementing the trait on a type

pub struct NewsArticle {
    pub headline: String,
    pub location: String,
    pub author: String,
    pub content: String,
}

impl Summary for NewsArticle {
    fn summarize(&self) -> String {
        format!("{}, by {} ({})", self.headline, self.author, self.location)
    }
}

How to use traits to define functions that accept many different types.

pub fn notify(item: &impl Summary) {
    println!("Breaking news! {}", item.summarize());
}

Instead of a concrete type for the item parameter, we specify the impl keyword and the trait name. This parameter accepts any type that implements the specified trait.

Trait Bound Syntax

The above is actually syntax sugar for a longer form,

pub fn notify<T: Summary>(item: &T) {
    println!("Breaking news! {}", item.summarize());
}

Multi input.

// differenct type
pub fn notify(item1: &impl Summary, item2: &impl Summary)
// same type
pub fn notify<T: Summary>(item1: &T, item2: &T)

Specifying Multiple Trait Bounds with the + Syntax

We specify in the notify definition that item must implement both Display and Summary. We can do so using the + syntax:

pub fn notify(item: &(impl Summary + Display)) {...
//or
pub fn notify<T: Summary + Display>(item: &T) {...

where clause

More readable, less cluttered.

fn some_function<T, U>(t: &T, u: &U) -> i32
    where T: Display + Clone,
          U: Clone + Debug
{

// is equal to
fn some_function<T: Display + Clone, U: Clone + Debug>(t: &T, u: &U) -> i32 {

Returning Types that Implement Traits

fn returns_summarizable() -> impl Summary {
    Tweet {
        username: String::from("horse_ebooks"),
        content: String::from(
            "of course, as you probably already know, people",
        ),
        reply: false,
        retweet: false,
    }
}

By using impl Summary for the return type, we specify that the returns_summarizable function returns some type that implements the Summary trait without naming the concrete type.

However, you can only use impl Trait if you’re returning a single type.

A simple example of trait

Here is the answer of the problem which arrosed at the beginning of this section.

fn largest<T: PartialOrd + Copy>(list: &[T]) -> T {
    let mut largest = list[0];

    for &item in list {
        if item > largest {
            largest = item;
        }
    }

    largest
}

fn main() {
    let number_list = vec![34, 50, 25, 100, 65];

    let result = largest(&number_list);
    println!("The largest number is {}", result);

    let char_list = vec!['y', 'm', 'a', 'q'];

    let result = largest(&char_list);
    println!("The largest char is {}", result);
}

Implementations of a trait on any type

custom type ~ struct or enum or etc.

impl<T: Display> ToString for T {
    // --snip--
}

My note: Trait, associated function, method

#![allow(unused)]
fn main() {
    struct Example {
        number: i32,
    }

    impl Example {
        fn boo() {
            println!("boo! Example::boo() was called!");
        }

        fn add_nuber(&mut self) {
            self.number += 1;
        }

        fn get_number(&self) -> i32 {
            self.number
        }
    }

    trait Thingy {
        fn do_thingy(&self);
    }

    impl Thingy for Example {
        fn do_thingy(&self) {
            println!("doing a thing! also, number is {}!", self.number);
        }
    }

    // Test it
    let mut dummy = Example{number: 2};
    Example::boo(); // boo! Example::boo() was called!
    println!("A number of the instance dummy is {:?}",dummy.get_number()); // A number of the instance dummy is 2
    dummy.do_thingy(); // doing a thing! also, number is 2!
    //dummy.boo(); //error!
}

Traits provide us total abstraction and loose coupling.

10.3 Validating References with Lifetimes

Every reference in Rust has a lifetime, which is the scope for which that reference is valid.

Dangling reference: a reference to an object that no longer exists.

The simplest example: println!("r: {}", r); is a dangling reference.

fn main() {
    {
        let r;                // ---------+-- 'a
                              //          |
        {                     //          |
            let x = 5;        // -+-- 'b  |
            r = &x;           //  |       |
        }                     // -+       |
                              //          |
        println!("r: {}", r); //          |
    }                         // ---------+
}

'a and 'b means the lifetimes of x and r. Because its scope is larger, we say that it “lives longer.”

The following code returns compile error.

fn longest(x: &str, y: &str) -> &str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

longest function could return x or y. If you use it like let result = longest(string1, string2);, the compile can’t decide the lifetime of string1 or string2.

The reason is, the Rust compiler has a borrow checker that compares scopes to determine whether all borrows are valid. The borrow checker doesn’t know how the lifetimes of x and y relate to the lifetime of the return value of the function longest.

How can we fix it?

Lifetime Annotation Syntax

The names of lifetime parameters must start with an apostrophe (') and are usually all lowercase and very short. Most people use the name 'a. We place lifetime parameter annotations after the & of a reference,

&i32        // a reference
&'a i32     // a reference with an explicit lifetime
&'a mut i32 // a mutable reference with an explicit lifetime

The annotations are meant to tell Rust how generic lifetime parameters of multiple references relate to each other. Multi references!! With this notation, we can specify that the lifetime of x and y are same as follows.

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

The change means “all the references in the parameters and the return value must have the same lifetime.” In practice, it means that the lifetime of the reference returned by the longest function is the same as the smaller of the lifetimes of the references passed in. Remember, when we specify the lifetime parameters in this function signature, we’re not changing the lifetimes of any values passed in or returned. Rather, we’re specifying that the borrow checker should reject any values that don’t adhere to these constraints.

Ultimately, lifetime syntax is about connecting the lifetimes of various parameters and return values of functions. Once they’re connected, Rust has enough information to allow memory-safe operations and disallow operations that would create dangling pointers or otherwise violate memory safety.

You need to specify lifetime parameters for functions or structs that use references.

Lifetime Elision

The developers programmed these patterns into the compiler’s code so the borrow checker could infer the lifetimes in these situations and wouldn’t need explicit annotations. The patterns programmed into Rust’s analysis of references are called the lifetime elision rules.

Lifetimes on function or method parameters are called input lifetimes, and lifetimes on return values are called output lifetimes.

The 3 rules of the elision:

  1. Each parameter that is a reference gets its own lifetime parameter. A function with one parameter gets one lifetime parameter, and a function with two parameters gets two separate lifetime parameters
  2. If there is exactly one input lifetime parameter, that lifetime is assigned to all output lifetime parameters:
  3. If there are multiple input lifetime parameters, but one of them is &self or &mut self because this is a method, the lifetime of self is assigned to all output lifetime parameters.

Example of the rule 1 and rule 2:

fn first_word(s: &str) -> &str {
// Apply rule 1. Same with
fn first_word<'a>(s: &'a str) -> &str {
// Apply rule 2. Same with
fn first_word<'a>(s: &'a str) -> &'a str {

When we implement methods on a struct with lifetimes, we use the same syntax as that of generic type parameters.

My example

src/main.rs:

struct ImportantExcerpt<'a> {
    part: &'a str,
}

impl<'a> ImportantExcerpt<'a> {
    fn level(&self) -> i32 {
        3
    }
}

impl<'a> ImportantExcerpt<'a> {
    fn announce_and_return_part(&self, announcement: &str) -> &str {
        println!("Attention please: {}", announcement);
        self.part
    }
}

fn main () {
    let s1 = String::from("test1");
    let mut s2 = String::from("test2");
    let a = ImportantExcerpt{
            part: s1.as_str()
        };

    println!("{}",a.part);
    a.announce_and_return_part(s2.as_str());
    s2 = String::from("new test2");
    println!("{}",s2);
    a.announce_and_return_part(s2.as_str());
}

And result:

$ cargo run
   Compiling panic v0.1.0 (/home/atle00/rust-projects/panic)
warning: associated function is never used: `level`
 --> src/main.rs:6:8
  |
6 |     fn level(&self) -> i32 {
  |        ^^^^^
  |
  = note: `#[warn(dead_code)]` on by default

warning: 1 warning emitted

    Finished dev [unoptimized + debuginfo] target(s) in 0.26s
     Running `target/debug/panic`
test1
Attention please: test2
new test2
Attention please: new test2

The Static Lifetime

One special lifetime we need to discuss is 'static, which means that this reference can live for the entire duration of the program. All string literals have the 'static lifetime,

let s: &'static str = "I have a static lifetime.";
// Same as
let s = "I have a static lifetime.";

The text of this string is stored directly in the program’s binary, which is always available. Therefore, the lifetime of all string literals is 'static.

Generic Type Parameters, Trait Bounds, and Lifetimes Together

Just an example:

use std::fmt::Display;

fn longest_with_an_announcement<'a, T>(
    x: &'a str,
    y: &'a str,
    ann: T,
) -> &'a str
where
    T: Display,
{
    println!("Announcement! {}", ann);
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

11. Writing Automated Tests

11.1 How to Write Tests

A test is done by,

  1. Set up any needed data or state.
  2. Run the code you want to test.
  3. Assert the results are what you expect.

Attribute

Attributes are metadata about pieces of Rust code. For example, derive is one of the attributes.

#[derive(Debug)]
struct Rectangle {
    width: u32,
    height: u32,
}

To change a function into a test function, add #[test] on the line before fn. To test, run cargo test. When we make a new library project with Cargo, a test module with a test function in it is automatically generated for us.

#[test] annotation

This is the default test file.

#[cfg(test)]
mod tests {
    #[test]
    fn it_works() {
        assert_eq!(2 + 2, 4);
    }
}

#[test] attribute indicates fn it_works is a test function.

Run the test:

$ cargo test
    Finished test [unoptimized + debuginfo] target(s) in 0.00s
     Running target/debug/deps/adder-6f6d09e2972de52b

running 1 test
test tests::it_works ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests adder

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

tests::it_works is the name of the generated test function. Note that measured is a result of benchmark test.

Sideway: Benchmark in Rust

Because the benchmark feature isn’t available in the stable channel, you should if you want to use benchmark feature.

https://doc.rust-lang.org/unstable-book/library-features/test.html

rustup install nightly

You should install knightly channel, unless you’ll get an error like,

$ cargo bench
   Compiling adder v0.1.0 (/home/atlex00/rust-projects/adder)
error[E0554]: `#![feature]` may not be used on the stable release channel
 --> src/lib.rs:1:1
  |
1 | #![feature(test)]
  | ^^^^^^^^^^^^^^^^^

error: aborting due to previous error

src/lib.rs

#![feature(test)]

extern crate test;

pub fn add_two(a: i32) -> i32 {
    a + 2
}

#[cfg(test)]
mod tests {
    use super::*;
    use test::Bencher;

    #[test]
    fn it_works() {
        assert_eq!(4, add_two(2));
    }

    #[bench]
    fn bench_add_two(b: &mut Bencher) {
        b.iter(|| add_two(2));
    }
}

Run a benchmark:

$ cargo +nightly bench
   Compiling adder v0.1.0 (/home/atlex/rust-projects/adder)
    Finished bench [optimized] target(s) in 0.60s
     Running unittests (target/release/deps/adder-8d2056bd46123ee2)

running 2 tests
test tests::it_works ... ignored
test tests::bench_add_two ... bench:           0 ns/iter (+/- 0)

test result: ok. 0 passed; 0 failed; 1 ignored; 1 measured; 0 filtered out; finished in 1.08s

Rust runs our benchmark a number of times, and then takes the average.

about Doc-tests

We’ll learn about it in Chapter 14, but in a nut shell,

  • Triple slash /// is a special comment, called Documentation comment.
  • /// supports Markdown notation.
  • Functions in a documentation comments are tested automatically.

assert! macro

We give the assert! macro an argument that evaluates to a Boolean. If the value is true, assert! does nothing and the test passes. If the value is false, the assert! macro calls the panic! macro, which causes the test to fail. You can put second parameter for a custom asserting message.

assert_eq! and assert_ne!

Under the surface, the assert_eq! and assert_ne! macros use the operators == and !=, respectively. The values being compared must implement the PartialEq and Debug traits.

Derivable Traits

https://doc.rust-lang.org/book/appendix-03-derivable-traits.html

The derive attribute generates code that will implement a trait with its own default implementation on the type you’ve annotated with the derive syntax.

should_panic attribute

This attribute makes a test pass if the code inside the function panics.

Example:

pub struct Guess {
    value: i32,
}

impl Guess {
    pub fn new(value: i32) -> Guess {
        if value < 1 || value > 100 {
            panic!("Guess value must be between 1 and 100, got {}.", value);
        }

        Guess { value }
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    #[should_panic]
    fn greater_than_100() {
        Guess::new(200);
    }
}

Tests that use should_panic can be imprecise because they only indicate that the code has caused some panic. Using expected parameter to the should_panic attributes makes the test more precise. expected parameter is a substring of the message which the function panics with.

...
        } else if value > 100 {
            panic!(
                "Guess value must be less than or equal to 100, got {}.",
                value
            );
        }
...
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    #[should_panic(expected = "Guess value must be less than or equal to 100")]
    fn greater_than_100() {
        Guess::new(200);
    }
}

It returns Ok(()) when the test passes and an Err with a String inside when the test fails.

  • Writing tests so they return a Result<T, E> enables you to use the question mark operator in the body of tests.
  • You can’t use the #[should_panic] annotation on tests that use Result<T, E>. Instead, you should return an Err value directly when the test should fail.

Using Result<T, E> in Tests

#[cfg(test)]
mod tests {
    #[test]
    fn it_works() -> Result<(), String> {
        if 2 + 2 == 4 {
            Ok(())
        } else {
            Err(String::from("two plus two does not equal four"))
        }
    }
}

11.2 Controlling How Tests Are Run

The default behavior of the binary produced by cargo test is to run all the tests in parallel and capture output generated during test runs, preventing the output from being displayed and making it easier to read the output related to the test results.

Because the tests are running at the same time, make sure your tests don’t depend on each other or on any shared state, including a shared environment, such as the current working directory or environment variables.

If you don’t want to run the tests in parallel, use --test-threads option like cargo test -- --test-threads=1. -- here is called “seperator.”

If we want to see printed values for passing tests as well, we can tell Rust to also show the output of successful tests at the end with --show-output.

We can pass the name of any test function to cargo test to run only that test: cargo test {{ the name of function }}, but we can’t specify the names of multiple tests in this way. We can specify part of a test name, and any test whose name matches that value will be run.

Sometimes a few specific tests can be very time-consuming to execute, so you might want to exclude them during most runs of cargo test. Use ignore attribute. src/lib/rs

#[test]
fn it_works() {
    assert_eq!(2 + 2, 4);
}

#[test]
#[ignore]
fn expensive_test() {
    // code that takes an hour to run
}

If we want to run only the ignored tests, we can use cargo test -- --ignored.

11.3 Test Organization (I should read again when I need it in my project)

The Rust community thinks about tests in terms of two main categories: unit tests and integration tests.

Unit Tests

The convention is to create a module named tests in each file to contain the test functions and to annotate the module with cfg(test). You’ll use #[cfg(test)] to specify that they shouldn’t be included in the compiled result.

Integration Tests

To create integration tests, you first need a tests directory at the top level of our project directory, next to src. Cargo knows to look for integration test files in this directory. We don’t need to annotate any code in tests/integration_test.rs with #[cfg(test)].

Each file in the tests directory is a separate crate, so we need to bring our library into each test crate’s scope.

tests/integration_test.rs in a project adder.

use adder;

#[test]
fn it_adds_two() {
    assert_eq!(4, adder::add_two(2));
}

12. An I/O Project: Building a Command Line Program

In this tutorial, we write a clone of grep command.

12.1 Accepting Command Line Arguments

  • The function std::env::args returns an iterator of the command line arguments.
  • We can call the collect method on an iterator to turn it into a collection (such a vector).
  • Note: std::env::args will panic if any argument contains invalid Unicode. For invalid Unicode, use std::env::args_os instead
use std::env;

fn main() {
    let args: Vec<String> = env::args().collect();
    println!("{:?}", args);
}

Result:

$ cargo run 1starg 2ndarg
   Compiling iptables_viewer v0.1.0 (/path/to/your/project)
    Finished dev [unoptimized + debuginfo] target(s) in 0.25s
     Running `target/debug/project-name 1starg 2ndarg`
["target/debug/project-name", "1starg", "2ndarg"]
  • The first value in the vector is target/debug/project-name, which is the name of our binary.
  • The first argument is reffered as &args[1] in the program.
  • The Type of arguments is &str.

12.2 Reading a File

The following snippet would be refactored in the next section 12.3.

use std::fs;
let contents = fs::read_to_string(filename)
        .expect("Something went wrong reading the file");
println!("With text:\n{}", contents);
  • fs::read_to_string takes the filename, opens that file, and returns a Result<String> of the file’s contents.

New topic from here.

12.3 Refactoring to Improve Modularity and Error Handling

I’ve learned general programming concepts in this chapter.

In a nutshell: main.rs handles running the program, and lib.rs handles all the logic of the task at hand.

Reasons:

  1. If we continue to grow our program inside main, the number of separate tasks the main function handles will increase.
  2. The more variables we have in scope, the harder it will be to keep track of the purpose of each. It’s best to group the configuration variables into one structure to make their purpose clear.
  3. The error message Something went wrong reading the file is not clear.
  4. It would be best if all the error-handling code were in one place so future maintainers had only one place to consult in the code if the error-handling logic needed to change.
  • The Rust community has developed a process to use as a guideline for splitting the separate concerns of a binary program when main starts getting large.

    • Split your program into a main.rs and a lib.rs and move your program’s logic to lib.rs.
    • As long as your command line parsing logic is small, it can remain in main.rs.
    • When the command line parsing logic starts getting complicated, extract it from main.rs and move it to lib.rs.
  • The responsibilities that remain in the main function after this process should be limited to the following:

    • Calling the command line parsing logic with the argument values
    • Setting up any other configuration
    • Calling a run function in lib.rs
    • Handling the error if run returns an error
  • Extracting the Argument Parser

  • Grouping Configuration Values

Note: Using primitive values when a complex type would be more appropriate is an anti-pattern known as primitive obsession.

use std::env;
use std::fs;

fn main() {
    let args: Vec<String> = env::args().collect();

    let config = parse_config(&args);

    println!("Searching for {}", config.query);
    println!("In file {}", config.filename);

    let contents = fs::read_to_string(config.filename)
        .expect("Something went wrong reading the file");

    println!("With text:\n{}", contents);
}

struct Config {
    query: String,
    filename: String,
}

fn parse_config(args: &[String]) -> Config {
    let query = args[1].clone();
    let filename = args[2].clone();

    Config { query, filename }
}
  • There’s a tendency among many Rustaceans to avoid using clone to fix ownership problems because of its runtime cost. We will learn more efficient way in Chapter 13.

13. Functional Language Features: Iterators and Closures

Programming in a functional style often includes using functions as values by passing them in arguments, returning them from other functions, assigning them to variables for later execution, and so forth.

13.1 Closures: Anonymous Functions that Can Capture Their Environment

An example of a closure.

let expensive_closure = |num| {
    println!("calculating slowly...");
    thread::sleep(Duration::from_secs(2));
    num
};
  • To define a closure, we start with a pair of vertical pipes (|), inside which we specify the parameters to the closure.
  • Unlike functions, closures can capture values from the scope in which they’re defined.

Memoization, lazy evaluation

  • We can create a struct that will hold the closure and the resulting value of calling the closure (not to calculate expensive code multiple times).

  • We need to specify the type of the closure, because a struct definition needs to know the types of each of its fields.

  • Example:

    struct Cacher<T>
    where
        T: Fn(u32) -> u32,
    {
        calculation: T,
        value: Option<u32>,
    }
    
    • The Cacher struct has a calculation field of the generic type T.
    • The trait bounds on T specify that it’s a closure by using the Fn trait.
    • Any closure we want to store in the calculation field must have one u32 parameter (specified within the parentheses after Fn)
    • ,and must return a u32 (specified after the ->).

Fn Traits

All closures implement at least one of the traits: Fn, FnMut, or FnOnce.

  • FnOnce consumes the variables it captures from its enclosing scope, known as the closure’s environment. To consume the captured variables, the closure must take ownership of these variables and move them into the closure when it is defined. The Once part of the name represents the fact that the closure can’t take ownership of the same variables more than once, so it can be called only once.
  • FnMut can change the environment because it mutably borrows values.
  • Fn borrows values from the environment immutably.

Implement the example:

impl<T> Cacher<T>
where
    T: Fn(u32) -> u32,
{
    fn new(calculation: T) -> Cacher<T> {
        Cacher {
            calculation,
            value: None,
        }
    }

    fn value(&mut self, arg: u32) -> u32 {
        match self.value {
            Some(v) => v,
            None => {
                let v = (self.calculation)(arg);
                self.value = Some(v);
                v
            }
        }
    }
}

And use it:

fn generate_workout(intensity: u32, random_number: u32) {
    let mut expensive_result = Cacher::new(|num| {
        println!("calculating slowly...");
        thread::sleep(Duration::from_secs(2));
        num
    });

    if intensity < 25 {
        println!("Today, do {} pushups!", expensive_result.value(intensity));
        println!("Next, do {} situps!", expensive_result.value(intensity));
    } else {
        if random_number == 3 {
            println!("Take a break today! Remember to stay hydrated!");
        } else {
            println!(
                "Today, run for {} minutes!",
                expensive_result.value(intensity)
            );
        }
    }
}

Closures have an additional capability that functions don’t have: they can capture their environment and access variables from the scope in which they’re defined.

Capturing the Environment with Closures

Following snippet returns error because equal_to_x is a function, not closure.

fn main() {
    let x = 4;

    fn equal_to_x(z: i32) -> bool {
        z == x
    }

    let y = 4;

    assert!(equal_to_x(y));
}

Here is the closure version

fn main() {
    let x = 4;

    let equal_to_x = |z| z == x;

    let y = 4;

    assert!(equal_to_x(y));
}

If you want to force the closure to take ownership of the values it uses in the environment, you can use the move keyword before the parameter list.

Here is the move example (returns compile error):

fn main() {
    let x = vec![1, 2, 3];

    let equal_to_x = move |z| z == x;

    println!("can't use x here: {:?}", x);

    let y = vec![1, 2, 3];

    assert!(equal_to_x(y));
}

13.2 Processing a Series of Items with Iterators

In Rust, iterators are lazy, meaning they have no effect until you call methods that consume the iterator to use it up.

We can create an iterater from Vec<T> explicitly:

let v1 = vec![1, 2, 3];
let v1_iter = v1.iter();

The definition of the Iterator trait in the standard library looks like this:

pub trait Iterator {
    type Item;

    fn next(&mut self) -> Option<Self::Item>;

    // methods with default implementations elided
}
  • Implementing the Iterator trait requires that you also define an Item type (more details in Chap. 19).
  • This Item type is used in the return type of the next method. = The Item type will be the type returned from the iterator.
  • We can call the next method on iterators directly.
  • We don’t need to make an iterator mutable when we used a for loop because the loop took ownership of the iterator and made it mutable behind the scenes.
  • The value we get from the calls to next are immutable references to the values in the vector.

14. More About Cargo and Crates.io

14.1 Customizing Builds with Release Profiles

There are two release profiles by default, dev and release. You can define the profile-specific configurations in Cargo.toml file. Here is the example how to change optimization level in the file (this example is default value):

[profile.dev]
opt-level = 0

[profile.release]
opt-level = 3

14.2 Publishing a Crate to Crates.io

passed

14.3 Cargo Workspaces

The feature workspaces enable us to split a package into multiple libraries (but still this is a single package).

A workspace is a set of packages that share the same Cargo.lock and output directory.

Here is the sample structure of workspaces

$ tree -I target
.
├── adder
│   ├── Cargo.toml
│   └── src
│       └── main.rs
├── add-one
│   ├── Cargo.toml
│   └── src
│       └── lib.rs
├── Cargo.lock
└── Cargo.toml

4 directories, 6 files

Cargo.toml:

[workspace]

members = [
    "adder",
    "add-one",
]

adder/Cargo.toml:

[package]
name = "adder"
version = "0.1.0"
edition = "2018"

[dependencies]
add-one = { path = "../add-one" }

add-one/Cargo.toml:

[package]
name = "add-one"
version = "0.1.0"
edition = "2018"

[dependencies]
rand = "0.8.3"

add-one/src/lib.rs:

pub fn add_one(x: i32) -> i32 {
    x + 1
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn it_works() {
        assert_eq!(3, add_one(2));
    }
}

adder/src/main.rs:

use add_one;

fn main() {
    let num = 10;
    println!(
        "Hello, world! {} plus one is {}!",
        num,
        add_one::add_one(num)
    );
}

Let’d run cargo commands.

$ cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.00s
     Running `target/debug/adder`
Hello, world! 10 plus one is 11!
  • The entry point of the cargo run is fn main() in adder/src/main.rs, because this is the only main function and main.rs.
  • We defined rand crate in add-one/Cargo.toml. If you want to use rand crate in adder package, you have to include the package explicitly in adder/Cargo.toml.
  • The workspace can define another scope.

15. Smart Pointers

A pointer is a general concept for a variable that contains an address in memory. The most common kind of pointer in Rust is a reference.

Smart pointers, on the other hand, are data structures that not only act like a pointer but also have additional metadata and capabilities.

One example that we’ll explore in this chapter is the reference counting smart pointer type. This pointer enables you to have multiple owners of data by keeping track of the number of owners and, when no owners remain, cleaning up the data.

In many cases, smart pointers own the data they point to.

Actually, We’ve already encountered a few smart pointers in this book, such as String and Vec<T>.

Smart pointers are usually implemented using structs. The characteristic that distinguishes a smart pointer from an ordinary struct is that smart pointers implement the Deref and Drop traits.

We’ll cover the most common smart pointers in the standard library:

  • Box<T> for allocating values on the heap
  • Rc<T>, a reference counting type that enables multiple ownership
  • Ref<T> and RefMut<T>, accessed through RefCell<T>, a type that enforces the borrowing rules at runtime instead of compile time

15.1 Using Box<T> to Point to Data on the Heap

Box<T> allow you to store data on the heap rather than the stack. What remains on the stack is the pointer to the heap data.

You’ll use them most often in these situations:

  • When you have a type whose size can’t be known at compile time and you want to use a value of that type in a context that requires an exact size
  • When you have a large amount of data and you want to transfer ownership but ensure the data won’t be copied when you do so
  • When you want to own a value and you care only that it’s a type that implements a particular trait rather than being of a specific type

Sideway: Memory allocation about Vec

My note: at this point, I wondered how Rust allocate memory when I manipulate Vec. I found a good post about this theme.

https://markusjais.com/unterstanding-rusts-vec-and-its-capacity-for-fast-and-efficient-programs/

Using a Box<T> to Store Data on the Heap

Not used in this way very often, but educational purpose.

fn main() {
    let b = Box::new(5);
    println!("b = {}", b);
}

When a box goes out of scope, as b does at the end of main, it will be deallocated. The deallocation happens for the box (stored on the stack) and the box goes out of scope, as b does at the end of main, it will be deallocated. The deallocation happens for the box (stored on the stack) and the data it points to (stored on the heap) data it points to (stored on the heap).

Example: construct function

A construction function constructs a new pair from its two arguments, which usually are a single value and another pair. “To cons x onto y” informally means to construct a new container instance by putting the element x at the start of this new container, followed by the container y.

Each item in a cons list contains two elements: the value of the current item and the next item. The last item in the list contains only a value called Nil without a next item. A cons list is produced by recursively calling the cons function.

Let’s try to implement a list of i32 with cons. The following code returns a compile error.

enum List {
    Cons(i32, List),
    Nil,
}

Because Rust doesn’t know how much space it needs to store a List value (List is defined recursively). image alt text

To solve this issue, use a Box<T> (pointer), because the size of pointer is known.

enum List {
    Cons(i32, Box<List>),
    Nil,
}

use crate::List::{Cons, Nil};

fn main() {
    let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))));
}

image alt text

15.2 Treating Smart Pointers Like Regular References with the Deref Trait

The code following returns compile error.

fn main() {
    let x = 5;
    let y = &x;

    assert_eq!(5, x);
    assert_eq!(5, y);
}

The error:

error[E0277]: can't compare `{integer}` with `&{integer}`
 --> src/main.rs:6:5
  |
6 |     assert_eq!(5, y);
  |     ^^^^^^^^^^^^^^^^^ no implementation for `{integer} == &{integer}`
  |

To avoid this error, we should change assert_eq!(5, y); to assert_eq!(5, *y);. This * is called dereference, which means “follow the reference to the value it’s pointing to.”

One more dereference example (one mutual borrowing is allowed!):

fn main() {
    let mut x = 5;
    let y = &mut x;

    *y = 4;

    assert_eq!(5, *y);
    // thread 'main' panicked at 'assertion failed: `(left == right)`
    //   left: `5`,
    //  right: `4`', src/main.rs:8:5
}

Like C or C++, print the number of address:

fn main() {
    let x = &42;
    let address = format!("{:p}", x);
    print!("{:?}", address) // like "0x560b046ea000"
}

Instead of a reference, write in Box:

fn main() {
    let x = 5;
    let y = Box::new(x);

    assert_eq!(5, x);
    assert_eq!(5, *y);
}

Note that y is an instance of a box pointing to a copied value of x rather than a reference pointing to the value of x.

Try to understand Rust Deref trait.

Just a memo from here: *y: behind the scenes Rust actually ran this code:

*(y.deref())

Rust substitutes the * operator with a call to the deref method and then a plain dereference so we don’t have to think about whether or not we need to call the deref method.

16. Fearless Concurrency

The Rust team discovered that the ownership and type systems are a powerful set of tools to help manage memory safety and concurrency problems!

Caution: In this book, authors refer to many of the problems as concurrent rather than being more precise by saying concurrent and/or parallel.

16.1 Using Threads to Run Code Simultaneously

Many operating systems provide an API for creating new threads. This model where a language calls the operating system APIs to create threads is sometimes called 1:1, meaning one operating system thread per one language thread.

Programming language-provided threads are known as green threads, and languages that use these green threads will execute them in the context of a different number of operating system threads. For this reason, the green-threaded model is called the M:N model: there are M green threads per N operating system threads, where M and N are not necessarily the same number.

In this context, by runtime we mean code that is included by the language in every binary.

The Rust standard library only provides an implementation of 1:1 threading.

Creating a New Thread with spawn

To create a new thread, we call the thread::spawn function and pass it a closure containing the code we want to run in the new thread. The new thread will be stopped when the main thread ends, whether or not it has finished running.

use std::thread;
use std::time::Duration;

fn main() {
    thread::spawn(|| {
        for i in 1..10 {
            println!("hi number {} from the spawned thread!", i);
            thread::sleep(Duration::from_millis(1));
        }
    });

    for i in 1..5 {
        println!("hi number {} from the main thread!", i);
        thread::sleep(Duration::from_millis(1));
    }
}

Run (You can see, there is no 6 to 10):

$ cargo run
hi number 1 from the main thread!
hi number 1 from the spawned thread!
hi number 2 from the spawned thread!
hi number 2 from the main thread!
hi number 3 from the spawned thread!
hi number 3 from the main thread!
hi number 4 from the spawned thread!
hi number 4 from the main thread!
hi number 5 from the spawned thread!

The calls to thread::sleep force a thread to stop its execution for a short duration, allowing a different thread to run. The number of spawnd thread! line between main thread is depend on your CPU. If I comment-out the lines thread::sleep(Duration::from_millis(1));, the spawned process doesn’t start.

$ cargo run
hi number 1 from the main thread!
hi number 2 from the main thread!
hi number 3 from the main thread!
hi number 4 from the main thread!

Waiting for All Threads to Finish Using join Handles

The return type of thread::spawn is JoinHandle. A JoinHandle is an owned value that, when we call the join method on it, will wait for its thread to finish.

use std::thread;
use std::time::Duration;

fn main() {
    let handle = thread::spawn(|| {
        for i in 1..10 {
            println!("hi number {} from the spawned thread!", i);
            thread::sleep(Duration::from_millis(1));
        }
    });

    for i in 1..5 {
        println!("hi number {} from the main thread!", i);
        thread::sleep(Duration::from_millis(1));
    }

    handle.join().unwrap();
}
hi number 1 from the main thread!
hi number 1 from the spawned thread!
hi number 2 from the main thread!
hi number 2 from the spawned thread!
hi number 3 from the main thread!
hi number 3 from the spawned thread!
hi number 4 from the main thread!
hi number 4 from the spawned thread!
hi number 5 from the spawned thread!
hi number 6 from the spawned thread!
hi number 7 from the spawned thread!
hi number 8 from the spawned thread!
hi number 9 from the spawned thread!

If we put the line handle.join().unwrap(); between the fors statement, result would be like follows, because it waits the end of the sub-thread.

hi number 1 from the spawned thread!
hi number 2 from the spawned thread!
hi number 3 from the spawned thread!
hi number 4 from the spawned thread!
hi number 5 from the spawned thread!
hi number 6 from the spawned thread!
hi number 7 from the spawned thread!
hi number 8 from the spawned thread!
hi number 9 from the spawned thread!
hi number 1 from the main thread!
hi number 2 from the main thread!
hi number 3 from the main thread!
hi number 4 from the main thread!

Using move Closures with Threads

If you want to access to variables with the closure in thread::spawn, the spawned thread doesn’t know how long the variable is valied. By adding the move keyword before the closure, we force the closure to take ownership of the values it’s using rather than allowing Rust to infer that it should borrow the values.

use std::thread;

fn main() {
    let v = vec![1, 2, 3];

    let handle = thread::spawn(move || {
        println!("Here's a vector: {:?}", v);
    });

    handle.join().unwrap();
}

17. Object Oriented Programming Features of Rust

Objects came from Simula in the 1960s. Those objects influenced Alan Kay’s programming architecture in which objects pass messages to each other. He coined the term object-oriented programming in 1967 to describe this architecture.

Hmm…

17.1 Characteristics of Object-Oriented Languages

Objects Contain Data and Behavior

The book Design Patterns: Elements of Reusable Object-Oriented Software by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides (Addison-Wesley Professional, 1994) colloquially referred to as The Gang of Four book, is a catalog of object-oriented design patterns. It defines OOP this way:

Object-oriented programs are made up of objects. An object packages both data and the procedures that operate on that data. The procedures are typically called methods or operations.

Using this definition, Rust is object oriented: structs and enums have data, and impl blocks provide methods on structs and enums.

Encapsulation that Hides Implementation Details

encapsulation means that the implementation details of an object aren’t accessible to code using that object. Therefore, the only way to interact with an object is through its public API. In Rust, we can use the pub keyword to decide which modules, types, functions, and methods in our code should be public, and by default everything else is private.

Inheritance as a Type System and as Code Sharing

There is no way to define a struct that inherits the parent struct’s fields and method implementations.

You choose inheritance for two main reasons.

  1. One is for reuse of code
  2. The other reason is polymorphism, which means that you can substitute multiple objects for each other at runtime if they share certain characteristics.

Rust uses generics to abstract over different possible types and trait bounds to impose constraints on what those types must provide. This is sometimes called bounded parametric polymorphism.

Rust takes a different approach, using trait objects instead of inheritance.

19. Advanced Features

19.1 Unsafe Rust

“Unsafe” means “doesn’t enforce memory safety guarantees”.

Although the code might be okay, if the Rust compiler doesn’t have enough information to be confident, it will reject the code. In these cases, you can use unsafe code to tell the compiler, “Trust me, I know what ’m doing.”

Another reason Rust has an unsafe alter ego is that the underlying computer hardware is inherently unsafe.

??

Unsafe Superpowers

  1. Dereference a raw pointer
  2. Call an unsafe function or method
  3. Access or modify a mutable static variable
  4. Implement an unsafe trait
  5. Access fields of unions

Note:

  • unsafe doesn’t turn off the borrow checker or disable any other of Rust’s safety checks.
  • unsafe does not mean the code inside the block is necessarily dangerous or that it will definitely have memory safety problems.

Parts of the standard library are implemented as safe abstractions over unsafe code that has been audited.

Example: define unsafe function, and use it

unsafe fn dangerous() {}

unsafe {
    dangerous();
}

Example: dereference a raw pointer

println! take ownerships of variables (don’t allow mutable reference), so…

let mut num = 5;

let r1 = &num as *const i32;
let r2 = &mut num as *mut i32;

unsafe {
    println!("r1 is: {}", *r1);
    println!("r2 is: {}", *r2);
}

Sometimes, Rust isn’t smart enough to know safe code. When we know code is okay, but Rust doesn’t, it’s time to reach for unsafe code.

Using extern Functions to Call External Code

Rust has a keyword, extern, that facilitates the creation and use of a Foreign Function Interface (FFI). Functions declared within extern blocks are always unsafe to call from Rust code.

extern "C" {
    fn abs(input: i32) -> i32;
}

fn main() {
    unsafe {
        println!("Absolute value of -3 according to C: {}", abs(-3));
    }
}

Calling Rust Functions from Other Languages

we make the call_from_c function accessible from C code, after it’s compiled to a shared library and linked from C:

#[no_mangle]
pub extern "C" fn call_from_c() {
    println!("Just called a Rust function from C!");
}

20. Final Project: Building a Multithreaded Web Server

But before we get started, we should mention one detail: the method we’ll use won’t be the best way to build a web server with Rust. A number of production-ready crates are available on crates.io that provide more complete web server and thread pool implementations than we’ll build.

20.1 Building a Single-Threaded Web Server

  • HTTP over TCP
  • Using standard library std::net
cargo new hello
cd hello

src/main.rc

use std::net::TcpListener;

fn main() {
    let listener = TcpListener::bind("127.0.0.1:7878").unwrap();

    for stream in listener.incoming() {
        let stream = stream.unwrap();

        println!("Connection established!");
    }
}
  • The bind function returns a Result<T, E>, which indicates that binding might fail.
  • We use unwrap to stop the program if errors happen.
  • The incoming method on TcpListener returns an iterator that gives us a sequence of streams (more specifically, streams of type TcpStream).
    • We’re iterating over connection attempts with incoming method.
    • A single stream represents an open connection between the client and the server.
    • A connection is the name for the full request and response process in which a client connects to the server, the server generates a response, and the server closes the connection.
    • As such, TcpStream will read fRom itself to see what the client sent and then allow us to write our response to the stream.
    • Overall, this for loop will process each connection in turn and produce a series of streams for us to handle.
    • The handling of the stream consists of calling unwrap to terminate our program if the stream has any errors.

Test

$ cargo run

# In another terminal
$ curl localhost:7878 -vvv
*   Trying 127.0.0.1:7878...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 7878 (#0)
> GET / HTTP/1.1
> Host: localhost:7878
> User-Agent: curl/7.68.0
> Accept: */*
>
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer


# cargo run terminal
$ cargo run
   Compiling hello v0.1.0 (/home/atlex00/rust-projects/hello)
warning: unused variable: `stream`
 --> src/main.rs:7:13
  |
7 |         let stream = stream.unwrap();
  |             ^^^^^^ help: if this is intentional, prefix it with an underscore: `_stream`
  |
  = note: `#[warn(unused_variables)]` on by default

warning: 1 warning emitted

    Finished dev [unoptimized + debuginfo] target(s) in 0.98s
     Running `target/debug/hello`
Connection established!
Connection established!
^C
$
  • The connections are reset because the server isn’t currently sending back any data.

Reading the Request

use std::io::prelude::*;
use std::net::TcpListener;
use std::net::TcpStream;

fn main() {
    let listener = TcpListener::bind("127.0.0.1:7878").unwrap();

    for stream in listener.incoming() {
        let stream = stream.unwrap();

        handle_connection(stream);
    }
}

fn handle_connection(mut stream: TcpStream) {
    let mut buffer = [0; 1024];

    stream.read(&mut buffer).unwrap();

    println!("Request: {}", String::from_utf8_lossy(&buffer[..]));
}
  • In the handle_connection function, we’ve made the stream parameter mutable. The reason is that the TcpStream instance keeps track of what data it returns to us internally. It might read more data than we asked for and save that data for the next time we ask for data. It therefore needs to be mut because its internal state might change; usually, we think of “reading” as not needing mutation, but in this case we need the mut keyword.

  • How to read from stream. -> 3 steps.

    1. Declare a buffer on the stack to hold the data. It’s 1024 bytes in the example.
    2. Pass the buffer to stream.read, which will read bytes from the TcpStream and put them in the buffer (stream.read(&mut buffer).unwrap();).
    3. Convert the bytes in the buffer to a string and print that string (String::from_utf8_lossy).

Test:

$ cargo run

# Other terminal
curl localhost:7878 -vvv -H "Host: myserver.com"
*   Trying 127.0.0.1:7878...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 7878 (#0)
> GET / HTTP/1.1
> Host: myserver.com
> User-Agent: curl/7.68.0
> Accept: */*
>
* Empty reply from server
* Connection #0 to host localhost left intact
curl: (52) Empty reply from server

# Cargo run terminal
$ cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.00s
     Running `target/debug/hello`
Request: GET / HTTP/1.1
Host: myserver.com
User-Agent: curl/7.68.0
Accept: */*
^C

Writing a Response

First, no HTTP body, just header. Change the handle_connection function as follows.

fn handle_connection(mut stream: TcpStream) {
    let mut buffer = [0; 1024];

    stream.read(&mut buffer).unwrap();

    let response = "HTTP/1.1 200 OK\r\n\r\n";

    stream.write(response.as_bytes()).unwrap();
    stream.flush().unwrap();
}

Returning Real HTML

hello.html (the same location with src)

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Hello!</title>
  </head>
  <body>
    <h1>Hello!</h1>
    <p>Hi from Rust</p>
  </body>
</html>

Change the handle_connection function as follows.

use std::fs;
fn handle_connection(mut stream: TcpStream) {
    let mut buffer = [0; 1024];
    stream.read(&mut buffer).unwrap();

    let contents = fs::read_to_string("hello.html").unwrap();

    let response = format!(
        "HTTP/1.1 200 OK\r\nContent-Length: {}\r\n\r\n{}",
        contents.len(),
        contents
    );

    stream.write(response.as_bytes()).unwrap();
    stream.flush().unwrap();
}

Test:

$ curl localhost:7878
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Hello!</title>
  </head>
  <body>
    <h1>Hello!</h1>
    <p>Hi from Rust</p>
  </body>
</html>

Validating the Request and Selectively Responding

Returns only GET request. Change the handle_connection function as follows.

fn handle_connection(mut stream: TcpStream) {
    let mut buffer = [0; 1024];
    stream.read(&mut buffer).unwrap();

    let get = b"GET / HTTP/1.1\r\n";

    if buffer.starts_with(get) {
        let contents = fs::read_to_string("hello.html").unwrap();

        let response = format!(
            "HTTP/1.1 200 OK\r\nContent-Length: {}\r\n\r\n{}",
            contents.len(),
            contents
        );

        stream.write(response.as_bytes()).unwrap();
        stream.flush().unwrap();
    } else {
        let contents = String::from("Panic!!");

        let response = format!(
            "HTTP/1.1 401 OK\r\nContent-Length: {}\r\n\r\n{}",
            contents.len(),
            contents
        );

        stream.write(response.as_bytes()).unwrap();
        stream.flush().unwrap();

    }
}

Return error page

Change else part in

else {
    let status_line = "HTTP/1.1 404 NOT FOUND\r\n\r\n";
    let contents = fs::read_to_string("404.html").unwrap();
  
    let response = format!("{}{}", status_line, contents);
  
    stream.write(response.as_bytes()).unwrap();
    stream.flush().unwrap();
}

404.html

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Hello!</title>
  </head>
  <body>
    <h1>Oops!</h1>
    <p>Sorry, I don't know what you're asking for.</p>
  </body>
</html>
A Touch of Refactoring

Here is the refactored handle_connection function.

fn handle_connection(mut stream: TcpStream) {
    let mut buffer = [0; 1024];
    stream.read(&mut buffer).unwrap();

    let get = b"GET / HTTP/1.1\r\n";

    let (status_line, filename) = if buffer.starts_with(get) {
        ("HTTP/1.1 200 OK\r\n\r\n", "hello.html")
    } else {
        ("HTTP/1.1 404 NOT FOUND\r\n\r\n", "404.html")
    };

    let contents = fs::read_to_string(filename).unwrap();

    let response = format!("{}{}", status_line, contents);

    stream.write(response.as_bytes()).unwrap();
    stream.flush().unwrap();
}


Appendixes

A. Naming rule

https://rust-lang.github.io/api-guidelines/naming.html

Note about prelude

https://doc.rust-lang.org/std/prelude/index.html

The prelude is the list of things that Rust automatically imports into every Rust program. It’s kept as small as possible, and is focused on things, particularly traits, which are used in almost every single Rust program.

Note about Rust memory management

The following link was a very good post. https://deepu.tech/memory-management-in-rust/