As of Dec. 2021, I work as a DevOps engineer, but I like to solve problems with codes (front, back, whatever. it depends on the purposes.) Thesedays, my motivation about learning Rust surged enough. This page is my memo while I’ve learned with Rust official document so that I can easily remind/refer to the key feafures. Most part of this post consist of quotes from the document, but I also leave my opinions (could be wrong.)
I have experiences on
.rs
extension.!
means that you’re calling a macro instead of a normal function.cargo new hello_cargo
src
directory.cargo build
command creates an executable file in target/debug/hello_cargo
.Cargo.lock
: This file keeps track of the exact versions of dependencies in your project.cargo check
: command quickly checks your code to make sure it compiles but doesn’t produce an executable.cargo build --release
to compile it with optimizations.We can start comment line with //
.
Create variables.
let foo = 5; // immmutable
let mut foo = 5; // mutable
let mut guess = String::new();
The ::
syntax indicates that new
is an associated function of the String
type. An associated function is implemented on a type, in this case String
, rather than on a particular instance of a String.
User input.
use std::io;
let mut guess = String::new();
io::stdin()
.read_line(&mut guess)
.expect("Failed to read line");
The code store a standart input to the variable guess
as a String
.
std::io::stdin
function returns an instance of std::io::Stdin
, which is a type that represents a handle to the standard input for your terminal.
The job of read_line
is to take whatever the user types into standard input and place that into a string, so it takes that string as an argument.
The &
indicates that this argument is a reference, which gives you a way to let multiple parts of your code access one piece of data without needing to copy that data into memory multiple times.
References are immutable by default. Hence, you need to write &mut guess
rather than &guess
to make it mutable.
.expect()
is a potential failuer handling.
It’s often wise to introduce a newline and other whitespace to help break up long lines.
read_line
returns io::Result
Rust has a number of types named Result
in its standard library
The Result
types are enumerations = enum
, which is a type that can have a fixed set of value.
For Result
, the variants are Ok
or Err
.
An instance of io::Result
has an expect method.
If you don’t call expect
, the program will compile, but you’ll get a warning:
$ cargo build
Compiling guessing_game v0.1.0 (file:///projects/guessing_game)
warning: unused `std::result::Result` that must be used
--> src/main.rs:10:5
|
10 | io::stdin().read_line(&mut guess);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_must_use)]` on by default
= note: this `Result` may be an `Err` variant, which should be handled
Finished dev [unoptimized + debuginfo] target(s) in 0.59s
The set of curly brackets, {}
, is a placeholder.
let x = 5;
let y = 10;
println!("x = {} and y = {}", x, y);
From Rust version 1.58, format strings are supported!
let x = 5;
let y = 10;
println!("x = {x} and y = {y}");
Rust doesn’t yet include random number functionality in its standard library. However, the Rust team does provide a rand
crate.
Using crate in Cargo.toml
.
[dependencies]
rand = "0.5.5"
0.5.5
is actually shorthand for ^0.5.5
, which means “any version that has a public API compatible with version 0.5.5
.”/target/debug/deps/
.Cargo.lock
file. When you build your project in the future, Cargo will see that the Cargo.lock
file exists and use the versions specified there rather than doing all the work of figuring out versions again.cargo update
, which will ignore the Cargo.lock
file and figure out all the latest versions that fit your specifications in Cargo.toml
. If that works, Cargo will write those versions to the Cargo.lock
file.0.5.5
and less than 0.6.0
.(My note, reminder) In rand::Rng
, Rng
is called associated function of rand
.
use rand::Rng;
...
let secret_number = rand::thread_rng().gen_range(1, 101);
Note: A trait
is a collection of methods defined for an unknown type: Self
. They can access other methods declared in the same trait. https://doc.rust-lang.org/rust-by-example/trait.html
Simple match
example.
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => println!("You win!"),
}
std::cmp::Ordering
is another enum
, but the variants for Ordering
are Less
, Greater
, and Equal
.cmp
method compares two values and can be called on anything that can be compared.match
expression is made up of arms. An arm (=>
) consists of a pattern and the code that should be run if the value given to the beginning of the match expression fits that arm’s pattern.Rust has a strong, static type system. However, it also has type inference.
Integer type examples: i32
, u32
, i64
.
Read the following code with paying attention to the type of guess
.
let mut guess = String::new();
io::stdin()
.read_line(&mut guess)
.expect("Failed to read line");
let guess: u32 = guess.trim().parse().expect("Please type a number!");
trim
method on a String instance will eliminate any whitespace at the beginning and end.parse
method on strings parses a string into some kind of number, and could easily cause an error (the string contained A👍%, there would be no way to convert that to a number.):
after guess
tells Rust we’ll annotate the variable’s type.Make above code more Rust-like.
// from
//let guess: u32 = guess.trim().parse().expect("Please type a number!");
//
// to
let guess: u32 = match guess.trim().parse() {
Ok(num) => num,
Err(_) => continue,
};
_
, is a catchall value; in this example, we’re saying we want to match all Err
values, no matter what information they have inside them.loop
can loop unlimitedly unless break;
appears.
By default variables are immutable. -> takes advantage of the safety and easy concurrency.
Why Rust encourages you to favor immutability?
Like immutable variables, constants are values that are bound to a name and are not allowed to change, but there are a few differences between constants and variables.
mut
with constants.An example of constants.
const MAX_POINTS: u32 = 100_000;
Shadowing
fn main() {
let x = 5;
let x = x + 1;
let x = x * 2;
println!("The value of x is: {}", x);
}
Shadowing is different from marking a variable as mut
, because we’ll get a compile-time error if we accidentally try to reassign to this variable without using the let
keyword.
The other difference between mut
and shadowing is that, because we’re effectively creating a new variable when we use the let keyword again, we can change the type of the value but reuse the same name.
Shadowing thus spares us from having to come up with different names, such as spaces_str
and spaces_num
; instead, we can reuse the simpler spaces
name.
This is a good StackOverflow answer: https://stackoverflow.com/a/48696415/9923806
std::mem::drop
fn main() {
let x = 1;
println!("Value: {x} Address: {:p}", &x);
// Save address before shadowing
let addr_first = &x;
let x = 2;
println!("Value: {x} Address: {:p}", &x);
println!("Old value: {}", *addr_first);
}
// Result:
//
// Value: 1 Address: 0x7ffeb2c0bf54
// Value: 2 Address: 0x7ffeb2c0bfc4
// Old value: 1
Unless I put the shadowing lines inside the scope ({}
) I couldn’t find a way to recover it.
You can find shadowing using scope from this official example.
A scalar type represents a single value.
Integer types.
Length | Signed | Unsigned |
---|---|---|
8-bit | i8 | u8 |
16-bit | i16 | u16 |
32-bit | i32 | u32 |
64-bit | i64 | u64 |
128-bit | i128 | u128 |
arch | isize | usize |
Signed numbers are stored using two’s complement representation.
Interger Literals in Rust.
Number literals | Example |
---|---|
Decimal | 98_222 |
Hex | 0xff |
Octal | 0o77 |
Binary | 0b1111_0000 |
Byte (u8 only) | b'A' |
Integer types default to i32
: this type is generally the fastest, even on 64-bit systems.
When you’re compiling in debug mode, Rust includes checks for integer overflow that cause your program to panic at runtime if this behavior occurs.
Rust uses the term panicking when a program exits with an error.
When you’re compiling in release mode with the --release
flag, Rust does not include checks for integer overflow that cause panics.
Rust’s floating-point types are f32
and f64
.
The default type is f64
because on modern CPUs it’s roughly the same speed as f32
but is capable of more precision.
Floating-point numbers are represented according to the IEEE-754 standard.
Booleans are one byte in size.
Rust’s char
type is four bytes in size and represents a Unicode Scalar Value, which means it can represent a lot more than just ASCII. …. your human intuition for what a “character” is may not match up with what a char is in Rust.
Compound types: tuple
and array
.
let tup: (i32, f64, u8) = (500, 6.4, 1);
fn main() {
let tup = (500, 6.4, 1);
let (x, y, z) = tup;
println!("The value of y is: {}", y);
}
// The value of y is: 6.4
.
) followed by the index of the value we want to access.let x: (i32, f64, u8) = (500, 6.4, 1);
let five_hundred = x.0;
The tuple without any values, ()
, is a special type that has only one value, also written ()
.
The type is called the unit type and the value is called the unit value.
This is frequently used for unit-like struct.
Another use case of unit-like struct is OK(())
.
Expressions implicitly return the unit value if they don’t return any other value.
let a: [i32; 5] = [1, 2, 3, 4, 5];
let a = [3; 5];
let first = a[0];
.fn
and have a set of parentheses after the function name. The curly brackets tell the compiler where the function body begins and ends.5 + 6
, which is an expression that evaluates to the value 11
.{}
, is an expression,{
let x = 3;
x + 1
}
x + 1
line without a semicolon at the end, which is unlike most of the lines you’ve seen so far. Expressions do not include ending semicolons.fn main() {
let y = {
let x = 3;
x + 1
};
println!("The value of y is: {}", y);
// This value of y is: 4
}
fn main() {
let x = plus_one(5);
println!("The value of x is: {}", x);
}
fn plus_one(x: i32) -> i32 {
x + 1
}
->
).Pass ;)
if number < 5 {
println!("condition was true");
} else {
println!("condition was false");
}
if
expressions are sometimes called arms, just like the arms in match expressions.bool
. If the condition isn’t a bool
, we’ll get an error.if
and else
in an else if
expression.if
is an expression, we can use it on the right side of a let statement.let condition = true;
let number = if condition { 5 } else { 6 };
if
must be the same type.loop
and break;
.break
expression you use to stop the loop
; that value will be returned out of the loop so you can use itlet mut counter = 0;
let result = loop {
counter += 1;
if counter == 10 {
break counter * 2;
}
};
println!("The result is {}", result);
//The result is 20
while
-> If the condition matches, out from the loop.fn main() {
let mut number = 3;
while number != 0 {
println!("{}!", number);
number -= 1;
}
println!("LIFTOFF!!!");
}
//3!
//2!
//1!
//LIFTOFF!!!
while
construct to loop over the elements of a collection, such as an array.fn main() {
let a = [10, 20, 30, 40, 50];
let mut index = 0;
while index < 5 {
println!("the value is: {}", a[index]);
index += 1;
}
}
the value is: 10
the value is: 20
the value is: 30
the value is: 40
the value is: 50
for
also.fn main() {
let a = [10, 20, 30, 40, 50];
for element in a.iter() {
println!("the value is: {}", element);
}
}
for
because of safetiness.fn main() {
for number in (1..4).rev() {
println!("{}!", number);
}
println!("LIFTOFF!!!");
}
rev
reverses the iteration.>=
or <=
to its children. But in a context of programming language, you can think heap is a free memory area which is assined to a program (process) when it’s execution time.Ownership addresses the problems,
Once you understand ownership, you won’t need to think about the stack and the heap very often, but knowing that managing heap data is why ownership exists can help explain why it works the way it does.
In Rust, memory is managed through a system of ownership with a set of rules that the compiler checks at compile time. None of the ownership features slow down your program while it’s running.
This blog post is a good reference about GC in Rust:
https://internals.rust-lang.org/t/jemalloc-was-just-removed-from-the-standard-library/8759
… the
std::alloc::System
type to represent the system’s default allocator.
We will learn about Box
laaaater (chapter 15).
let s = String::from("hello");
String
is allocated on the heap and as such is able to store an amount of text that is unknown to us at compile time.let literal = "I'm a string literal"
), we know the contents at compile time, so the text is hardcoded directly into the final executable. This is why string literals are fast and efficient. But these properties only come from the string literal’s immutability.String
type, in order to support a mutable, growable piece of text, we need to allocate an amount of memory on the heap, unknown at compile time, to hold the contents. This means:String
.String::from
, its implementation requests the memory it needs. However, the second part is different. (GC)drop
, and it’s where the author of String
can put the code to return the memory. Rust calls drop
automatically at the closing curly bracket.Example.1: Stack
let x = 5;
let y = x;
5
will stored in the stack.x
and bind it to y
.5
values are pushed onto the stack.x
and y
have no meanings in assembly (a.k.a. compiled code). Only the values 5
are stored in real memory stack, and the Rust compiler remembers the each locations of these variables x
and y
.Example.2: Heap
let s1 = String::from("hello");
let s2 = s1;
String
is made up of three parts:String
is currently using, andString
has received from the allocator.s1
to s2
, the String
data is copied, meaning we copy the pointer, the length, and the capacity that are on the stack. We do not copy the data on the heap that the pointer refers to.ptr
, len
and capacity
are stored in stack.
The following code returns error at its compile time.
let s1 = String::from("hello");
let s2 = s1;
println!("{}, world!", s1);
move
.s2
is valid, when it goes out of scope.clone
.fn main() {
let s1 = String::from("hello");
let s2 = s1.clone();
println!("s1 = {}, s2 = {}", s1, s2);
}
// s1 = hello, s2 = hello
let x = 5;
let y = x;
println!("x = {}, y = {}", x, y);
Copy
trait that we can place on types like integers that are stored on the stack.Copy
, and nothing that requires allocation or is some form of resource is Copy
.u32
, bool
, f64
, char
, or Tuples
(if they only contain types that are also Copy
.println!("{}", s)
.fn main() {
let s = String::from("hello"); // s comes into scope
takes_ownership(s); // s's value moves into the function...
// ... and so is no longer valid here
println!("{}", s)
}
fn takes_ownership(some_string: String) { // some_string comes into scope
println!("{}", some_string);
} // Here, some_string goes out of scope and `drop` is called. The backing
// memory is freed.
drop
unless the data has been moved to be owned by another variable.Example:
fn main() {
let s1 = gives_ownership(); // gives_ownership moves its return
// value into s1
let s2 = String::from("hello"); // s2 comes into scope
let s3 = takes_and_gives_back(s2); // s2 is moved into
// takes_and_gives_back, which also
// moves its return value into s3
} // Here, s3 goes out of scope and is dropped. s2 goes out of scope but was
// moved, so nothing happens. s1 goes out of scope and is dropped.
fn gives_ownership() -> String { // gives_ownership will move its
// return value into the function
// that calls it
let some_string = String::from("hello"); // some_string comes into scope
some_string // some_string is returned and
// moves out to the calling
// function
}
// takes_and_gives_back will take a String and return one
fn takes_and_gives_back(a_string: String) -> String { // a_string comes into
// scope
a_string // a_string is returned and moves out to the calling function
}
let s1 = String::from("hello");
let len = calculate_length(&s1);
fn calculate_length(s: &String) -> usize {
s.len()
}
&s1
is a reference. It doesn’t own the ownership of s
.Mutable reference (pass compiling):
fn main() {
let mut s = String::from("hello");
change(&mut s);
}
fn change(some_string: &mut String) {
some_string.push_str(", world");
}
Mutable reference, but double borrowing (compile error):
fn main() {
let mut s = String::from("hello");
let r1 = &mut s;
let r2 = &mut s;
println!("{}, {}", r1, r2);
}
fn main() {
let mut s = String::from("hello");
let r1 = &s; // no problem
let r2 = &s; // no problem (multiple immutable references.)
println!("{} and {}", r1, r2);
// r1 and r2 are no longer used after this point
let r3 = &mut s; // no problem
println!("{}", r3);
}
let bytes = s.as_bytes();
: s
is String
and bytes
is an array of bytes.fn first_word(s: &String) -> usize {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' { //search for the byte that represents the space by using the byte literal syntax.
return i; //If we find a space, we return the position
}
}
s.len() // Otherwise, we return the length of the string by using s.len()
}
fn main() {
let mut s = String::from("hello world");
let word = first_word(&s); // word will get the value 5
s.clear(); // this empties the String, making it equal to ""
// word still has the value 5 here, but there's no more string that
// we could meaningfully use the value 5 with. word is now totally "invalid"!
}
.iter().enumerate()
, we use &
in the pattern.word
isn’t connected to the state of s
at all, word
still contains the value 5
.String
.&str
.String
and str
: https://stackoverflow.com/a/24159933/9923806fn main() {
let s = String::from("hello world");
let hello = &s[0..5];
let world = &s[6..11];
}
world
contains ptr
to the 6th element of the s
and length 5
(slice is references)...
.fn first_word(s: &String) -> &str {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[0..i];
}
}
&s[..]
}
fn main() {
let mut s = String::from("hello world");
let word = first_word(&s); // immutable reference
s.clear(); // error! Because clear needs to truncate the String, it needs to get a mutable reference.
println!("the first word is: {}", word);
}
word
is immutable, it cant clean (make it ""
, which means mutable borrow.)Example: frequently used slice
fn main() {
let a = [1, 2, 3, 4, 5];
let slice = &a[1..3];
}
This slice has the type &[i32]
.
A struct, or structure, is a custom data type that lets you name and package together multiple related values that make up a meaningful group.
struct User {
username: String,
email: String,
sign_in_count: u64,
active: bool,
}
fn main() {
let user1 = User {
email: String::from("someone@example.com"),
username: String::from("someusername123"),
active: true,
sign_in_count: 1,
};
}
user1.email = String::from("anotheremail@example.com");
fn build_user(email: String, username: String) -> User {
User {
email: email,
username: username,
active: true,
sign_in_count: 1,
}
}
build_user
so that it behaves exactly the same but doesn’t have the repetition of email and username.fn build_user(email: String, username: String) -> User {
User {
email,
username,
active: true,
sign_in_count: 1,
}
}
..
specifies that the remaining fields not explicitly set should have the same value as the fields in the given instance.let user2 = User {
email: String::from("another@example.com"),
username: String::from("anotherusername567"),
..user1
};
user2
has a different value for email
and username
but has the same values for the active
and sign_in_count
fields from user1
.
struct Color(i32, i32, i32);
let black = Color(0, 0, 0);
You can define a struct
without fields:
struct AlwaysEqual;
let subject = AlwaysEqual;
It is called unit-like struct.
&str
instead of String::from()
in a Structure.
It returns error because of its lifetime
.
&str
is a “string slice”, so it is a reference.
The value of a struct can be reference, but lifetime
issues are there.lifetime
, which is the scope for which that reference is valid.struct Rectangle {
width: u32,
height: u32,
}
fn main() {
let rect1 = Rectangle {
width: 30,
height: 50,
};
println!(
"The area of the rectangle is {} square pixels.",
area(&rect1)
);
}
fn area(rectangle: &Rectangle) -> u32 {
rectangle.width * rectangle.height
}
We want to borrow the struct rather than take ownership of it. This way, main retains its ownership and can continue using rect1
, which is the reason we use the &
in the function signature and where we call the function.
By default, the curly brackets {}
tell println!
to use formatting known as Display
: output intended for direct end user consumption. Due to this ambiguity, Rust doesn’t try to guess what we want, and structs don’t have a provided implementation of Display
.
{:?}
debug or {:#?}
for pretty-print. Require #[derive(Debug)]
jsut before the struct definition as shown below.
#[derive(Debug)]
struct Rectangle {
width: u32,
height: u32,
}
fn main() {
let rect1 = Rectangle {
width: 30,
height: 50,
};
println!("rect1 is {:?}", rect1);
// rect1 is Rectangle { width: 30, height: 50 }
}
I add the annotation to derive the Debug
trait and printing the Rectangle
instance using debug formatting.
Rust has provided a number of traits for us to use with the derive annotation that can add useful behavior to our custom types.
About #[derive(Debug)]
, it’s called an attribute.
https://doc.rust-lang.org/rust-by-example/attribute.html
self
, which represents the instance of the struct the method is being called on.impl
#[derive(Debug)]
struct Rectangle {
width: u32,
height: u32,
}
impl Rectangle {
fn area(&self) -> u32 {
self.width * self.height
}
}
self
, borrow self
immutably as we’ve done above, or borrow self
mutably, just as they can any other parameter.object->something()
is similar to (*object).something()
.->
operator; instead, Rust has a feature called automatic referencing and dereferencing.object.something()
, Rust automatically adds in &
, &mut
, or *
so object matches the signature of the method.impl Rectangle {
fn area(&self) -> u32 {
self.width * self.height
}
fn can_hold(&self, other: &Rectangle) -> bool {
self.width > other.width && self.height > other.height
}
}
impl
blocks that don’t take self
as a parameter. These are called associated functions because they’re associated with the struct. It’s similar concept to a static method.#[derive(Debug)]
struct Rectangle {
width: u32,
height: u32,
}
impl Rectangle {
fn square(size: u32) -> Rectangle {
Rectangle {
width: size,
height: size,
}
}
}
fn main() {
let sq = Rectangle::square(3);
}
::
syntax with the struct name; let sq = Rectangle::square(3)
; is an example. This function is namespaced by the struct:impl
blocks. (My memo) I can add new functions later.String::from
.An enum
definition is kind of custom data type.
This YouTube video explaings what is an enumeration type in C (The video is very understandable).
Introduction to Enumerations in C
I can regard enum
as kind of a lookup table in this simplest case.
Example 1: IP (enum
in this example is similar to enum
in C ).
enum IpAddrKind {
V4,
V6,
}
The custom data type is IpAddrKind
.
The variant of type IpAddrKind
could be either V4
or V6
.
The variants of the enum are namespaced under its identifier, and we use a double colon to separate the two:
let four = IpAddrKind::V4;
let six = IpAddrKind::V6;
Like C, you can label (map) each variable as integer (let x = IpAddrKind::V4 as i32;
).
The reason this is useful is that both values IpAddrKind::V4
and IpAddrKind::V6
are of the same type: IpAddrKind
. We can then, for instance, define a function that takes any IpAddrKind
:
fn route(ip_kind: IpAddrKind) {}
And we can call this function with either variant:
route(IpAddrKind::V4);
route(IpAddrKind::V6);
Example 2: IP with the addess.
We can associate values to the enum
values:
enum IpAddr {
V4(String),
V6(String),
}
let home = IpAddr::V4(String::from("127.0.0.1"));
let loopback = IpAddr::V6(String::from("::1"));
enum IpAddr {
V4(u8, u8, u8, u8),
V6(String),
}
let home = IpAddr::V4(127, 0, 0, 1);
let loopback = IpAddr::V6(String::from("::1"));
struct Ipv4Addr {
// --snip--
}
struct Ipv6Addr {
// --snip--
}
enum IpAddr {
V4(Ipv4Addr),
V6(Ipv6Addr),
}
Example 3: Message.
enum Message {
Quit,
Move { x: i32, y: i32 },
Write(String),
ChangeColor(i32, i32, i32),
}
Message
enum defined above, which is a single type.enum
also can be impl
emented.impl Message {
fn call(&self) {
// method body would be defined here
}
}
let m = Message::Write(String::from("hello"));
m.call();
My homework: How Rust compiler compile enum
into machine code…?
Please learn to learn from at once, eventhough you don’t understand the Option
at the first time.
YouTube video: Rust Programming Tutorial #37 - Option (Enum)
Option
is another enum
defined by the standard library.Option
type is used in many places because it encodes the very common scenario in which a value could be something or it could be nothing. Expressing this concept in terms of the type system means the compiler can check whether you’ve handled all the cases you should be handling; this functionality can prevent bugs that are extremely common in other programming languages.Option<T>
, and it is defined by the standard library as follows:enum Option<T> {
Some(T),
None,
}
Some
and None
directly without the Option::
prefix.<T>
means the Some
variant of the Option enum can hold one piece of data of any type.let some_number = Some(5);
let some_string = Some("a string");
let absent_number: Option<i32> = None;
None
rather than Some
, we need to tell Rust what type of Option<T>
we have.Option<T>
any better than having null?Because Option<T>
and T
(where T
can be any type) are different types, the compiler won’t let us use an Option<T>
value as if it were definitely a valid value.
In the following code, sum
returns a compile error because Rust doesn’t understand how to add an i8
and an Option<i8>
.
fn main() {
let x: i8 = 5;
let y: Option<i8> = Some(5);
let sum = x + y;
}
This means, when we have a value of a type like i8
in Rust, the compiler will ensure that we always have a valid value.
In other words, you have to convert an Option<T>
to a T
before you can perform T
operations with it (usually done by match
in the next section).
Generally, this helps catch one of the most common issues with null.
match
Control Flow OperatorHere is an example. (Tips. From 1999 through 2008, the United States minted quarters with different designs for each of the 50 states on one side.)
#[derive(Debug)] // so we can inspect the state in a minute
enum UsState {
Alabama,
Alaska,
// --snip--
}
enum Coin {
Penny,
Nickel,
Dime,
Quarter(UsState),
}
fn value_in_cents(coin: Coin) -> u8 {
match coin {
Coin::Penny => 1,
Coin::Nickel => 5,
Coin::Dime => 10,
Coin::Quarter(state) => {
println!("State quarter from {:?}!", state);
25
}
}
}
Option<T>
Option<T>
, when Rust prevents us from forgetting to explicitly handle the None
case, it protects us from assuming that we have a value when we might have null, thus making the billion-dollar mistake discussed earlier impossible._
PlaceholderThe _
will match all the possible cases that aren’t specified before it.
if let
The if let
syntax lets you combine if
and let
into a less verbose way to handle values that match one pattern while ignoring the rest.
let some_u8_value = Some(0u8);
// no if let syntax
match some_u8_value {
Some(3) => println!("three"),
_ => (),
}
// same as above (with if let syntax)
// note: the pattern is its first arm.
if let Some(3) = some_u8_value {
println!("three");
}
else
with an if let
.match coin {
Coin::Quarter(state) => println!("State quarter from {:?}!", state),
_ => count += 1,
}
// same as
if let Coin::Quarter(state) = coin {
println!("State quarter from {:?}!", state);
} else {
count += 1;
}
When we use if let
?
Using
if let
means less typing, less indentation, and less boilerplate code. However, you lose the exhaustive checking thatmatch
enforces. Choosing betweenmatch
andif let
depends on what you’re doing in your particular situation and whether gaining conciseness is an appropriate trade-off for losing exhaustive checking.
With if let
, we don’t need to write _
in match.
And the difference between if
is, in place of a condition expression if let
expects the keyword let followed by a pattern, an =
and a scrutinee expression.`
A package can contain multiple binary crates and optionally one library crate.
Cargo.toml
file that describes how to build those crates.cargo new hello_cargo
, it creates the package hello_cargo
, and this is described in Cargo.toml
file.src/main.rs
, meaning it only contains a binary crate named hello_cargo
.Sample: Cargo.toml
[package]
name = "hello_cargo"
version = "0.1.0"
authors = ["atlex <itsme@myemail.com>"]
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
Conventions
src/main.rs
is the crate root of a binary crate with the same name as the package.src/lib.rs
, the package contains a library crate with the same name as the package, and src/lib.rs
is its crate root.src/main.rs
and src/lib.rs
, it has two crates: a library and a binary, both with the same name as the package.src/bin
directory.My note: in terms of a crate, library and binary can be regarded as different elements.
The section was too verbose for me when it comes to practice, so I summarized some practical memo.
src/main.rs
.src/main.rs
handles running the program, and src/lib.rs
handles all the logic of the task at hand. We can see how it works in the section 12.3.src/{{ name_of_module }}.rs
or src/{{ name_of_module }}/mod.rs
.mod {{ name_of_module }};
imports the module.For details:
use
keyword brings a path into scope. Written in a parent codepub
keyword to make items public. Written in child codecargo new --lib restraunt
restraunt
├── Cargo.toml
└── src
└── lib.rs
Write lib.rs
as follows.
mod front_of_house {
mod hosting {
fn add_to_waitlist() {}
fn seat_at_table() {}
}
mod serving {
fn take_order() {}
fn serve_order() {}
fn take_payment() {}
}
}
Module tree in this example
crate
└── front_of_house
├── hosting
│ ├── add_to_waitlist
│ └── seat_at_table
└── serving
├── take_order
├── serve_order
└── take_payment
src/lib.rs
) by using a crate name or a literal crate.self
, super
, or an identifier in the current module.::
).src/lib.rs
mod front_of_house {
pub mod hosting {
pub fn add_to_waitlist() {}
}
}
pub fn eat_at_restaurant() {
// Absolute path
crate::front_of_house::hosting::add_to_waitlist();
// Relative path
front_of_house::hosting::add_to_waitlist();
}
super
at the start of the path. This is like starting a filesystem path with the ..
syntax.fn serve_order() {}
mod back_of_house {
fn fix_incorrect_order() {
cook_order();
super::serve_order();
}
fn cook_order() {}
}
We think the back_of_house
module and the serve_order
function are likely to stay in the same relationship to each other and get moved together should we decide to reorganize the crate’s module tree. Therefore, we used super
so we’ll have fewer places to update code in the future if this code gets moved to a different module.
If we use pub
before a struct definition, we make the struct public, but the struct’s fields will still be private. We can make each field public or not on a case-by-case basis.
mod back_of_house {
pub struct Breakfast {
pub toast: String,
seasonal_fruit: String,
}
impl Breakfast {
pub fn summer(toast: &str) -> Breakfast {
Breakfast {
toast: String::from(toast),
seasonal_fruit: String::from("peaches"),
}
}
}
}
pub fn eat_at_restaurant() {
let mut meal = back_of_house::Breakfast::summer("Rye");
meal.toast = String::from("Wheat");
println!("I'd like {} toast please", meal.toast);
}
We’ve defined a public back_of_house::Breakfast
struct with a public toast
field but a private seasonal_fruit
field. This models the case in a restaurant where the customer can pick the type of bread that comes with a meal, but the chef decides which fruit accompanies the meal based on what’s in season and in stock. The available fruit changes quickly, so customers can’t choose the fruit or even see which fruit they’ll get.
pub
before the enum
keyword.mod back_of_house {
pub enum Appetizer {
Soup,
Salad,
}
}
pub fn eat_at_restaurant() {
let order1 = back_of_house::Appetizer::Soup;
let order2 = back_of_house::Appetizer::Salad;
}
use
Keyworduse
keyword.mod front_of_house {
pub mod hosting {
pub fn add_to_waitlist() {}
}
}
use crate::front_of_house::hosting;
//or
//use self::front_of_house::hosting;
pub fn eat_at_restaurant() {
hosting::add_to_waitlist();
hosting::add_to_waitlist();
hosting::add_to_waitlist();
}
The following use
is bad.
use crate::front_of_house::hosting::add_to_waitlist;
pub fn eat_at_restaurant() {
add_to_waitlist();
add_to_waitlist();
add_to_waitlist();
}
We don’t know in which scope add_to_waitlist
comes from?
Another snippet which has the same probelm (bad).
use std::fmt::Result;
use std::io::Result as IoResult;
When we bring a name into scope with the use
keyword, the name available in the new scope is private. -> pub use
is called re-exporting, and with this syntax an external code also use them.
Note that the standard library (std
) is also a crate that’s external to our package. Because the standard library is shipped with the Rust language, we don’t need to change Cargo.toml
to include std
. But we do need to refer to it with use to bring items from there into our package’s scope.
Here are smart ways to use
.
// old
//use std::cmp::Ordering;
//use std::io;
// New!
use std::{cmp::Ordering, io};
// How about this?
//use std::io;
//use std::io::Write;
// Here!
use std::io::{self, Write};
*
, the glob operator:use std::collections::*;
The glob operator is often used when testing to bring everything under test into the tests module.
src/lib.rs
mod front_of_house;
pub use crate::front_of_house::hosting;
// --snip--
src/front_of_house.rs
pub mod hosting {
pub fn add_to_waitlist() {
// --snip--
}
}
Using a semicolon after mod front_of_house
rather than using a block tells Rust to load the contents of the module from another file with the same name as the module.
My note: sample of an available depth structure
src/lib.rs
: pub use crate::front_of_house::hosting
src/front_of_house.rs
: pub mod hosting;
src/front_of_house/hosting.rs
: pub fn add_to_waitlist() {}
Vector
Vec<T>
let v: Vec<i32> = Vec::new();
vec!
macro for convenience. The macro will create a new vector that holds the values you give it.let v = vec![1, 2, 3];
push
let mut v = Vec::new();
v.push(5);
&v[2]
and v.get(2)
.&v[2]
returns the value, and v.get(2)
returns Option<&T>
.&v[100]
will cause the program to panic when it references a nonexistent element (i.e. there is no 100th element in v
). When the get
method is passed an index that is outside the vector, it returns None
without panicking.get
method if accessing an element beyond the range of the vector happens occasionally under normal circumstances.Sample code of v.get()
:
let v = vec![1, 2, 3, 4, 5];
let third: &i32 = &v[2];
println!("The third element is {}", third);
match v.get(2) {
Some(third) => println!("The third element is {}", third),
None => println!("There is no third element."),
}
v.push(6);
.let mut v = vec![1, 2, 3, 4, 5];
let first = &v[0]; //immutable borrow
v.push(6); //mutable borrow
println!("The first element is: {}", first); // immutable borrow
push
and pop
method operate at the last element the vector.// Just referencing
let v = vec![100, 32, 57];
for i in &v {
println!("{}", i);
}
// Change elements
let mut v = vec![100, 32, 57];
for i in &mut v {
*i += 50;
}
*i
is called “dereference operator”. (Details are in Chapter 15)
There are definitely use cases for needing to store a list of items of different types. -> enum!!
enum SpreadsheetCell {
Int(i32),
Float(f64),
Text(String),
}
let row = vec![
SpreadsheetCell::Int(3),
SpreadsheetCell::Text(String::from("blue")),
SpreadsheetCell::Float(10.12),
];
str
that is usually seen in its borrowed form &str
.String
and the string slice &str
types, not just one of those types.String
and a string slice &str
are UTF-8 encoded.to_string
method, which is available on any type that implements the Display
trait, as string literals do.to_string
method to create a String
from a string literal.// the method works on a literal directly:
let s = "initial contents".to_string();
// same as
let s = String::from("initial contents");
String
by using the push_str
method to append a string.let mut s = String::from("foo");
s.push_str("bar");
// s ~ "foobar"
push_str
method takes a string slice because we don’t necessarily want to take ownership of the parameter. Therefore, the following codes returns s2 is bar
, not a compile error.let mut s1 = String::from("foo");
let s2 = "bar";
s1.push_str(s2); // push_str() don't take ownership of s2
println!("s2 is {}", s2);
+
Operator or the format!
MacroThe following code contains a lot of knowledge.
let s1 = String::from("Hello, ");
let s2 = String::from("world!");
let s3 = s1 + &s2;
Before discussing about the code above, we should know that the +
operator uses the add
method, whose “signature” looks something like this (but isn’t exact):
fn add(self, s: &str) -> String {
Two discussions: let s3 = s1 + &s2;
s3
takes ownership of s1
. s1
becomes self
of the add
function.+
operator uses the add
method, whose input is &str
, not &String
. The reason we’re able to use &s2
in the call to add
is that the compiler can coerce the &String
argument into a &str
. When we call the add
method, Rust uses a deref coercion, which here turns &s2
into &s2[..]
.String
s. With format!
macro.let s1 = String::from("tic");
let s2 = String::from("tac");
let s3 = String::from("toe");
let s = format!("{}-{}-{}", s1, s2, s3);
format!
is much easier to read and doesn’t take ownership of any of its parameters.
format!
macro works in the same way as println!
, but instead of printing the output to the screen, it returns a String
with the contents.Rust doesn’t allow us to get n-th charactor with the index. The following code returns a compile error.
let s1 = String::from("hello");
let h = s1[0];
The reason is…?
String
is a wrapper over a Vec<u8>
.String
and a string slice &str
are UTF-8 encoded.// The u8 values of the String
[224, 164, 168, 224, 164, 174, 224, 164, 184, 224, 165, 141, 224, 164, 164, 224, 165, 135]
// is same as the character set
['न', 'म', 'स', '्', 'त', 'े']
// is same in the letter
["न", "म", "स्", "ते"]
Example 1. specifing by the number of bytes
let hello = "Здравствуйте";
let s = &hello[0..4];
// s will be Зд
// &hello[0..1] returns panic!
// thread 'main' panicked at 'byte index 1 is not a char boundary;
Example 2. specifing by charactors.
for c in "नमस्ते".chars() {
println!("{}", c);
}
// न
// म
// स
// ्
// त
// े
Example 3. deviding in bytes.
for b in "नमस्ते".bytes() {
println!("{}", b);
}
//224
//164
//// --snip--
//165
//135
Be sure to remember that valid Unicode scalar values may be made up of more than 1 byte.
Example:
use std::collections::HashMap;
let mut scores = HashMap::new();
scores.insert(String::from("Blue"), 10);
HashMap<K, V>
stores a mapping of keys of type K
to values of type V
..insert
takes ownerships of the variables.Example: Combining two Vec
into a HashMap
.
use std::collections::HashMap;
let teams = vec![String::from("Blue"), String::from("Yellow")];
let initial_scores = vec![10, 50];
let mut scores: HashMap<_, _> =
teams.into_iter().zip(initial_scores.into_iter()).collect();
Done by get
method.
use std::collections::HashMap;
let mut scores = HashMap::new();
scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);
let team_name = String::from("Blue");
let score = scores.get(&team_name);
Note that the result of scores.get(&team_name)
is Some(&10)
because get
returns an Option<&V>
; if there’s no value for that key in the hash map, get
will return None
.
use std::collections::HashMap;
let mut scores = HashMap::new();
scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);
for (key, value) in &scores {
println!("{}: {}", key, value);
}
Case 1. Overwriting a value. insert
simply because an HashMap
has a unique key.
use std::collections::HashMap;
let mut scores = HashMap::new();
scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Blue"), 25);
println!("{:?}", scores); // {"Blue": 25}
Case 2. Only inserting a value if the key has no value. or_insert
method.
use std::collections::HashMap;
let mut scores = HashMap::new();
scores.insert(String::from("Blue"), 10);
scores.entry(String::from("Yellow")).or_insert(50);
scores.entry(String::from("Blue")).or_insert(50);
println!("{:?}", scores); // {"Yellow": 50, "Blue": 10}
entry
method returns an enum called Entry
that represents a value that might or might not exist.
Case 3. Updating a value based on the old value. Use dereference (before understanding Chap. 15, just notice about dereference *
)
use std::collections::HashMap;
fn main(){
let text = "hello world wonderful world";
let mut map = HashMap::new();
for word in text.split_whitespace() {
let count = map.entry(word).or_insert(0);
*count += 1;
}
println!("{:?}", map);
}
The or_insert
method actually returns a mutable reference (&mut V
) to the value for this key. Here we store that mutable reference in the count
variable, so in order to assign to that value, we must first dereference count
using the asterisk (*
).
For Hashing algorithm, Rust uses SipHash as of Apr. 2021.
My note: a slide about SipHash.
Rust groups errors into two major categories: recoverable and unrecoverable errors.
Rust doesn’t have exceptions.
Instead, it has the type Result<T, E>
for recoverable errors and the panic!
macro that stops execution when the program encounters an unrecoverable error.
panic!
panic!
macro executes, your program will print a failure message, unwind and clean up the stack, and then quit.There are two type of panic, unwinding and abort.
Generally the walking back and cleanup in unwinding is a lot of work. Abort is an alternative.
Panic example: Buffer overread
fn main() {
let v = vec![1, 2, 3];
v[99];
}
The key to reading the backtrace is to start from the top and read until you see files you wrote.
RUST_BACKTRACE=1 cargo run
Recall Result
enum.
enum Result<T, E> {
Ok(T),
Err(E),
}
<T, E>
means “T
and E
are generic type parameters”.A good error handling example: Open file.
use std::fs::File;
fn main() {
let f = File::open("hello.txt");
let f = match f {
Ok(file) => file,
Err(error) => panic!("Problem opening the file: {:?}", error),
};
}
Run without the file hello.txt
.
$ cargo run
... (warning about _f)
Finished dev [unoptimized + debuginfo] target(s) in 0.00s
Running `target/debug/panic`
thread 'main' panicked at 'Problem opening the file: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/main.rs:8:23
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Example: switch operations by a type of errors.
use std::fs::File;
use std::io::ErrorKind;
fn main() {
let f = File::open("hello.txt");
let f = match f {
Ok(file) => file,
Err(error) => match error.kind() {
ErrorKind::NotFound => match File::create("hello.txt") {
Ok(fc) => fc,
Err(e) => panic!("Problem creating the file: {:?}", e),
},
other_error => {
panic!("Problem opening the file: {:?}", other_error)
}
},
};
}
It’s more sophicicated because there is no match
expression.
unwrap_or_else
: Implemented in Option<T>
. Returns the contained Some
value or computes it from a closure.
use std::fs::File;
use std::io::ErrorKind;
fn main() {
let f = File::open("hello.txt").unwrap_or_else(|error| {
if error.kind() == ErrorKind::NotFound {
File::create("hello.txt").unwrap_or_else(|error| {
panic!("Problem creating the file: {:?}", error);
})
} else {
panic!("Problem opening the file: {:?}", error);
}
});
}
unwrap
<- Used frequent (IMO)The Result<T, E>
type has many helper methods defined on it to do various tasks. One of those methods, called unwrap
, is a shortcut method that is implemented just like the match expression.
If the Result
value is the Ok
variant, unwrap
will return the value inside the Ok
. If the Result
is the Err
variant, unwrap
will call the panic!
macro for us.
use std::fs::File;
fn main() {
let f = File::open("hello.txt").unwrap();
}
expect
Similar to unwrap
, but it lets us also choose the panic!
error message.
use std::fs::File;
fn main() {
let f = File::open("hello.txt").expect("Failed to open hello.txt");
}
?
operatoruse std::fs::File;
use std::io;
use std::io::Read;
fn read_username_from_file() -> Result<String, io::Error> {
let mut f = File::open("hello.txt")?;
let mut s = String::new();
f.read_to_string(&mut s)?;
Ok(s)
}
The ?
placed after a Result
value is defined to work
Result
is an Ok
, the value inside the Ok
will get returned from this expression, and the program will continue.Err
, the Err
will be returned from the whole function so the error value gets propagated to the calling code.Error values that have the ?
operator called on them go through the from
function, defined in the From
trait in the standard library, which is used to convert errors from one type into another.
The ?
operator can be used in functions that have a return type of Result
.
We’re only allowed to use the ?
operator in a function that returns Result
or Option
or another type that implements std::ops::Try
.
When you’re writing code in a function that doesn’t return one of these types, and you want to use ?
when you call other functions that return Result<T, E>
, one technique is to change the return type of your function to be Result<T, E>
if you have no restrictions preventing that.
The main
function is special, and there are restrictions on what its return type must be. One valid return type for main
is ()
, and conveniently, another valid return type is Result<T, E>
.
use std::error::Error;
use std::fs::File;
fn main() -> Result<(), Box<dyn Error>> {
let f = File::open("hello.txt")?;
Ok(())
}
For now, you can read Box<dyn Error>
to mean “any kind of error.”
Rust provides the convenient fs::read_to_string
function that opens the file, creates a new String
, reads the contents of the file, puts the contents into that String
, and returns it.
use std::fs;
use std::io;
fn read_username_from_file() -> Result<String, io::Error> {
fs::read_to_string("hello.txt")
}
panic!
or Not to panic!
Returning Result
is a good default choice when you’re defining a function that might fail.
(My note: user can handle errors. panic!
stop the program!)
The unwrap
and expect
methods are very handy when prototyping, before you’re ready to decide how to handle errors.
In test phase, panic!
is how a test is marked as a failure. (My note: single panic = fail of a whole test)
panic!
is often appropriate if you’re calling external code that is out of your control and it returns an invalid state that you have no way of fixing.
However, when failure is expected, it’s more appropriate to return a Result
than to make a panic!
call.
Functions often have contracts: their behavior is only guaranteed if the inputs meet particular requirements. Panicking when the contract is violated makes sense because a contract violation always indicates a caller-side bug and it’s not a kind of error you want the calling code to have to explicitly handle. … Contracts for a function, especially when a violation will cause a panic, should be explained in the API documentation for the function.
My note: for validation, use Rust’s type system.
We can make a new type and put the validations in a function to create an instance of the type rather than repeating the validations everywhere. That way, it’s safe for functions to use the new type in their signatures and confidently use the values they receive.
Example:
pub struct Guess {
value: i32,
}
impl Guess {
pub fn new(value: i32) -> Guess {
if value < 1 || value > 100 {
panic!("Guess value must be between 1 and 100, got {}.", value);
}
Guess { value }
}
pub fn value(&self) -> i32 {
self.value
}
}
pub fn value(&self) -> i32
is called getter. This public method is necessary because the value
field of the Guess
struct is private.
Generics are abstract stand-ins for concrete types or other properties.
Similar to the way a function takes parameters with unknown values to run the same code on multiple concrete values, functions can take parameters of some generic type instead of a concrete type, like i32
or String
.
The core concept is “removing duplication by extracting a function.”
In case of a function:
Tips: By convention, parameter names in Rust are short, often just a letter, and Rust’s type-naming convention is CamelCase. Short for “type,” T
is the default choice of most Rust programmers.
Practice: We combine the two functions below.
fn largest_i32(list: &[i32]) -> &i32 {
let mut largest = &list[0];
for item in list {
if item > largest {
largest = item;
}
}
largest
}
fn largest_char(list: &[char]) -> &char {
let mut largest = &list[0];
for item in list {
if item > largest {
largest = item;
}
}
largest
}
First, define a generic function.
fn largest<T>(list: &[T]) -> &T {
<>
list
.list
is a slice of values of type T
fn largest<T>(list: &[T]) -> &T {
let mut largest = &list[0];
for item in list {
if item > largest {
largest = item;
}
}
largest
}
fn main() {
let number_list = vec![34, 50, 25, 100, 65];
let result = largest(&number_list);
println!("The largest number is {}", result);
let char_list = vec!['y', 'm', 'a', 'q'];
let result = largest(&char_list);
println!("The largest char is {}", result);
}
It looks fine, but unfortunately, it returns compile error.
error[E0369]: binary operation `>` cannot be applied to type `&T`
--> src/main.rs:5:17
|
5 | if item > largest {
| ---- ^ ------- &T
| |
| &T
|
help: consider restricting type parameter `T`
|
1 | fn largest<T: std::cmp::PartialOrd>(list: &[T]) -> &T {
| ^^^^^^^^^^^^^^^^^^^^^^
error: aborting due to previous error
The root cause is, the trait std::cmp::PartialOrd
is not implemented to String
s.
The final answer would be as follows, which is covered in the next section.
fn largest<T: PartialOrd + Copy>(list: &[T]) -> T {
let mut largest = list[0];
for &item in list {
if item > largest {
largest = item;
}
}
largest
}
We can define structs to use a generic type parameter in one or more fields using the <>
syntax.
struct Point<T> {
x: T,
y: T,
}
fn main() {
let integer = Point { x: 5, y: 10 };
let float = Point { x: 1.0, y: 4.0 };
}
To define a Point
struct where x
and y
are both generics but could have different types…
struct Point<T, U> {
x: T,
y: U,
}
Remind Option
in the Chapter 6.
enum Option<T> {
Some(T),
None,
}
Remind Result
in the Chapter 9.
enum Result<T, E> {
Ok(T),
Err(E),
}
When you recognize situations in your code with multiple struct or enum definitions that differ only in the types of the values they hold, you can avoid duplication by using generic types instead.
impl<T>
.
By declaring T
as a generic type after impl
, Rust can identify that the type in the angle brackets in Point
is a generic type rather than a concrete type.
struct Point<T> {
x: T,
y: T,
}
impl<T> Point<T> {
fn x(&self) -> &T {
&self.x
}
}
fn main() {
let p = Point { x: 5, y: 10 };
println!("p.x = {}", p.x());
}
Defined a method named x
on Point<T>
that returns a reference to the data in the field x
.
When we write impl Point<f32>
, methods are implemented only to type f32
.
The good news is that Rust implements generics in such a way that your code doesn’t run any slower using generic types than it would with concrete types.
Monomorphization is the process of turning generic code into specific code by filling in the concrete types that are used when compiled.
For example, when Rust compiles the following code, it performs monomorphization.
let integer = Some(5);
let float = Some(5.0);
A trait tells the Rust compiler about functionality a particular type has and can share with other types.
pub trait Summary {
fn summarize(&self) -> String;
}
Interpret as “any type that has the Summary
trait will have the method summarize
.”
Implementing the trait on a type
pub struct NewsArticle {
pub headline: String,
pub location: String,
pub author: String,
pub content: String,
}
impl Summary for NewsArticle {
fn summarize(&self) -> String {
format!("{}, by {} ({})", self.headline, self.author, self.location)
}
}
How to use traits to define functions that accept many different types.
pub fn notify(item: &impl Summary) {
println!("Breaking news! {}", item.summarize());
}
Instead of a concrete type for the item
parameter, we specify the impl
keyword and the trait name. This parameter accepts any type that implements the specified trait.
The above is actually syntax sugar for a longer form,
pub fn notify<T: Summary>(item: &T) {
println!("Breaking news! {}", item.summarize());
}
Multi input.
// differenct type
pub fn notify(item1: &impl Summary, item2: &impl Summary)
// same type
pub fn notify<T: Summary>(item1: &T, item2: &T)
+
SyntaxWe specify in the notify
definition that item
must implement both Display
and Summary
. We can do so using the +
syntax:
pub fn notify(item: &(impl Summary + Display)) {...
//or
pub fn notify<T: Summary + Display>(item: &T) {...
where
clauseMore readable, less cluttered.
fn some_function<T, U>(t: &T, u: &U) -> i32
where T: Display + Clone,
U: Clone + Debug
{
// is equal to
fn some_function<T: Display + Clone, U: Clone + Debug>(t: &T, u: &U) -> i32 {
fn returns_summarizable() -> impl Summary {
Tweet {
username: String::from("horse_ebooks"),
content: String::from(
"of course, as you probably already know, people",
),
reply: false,
retweet: false,
}
}
By using impl Summary
for the return type, we specify that the returns_summarizable
function returns some type that implements the Summary
trait without naming the concrete type.
However, you can only use impl Trait
if you’re returning a single type.
Here is the answer of the problem which arrosed at the beginning of this section.
fn largest<T: PartialOrd + Copy>(list: &[T]) -> T {
let mut largest = list[0];
for &item in list {
if item > largest {
largest = item;
}
}
largest
}
fn main() {
let number_list = vec![34, 50, 25, 100, 65];
let result = largest(&number_list);
println!("The largest number is {}", result);
let char_list = vec!['y', 'm', 'a', 'q'];
let result = largest(&char_list);
println!("The largest char is {}", result);
}
custom type ~ struct
or enum
or etc.
impl<T: Display> ToString for T {
// --snip--
}
In “Rust by example”, there are good examples of associated function & methods.
Associated functions whose first parameter is named
self
are called methods and may be invoked using the method call operator, for example,x.foo()
, as well as the usual function call notation.
cf. Instance methods are also stored in
https://stackoverflow.com/questions/8376953/how-are-instance-methods-stored
https://stackoverflow.com/questions/34149386/are-static-methods-always-held-in-memory
#![allow(unused)]
fn main() {
struct Example {
number: i32,
}
impl Example {
fn boo() {
println!("boo! Example::boo() was called!");
}
fn add_nuber(&mut self) {
self.number += 1;
}
fn get_number(&self) -> i32 {
self.number
}
}
trait Thingy {
fn do_thingy(&self);
}
impl Thingy for Example {
fn do_thingy(&self) {
println!("doing a thing! also, number is {}!", self.number);
}
}
// Test it
let mut dummy = Example{number: 2};
Example::boo(); // boo! Example::boo() was called!
println!("A number of the instance dummy is {:?}",dummy.get_number()); // A number of the instance dummy is 2
dummy.do_thingy(); // doing a thing! also, number is 2!
//dummy.boo(); //error!
}
Traits provide us total abstraction and loose coupling.
Every reference in Rust has a lifetime, which is the scope for which that reference is valid.
Dangling reference: a reference to an object that no longer exists.
The simplest example: println!("r: {}", r);
is a dangling reference, so Rust compiler returns a compile error:
fn main() {
{
let r; // ---------+-- 'a
// |
{ // |
let x = 5; // -+-- 'b |
r = &x; // | |
} // -+ |
// |
println!("r: {}", r); // |
} // ---------+
}
'a
and 'b
mean the lifetimes of r
and x
, respectively.
Because its scope is larger, we say that “r
lives longer than x
.”
The following function returns a compile error:
fn longest(x: &str, y: &str) -> &str {
if x.len() > y.len() {
x
} else {
y
}
}
longest
function could return x
or y
.
If you use it like let result = longest(string1, string2);
, the compile can’t decide the lifetime of string1
or string2
.
The reason is, the Rust compiler has a borrow checker that compares scopes to determine whether all borrows are valid.
The borrow checker doesn’t know how the lifetimes of x
and y
relate to the lifetime of the return value of the function longest
.
How can we fix it?
The names of lifetime parameters must start with an apostrophe ('
) and are usually all lowercase and very short. Most people use the name 'a
.
We place lifetime parameter annotations after the &
of a reference,
&i32 // a reference
&'a i32 // a reference with an explicit lifetime
&'a mut i32 // a mutable reference with an explicit lifetime
The annotations are meant to tell Rust how generic lifetime parameters of multiple references relate to each other. Multi references!!
With this notation, we can specify that the lifetime of x
and y
are same as follows.
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() {
x
} else {
y
}
}
The change means “all the references in the parameters and the return value must have the same lifetime.”
In practice, it means that the lifetime of the reference returned by the longest
function is the same as the smaller of the lifetimes of the references passed in.
Remember, when we specify the lifetime parameters in this function signature, we’re not changing the lifetimes of any values passed in or returned. Rather, we’re specifying that the borrow checker should reject any values that don’t adhere to these constraints.
Ultimately, lifetime syntax is about connecting the lifetimes of various parameters and return values of functions. Once they’re connected, Rust has enough information to allow memory-safe operations and disallow operations that would create dangling pointers or otherwise violate memory safety.
You need to specify lifetime parameters for functions or structs that use references.
The developers programmed these patterns into the compiler’s code so the borrow checker could infer the lifetimes in these situations and wouldn’t need explicit annotations. The patterns programmed into Rust’s analysis of references are called the lifetime elision rules.
Lifetimes on function or method parameters are called input lifetimes, and lifetimes on return values are called output lifetimes.
The 3 rules of the elision:
&self
or &mut self
because this is a method, the lifetime of self
is assigned to all output lifetime parameters.Example of the rule 1 and rule 2:
fn first_word(s: &str) -> &str {
// Apply rule 1. Same with
fn first_word<'a>(s: &'a str) -> &str {
// Apply rule 2. Same with
fn first_word<'a>(s: &'a str) -> &'a str {
When we implement methods on a struct with lifetimes, we use the same syntax as that of generic type parameters.
src/main.rs
:
struct ImportantExcerpt<'a> {
part: &'a str,
}
impl<'a> ImportantExcerpt<'a> {
fn level(&self) -> i32 {
3
}
}
impl<'a> ImportantExcerpt<'a> {
fn announce_and_return_part(&self, announcement: &str) -> &str {
println!("Attention please: {}", announcement);
self.part
}
}
fn main () {
let s1 = String::from("test1");
let mut s2 = String::from("test2");
let a = ImportantExcerpt{
part: s1.as_str()
};
println!("{}",a.part); // test1
a.announce_and_return_part(s2.as_str()); // Attention please: test2
s2 = String::from("new test2");
println!("{}",s2); // new test2
a.announce_and_return_part(s2.as_str()); //Attention please: new test2
println!("{}",a.level()); // 3
}
And result:
➜ cargo run
Compiling te v0.1.0 (/home/atlex00/rust-project/test)
Finished dev [unoptimized + debuginfo] target(s) in 0.15s
Running `target/debug/test`
test1
Attention please: test2
new test2
Attention please: new test2
3
One special lifetime we need to discuss is 'static
, which means that this reference can live for the entire duration of the program.
All string literals have the 'static
lifetime,
let s: &'static str = "I have a static lifetime.";
// Same as
let s = "I have a static lifetime.";
The text of this string is stored directly in the program’s binary, which is always available. Therefore, the lifetime of all string literals is 'static
.
During learning tokio framework, I realized that it is Common Rust Lifetime Misconceptions.
I need to tell the difference between static variables and static lifetime.
Well yes, but a type with a
'static
lifetime is different from a type bounded by a'static
lifetime. …T: 'static
includes all&'static T
however it also includes all owned types, likeString
,Vec
, etc. The owner of some data is guaranteed that data will never get invalidated as long as the owner holds onto it, therefore the owner can safely hold onto the data indefinitely long, including up until the end of the program. … Key Takeaways
T: 'static
should be read as “T
is bounded by a'static
lifetime”- if
T: 'static
thenT
can be a borrowed type with a'static
lifetime or an owned type- since
T: 'static
includes owned types that meansT
- can be dynamically allocated at run-time
- does not have to be valid for the entire program
- can be safely and freely mutated
- can be dynamically dropped at run-time
- can have lifetimes of different durations
static
as a trait bound is described in the official Rust by example.
Just an example:
fn main() {
let string1 = String::from("abcd");
let string2 = "xyz";
let result = longest_with_an_announcement(
string1.as_str(),
string2,
"Today is someone's birthday!",
);
println!("The longest string is {}", result);
}
use std::fmt::Display;
fn longest_with_an_announcement<'a, T>(
x: &'a str,
y: &'a str,
ann: T,
) -> &'a str
where
T: Display,
{
println!("Announcement! {}", ann);
if x.len() > y.len() {
x
} else {
y
}
}
➜ cargo run
Compiling te v0.1.0 (/home/atlex00/rust-project/test)
Finished dev [unoptimized + debuginfo] target(s) in 0.16s
Running `target/debug/test`
Announcement! Today is someone's birthday!
The longest string is abcd
A test is done by,
Attributes are metadata about pieces of Rust code.
For example, derive
is one of the attributes.
#[derive(Debug)]
struct Rectangle {
width: u32,
height: u32,
}
To change a function into a test function, add #[test]
on the line before fn
.
To test, run cargo test
.
When we make a new library project with Cargo, a test module with a test function in it is automatically generated for us.
#[test]
annotationThis is the default test file.
#[cfg(test)]
mod tests {
#[test]
fn it_works() {
assert_eq!(2 + 2, 4);
}
}
#[test]
attribute indicates fn it_works
is a test function.
Run the test:
$ cargo test
Finished test [unoptimized + debuginfo] target(s) in 0.00s
Running target/debug/deps/adder-6f6d09e2972de52b
running 1 test
test tests::it_works ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Doc-tests adder
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
tests::it_works
is the name of the generated test function.
Note that measured
is a result of benchmark test.
Because the benchmark feature isn’t available in the stable channel, you should if you want to use benchmark feature.
https://doc.rust-lang.org/unstable-book/library-features/test.html
rustup install nightly
You should install nightly channel, unless you’ll get an error like,
$ cargo bench
Compiling adder v0.1.0 (/home/atlex00/rust-projects/adder)
error[E0554]: `#![feature]` may not be used on the stable release channel
--> src/lib.rs:1:1
|
1 | #![feature(test)]
| ^^^^^^^^^^^^^^^^^
error: aborting due to previous error
src/lib.rs
#![feature(test)]
extern crate test;
pub fn add_two(a: i32) -> i32 {
a + 2
}
#[cfg(test)]
mod tests {
use super::*;
use test::Bencher;
#[test]
fn it_works() {
assert_eq!(4, add_two(2));
}
#[bench]
fn bench_add_two(b: &mut Bencher) {
b.iter(|| add_two(2));
}
}
Run a benchmark:
$ cargo +nightly bench
Compiling adder v0.1.0 (/home/atlex/rust-projects/adder)
Finished bench [optimized] target(s) in 0.60s
Running unittests (target/release/deps/adder-8d2056bd46123ee2)
running 2 tests
test tests::it_works ... ignored
test tests::bench_add_two ... bench: 0 ns/iter (+/- 0)
test result: ok. 0 passed; 0 failed; 1 ignored; 1 measured; 0 filtered out; finished in 1.08s
Rust runs our benchmark a number of times, and then takes the average.
Doc-tests
We’ll learn about it in Chapter 14, but in a nut shell,
///
is a special comment, called Documentation comment.///
supports Markdown notation.assert!
macroWe give the assert!
macro an argument that evaluates to a Boolean
.
If the value is true
, assert!
does nothing and the test passes.
If the value is false
, the assert!
macro calls the panic!
macro, which causes the test to fail.
You can put second parameter for a custom asserting message.
assert_eq!
and assert_ne!
Under the surface, the assert_eq!
and assert_ne!
macros use the operators ==
and !=
, respectively.
The values being compared must implement the PartialEq
and Debug
traits.
https://doc.rust-lang.org/book/appendix-03-derivable-traits.html
The derive
attribute generates code that will implement a trait with its own default implementation on the type you’ve annotated with the derive
syntax.
should_panic
attributeThis attribute makes a test pass if the code inside the function panics.
Example:
pub struct Guess {
value: i32,
}
impl Guess {
pub fn new(value: i32) -> Guess {
if value < 1 || value > 100 {
panic!("Guess value must be between 1 and 100, got {}.", value);
}
Guess { value }
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
#[should_panic]
fn greater_than_100() {
Guess::new(200);
}
}
Tests that use should_panic
can be imprecise because they only indicate that the code has caused some panic.
Using expected
parameter to the should_panic
attributes makes the test more precise.
expected
parameter is a substring of the message which the function panics with.
...
} else if value > 100 {
panic!(
"Guess value must be less than or equal to 100, got {}.",
value
);
}
...
#[cfg(test)]
mod tests {
use super::*;
#[test]
#[should_panic(expected = "Guess value must be less than or equal to 100")]
fn greater_than_100() {
Guess::new(200);
}
}
It returns Ok(())
when the test passes and an Err
with a String
inside when the test fails.
Result<T, E>
enables you to use the question mark operator in the body of tests.#[should_panic]
annotation on tests that use Result<T, E>
. Instead, you should return an Err
value directly when the test should fail.Result<T, E>
in Tests#[cfg(test)]
mod tests {
#[test]
fn it_works() -> Result<(), String> {
if 2 + 2 == 4 {
Ok(())
} else {
Err(String::from("two plus two does not equal four"))
}
}
}
The default behavior of the binary produced by cargo test
is to run all the tests in parallel and capture output generated during test runs, preventing the output from being displayed and making it easier to read the output related to the test results.
Because the tests are running at the same time, make sure your tests don’t depend on each other or on any shared state, including a shared environment, such as the current working directory or environment variables.
If you don’t want to run the tests in parallel, use --test-threads
option like cargo test -- --test-threads=1
. --
here is called “seperator.”
If we want to see printed values for passing tests as well, we can tell Rust to also show the output of successful tests at the end with --show-output
.
We can pass the name of any test function to cargo test to run only that test: cargo test {{ the name of function }}
, but we can’t specify the names of multiple tests in this way.
We can specify part of a test name, and any test whose name matches that value will be run.
Sometimes a few specific tests can be very time-consuming to execute, so you might want to exclude them during most runs of cargo test
. Use ignore
attribute.
src/lib.rs
#[test]
fn it_works() {
assert_eq!(2 + 2, 4);
}
#[test]
#[ignore]
fn expensive_test() {
// code that takes an hour to run
}
If we want to run only the ignored tests, we can use cargo test -- --ignored
.
The Rust community thinks about tests in terms of two main categories: unit tests and integration tests.
The convention is to create a module named tests
in each file to contain the test functions and to annotate the module with cfg(test)
.
You’ll use #[cfg(test)]
to specify that they shouldn’t be included in the compiled result.
To create integration tests, you first need a tests
directory at the top level of our project directory, next to src
. Cargo knows to look for integration test files in this directory.
We don’t need to annotate any code in tests/integration_test.rs
with #[cfg(test)]
.
Each file in the tests
directory is a separate crate, so we need to bring our library into each test crate’s scope.
tests/integration_test.rs
in a project adder
.
use adder;
#[test]
fn it_adds_two() {
assert_eq!(4, adder::add_two(2));
}
In this tutorial, we write a clone of grep
command.
std::env::args()
returns an iterator of the command line arguments.collect
method on an iterator to turn it into a collection (such a vector).std::env::args()
will panic if any argument contains invalid Unicode. For invalid Unicode, use std::env::args_os
insteaduse std::env;
fn main() {
let args: Vec<String> = env::args().collect();
println!("{:?}", args);
}
Result:
$ cargo run 1starg 2ndarg
Compiling iptables_viewer v0.1.0 (/path/to/your/project)
Finished dev [unoptimized + debuginfo] target(s) in 0.25s
Running `target/debug/project-name 1starg 2ndarg`
["target/debug/project-name", "1starg", "2ndarg"]
target/debug/project-name
, which is the name of our binary.&args[1]
in the program.&str
.The following snippet would be refactored in the next section 12.3.
use std::fs;
let contents = fs::read_to_string(filename)
.expect("Something went wrong reading the file");
println!("With text:\n{}", contents);
fs::read_to_string
takes the filename, opens that file, and returns a Result<String>
of the file’s contents.I’ve learned general programming concepts in this chapter.
In a nutshell: main.rs
handles running the program, and lib.rs
handles all the logic of the task at hand.
Here are the reasons:
main
, the number of separate tasks the main function handles will increase.Something went wrong reading the file
is not clear.main.rs
and a lib.rs
and move your program’s logic to lib.rs
.main.rs
.main.rs
and move it to lib.rs
.run
function in lib.rs
run
returns an errorBased on this best practices, we can do
prse_config
function)Config
struct)Note: Using primitive values when a complex type would be more appropriate is an anti-pattern known as primitive obsession.
This is the refactored version:
use std::env;
use std::fs;
fn main() {
let args: Vec<String> = env::args().collect();
let config = parse_config(&args);
println!("Searching for {}", config.query);
println!("In file {}", config.filename);
let contents = fs::read_to_string(config.filename)
.expect("Something went wrong reading the file");
println!("With text:\n{}", contents);
}
struct Config {
query: String,
filename: String,
}
fn parse_config(args: &[String]) -> Config {
let query = args[1].clone();
let filename = args[2].clone();
Config { query, filename }
}
If you create a file foo.txt
:
➜ cargo run ar1 foo.txt
Finished dev [unoptimized + debuginfo] target(s) in 0.00s
Running `target/debug/minigrep ar1 foo.txt`
Searching for ar1
In file foo.txt
With text:
I'm in foo.txt.
There’s a tendency among many Rustaceans to avoid using clone
to fix ownership problems because of its runtime cost. We will learn more efficient way in Chapter 13.
The next improvements are:
parse_config
as a constructor. Making this change will make the code more idiomatic.Result
from constructor
instead of calling panic!
, so that main
function can exit the process more cleanly in the error case.impl Config {
fn new(args: &[String]) -> Result<Config, &str> {
if args.len() < 3 {
return Err("not enough arguments");
}
let query = args[1].clone();
let filename = args[2].clone();
Ok(Config { query, filename })
}
}
// --snip--
let config = Config::new(&args).unwrap_or_else(|err| {
println!("Problem parsing arguments: {}", err);
process::exit(1);
});
The unwrap_or_else
function is, if the value is an Err
value, this method calls the code in the closure, which is an anonymous function we define and pass as an argument to unwrap_or_else
.
Next, following the next best practice, we’ll create run
function.
Calling a
run
function inlib.rs
fn run(config: Config) -> Result<(), Box<dyn Error>> {
let contents = fs::read_to_string(config.filename)?;
println!("With text:\n{}", contents);
Ok(())
}
Box<dyn Error>
is colled a trait object, and we will review it in chapter 17.
For now, we can understand that Box<dyn Error>
means the function will return a type that implements the Error trait, but we don’t have to specify what particular type the return value will be.
Recall that ?
returns Err
from the whole function so the error value gets propagated.
This
Ok(())
syntax might look a bit strange at first, but using()
like this is the idiomatic way to indicate that we’re callingrun
for its side effects only; it doesn’t return a value we need.
If a function returns ()
(inside OK(())
) in the success case, and we don’t care about the returned value, we can use if let
rather than unwrap_or_else
.
The last refactoring is splitting code into a library crate. And here is the final result of the section.
src/lib.rs
:
use std::error::Error;
use std::fs;
pub struct Config {
pub query: String,
pub filename: String,
}
impl Config {
pub fn new(args: &[String]) -> Result<Config, &str> {
if args.len() < 3 {
return Err("not enough arguments");
}
let query = args[1].clone();
let filename = args[2].clone();
Ok(Config { query, filename })
}
}
pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
let contents = fs::read_to_string(config.filename)?;
println!("With text:\n{}", contents);
Ok(())
}
src/main.rs
:
use std::env;
use std::process;
use minigrep::Config;
fn main() {
let args: Vec<String> = env::args().collect();
let config = Config::new(&args).unwrap_or_else(|err| {
println!("Problem parsing arguments: {}", err);
process::exit(1);
});
println!("Searching for {}", config.query);
println!("In file {}", config.filename);
if let Err(e) = minigrep::run(config) {
println!("Application error: {}", e);
process::exit(1);
}
}
In this chapter, the TDD process is
Before start TDD process, please delete unrequired println!
lines.
In src/lib.rs
:
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn one_result() {
let query = "duct";
let contents = "\
Rust:
safe, fast, productive.
Pick three.";
assert_eq!(vec!["safe, fast, productive."], search(query, contents));
}
}
Here, we defined the function search
, which was not defined yet.
But it’s OK for this step (this is the TDD).
In src/lib.rs
:
pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
let mut results = Vec::new();
for line in contents.lines() {
if line.contains(query) {
results.push(line);
}
}
results
}
Maybe it’s better time to review the lifetime chapter.
And use it from run()
(this part will be refactored in the chapter 13):
pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
let contents = fs::read_to_string(config.filename)?;
for line in search(&config.query, &contents) {
println!("{}", line);
}
Ok(())
}
Add a new test case with search_case_insensitive
function.
We want to use the function when we specify an environment variable.
The way of “making all functions ascase insensitive” is to all related strings to lower cases (for now, we don’t think about general UTF-8 characters).
In mod tests
of src/lib.rs
:
#[test]
fn case_insensitive() {
let query = "rUsT";
let contents = "\
Rust:
safe, fast, productive.
Pick three.
Trust me.";
assert_eq!(
vec!["Rust:", "Trust me."],
search_case_insensitive(query, contents)
);
}
Implement the function:
pub fn search_case_insensitive<'a>(
query: &str,
contents: &'a str,
) -> Vec<&'a str> {
let query = query.to_lowercase();
let mut results = Vec::new();
for line in contents.lines() {
if line.to_lowercase().contains(&query) {
results.push(line);
}
}
results
}
query
, and the type of query
is String
(because of to_lowercase
method).Now, add an environment variable part.
Change Config
struct:
pub struct Config {
pub query: String,
pub filename: String,
pub case_sensitive: bool,
}
Change run
fuction (controll flow):
let results = if config.case_sensitive {
search(&config.query, &contents)
} else {
search_case_insensitive(&config.query, &contents)
};
for line in results {
println!("{}", line);
}
Read an environment variable in the constructor:
let query = args[1].clone();
let filename = args[2].clone();
let case_sensitive = env::var("CASE_INSENSITIVE").is_err();
Ok(Config {
query,
filename,
case_sensitive,
})
CASE_INSENSITIVE
environment variable could be set to anything.is_error
unwraps a Result
and returns boolean.The next section is the last section of the chapter, so I’ll paste the final result at the end of the chapter.
One thing to learn: eprintln!
will output a message to the stdout
.
Here is the final result:
src/main.rs
:
use std::env;
use std::process;
use minigrep::Config;
fn main() {
let args: Vec<String> = env::args().collect();
let config = Config::new(&args).unwrap_or_else(|err| {
println!("Problem parsing arguments: {}", err);
process::exit(1);
});
if let Err(e) = minigrep::run(config) {
eprintln!("Application error: {}", e);
process::exit(1);
}
}
src/lib.rs
:
use std::error::Error;
use std::fs;
use std::env;
pub struct Config {
pub query: String,
pub filename: String,
pub case_sensitive: bool,
}
impl Config {
pub fn new(args: &[String]) -> Result<Config, &str> {
if args.len() < 3 {
return Err("not enough arguments");
}
let query = args[1].clone();
let filename = args[2].clone();
let case_sensitive = env::var("CASE_INSENSITIVE").is_err();
Ok(Config {
query,
filename,
case_sensitive,
})
}
}
pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
let contents = fs::read_to_string(config.filename)?;
let results = if config.case_sensitive {
search(&config.query, &contents)
} else {
search_case_insensitive(&config.query, &contents)
};
for line in results {
println!("{}", line);
}
Ok(())
}
pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
let mut results = Vec::new();
for line in contents.lines() {
if line.contains(query) {
results.push(line);
}
}
results
}
pub fn search_case_insensitive<'a>(
query: &str,
contents: &'a str,
) -> Vec<&'a str> {
let query = query.to_lowercase();
let mut results = Vec::new();
for line in contents.lines() {
if line.to_lowercase().contains(&query) {
results.push(line);
}
}
results
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn one_result() {
let query = "duct";
let contents = "\
Rust:
safe, fast, productive.
Pick three.";
assert_eq!(vec!["safe, fast, productive."], search(query, contents));
}
#[test]
fn case_insensitive() {
let query = "rUsT";
let contents = "\
Rust:
safe, fast, productive.
Pick three.
Trust me.";
assert_eq!(
vec!["Rust:", "Trust me."],
search_case_insensitive(query, contents)
);
}
}
If you want to do logging properly, use log
crate.
Programming in a functional style often includes using functions as values by passing them in arguments, returning them from other functions, assigning them to variables for later execution, and so forth.
An example of a closure.
let expensive_closure = |num| {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(2));
num
};
|
), inside which we specify the parameters to the closure.We can use the closure like this:
let i: i32 = 5;
println("{}",expensive_closure(i)));
We don’t need to define type of closure. The Rust compiler infer its parameters and return type. But, closure definitions will have one concrete type inferred for each of their parameters and for their return value.
But we can also define types explicitly:
let expensive_closure = |num: u32| -> u32 {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(2));
num
};
We can create a struct that will hold the closure and the resulting value of calling the closure (not to calculate expensive code multiple times).
We need to specify the type of the closure, because a struct definition needs to know the types of each of its fields.
Example:
struct Cacher<T>
where
T: Fn(u32) -> u32,
{
calculation: T,
value: Option<u32>,
}
Cacher
struct has a calculation
field of the generic type T
.T
specify that it’s a closure by using the Fn
trait.u32
parameter (specified within the parentheses after Fn
)u32
(specified after the ->
).Fn
TraitsAll closures implement at least one of the traits: Fn
, FnMut
, or FnOnce
.
FnOnce
consumes the variables it captures from its enclosing scope, known as the closure’s environment. To consume the captured variables, the closure must take ownership of these variables and move them into the closure when it is defined. The Once
part of the name represents the fact that the closure can’t take ownership of the same variables more than once, so it can be called only once.FnMut
can change the environment because it mutably borrows values.Fn
borrows values from the environment immutably.Implement the example:
impl<T> Cacher<T>
where
T: Fn(u32) -> u32,
{
fn new(calculation: T) -> Cacher<T> {
Cacher {
calculation,
value: None,
}
}
fn value(&mut self, arg: u32) -> u32 {
match self.value {
Some(v) => v,
None => {
let v = (self.calculation)(arg);
self.value = Some(v);
v
}
}
}
}
And use it:
fn generate_workout(intensity: u32, random_number: u32) {
let mut expensive_result = Cacher::new(|num| {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(2));
num
});
if intensity < 25 {
println!("Today, do {} pushups!", expensive_result.value(intensity));
println!("Next, do {} situps!", expensive_result.value(intensity));
} else {
if random_number == 3 {
println!("Take a break today! Remember to stay hydrated!");
} else {
println!(
"Today, run for {} minutes!",
expensive_result.value(intensity)
);
}
}
}
Closures have an additional capability that functions don’t have: they can capture their environment and access variables from the scope in which they’re defined.
The following snippet returns an error because equal_to_x
is a function, not closure.
fn main() {
let x = 4;
fn equal_to_x(z: i32) -> bool {
z == x
}
let y = 4;
assert!(equal_to_x(y));
}
error[E0434]: can't capture dynamic environment in a fn item
--> src/main.rs:5:14
|
5 | z == x
| ^
|
= help: use the `|| { ... }` closure form instead
Here is the closure version
fn main() {
let x = 4;
let equal_to_x = |z| z == x;
let y = 4;
assert!(equal_to_x(y));
}
If you want to force the closure to take ownership of the values it uses in the environment, you can use the move
keyword before the parameter list.
Here is the move
example (returns compile error):
fn main() {
let x = vec![1, 2, 3];
let equal_to_x = move |z| z == x;
println!("can't use x here: {:?}", x);
let y = vec![1, 2, 3];
assert!(equal_to_x(y));
}
In Rust, iterators are lazy, meaning they have no effect until you call methods that consume the iterator to use it up.
We can create an iterater from Vec<T>
explicitly:
let v1 = vec![1, 2, 3];
let v1_iter = v1.iter();
The definition of the Iterator
trait in the standard library looks like this:
pub trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
// methods with default implementations elided
}
Iterator
trait requires that you also define an Item
type.type
at the line type Item;
is called an associated type. Associated types connect a type placeholder with a trait such that the trait method definitions can use these placeholder types in their signatures. We will learn this at the later chapter “19.2. Advanced Traits”.type
keyword..Item
type is used in the return type of the next
method. = The Item
type will be the type returned from the iterator.next
method on iterators directly.for
loop because the loop took ownership of the iterator and made it mutable behind the scenes.next
are immutable references to the values in the vector.vec
and returns owned values, we can call into_iter
instead of iter
. Similarly, if we want to iterate over mutable references, we can call iter_mut
instead of iter
.Methods that call next
are called consuming adaptors, because calling them uses up the iterator.
An example of the consuming adaptor is sum()
method.
After use sum()
, you can’t reuse the iterator.
A method iterator adaptors allow you to change iterators into different kinds of iterators.
The method map
, which takes a closure to call on each item to produce a new iterator, is an example.
But because all iterators are lazy, you have to call one of the consuming adaptor methods to get results from calls to iterator adaptors.
collect()
method consumes the iterator and collects the resulting values into a collection data type.
Here is the good snippet how to use iter
, map
, and collect
:
let v1: Vec<i32> = vec![1, 2, 3];
let v2: Vec<_> = v1.iter().map(|x| x + 1).collect();
assert_eq!(v2, vec![2, 3, 4]);
Refactor two components using iterators:
struct Config
pub fn search
main
function accordinglyConfig
before:
impl Config {
pub fn new(args: &[String]) -> Result<Config, &str> {
if args.len() < 3 {
return Err("not enough arguments");
}
let query = args[1].clone();
let filename = args[2].clone();
let case_sensitive = env::var("CASE_INSENSITIVE").is_err();
Ok(Config {
query,
filename,
case_sensitive,
})
}
}
Config
after:
impl Config {
pub fn new(mut args: env::Args) -> Result<Config, &'static str> {
args.next();
let query = match args.next() {
Some(arg) => arg,
None => return Err("Didn't get a query string"),
};
let filename = match args.next() {
Some(arg) => arg,
None => return Err("Didn't get a file name"),
};
let case_sensitive = env::var("CASE_INSENSITIVE").is_err();
Ok(Config {
query,
filename,
case_sensitive,
})
}
}
clone
s from the constructor.'static
, the compiler returns error below:error[E0106]: missing lifetime specifier
--> src/lib.rs:12:55
|
12 | pub fn new(mut args: env::Args) -> Result<Config, &str> {
| ^ expected named lifetime parameter
|
= help: this function's return type contains a borrowed value with an elided lifetime, but the lifetime cannot be derived from the arguments
pub fn search
before:
pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
let mut results = Vec::new();
for line in contents.lines() {
if line.contains(query) {
results.push(line);
}
}
results
}
pub fn search
after:
pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
contents
.lines()
.filter(|line| line.contains(query))
.collect()
}
and minor main
change:
fn main() {
let config = Config::new(env::args()).unwrap_or_else(|err| {
eprintln!("Problem parsing arguments: {}", err);
process::exit(1);
});
// -- snip --
Answer to the title: Iterators, although a high-level abstraction, get compiled down to roughly the same code as if you’d written the lower-level code yourself.
TL;DR: The implementations of closures and iterators are such that runtime performance is not affected. This is part of Rust’s goal to strive to provide zero-cost abstractions.
Unrolling is an optimization that removes the overhead of the loop controlling code and instead generates repetitive code for each iteration of the loop. Rust comiler unrolls some iteration code when its optimization time.
There are two release profiles by default, dev
and release
.
You can define the profile-specific configurations in Cargo.toml
file.
Here is the example how to change optimization level in the file (this example is default value):
[profile.dev]
opt-level = 0
[profile.release]
opt-level = 3
You can find other profiles in Cargo book.
Before publishing, we need to leave documentation.
The documentation can be written inside trible slashes comment ///
(Doc-test).
cargo doc
creates the documentation, and cargo doc --open
open the documentation locally.
//!
comments are used for describing the entire crate, or entire items.
We often use this comments in src/lib.rs
, which is the crate root, to describe the entire crate.
pub use
If you use pub use self::{{ your_custom_module }}
, such modules are added the “Re-exports” section of the document, and user can use
the module easily.
This section isn’t so critical, so I don’t leave a note. If I need to publish an API, I’ll refer to the documentation directly.
cargo login {{ you_token }}
. The command store your token in $HOME/.cargo/credentials
.package.{name, version, license, description, etc.}
) in Cargo.toml
.cargo publish
. (Done!)The feature workspaces
enable us to split a package into multiple libraries (but still this is a single package).
A workspace is a set of packages that share the same
Cargo.lock
and output directory.
Here is the sample structure of workspaces
$ tree -I target
.
├── adder
│ ├── Cargo.toml
│ └── src
│ └── main.rs
├── add-one
│ ├── Cargo.toml
│ └── src
│ └── lib.rs
├── Cargo.lock
└── Cargo.toml
4 directories, 6 files
Cargo.toml
:
[workspace]
members = [
"adder",
"add-one",
]
adder/Cargo.toml
:
[package]
name = "adder"
version = "0.1.0"
edition = "2018"
[dependencies]
add-one = { path = "../add-one" }
add-one/Cargo.toml
:
[package]
name = "add-one"
version = "0.1.0"
edition = "2018"
[dependencies]
rand = "0.8.3"
add-one/src/lib.rs
:
pub fn add_one(x: i32) -> i32 {
x + 1
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn it_works() {
assert_eq!(3, add_one(2));
}
}
adder/src/main.rs
:
use add_one;
fn main() {
let num = 10;
println!(
"Hello, world! {} plus one is {}!",
num,
add_one::add_one(num)
);
}
Let’s run cargo run
:
$ cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.00s
Running `target/debug/adder`
Hello, world! 10 plus one is 11!
cargo run
is fn main()
in adder/src/main.rs
, because this is the only main
function and main.rs
.rand
crate in add-one/Cargo.toml
. If you want to use rand
crate in adder
package, you have to include the package explicitly in adder/Cargo.toml
.cargo install {{ name_of_binary_on_crate.io }}
The default location of the cargo binaries is $HOME/.cargo/bin
.
cargo-something
= cargo somthing
(easy subcommand).
A pointer is a general concept for a variable that contains an address in memory. The most common kind of pointer in Rust is a reference.
Smart pointers, on the other hand, are data structures that not only act like a pointer but also have additional metadata and capabilities.
One example that we’ll explore in this chapter is the reference counting smart pointer type (in 15.4). This pointer enables you to have multiple owners of data by keeping track of the number of owners and, when no owners remain, cleaning up the data.
In many cases, smart pointers own the data they point to.
Actually, We’ve already encountered a few smart pointers in this book, such as String
and Vec<T>
.
Smart pointers are usually implemented using structs. The characteristic that distinguishes a smart pointer from an ordinary struct is that smart pointers implement the Deref
and Drop
traits.
We’ll cover the most common smart pointers in the standard library:
Box<T>
for allocating values on the heapRc<T>
, a reference counting type that enables multiple ownershipRef<T>
and RefMut<T>
, accessed through RefCell<T>
, a type that enforces the borrowing rules at runtime instead of compile timeThe Rust is desiend in a memory-safety way. Think about your company need to create their own database system for some reason (suppose the company don’t want to use 3rd party database services). If you want to implement a relational database by Rust, these pointers could be used frequently.
Box<T>
to Point to Data on the HeapBox<T>
allow you to store data on the heap rather than the stack. What remains on the stack is the pointer to the heap data.
You’ll use them most often in these situations:
Vec
My note: at this point, I wondered how Rust allocate memory when I manipulate Vec
.
I found a good post about this theme.
https://markusjais.com/unterstanding-rusts-vec-and-its-capacity-for-fast-and-efficient-programs/ <- the page was removed somehow…🤔
I found a good criticism on the post.
Cite from the official document of std::vec::Vec
:
The capacity of a vector is the amount of space allocated for any future elements that will be added onto the vector. This is not to be confused with the length of a vector, which specifies the number of actual elements within the vector. If a vector’s length exceeds its capacity, its capacity will automatically be increased, but its elements will have to be reallocated. For example, a vector with capacity 10 and length 0 would be an empty vector with space for 10 more elements. Pushing 10 or fewer elements onto the vector will not change its capacity or cause reallocation to occur. However, if the vector’s length is increased to 11, it will have to reallocate, which can be slow. For this reason, it is recommended to use
Vec::with_capacity
whenever possible to specify how big the vector is expected to get.
fn main() {
let v: Vec<i32> = Vec::new();
println!("{:?}",v.capacity()); // 0
println!("{:?}",v.len()); // 0
let v2: Vec<i32> = Vec::with_capacity(5);
println!("{:?}",v2.capacity()); // 5
println!("{:?}",v.len()); // 0
}
After several googling, here was also a good explanation. (Thank you u/matthieum !)
Box<T>
to Store Data on the HeapNot used in this way very often, but educational purpose.
fn main() {
let b = Box::new(5);
println!("b = {}", b);
}
When a box goes out of scope, as b
does at the end of main
, it will be deallocated. The deallocation happens for the box (stored on the stack) and the box goes out of scope, as b
does at the end of main
, it will be deallocated. The deallocation happens for the box (stored on the stack) and the data it points to (stored on the heap) data it points to (stored on the heap).
A construction function constructs a new pair from its two arguments, which usually are a single value and another pair.
“To cons x
onto y
” informally means to construct a new container instance by putting the element x
at the start of this new container, followed by the container y
.
Each item in a cons list contains two elements: the value of the current item and the next item. The last item in the list contains only a value called Nil
without a next item. A cons list is produced by recursively calling the cons
function.
Cons list is one of linked lists.
Let’s try to implement a list of i32
with Cons
.
The following code returns a compile error.
enum List {
Cons(i32, List),
Nil,
}
The reason Rust compiler can’t compile is, Rust doesn’t know how much space it needs to store a List
value (List
is defined recursively).
To solve this issue, use a Box<T>
(pointer), because the size of pointer is known.
enum List {
Cons(i32, Box<List>),
Nil,
}
use crate::List::{Cons, Nil};
fn main() {
let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))));
}
Deref
TraitThe code following returns compile error:
fn main() {
let x = 5;
let y = &x;
assert_eq!(5, x);
assert_eq!(5, y);
}
The error:
error[E0277]: can't compare `{integer}` with `&{integer}`
--> src/main.rs:6:5
|
6 | assert_eq!(5, y);
| ^^^^^^^^^^^^^^^^^ no implementation for `{integer} == &{integer}`
|
To avoid this error, we should change assert_eq!(5, y);
to assert_eq!(5, *y);
.
This *
is called dereference, which means “follow the reference to the value it’s pointing to.”
One more dereference example (one mutual borrowing is allowed!):
fn main() {
let mut x = 5;
let y = &mut x;
*y = 4;
assert_eq!(5, *y);
// thread 'main' panicked at 'assertion failed: `(left == right)`
// left: `5`,
// right: `4`', src/main.rs:8:5
}
Like C or C++, print the number of address:
fn main() {
let x = &42;
let address = format!("{:p}", x);
print!("{:?}", address) // like "0x560b046ea000"
}
Instead of let y = &mut x;
, write with Box
:
fn main() {
let x = 5;
let y = Box::new(x);
assert_eq!(5, x);
assert_eq!(5, *y);
}
Note that y
is an instance of a box pointing to a copied value of x
rather than a reference pointing to the value of x
.
Box<T>
type in standard library is already implemented Deref
tarit, so we could use *
operator.
If you want to used dereference operator for your own type (struct),
Let’s define a sample type MyBox<T>
(tuple struct with one element):
struct MyBox<T>(T);
impl<T> MyBox<T> {
fn new(x: T) -> MyBox<T> {
MyBox(x)
}
}
We didn’t implement Deref
trait for this struct, so the following code returns a compile error:
let x = 5;
let y = MyBox::new(x);
assert_eq!(5, x);
assert_eq!(5, *y);
// error[E0614]: type `MyBox<{integer}>` cannot be dereferenced
// --> src/main.rs:14:19
// |
// 14 | assert_eq!(5, *y);
// | ^^
//
Let’s implement Deref
trait.
The official trait document says, the required method is deref
and the associated type (about associated type, check Chapter 19) is Target
:
use std::ops::Deref;
impl<T> Deref for MyBox<T> {
type Target = T;
fn deref(&self) -> &Self::Target {
&self.0
}
}
*y
: behind the scenes Rust actually ran this code:
*(y.deref())
Rust substitutes the *
operator with a call to the deref
method and then a plain dereference so we don’t have to think about whether or not we need to call the deref
method.
Why the signature of deref
is fn deref(&self) -> &Self::Target
?
The answer is “Rust’s ownership system”.
If the deref
method returned the value directly instead of a reference to the value, the value would be moved out of self
.
Advanced review on String
and str
:
str
doesn’t store data on stack… partialy wrong.str
is known-size, and String
isn’t.str
is known-size, so it is placed on stack. The first address (a.k.a. base address) stores the length, and the remained addresses stores the actual string data.&str
points to data segment. It also means &str
is immutable (&'static str
).String
storesI checked the data segment data in this post.
How is Deref
implemented for String
in standard library:
#[stable(feature = "rust1", since = "1.0.0")]
impl ops::Deref for String {
type Target = str;
#[inline]
fn deref(&self) -> &str {
unsafe { str::from_utf8_unchecked(&self.vec) }
}
}
When we pass a reference to a particular type’s value as an argument to a function or method , Rust tries to dereference as many times as necessary to get a reference to match the parameter’s type. This is called “implicit deref coercions”.
The following code shows a deref coercions chains (&MyBox<String>
→ &String
→ &str
):
use std::ops::Deref;
impl<T> Deref for MyBox<T> {
type Target = T;
fn deref(&self) -> &T {
&self.0
}
}
struct MyBox<T>(T);
impl<T> MyBox<T> {
fn new(x: T) -> MyBox<T> {
MyBox(x)
}
}
fn hello(name: &str) {
println!("Hello, {}!", name);
}
fn main() {
let m = MyBox::new(String::from("Rust"));
hello(&m);
}
I checked the memory allocation of MyBox
in this post.
Rust does deref coercion when it finds types and trait implementations in three cases:
&T
to &U
when T: Deref<Target=U>
&mut T
to &mut U
when T: DerefMut<Target=U>
&mut T
to &U
when T: Deref<Target=U>
Rust will also coerce a mutable reference to an immutable one. But the reverse is not possible: immutable references will never coerce to mutable references.
Rc<T>
, the Reference Counted Smart PointerWe use the Rc<T>
type when we want to allocate some data on the heap for multiple parts of our program to read and we can’t determine at compile time which part will finish using the data last.
Note that Rc<T>
is only for use in single-threaded scenarios.
If you want to use shared reference counter in mutlthread, you need Arc
and Mutex
like Arc::new(Mutex::new(0));
.
Let’s see the sample code:
enum List {
Cons(i32, Rc<List>),
Nil,
}
use crate::List::{Cons, Nil};
use std::rc::Rc;
fn main() {
let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
let b = Cons(3, Rc::clone(&a));
let c = Cons(4, Rc::clone(&a));
}
This code would be interpreted as follows:
When we create b
, instead of taking ownership of a
, we’ll clone the Rc<List>
that a
is holding, thereby increasing the number of references from one to two and letting a
and b
share ownership of the data in that Rc<List>
.
clone()
makes a clone of the Rc
pointer. This creates another pointer to the same allocation, increasing the strong reference count.
When b
goes out of scope, the counter decrece the number automatically.
enum List {
Cons(i32, Rc<List>),
Nil,
}
use crate::List::{Cons, Nil};
use std::rc::Rc;
fn main() {
let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
println!("{}", Rc::strong_count(&a)); // 1
let b = Cons(3, Rc::clone(&a));
println!("{}", Rc::strong_count(&a)); // 2
{
let c = Cons(4, Rc::clone(&a));
println!("{}", Rc::strong_count(&a)); // 3
}
println!("{}", Rc::strong_count(&a)); // 2
}
We’ll see cyclic reference later, and that’s why the name of method is strong_cout
(there is a weak_count
also.)
RefCell<T>
and the Interior Mutability PatternSuppose the use case such that:
MyStruct
, which has a field my_field: &str
.(&self, foo: &str)
. &self
is immutable reference.MyStruct
for tests, your implementation of the trait should mutate the value MyStruct.my_field
to foo
.(&self, foo: &str)
to (&mut self, foo: &str)
because it is 3rd party crate. (You can fork the crate, but that is another story.)In this case, you can use RefCell
like my_field: RefCell<&str>
.
The following methods are basic usages of RefCell
:
RefCell::new()
my_refcell.borrow()
my_refcell.borrow_mut()
RefCell<T>
With references and Box<T>
, the borrowing rules’ invariants are enforced at compile time.
With RefCell<T>
, these invariants are enforced at runtime.
With references, if you break these rules, you’ll get a compiler error.
With RefCell<T>
, if you break these rules, your program will panic and exit.
(Of course you have the question now “why we need to violate the compiler rule?”. Be patient.)
The advantage of checking the borrowing rules at runtime instead is that certain memory-safe scenarios are then allowed, whereas they are disallowed by the compile-time checks.
The advantage of checking the borrowing rules at runtime instead is that certain memory-safe scenarios are then allowed, where they would’ve been disallowed by the compile-time checks. Static analysis, like the Rust compiler, is inherently conservative. Some properties of code are impossible to detect by analyzing the code: the most famous example is the Halting Problem,
Because some static analysis is impossible, if the Rust compiler can’t be sure the code complies with the ownership rules, it might reject a correct program; in this way, it’s conservative.
Similar to Rc<T>
, RefCell<T>
is only for use in single-threaded scenarios and will give you a compile-time error if you try using it in a multithreaded context.
Rc<T>
enables multiple owners of the same data; Box<T>
and RefCell<T>
have single owners.Box<T>
allows immutable or mutable borrows checked at compile time; Rc<T>
allows only immutable borrows checked at compile time; RefCell<T>
allows immutable or mutable borrows checked at runtime.RefCell<T>
allows mutable borrows checked at runtime, you can mutate the value inside the RefCell<T>
even when the RefCell<T>
is immutable.Interior mutability is a design pattern in Rust that allows you to mutate data even when there are immutable references to that data.
There are situations in which it would be useful for a value to mutate itself in its methods but appear immutable to other code.
Code outside the value’s methods would not be able to mutate the value.
Using RefCell<T>
is one way to get the ability to have interior mutability.
&self
reference.RefCell<T>
like RefCell<Vec<String>>
so that .borrow_mut()
method make the reference as mutable.Rc<T>
and RefCell<T>
(to be reviewed)need to be reviewed.
Rc<T>
lets you have multiple owners of some data,Rc<T>
that holds a RefCell<T>
, you can get a value that can have multiple owners and that you can mutate!#[derive(Debug)]
enum List {
Cons(Rc<RefCell<i32>>, Rc<List>),
Nil,
}
use crate::List::{Cons, Nil};
use std::cell::RefCell;
use std::rc::Rc;
fn main() {
let value = Rc::new(RefCell::new(5));
let a = Rc::new(Cons(Rc::clone(&value), Rc::new(Nil)));
let b = Cons(Rc::new(RefCell::new(3)), Rc::clone(&a));
let c = Cons(Rc::new(RefCell::new(4)), Rc::clone(&a));
*value.borrow_mut() += 10;
println!("a after = {:?}", a);
println!("b after = {:?}", b);
println!("c after = {:?}", c);
}
a after = Cons(RefCell { value: 15 }, Nil)
b after = Cons(RefCell { value: 3 }, Cons(RefCell { value: 15 }, Nil))
c after = Cons(RefCell { value: 4 }, Cons(RefCell { value: 15 }, Nil))
Mutex is the thread-safe version of RefCell
https://doc.rust-lang.org/book/ch15-06-reference-cycles.html
memory leaks are memory safe in Rust. We can see that Rust allows memory leaks by using Rc and RefCell
Should be reviewed from here
The Rust team discovered that the ownership and type systems are a powerful set of tools to help manage memory safety and concurrency problems!
Caution: In this book, authors refer to many of the problems as concurrent rather than being more precise by saying concurrent and/or parallel.
Many operating systems provide an API for creating new threads. This model where a language calls the operating system APIs to create threads is sometimes called 1:1, meaning one operating system thread per one language thread.
Programming language-provided threads are known as green threads, and languages that use these green threads will execute them in the context of a different number of operating system threads. For this reason, the green-threaded model is called the M:N model: there are M
green threads per N
operating system threads, where M
and N
are not necessarily the same number.
The Rust standard library only provides an implementation of 1:1 threading.
spawn
To create a new thread, we call the thread::spawn
function and pass it a closure containing the code we want to run in the new thread.
(The new thread is a new OS thread, because Rust standard library provides only 1:1.)
The new thread will be stopped when the main thread ends, whether or not it has finished running.
use std::thread;
use std::time::Duration;
fn main() {
thread::spawn(|| {
for i in 1..10 {
println!("hi number {} from the spawned thread!", i);
thread::sleep(Duration::from_millis(1));
}
});
for i in 1..5 {
println!("hi number {} from the main thread!", i);
thread::sleep(Duration::from_millis(1));
}
}
Run (You can see, there is no 6 to 10):
$ cargo run
hi number 1 from the main thread!
hi number 1 from the spawned thread!
hi number 2 from the spawned thread!
hi number 2 from the main thread!
hi number 3 from the spawned thread!
hi number 3 from the main thread!
hi number 4 from the spawned thread!
hi number 4 from the main thread!
hi number 5 from the spawned thread!
The calls to thread::sleep
force a thread to stop its execution for a short duration, allowing a different thread to run.
The number of spawnd thread! line between main thread is depend on your CPU.
If I comment-out the lines thread::sleep(Duration::from_millis(1));
, the spawned process doesn’t start.
$ cargo run
hi number 1 from the main thread!
hi number 2 from the main thread!
hi number 3 from the main thread!
hi number 4 from the main thread!
join
HandlesThe return type of thread::spawn
is JoinHandle
. A JoinHandle
is an owned value that, when we call the join method on it, will wait for its thread to finish.
use std::thread;
use std::time::Duration;
fn main() {
let handle = thread::spawn(|| {
for i in 1..10 {
println!("hi number {} from the spawned thread!", i);
thread::sleep(Duration::from_millis(1));
}
});
for i in 1..5 {
println!("hi number {} from the main thread!", i);
thread::sleep(Duration::from_millis(1));
}
handle.join().unwrap();
}
hi number 1 from the main thread!
hi number 1 from the spawned thread!
hi number 2 from the main thread!
hi number 2 from the spawned thread!
hi number 3 from the main thread!
hi number 3 from the spawned thread!
hi number 4 from the main thread!
hi number 4 from the spawned thread!
hi number 5 from the spawned thread!
hi number 6 from the spawned thread!
hi number 7 from the spawned thread!
hi number 8 from the spawned thread!
hi number 9 from the spawned thread!
If we put the line handle.join().unwrap();
between the for
s statement, result would be like follows, because it waits the end of the sub-thread.
hi number 1 from the spawned thread!
hi number 2 from the spawned thread!
hi number 3 from the spawned thread!
hi number 4 from the spawned thread!
hi number 5 from the spawned thread!
hi number 6 from the spawned thread!
hi number 7 from the spawned thread!
hi number 8 from the spawned thread!
hi number 9 from the spawned thread!
hi number 1 from the main thread!
hi number 2 from the main thread!
hi number 3 from the main thread!
hi number 4 from the main thread!
move
Closures with ThreadsIf you want to access to variables with the closure in thread::spawn
, the spawned thread doesn’t know how long the variable is valied.
By adding the move
keyword before the closure, we force the closure to take ownership of the values it’s using rather than allowing Rust to infer that it should borrow the values.
use std::thread;
fn main() {
let v = vec![1, 2, 3];
let handle = thread::spawn(move || {
println!("Here's a vector: {:?}", v);
});
handle.join().unwrap();
}
In this example, move
keyword moved the ownership of v to the spawned thread, so main thread doesn’t have the ownership of v
.
If you try to drop(v)
in the main thread before handle.join()
, compiler doesn’t allow to do that.
We deal with this issue ina a later section.
To understand move
correctly, please review lifetime in Rust.
In the standard library, Rust provides message-sending concurrency by channel (mpsc::channel
).
mpsc
stands for multiple producer, single consumer (multi TX, single RX, Consumer).
A channel is said to be closed if either the transmitter or receiver half is dropped.
You can create a channel like this:
use std::sync::mpsc;
// -- snip
let (tx, rx) = mpsc::channel();
Let’s see the sample code:
use std::sync::mpsc;
use std::thread;
fn main() {
let (tx, rx) = mpsc::channel();
thread::spawn(move || {
let val = String::from("hi");
tx.send(val).unwrap();
});
let received = rx.recv().unwrap();
println!("Got: {}", received);
}
RX has two useful methods: recv
and try_recv
.
recv
is blocking and return Result<T, E>
, while try_recv
is non-blocking and returns Ok
or Err
immediately.
So, try_recv
would be put in loops.
In this context, we can call threads as actors
Transfering via channel takes ownership of the item (like a real word RX.) If you try to use transmitted data after sending via channel, it returns compile error (thank you Rust).
The single receiver could be an iterable. When the TX is closed, RX will be also closed (dropped) and iterator ends:
use std::sync::mpsc;
use std::thread;
use std::time::Duration;
fn main() {
let (tx, rx) = mpsc::channel();
thread::spawn(move || {
let vals = vec![
String::from("hi"),
String::from("from"),
String::from("the"),
String::from("thread"),
];
for val in vals {
tx.send(val).unwrap();
thread::sleep(Duration::from_secs(1));
}
});
for received in rx {
println!("Got: {}", received);
}
}
You can imaging a channel as a queue, so when multiple sender send messages, the order in receiver is random.
To access the data in a mutex, a thread must first signal that it wants access by asking to acquire the mutex’s lock (lock()
method in Rust).
Mutex<T>
Here is 101 of Mutex<T>
:
use std::sync::Mutex;
fn main() {
let m = Mutex::new(5);
{
let mut num = m.lock().unwrap();
*num = 6;
}
println!("m = {:?}", m);
}
Result:
m = Mutex { data: 6, poisoned: false, .. }
The call to lock
would fail if another thread holding the lock panicked.
In that case, no one would ever be able to get the lock, so we’ve chosen to unwrap
and have this thread panic if we’re in that situation.
Mutex<T>
is a smart pointer.
The call to lock
returns a smart pointer called MutexGuard
, wrapped in a LockResult
that we handled with the call to unwrap
.
The MutexGuard
smart pointer implements Deref
to point at our inner data; the smart pointer also has a Drop
implementation that releases the lock automatically when a MutexGuard
goes out of scope.
Note that Rc<T>
points to the data on heap.
Arc<T>
Rc<T>
is not safe to share accross threads.
Instead we can use an Atomic reference counter Arc<T>
in std::sync::atomic
.
Thread safety comes with a performance penalty that you only want to pay when you really need to.
Here is the proper way to share ownership accross multiple threads:
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let counter = Arc::new(Mutex::new(0));
let mut handles = vec![];
for _ in 0..10 {
let counter = Arc::clone(&counter);
let handle = thread::spawn(move || {
let mut num = counter.lock().unwrap();
*num += 1;
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}
println!("Result: {}", *counter.lock().unwrap()); // Result: 10
}
counter
is immutable, but Mutex<T>
provides interior mutability.
RefCell<T>
/Rc<T>
and Mutex<T>
/Arc<T>
Should be reviewed from here
Objects came from Simula in the 1960s. Those objects influenced Alan Kay’s programming architecture in which objects pass messages to each other. He coined the term object-oriented programming in 1967 to describe this architecture.
Hmm…
The book “Design Patterns: Elements of Reusable Object-Oriented Software” by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides (Addison-Wesley Professional, 1994) colloquially referred to as “The Gang of Four book”, is a catalog of object-oriented design patterns. It defines OOP this way:
Object-oriented programs are made up of objects. An object packages both data and the procedures that operate on that data. The procedures are typically called methods or operations.
Using this definition, Rust is object oriented: structs and enums have data, and impl
blocks provide methods on structs and enums.
Encapsulation means that the implementation details of an object aren’t accessible to code using that object. Therefore, the only way to interact with an object is through its public API.
In Rust, we can use the pub
keyword to decide which modules, types, functions, and methods in our code should be public, and by default everything else is private.
There is no way to define a struct that inherits the parent struct’s fields and method implementations.
You choose inheritance for two main reasons.
Rust uses generics to abstract over different possible types and trait bounds to impose constraints on what those types must provide. This is sometimes called bounded parametric polymorphism.
polimorphism in practice
Cf. add hoc polymorphism: suppose a function, Add(x,y)
. The behavior of Add
is depend on the type of input (append
in case of strings, add
in case of int, etc.)
This could be an example of ad hoc polymorphism.
Rust takes a different approach, using trait objects instead of inheritance.
src/lib.rs
:
pub trait Draw {
fn draw(&self);
}
pub struct Screen {
pub components: Vec<Box<dyn Draw>>,
}
impl Screen {
pub fn run(&self) {
for component in self.components.iter() {
component.draw();
}
}
}
pub struct Button {
pub width: u32,
pub height: u32,
pub label: String,
}
impl Draw for Button {
fn draw(&self) {
// code to actually draw a button
}
}
Type Box<dyn Draw>
is a trait object; it’s a stand-in for any type inside a Box
that implements the Draw
trait.
dyn
stands for “dynamic dispatch” (in computer science, dynamic dispatch is the process of selecting which implementation of a polymorphic operation (method or function) to call at run time.).
The official documentation describes the meaning of dynamic dispatch later.
When we use trait objects, Rust must use dynamic dispatch.
We can’t use trait <T>
here because A generic type parameter can only be substituted with one concrete type at a time, whereas trait objects allow for multiple concrete types to fill in for the trait object at runtime.
We say that the trait occurs as a trait objedt at Box<dyn Draw>
.
The dyn
keyword is used to highlight that calls to methods on the associated Trait
are dynamically dispatched. To use the trait this way, it must be ‘object safe’.)
A trait is object safe if all the methods defined in the trait have the following properties:
Self
.Trait objects must be object safe because once you’ve used a trait object, Rust no longer knows the concrete type that’s implementing that trait.
The code that results from monomorphization is doing static dispatch, which is when the compiler knows what method you’re calling at compile time. This is opposed to dynamic dispatch, which is when the compiler can’t tell at compile time which method you’re calling.
An example of a trait whose methods are not object safe is the standard library’s Clone trait.
pub trait Clone {
fn clone(&self) -> Self;
}
When you use a generic function, you could encounter &*my_variable
.
https://stackoverflow.com/a/41273406
Dynamic dispatch costs a bit, so consider using enum_dispatch
crate.
https://docs.rs/enum_dispatch/latest/enum_dispatch/
Pattern matching is mandatory when you want to write your own macro.
match
Arms and Conditional if let
ExpressionYou can write more complex match with if let
, but the downside of if let
expressions is that the compiler doesn’t check exhaustiveness.
(Some cases could leak.)
while let
let v = vec!['a', 'b', 'c'];
for (index, value) in v.iter().enumerate() {
println!("{} is at index {}", value, index);
}
let
statement as patternlet PATTERN = EXPRESSION;
// example
let (x, y, z) = (1, 2, 3);
fn print_coordinates(&(x, y): &(i32, i32)) {
println!("Current location: ({}, {})", x, y);
}
fn main() {
let point = (3, 5);
print_coordinates(&point);
}
irrefutable | refutable |
---|---|
match for any possible value passed | can fail to match for some possible value |
let x = 5; | Some(x) = a_value; |
In general, you shouldn’t have to worry about the distinction between refutable and irrefutable patterns; however, you do need to be familiar with the concept of refutability so you can respond when you see it in an error message.
This section just contains examples of useful pattern matches.
Value match for a variable:
let x = 1;
match x {
1 => println!("one"),
2 => println!("two"),
3 => println!("three"),
_ => println!("anything"),
}
Variable scope (shadowed):
let x = Some(5);
let y = 10;
match x {
Some(50) => println!("Got 50"),
Some(y) => println!("Matched, y = {:?}", y), // Matched, y = 5
_ => println!("Default case, x = {:?}", x),
}
println!("at the end: x = {:?}, y = {:?}", x, y);
// at the end: x = Some(5), y = 10
let x = 1;
match x {
1 | 2 => println!("one or two"), // match
3 => println!("three"),
_ => println!("anything"),
}
..=
let x = 5;
match x {
1..=5 => println!("one through five"),
_ => println!("something else"),
}
Recall that Rust’s char type is four bytes in size and represents a Unicode Scalar Value.
let x = 'c';
match x {
'a'..='j' => println!("early ASCII letter"),
'k'..='z' => println!("late ASCII letter"),
_ => println!("something else"),
}
struct Point {
x: i32,
y: i32,
}
fn main() {
let p = Point { x: 0, y: 7 };
let Point { x: a, y: b } = p;
assert_eq!(0, a);
assert_eq!(7, b);
// Or
let Point { x, y } = p;
assert_eq!(0, x);
assert_eq!(7, y);
}
You can achieve a partial match:
let p = Point { x: 0, y: 7 };
match p {
Point { x, y: 0 } => println!("On the x axis at {}", x),
Point { x: 0, y } => println!("On the y axis at {}", y), // match
Point { x, y } => println!("On neither axis: ({}, {})", x, y),
}
enum Message {
Quit,
Move { x: i32, y: i32 },
Write(String),
ChangeColor(i32, i32, i32),
}
fn main() {
let msg = Message::ChangeColor(0, 160, 255);
match msg {
Message::Quit => {
println!("The Quit variant has no data to destructure.")
}
Message::Move { x, y } => {
println!(
"Move in the x direction {} and in the y direction {}",
x, y
);
}
Message::Write(text) => println!("Text message: {}", text),
Message::ChangeColor(r, g, b) => println!(
"Change the color to red {}, green {}, and blue {}",
r, g, b
),
}
}
enum Color {
Rgb(i32, i32, i32),
Hsv(i32, i32, i32),
}
enum Message {
Quit,
Move { x: i32, y: i32 },
Write(String),
ChangeColor(Color),
}
fn main() {
let msg = Message::ChangeColor(Color::Hsv(0, 160, 255));
match msg {
Message::ChangeColor(Color::Rgb(r, g, b)) => println!(
"Change the color to red {}, green {}, and blue {}",
r, g, b
),
Message::ChangeColor(Color::Hsv(h, s, v)) => println!(
"Change the color to hue {}, saturation {}, and value {}",
h, s, v
),
_ => (),
}
}
Just use underscore _
as a place holder.
..
struct Point {
x: i32,
y: i32,
z: i32,
}
let origin = Point { x: 0, y: 0, z: 0 };
match origin {
Point { x, .. } => println!("x is {}", x),
}
Don’t make ..
as ambiguous (the following code isn’t compiled):
let numbers = (2, 4, 8, 16, 32);
match numbers {
(.., second, ..) => { // Ambiguous
println!("Some numbers: {}", second)
},
}
let x = Some(5);
let y = 10;
match x {
Some(50) => println!("Got 50"),
Some(n) if n == y => println!("Matched, n = {}", n),
_ => println!("Default case, x = {:?}", x),
}
println!("at the end: x = {:?}, y = {}", x, y);
@
BindingsYou can use a variable alias:
enum Message {
Hello { id: i32 },
}
let msg = Message::Hello { id: 5 };
match msg {
Message::Hello {
id: id_variable @ 3..=7,
} => println!("Found an id in range: {}", id_variable),
Message::Hello { id: 10..=12 } => {
println!("Found an id in another range")
}
Message::Hello { id } => println!("Found some other id: {}", id),
}
“Unsafe” means “doesn’t enforce memory safety guarantees”.
Although the code might be okay, if the Rust compiler doesn’t have enough information to be confident, it will reject the code. In these cases, you can use unsafe code to tell the compiler, “Trust me, I know what I’m doing.”
Another reason Rust has an unsafe alter ego is that the underlying computer hardware is inherently unsafe.
??
union
sNotes:
unsafe
doesn’t turn off the borrow checker or disable any other of Rust’s safety checks.unsafe
does not mean the code inside the block is necessarily dangerous or that it will definitely have memory safety problems.Parts of the standard library are implemented as safe abstractions over unsafe code that has been audited.
The rest of the section contains examples when to use unsafe.
unsafe fn dangerous() {}
unsafe {
dangerous();
}
Raw pointers can be immutable or mutable and are written as *const T
and *mut T
, respectively.
The asterisk isn’t the dereference operator; it’s part of the type name.
In the context of raw pointers, immutable means that the pointer can’t be directly assigned to after being dereferenced.
let mut num = 5;
let r1 = &num as *const i32;
let r2 = &mut num as *mut i32;
unsafe {
println!("r1 is: {}", *r1);
println!("r2 is: {}", *r2);
}
cf) println!
macro expand the arguments as a reference under the cover:
With raw pointers, we can create a mutable pointer and an immutable pointer to the same location and change data through the mutable pointer, potentially creating a data race. Be careful!
Sometimes, Rust isn’t smart enough to know safe code.
When we know code is okay, but Rust doesn’t, it’s time to reach for unsafe
code.
Rust has a keyword, extern
, that facilitates the creation and use of a Foreign Function Interface (FFI).
Functions declared within extern
blocks are always unsafe
to call from Rust code.
extern "C" {
fn abs(input: i32) -> i32;
}
fn main() {
unsafe {
println!("Absolute value of -3 according to C: {}", abs(-3));
}
}
we make the
call_from_c
function accessible from C code, after it’s compiled to a shared library and linked from C:#[no_mangle] pub extern "C" fn call_from_c() { println!("Just called a Rust function from C!"); }
In Rust, global variables are called static variables. Rust does support global variables, but can be problematic with Rust’s ownership rules.
Mutatin a static mut
variable is unsafe
:
static mut COUNTER: u32 = 0;
fn add_to_count(inc: u32) {
unsafe {
COUNTER += inc;
}
}
fn main() {
add_to_count(3);
unsafe {
println!("COUNTER: {}", COUNTER);
}
}
But why it’s unsafe
?
With mutable data that is globally accessible, it’s difficult to ensure there are no data races, which is why Rust considers mutable static variables to be unsafe.
Where possible, it’s preferable to use the concurrency techniques and thread-safe smart pointers we discussed in Chapter 16 so the compiler checks that data accessed from different threads is done safely.
If you know inheritance in OOP, you can understand the meaning of “Super” in Supertraits. We can define a trait which can be implemented only for a struct implemented a certain trait.
In the following snippet, OutlinePrint
can be implemented to a struct only when the struct is implemented the trait fmt::Display
:
struct Point {
x: i32,
y: i32,
}
use std::fmt;
impl fmt::Display for Point {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "({}, {})", self.x, self.y)
}
}
use std::fmt;
trait OutlinePrint: fmt::Display {
fn outline_print(&self) {
let output = self.to_string();
let len = output.len();
println!("{}", "*".repeat(len + 4));
println!("*{}*", " ".repeat(len + 2));
println!("* {} *", output);
println!("*{}*", " ".repeat(len + 2));
println!("{}", "*".repeat(len + 4));
}
}
fmt::Display
write!
macro (std::write
) writes formatted data into a bufer (heap like Vec
).Formatter
(std::fmt::Formatter
) represents various options related to formatting. Users do not construct Formatters
directly; a mutable reference to one is passed to the fmt
method of all formatting traits, like Debug
and Display
.In Rust by Example:
https://doc.rust-lang.org/rust-by-example/trait/supertraits.html#supertraits
Note that even the name is “super” tarit, a supertrait is a basic trait which should be “inherited” by other subtrait.
For example, when we think about a set of all person and a set of all student, Person ⊃ Student
is the case, so Person
is the supertrait of Student
.
trait Person {
// foo
};
trait Student: Person {
// bar
}
The newtype pattern is that create a new type which behaves totally same as another type but the only difference is its name of type. This pattern could reduce the number of bugs, and provides abstraction. You can create the new type with unit-like struct.
use std::ops::Add;
struct Millimeters(u32);
struct Meters(u32);
impl Add<Meters> for Millimeters {
type Output = Millimeters;
fn add(self, other: Meters) -> Millimeters {
Millimeters(self.0 + (other.0 * 1000))
}
}
You can find more flat explanation at Rust Design Patterns or my note.
By default, generic functions will work only on types that have a known size at compile time. However, you can use the following special syntax to relax this restriction:
fn generic<T: ?Sized>(t: &T) {
// --snip--
}
A trait bound on ?Sized
means “T
may or may not be Sized” and this notation overrides the default that generic types must have a known size at compile time. The ?Trait
syntax with this meaning is only available for Sized
, not any other traits.
Also note that we switched the type of the t
parameter from T
to &T
. Because the type might not be Sized
, we need to use it behind some kind of pointer.
In this case, we’ve chosen a reference.
The fn
type is called a function pointer.
By function pointer, you can pass a function as a paramenter:
fn add_one(x: i32) -> i32 {
x + 1
}
fn do_twice(f: fn(i32) -> i32, arg: i32) -> i32 {
f(arg) + f(arg)
}
fn main() {
let answer = do_twice(add_one, 5);
println!("The answer is: {}", answer);
}
Function pointers implement all three of the closure traits (Fn
, FnMut
, and FnOnce
), so you can always pass a function pointer as an argument for a function that expects a closure.
fn returns_closure() -> Box<dyn Fn(i32) -> i32> {
Box::new(|x| x + 1)
}
Note that if you change the return type to dyn Fn(i32) -> i32
, the compiler returns error because Rust doesn’t know the size of a closure,
macro_rules!
can define your custom macros, especially called declarative macros.
There are three types of macros in Rust:
Custom #[derive] macro | Attribute-like macro | Function-like macro |
---|---|---|
code added with the derive attribute used on structs and enums | define custom attributes usable on any item | look like function calls but operate on the tokens specified as their argument |
#[derive(Debug)] | #[tokio:main] | vec![1,2,3,] |
Macros can,
println!("hello")
with one argument or println!("hello {}", name)
Excerpt from “Rust By Example”:
So why are macros useful?
- Don’t repeat yourself. …
- Domain-specific languages.
- Variadic interfaces.
The downside is, macro definitions are generally more difficult to read, understand, and maintain than function definitions.
You must define macros or bring them into scope before you call them in a file, as opposed to functions you can define anywhere and call anywhere.
macro_rules!
for General MetaprogrammingBefore checking how vec!
macro should work, here is a simple macro definition from “Rust by Example”:
// This is a simple macro named `say_hello`.
macro_rules! say_hello {
// `()` indicates that the macro takes no argument.
() => {
// The macro will expand into the contents of this block.
println!("Hello!");
};
}
fn main() {
// This call will expand into `println!("Hello");`
say_hello!()
}
One argument macro:
fn main() {
// compiles OK
macro_rules! foo {
($l:tt) => {
bar!($l);
};
}
macro_rules! bar {
(3) => {};
}
foo!(3);
}
tt
is an abbreviation of “Token Tree”, a single token or tokens in matching delimiters ()
, []
, or {}
.($l:tt)
: this parentheses mean the macro try to match this pattern. In this case, the macro capture inside of the macro parameter as $l
, and this $l
should be a TokenTree metavariable.To achieve variadic interfaces, macros in Rust takes an expression (pattern) inside the first parentheses. We need knowledge on expressions and metavariables.
Let’s quickly look how we can implement the simple version of the familiar macro vec!
:
#[macro_export]
macro_rules! vec {
( $( $x:expr ),* ) => {
{
let mut temp_vec = Vec::new();
$(
temp_vec.push($x);
)*
temp_vec
}
};
}
#[macro_export]
and macro_rules!
declare we will define an exportable macro.( $( $x:expr ),* )
: The input of the macro would be,$ ( MacroMatch+ ) MacroRepSep? MacroRepOp
, where MacroMatch+
is a tree labeled by $x
, ,
is a MacroRepSep, and *
MacroRepOp indicates how many times the match repeats (in this case 0 or more than 0 times).$x
.,
is a literal comma, which could contain,*
means anything after the comma.From the macro, vec![1,2,3]
wil generate the code as follows:
{
let mut temp_vec = Vec::new();
temp_vec.push(1);
temp_vec.push(2);
temp_vec.push(3);
temp_vec
}
This book is “Getting-strated book”, so we don’t learn about how to write macro further.
proc_macro
crate is required.
…
OK. getting-started book doesn’t provide a good tutorial how to write my own macro. I’ll refer other learning material when I need.
Fragment specifiers | Name | Example |
---|---|---|
item | Item | |
block | BlockExpression | { let foo = 2;} |
stmt | Statement | let foo = 2; |
pat_param | PatternNoTopAlt | Refer the section “Pattern and Matching” |
pat | equivalent to pat_param | Refer the section “Pattern and Matching” |
expr | Expression | |
ty | Type | f64 , MyStruct , MyEnum |
ident | IDENTIFIER_OR_KEYWORD | foo in let foo: i32; |
path | TypePath | ::std::fmt |
tt | TokenTree | |
meta | Attributes | #![allow(unused_variables)] |
lifetime | LIFETIME_TOKEN | 'a in &'a i32 |
vis | Visibility quialifier | pub in pub bar |
literal | LiteralExpression | "hello" ,r#"hi"# , 12 |
But before we get started, we should mention one detail: the method we’ll use won’t be the best way to build a web server with Rust. A number of production-ready crates are available on crates.io that provide more complete web server and thread pool implementations than we’ll build.
std::net
cargo new hello
cd hello
src/main.rc
:
use std::net::TcpListener;
fn main() {
let listener = TcpListener::bind("127.0.0.1:7878").unwrap();
for stream in listener.incoming() {
let stream = stream.unwrap();
println!("Connection established!");
}
}
bind
function returns a Result<T, E>
, which indicates that binding might fail.unwrap
to stop the program if errors happen.incoming
method on TcpListener
returns an iterator that gives us a sequence of streams (more specifically, streams of type TcpStream
).incoming
method.TcpStream
will read fRom itself to see what the client sent and then allow us to write our response to the stream.for
loop will process each connection in turn and produce a series of streams for us to handle.unwrap
to terminate our program if the stream has any errors.Test
$ cargo run
# In another terminal
$ curl localhost:7878 -vvv
* Trying 127.0.0.1:7878...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 7878 (#0)
> GET / HTTP/1.1
> Host: localhost:7878
> User-Agent: curl/7.68.0
> Accept: */*
>
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer
# cargo run terminal
$ cargo run
Compiling hello v0.1.0 (/home/atlex00/rust-projects/hello)
warning: unused variable: `stream`
--> src/main.rs:7:13
|
7 | let stream = stream.unwrap();
| ^^^^^^ help: if this is intentional, prefix it with an underscore: `_stream`
|
= note: `#[warn(unused_variables)]` on by default
warning: 1 warning emitted
Finished dev [unoptimized + debuginfo] target(s) in 0.98s
Running `target/debug/hello`
Connection established!
Connection established!
^C
$
use std::io::prelude::*;
use std::net::TcpListener;
use std::net::TcpStream;
fn main() {
let listener = TcpListener::bind("127.0.0.1:7878").unwrap();
for stream in listener.incoming() {
let stream = stream.unwrap();
handle_connection(stream);
}
}
fn handle_connection(mut stream: TcpStream) {
let mut buffer = [0; 1024];
stream.read(&mut buffer).unwrap();
println!("Request: {}", String::from_utf8_lossy(&buffer[..]));
}
In the handle_connection
function, we’ve made the stream parameter mutable. The reason is that the TcpStream
instance keeps track of what data it returns to us internally. It might read more data than we asked for and save that data for the next time we ask for data. It therefore needs to be mut
because its internal state might change; usually, we think of “reading” as not needing mutation, but in this case we need the mut keyword.
How to read from stream. -> 3 steps.
buffer
on the stack to hold the data. It’s 1024 bytes in the example.buffer
to stream.read
, which will read bytes from the TcpStream
and put them in the buffer
(stream.read(&mut buffer).unwrap();
).String::from_utf8_lossy
).Test:
$ cargo run
# Other terminal
curl localhost:7878 -vvv -H "Host: myserver.com"
* Trying 127.0.0.1:7878...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 7878 (#0)
> GET / HTTP/1.1
> Host: myserver.com
> User-Agent: curl/7.68.0
> Accept: */*
>
* Empty reply from server
* Connection #0 to host localhost left intact
curl: (52) Empty reply from server
# Cargo run terminal
$ cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.00s
Running `target/debug/hello`
Request: GET / HTTP/1.1
Host: myserver.com
User-Agent: curl/7.68.0
Accept: */*
^C
First, no HTTP body, just header.
Change the handle_connection
function as follows.
fn handle_connection(mut stream: TcpStream) {
let mut buffer = [0; 1024];
stream.read(&mut buffer).unwrap();
let response = "HTTP/1.1 200 OK\r\n\r\n";
stream.write(response.as_bytes()).unwrap();
stream.flush().unwrap();
}
hello.html
(the same location with src
)
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Hello!</title>
</head>
<body>
<h1>Hello!</h1>
<p>Hi from Rust</p>
</body>
</html>
Change the handle_connection
function as follows.
use std::fs;
fn handle_connection(mut stream: TcpStream) {
let mut buffer = [0; 1024];
stream.read(&mut buffer).unwrap();
let contents = fs::read_to_string("hello.html").unwrap();
let response = format!(
"HTTP/1.1 200 OK\r\nContent-Length: {}\r\n\r\n{}",
contents.len(),
contents
);
stream.write(response.as_bytes()).unwrap();
stream.flush().unwrap();
}
Test:
$ curl localhost:7878
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Hello!</title>
</head>
<body>
<h1>Hello!</h1>
<p>Hi from Rust</p>
</body>
</html>
Returns only GET request.
Change the handle_connection
function as follows.
fn handle_connection(mut stream: TcpStream) {
let mut buffer = [0; 1024];
stream.read(&mut buffer).unwrap();
let get = b"GET / HTTP/1.1\r\n";
if buffer.starts_with(get) {
let contents = fs::read_to_string("hello.html").unwrap();
let response = format!(
"HTTP/1.1 200 OK\r\nContent-Length: {}\r\n\r\n{}",
contents.len(),
contents
);
stream.write(response.as_bytes()).unwrap();
stream.flush().unwrap();
} else {
let contents = String::from("Panic!!");
let response = format!(
"HTTP/1.1 401 OK\r\nContent-Length: {}\r\n\r\n{}",
contents.len(),
contents
);
stream.write(response.as_bytes()).unwrap();
stream.flush().unwrap();
}
}
Change else
part in
else {
let status_line = "HTTP/1.1 404 NOT FOUND\r\n\r\n";
let contents = fs::read_to_string("404.html").unwrap();
let response = format!("{}{}", status_line, contents);
stream.write(response.as_bytes()).unwrap();
stream.flush().unwrap();
}
404.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Hello!</title>
</head>
<body>
<h1>Oops!</h1>
<p>Sorry, I don't know what you're asking for.</p>
</body>
</html>
Here is the refactored handle_connection
function.
fn handle_connection(mut stream: TcpStream) {
let mut buffer = [0; 1024];
stream.read(&mut buffer).unwrap();
let get = b"GET / HTTP/1.1\r\n";
let (status_line, filename) = if buffer.starts_with(get) {
("HTTP/1.1 200 OK\r\n\r\n", "hello.html")
} else {
("HTTP/1.1 404 NOT FOUND\r\n\r\n", "404.html")
};
let contents = fs::read_to_string(filename).unwrap();
let response = format!("{}{}", status_line, contents);
stream.write(response.as_bytes()).unwrap();
stream.flush().unwrap();
}
The Default
trait allows you to create a default value for a type.
Deriving Default
implements the default
function.
The derived implementation of the default
function calls the default
function on each part of the type, meaning all fields or values in the type must also implement Default
to derive Default
.
https://rust-lang.github.io/api-guidelines/naming.html
https://doc.rust-lang.org/std/prelude/index.html
The prelude is the list of things that Rust automatically imports into every Rust program. It’s kept as small as possible, and is focused on things, particularly traits, which are used in almost every single Rust program.
The following link was a very good introductive post: https://deepu.tech/memory-management-in-rust/
Super reference how rust allocate memory: https://www.youtube.com/watch?v=rDoqT-a6UFg