YARR is available on the web and as a PDF. Nicole's post is also helpful.
Introduction
This is a crash course in the Rust programming language. It's written with a specific audience in mind: software engineers who know how to program in a high-level language but aren't familiar with lower-level programming and want to learn to write Rust quickly.
It does not aim to be a comprehensive course, and it's best done with a Rust buddy who can help you through tricky bits. The aim of this course is to teach you just enough to pair program so that you can learn by doing with others after this course.
This course contains exercises which should be done as you go through to check your understanding, and it also contains exercises which may be done independently or in a group. We highly encourage pairing for these with someone who's experienced or a fellow learner. Learning together is highly effective, especially for new, difficult concepts like those Rust may expose you to.
Rust has a notoriously difficult learning curve because it forces you at the language level to deal with concepts that are typically hidden (such as lifetimes). Struggling with these concepts is expected. When you come out of this course, these concepts will still be fuzzy and unclear. That's expected. This course will set you off on your journey, and your understanding and confidence in these concepts will grow over time.
This is just the beginning.
Let's get started.
Contributing, Bug Reports, and License
Please see the repo for contribution and license information.
Bug reports and feedback may be sent to the discussion mailing list.
Setting Up Your Environment
Before we can move forward to learning Rust, we need to have a working local environment. There are a lot of online tools to run Rust code, but since the target here is to start pairing, local is best.
The preferred way to setup your local Rust toolchain is with rustup. Go ahead and install that, then come back here. You can check if you have it installed by checking which version it is:
$ rustup --version
You should see something like this as the output:
rustup 1.25.1 (bb60b1e89 2022-07-12)
info: This is the version for the rustup toolchain manager, not the rustc compiler.
info: The currently active `rustc` version is `rustc 1.68.0-nightly (cc47b0699 2023-01-08)`
If you get an error, try closing your terminal and opening a new one; the installation adds itself to your path and that doesn't get reloaded automatically.
The Rust language moves quickly, so we can benefit from a lot of improvements by using the nightly toolchain instead of stable. Set it as your default like so:
$ rustup default nightly
And then make sure it's up to date:
$ rustup update
That's all you need, and you should be up and running now! Let's go take a look at our first Rust code.
Rust Analyzer
Before you start writing too much Rust, you will want to also install rust-analyzer. This is a tool which provides a lot of nice features in your IDE or editor, like completions and refactoring help. Crucially to me, it also can display inline types annotations for inferred types, which can greatly aid understanding of Rust programs.
It's by no means necessary for working with Rust, but if you're going to do any work with Rust it's an extremely helpful tool.
Instructions on how to configure it are out of scope of this guide, as it will be different for every editor.
Exercise: Look up instructions for your editor of choice and set it up to use Rust Analyzer.
Hello World
In the time-worn tradition, our first Rust program will be getting our computer to greet this wide world of ours.
Let's start by making a directory to work out of1:
$ cd ~
$ mkdir yarr
$ cd yarr
For the rest of this course, we'll assume that you're in the ~/yarr
directory.
Now let's make our first Rust program. When we installed rustup
and the Rust toolchain, it included a tool called cargo
. This is the Swiss army knife for Rust. It takes care of a lot of things for you, including:
- Installing and updating packages
- Creating new projects
- Running your tests
- Executing your binary
- Formatting and linting your code
- Publishing your library
We'll learn by doing, but feel free to explore the other functions with cargo help
or in the Cargo docs.
To create a new project, you can use cargo new
:
$ cargo new hello_world
Created binary (application) `hello_world` package
If you print out the directory that was created, there are a few files:
hello_world/
├── Cargo.toml
└── src
└── main.rs
Now we can run the program:
$ cd hello_world
$ cargo run
Compiling hello_world v0.1.0 (/home/ntietz/Code/ntietz/rust-crash-course/hello_world)
Finished dev [unoptimized + debuginfo] target(s) in 0.55s
Running `target/debug/hello_world`
Hello, world!
And we printed it out!
Whoops, we cheated our way to "hello world" without actually writing the program! But we know our toolchain works. Let's delete the existing file and write our own:
$ rm src/main.rs
Now open src/main.rs
as a new file in your editor of choice, and write this program:
fn main() { println!("Ahoy, matey!"); }
Now when you run it, you should see our new, corrected, greeting:
$ cargo run
Compiling hello_world v0.1.0 (/home/ntietz/Code/ntietz/rust-crash-course/hello_world)
Finished dev [unoptimized + debuginfo] target(s) in 0.27s
Running `target/debug/hello_world`
Ahoy, matey!
A few quick notes on the program we wrote, then we'll move on:
- This declares the
main
function, which has no return value here but can have a return value in some cases; we'll cover that later in Error Handling. - It does only one thing, which is invoking
println!
. This looks like a function, but is a macro. Macros always end in!
, so when you see that you know it's a macro. You can think of macros as fancy powerful functions, and use them just like you would use a function. We won't write our own in this course.
Exercises:
- Repeat the hello world program from scratch in a new directory, going by memory as much as possible.
- Find and read the documentation on
println!
. What do you find surprising? What do you find normal?
And now we're ready to move on! Stretch, shake it out, and then move on to learn about variables, functions, and control flow!
If you prefer a different working directory, feel free to do as you wish!
Variables
This chapter has a few concepts to cover:
- Creating variables
- Giving a variable a type
- Primitive types
Creating Variables
In Rust, you can create a variable much like you would in other languages:
#![allow(unused)] fn main() { let x = 10; }
The base syntax is straightforward: give it a name, give it a value. This creates an immutable variable. If you try to change it after this, you'll get an error:
#![allow(unused)] fn main() { let x = 10; x = 11; }
If you do this in a file and try to compile it, you'll get a pretty helpful error:
error[E0384]: cannot assign twice to immutable variable `x`
--> src/main.rs:3:5
|
2 | let x = 10;
| -
| |
| first assignment to `x`
| help: consider making this binding mutable: `mut x`
3 | x = 11;
| ^^^^^^ cannot assign twice to immutable variable
For more information about this error, try `rustc --explain E0384`.
To fix this error, we learn another keyword: mut
. This keyword is used to create mutable variables (or references).
If we run this program instead, it will compile:
#![allow(unused)] fn main() { let mut x = 10; x = 11; }
A note on mut
The relation between the mut
keyword and the concept of mutability is apparent in its name and in its usage: it allows you to mutate things.
However, this hides another way of thinking about it which is very helpful in the context of references later on. How do you achieve mutability? By enforcing exclusive access: when there is one reference to a mutable variable, there can be no others (but you can use that one multiple times). So a mut
reference is also an exclusive reference. (We'll talk about references later on.)
More on that later in the ownership and references section!
Type Annotations
Rust is a statically typed language. Everything is given a type at compile time. This does not mean, however, that we need to write types for everything! Like many modern languages, it employs type inference to figure out what types things are if you don't say. That's how Rust was able to compile our code up above without any types on it.
If you use a tool like rust-analyzer, you can see inline type hints. If you use that, you'll be able to see which types are inferred for a given variable, which is invaluable.
Sometimes, though, you just have to write the types down yourself! The compiler can't always figure out what you meant, and it can make it more clear to other programmers and the compiler what you intended—so if the inferred type doesn't match what you intended, now that's caught at compile time rather than run time.
The syntax for type annotations will feel somewhat familiar if you're used to Python or TypeScript, which have similar syntax. Let's say you're declaring a string and a 32-bit unsigned integer. You could write:
#![allow(unused)] fn main() { let name: &str = "Captain Blackbeard"; let age: u32 = 35; }
In a variable initializer, the type annotation is the : <type>
portion.
There are a few other places you'll see type annotations, like in functions and closures and structs. We'll cover those when we get there, but they're all of this form with a colon and a type.
Primitive Types
Rust has a bunch of primitive types to help you express what you want to write! The primitives are well documented in the Rust docs, which you can look to for more details.
- Unsigned integers:
u8
,u16
,u32
,u64
,u128
, andusize
(8-bit through 128-bit numbers, respectively) These can be written with bare literals like123
, or you can append on the integer type, like123u8
.usize
is a pointer-sized unsigned integer; on 64-bit systems, this is often a 64-bit integer, and it's the type you use to index into collections. - Signed integers:
i8
,i16
,i32
,i64
,i128
, andisize
. These can be written with bare literals like456
, or you can append on the integer type, like456i32
.isize
is a pointer-sized signed integer, which is handy for representing the difference between two indexes. - Floating point numbers:
f32
andf64
. These can be written with bare literals like1.2
, or they can be written1.2f32
or1.2f64
. - Characters:
char
is a 4-byte Unicode character. These can be written with single quotes around them, like'a'
. - Booleans:
bool
istrue
orfalse
. - Strings:
str
is the primitive type for a string. String handling is complicated in Rust1, but this is the primitive type, which is a UTF-8 encoded string (or "string slice"). It's usually seen in the reference form,&str
, but more on references later. Since strings are UTF-8 encoded, characters have variable-width encoding, which complicates accessing characters at specific indices. - Unit:
()
is the "unit type". This can be written with a literal()
, and it basically means... nothing. It's an empty, 0-element tuple, and it's the equivalent ofvoid
in other languages.
There are also a couple of other types worth mentioning here which aren't technically primitives, but are formed from them.
- Tuples:
(u8, bool)
is the type of a two-tuple of an unsigned 8-bit integer and a boolean. You can make a tuple from as many elements as you want, practically. A tuple value is of the form(123, true)
, which would have the previous type. - Arrays: an array in Rust is a fixed-size list of values, with the size as part of its type. The array
[0, 1, 2, 3, 4]
may have the type[u8; 5]
, for example.
Casting Between Types
Sometimes you have a primitive of one type, but you want it in another type! This happens often with numbers, where you'll have (say) an 8-bit integer but need to use a 64-bit integer for something. Rust does not do any implicit casting for you. You must explicitly say that you want it to happen, which helps prevent overflow errors.
Casting with the as
keyword is straightforward. You take the variable, and say as <type>
for what you want to cast it into. If it can't be done, the compiler will tell you!
You do have to be careful to ensure that if you cast to a smaller size value, that you won't overflow anything. The behavior you get is well-defined but may be unexpected.
Here is an examples of a cast:
#![allow(unused)] fn main() { let x: i64 = -13; let y: u8 = 10; let z = x + (y as i64); }
In this example, we had to cast y
so that we could add it to x
. If we didn't, we'd get a compiler error. Try modifying this to remove the cast and see what error you get. Also try casting x
to a u8
and see what you get instead.
Exercises:
- From memory, try to recall the different primitive types in Rust, and write them down. How many did you get right? How many did you miss?
- Write a program that creates three integers of different types and multiplies them together. Print the result.
Or rather, Rust forces you to acknowledge how complicated strings really are. They're among the more confusing bits coming in, I find.
Control Flow
Programs aren't very useful if they just follow a linear path forward. To do useful things, we need conditional statements and loops! Rust can do this, of course.
This section ostensibly is about control flow. It's also about Rust as an expression-oriented language; this is one of the things that makes Rust really ergonomic, and is also one of the things most confusing if you haven't used a language like it before! Basically, almost everything in Rust returns a value! This is super helpful, and different/confusing if you're not used to it.
Also, a note: we'll use println quite a bit through this section in a few forms.
The fmt docs have thorough explanations of all the different ways you can format output with println.
Please refer there if you need any help understanding the println!
usage on this page.
Blocks
Blocks aren't really control flow per se, but since every control flow mechanism takes a block, here we are.
A block is just a section of code surrounded by curly braces:
#![allow(unused)] fn main() { { println!("I'm in a block!"); } }
There are two main notable things about blocks:
- Blocks delimit a scope, so any variables you declare inside a block are out of scope outside of it; this is super handy for temporary variables.
- Blocks return a value, which is precisely the value of the last expression in the block. If the last expression of the block ends with a semicolon, then it's a statement, which returns
()
, the unit value, so it functionally has no return value.
Let's see a couple of examples of this and how you'd use it.
#![allow(unused)] fn main() { let msg = "Hello, world"; { // we're in the pirate block now let msg = "Ahoy, matey!"; println!("{msg}"); } println!("{msg}"); }
Exercise: What do you think this will print when you run it? Try to figure it out before executing it to test your understanding and intuition!
Since blocks declare a new scope, the msg
variable inside the block shadows the msg
variable on the outside, and does not change its value.
Note: Examples here are going to be rather contrived because we're avoiding things like structs and methods for now, trying to stick to (largely) just the syntax introduced so far.
Okay, so we saw scope delimiting. Here's an example of a block returning a value:
#![allow(unused)] fn main() { let parrots = 5; let shipmates = 10; let legs_on_ship = { let parrot_legs = 2 * parrots; let human_legs = 2 * shipmates; parrot_legs + human_legs }; }
Here we end up with legs_on_ship
having the value 30
, and the temporary variables (parrot_legs
and human_legs
) are freed when the block ends.
What's the real-world use for this usage of blocks? A common use is to do what we saw in the previous example and use it to constrain the scope of temporary variables. Doing this inside a block allows you to set things up in a readable manner without polluting the outer scope. Another common use is to release resources; you can lock a mutex at the beginning of a block, and when the block ends it will be released (like with lock:
in Python).
Ifs
Blocks are neat, but if-else is how we really get stuff done.
It works how you'd expect from other languages, with a few notable things:
- Parentheses around the condition are optional (and usually considered un-idiomatic)
- if-else-expressions return a value! Just like with blocks, this is the value of the last expression in each branch1.
- The branches must be surrounded by curly braces (yes, even if it's just a single statement2)
Here's a simple one:
#![allow(unused)] fn main() { if plunder > 5 { println!("A good haul"); } else { println!("Just jetsam"); } }
And then using it to get back a value. Since Rust doesn't have the conditional/ternary operator, this is the way you set a value conditionally:
#![allow(unused)] fn main() { let is_crew_member = true; let greeting = if is_crew_member { "Ahoy!!! Welcome aboard!" } else { "Yarrrrr get off me ship" }; println!("{}", greeting); }
This block will print a different greeting depending on whether or not you're a crew member.
Loops
The final basic control flow constructs for getting stuff done are loops. There are a few kinds of loops, so we'll just whirlwind through them.
There are three kinds of loops:
- loop-expressions
- while-expressions
- for-expressions
After those, we'll go through an extra: returning values from a loop.
loop expressions
These are the basic infinite loops. You have to exit out of them manually with the break
keyword. Otherwise, they keep going forever.
#![allow(unused)] fn main() { // Don't do this, it will run forever loop { println!("Wheeeee"); } }
To have the loop terminate, you have to break. Usually you want that on a condition:
#![allow(unused)] fn main() { let mut count = 0; loop { count += 1; println!("iteration {count}"); if count >= 10 { break; } } }
while expressions
These work like loop-expressions with the added bonus of having a condition to halt, so you don't have to manually break out of them. Otherwise, they're the same: you give a body and it gets run each time until the condition evaluates to false
.
#![allow(unused)] fn main() { let mut count = 0; while count < 10 { count += 1; println!("iteration {count}"); } }
This should behave the same as the loop above. Note that since we run while the condition is true, we don't have to do an awkward inversion of it to decide when to break. It's much more convenient.
Of course, you can also use break
statements in these.
There just usually isn't as much need to.
for expressions
The for loop in Rust is one of the ways you iterate over an iterable, which is typically a collection of things or a range ("from 0 to 10"). We'll see an example of both. There will be some syntax you're not familiar with, but we'll come back to the collections later on.
#![allow(unused)] fn main() { for count in 0..10 { println!("iteration {}", count+1); } }
This behaves the same as above. 0..10
gives us the range from 0 to 10, excluding the 10. That means we have to do the awkward +1
to get count the same. We can specify the range as an inclusive range instead to avoid that, using 1..=10
, which ranges from 1 to 10, including the 10.
#![allow(unused)] fn main() { for count in 1..=10 { println!("iteration {count}"); } }
Collections of things are similar. Let's see a basic example with an array. We'll see examples from other collections later on, when we talk about the standard library.
#![allow(unused)] fn main() { for prime in [2, 3, 5, 7, 11] { println!("{prime} is prime."); } }
And like with the other types of loops, you can use break
in for loops!
You usually won't need it, but it comes in handy occasionally.
Loop values
We mentioned before that in Rust, most expressions return values. But if you try to do that with a loop, you're going to see a compiler error:
#![allow(unused)] fn main() { let x = for count in 0..3 { count * 2 }; }
error[E0308]: mismatched types
--> src/main.rs:4:5
|
4 | count * 2
| ^^^^^^^^^ expected `()`, found integer
|
help: you might have meant to break the loop with this value
|
4 | break count * 2;
| +++++ +
For more information about this error, try `rustc --explain E0308`.
error: could not compile `playground` due to previous error
One of the really nice things with Rust is that the compiler messages are often helpful. (Sometimes, they try to be helpful... Trust yourself over the compiler, it cannot know your intent!) In this case, it tells us precisely what we need to do: use the break
keyword.
There are two things to unpack here:
- The loop body must result in
()
, so the last statement needs to end in a semicolon. - If you want the loop to result in a value, you have to emit that value using the
break
keyword.
Here's the previous example, but modified to work (and made a little more interesting):
#![allow(unused)] fn main() { let x = for count in 0..3 { if count > 1 { break count * 2; } }; }
Oops, another compiler error!
Compiling playground v0.0.1 (/playground)
error[E0571]: `break` with value from a `for` loop
--> src/main.rs:5:9
|
3 | let x = for count in 0..3 {
| ----------------- you can't `break` with a value in a `for` loop
4 | if count > 1 {
5 | break count * 2;
| ^^^^^^^^^^^^^^^ can only break with a value inside `loop` or breakable block
|
help: use `break` on its own without a value inside this `for` loop
|
5 | break;
| ~~~~~
For more information about this error, try `rustc --explain E0571`.
error: could not compile `playground` due to previous error
See, the problem is that while- and for-expressions are not going to be guaranteed to have a value, because they may hit their termination before they hit the break statement.
We can do it with loop
though. This one will compile, I promise:
#![allow(unused)] fn main() { let mut count = 0; let x = loop { if count > 1 { break count * 2; } count += 1; }; }
And there you have it: control flow! The first of many things Rust provides to do useful things.
Exercises:
- Write fizz buzz using a while loop.
- Now write fizz buzz using a for loop.
We'll see you in the next section for functions.
If you use it this way, the branches must have matching types. You can't, say, return a string from one branch and an integer from the other.
And thank goodness, too. This helps prevent some infamous bugs with major security implications, like goto fail.
Just as with control flow, doing useful things is a whole lot easier if you have functions available. They're there in every major high-level language, and they're there in Rust as well.
Functions
Functions in Rust are defined with the fn
keyword (often pronounced "fun", or if you're feeling edgy, "effin'") followed by the function name, parameter list, return type, and function body.
That's a few words, so let's just see an example and break it down.
Let's pretend that for some totally-not-contrived reason, we need a function which adds 10 to whatever argument we pass in. We give it 5, we want to get back 15. And let's also pretend we only care about 32-bit unsigned integers. That function would look something like this:
#![allow(unused)] fn main() { fn add_10(x: u32) -> u32 { let value = x + 10; value } }
fn
says we're starting a function, and add_10
gives it a name.
Then we have the parameter list. We only have one thing: x: u32
, which says we'll accept a parameter named x
with the type u32
.
The -> u32
is our return type; if you omit it, it's the same as writing -> ()
, or returning the unit type (kind of like the return type void
in TypeScript or C/C++).
Then we have the function body. This is a block like we talked about in control flow. Inside we have two lines. The first creates a variable named value
, and the second returns it implicitly.
Examples
Now we have everything we need to do some more more interesting things. We'll walk through one example here, then you can work through the other one on your own.
Let's take the classic Fibonacci function. First we create our main function as every program needs, we also create a stub. We'll just return a dummy value for now.
fn main() { let x = fibonacci(10); println!("{x}"); } fn fibonacci(n: u32) -> u32 { 0 }
Now we need to fill in that function body. It's a straightforward translation from the algorithm:
fn main() { let x = fibonacci(10); println!("{x}"); } fn fibonacci(n: u32) -> u32 { if n < 2 { return n; } fibonacci(n-1) + fibonacci(n-2) }
And now it should print 55!
Now, one to try on your own. I'll give you the main:
fn main() { fizzbuzz_up_to(100); }
Now you should define a function named fizzbuzz_up_to
which takes a parameter as a number and does Fizzbuzz. You'll need to use loops and define at least one function to achieve this!
Exercise: Write fizz buzz using a function, taking a parameter for the max.
Good luck. Come back here when you've completed that.
Memory Management
One of the defining characteristics of Rust is that it gives you full control over memory: it doesn't have a runtime garbage collector, and it doesn't do reference counting unless you explicitly ask it to. When things get allocated and deallocated is entirely in your control (but constrained by some safety rules; you can't deallocate it then use it again).
This means to really get Rust, we have to also really get how memory management works. There's only a little bit here that's specific to Rust; it's more of a primer on memory management for those coming from higher level languages who may not have seen this before, or in a while.
Stack and Heap
The first thing to know is that there are two places memory gets allocated: on the stack and on the heap. If you're familiar with the data structures, you're halfway there. The stack is a stack like you're familiar with: things get pushed on and popped off (but also referenced from the middle of the stack). On the other hand, the heap is just a big open field of memory (not a heap data structure) where we can allocate things.
The stack is controlled by the program's execution, comprising call frames (pushed on for function calls) and local variables of those functions. If something is on the stack, it's going to only live as long as the function it's inside. That's why you can't return references to things that are local variables, because they'll get deallocated when the function ends and its call frame is popped off the stack.
The heap is different. It's this big wide open space where variables can live as long as they'd like. Well, until someone comes along and deallocates them. When programs allocate memory on the heap, then it has to be deliberately deallocated later. The heap gives us basically unbounded memory (up to what's made available by your operating system) that can live for a long time. The penalty is that that memory has to be managed (we have to deallocate it sometime, not automatically when call frames get popped) and it is often less efficient than memory on the stack1.
Managing Memory
Different programming languages have different ways of handling heap-allocated memory.
The base level where the language gives you no help is manual memory management. This is what you get in C and C++, where you have to explicitly allocate and deallocate memory. In C, these functions are malloc
to allocate memory, free
to release it, and realloc
to resize some allocation.
With manual memory management, you have all the power but you also have all the problems. This technique is where we get vulnerabilities like buffer overflows leading to remote code execution, or issues with use-after-free which can wreak all sorts of havoc. With great power comes great responsibility.
The highest amount of help is in languages with automatic memory management. This is what you see in most programming languages. There are a few forms, the two most common are (tracing) garbage collection (as you see in Java and Go, for example) and reference counting (as you see in Python and Swift, for example). These techniques largely handle memory management for you so you don't have to think about it, and you avoid some major problems with manual memory management. Notably, you cannot have use-after-free issues, and it's usually harder to have buffer overflows lead to arbitrary memory access, since the runtime checks for that.
Rust is a special language. It doesn't employ automatic memory management, but it also avoids the pain and problems that manual memory management brings. You get all the power of manual memory management, and usually don't have to worry about it at all. It does this by keeping track of "ownership" of memory (we'll talk about this in the next section) and the lifetimes of memory, and deallocating memory automatically when it won't be used anymore.
Rust also places some restrictions on what you can do to make this feasible.
References and Pointers
When we talk about memory management, we're talking about how much memory we have, where it is, and (to an extent) what it represents. This leads us to a type we haven't talked about yet: references. (And their cousins, pointers.)
References are found in most languages, but they're not usually called out as such. It's a concept hidden below the surface2.
A reference is just a way of referring to a variable (or memory) at another location. So if you have a variable x
, then a reference to x
would be written &x
, and is a way of letting you access x
. To access the underlying value, you dereference it, which you write as *x
.
This is a little abstract, so let's see it in action.
#![allow(unused)] fn main() { // First create a variable, x let x: u32 = 10; // Then create a reference to it: let ref_x: &u32 = &x; // Then print out the reference! println!("x = {}", *ref_x); }
Rust will also automatically dereference references for you, so usually you can omit the *
dereference operator:
#![allow(unused)] fn main() { // First create a variable, x let x: u32 = 10; // Then create a reference to it: let ref_x: &u32 = &x; // Print out the value without explicitly dereferencing it println!("x = {}", ref_x); }
Pointers have the same underlying representation as references: both hold an address to another place in memory. But while a reference refers to a variable at another location, a pointer is just an address in memory. This means that unlike with references, Rust doesn't make guarantees about what a pointer can do, so you have to use unsafe Rust to use it and access memory at that location.
You generally don't need to use pointers in Rust unless you're doing something very specific, and using them entails using unsafe Rust. It's good to know of them, but we won't go into the details on how to use them at all.
Allocating on the Stack or on the Heap
To allocate on the stack, you don't do anything special. Any local variable is allocated on the stack, since it's part of the call frame for the function you're in.
To allocate on the heap, you have to use a type that causes heap allocation. We'll talk about that in the next section!
Exercises:
- In your own words, what is the difference between allocating on the heap and on the stack?
- In your own words, what is the difference between a reference and a pointer?
These exercises are particularly challenging due to how abstract the concepts are when you haven't put them in practice. Be kind to yourself!
Why this is is a bit out of scope, but it's due to memory locality and how things get retrieved. The variables in the stack will likely already be in the CPU cache when the related code is executing, but you have to do a slow fetch from main memory for heap memory, since the CPU can't predict well what you're going to need.
This is one of the things I greatly appreciate about Rust: it doesn't hide these fundamental concepts, so while there's more work to understand things up front before you use the language, you can rest confident that you better understand what's happening, and there's less unexpected behavior.
Heap Allocation
Sometimes you need to allocate something on the heap. Maybe you don't know its size at compile time, or you need it to live independently of the scope it's in. When you want to allocate something on the heap, there are a few types you can use to do so1.
Boxed values
The most basic, fundamental heap allocation is using Box
.
A boxed value is simply one that lives on the heap instead of the stack.
A Box
is a boxed value, and it's generic: a Box<u32>
is a u32 that's heap-allocated, and a Box<f64>
is a 64-bit float that's heap-allocated.
To construct one, you use the Box::new
constructor.
Often the compiler can infer what type the Box has, but if not, you have to provide type annotations.
Here are two ways to do that:
#![allow(unused)] fn main() { let x: Box<u32> = Box::new(42); let y = Box::<f64>::new(4.2); }
To move a value back onto the stack, out of the box, you can dereference it:
#![allow(unused)] fn main() { let x: Box<u32> = Box::new(42); let y = *x; // y is now a u32 on the stack }
Vecs
Sometimes you just need to build a list of things.
You do this with Vec
, which is the type for dynamically sized arrays in Rust.
(Rust's arrays are of fixed size.)
#![allow(unused)] fn main() { let mut parrots: Vec<&str> = Vec::new(); parrots.push("Shivers"); parrots.push("Tweety"); parrots.push("Dinner"); println!("parrots: {:?}", parrots); }
You can also do it inline, using the vec!
macro:
#![allow(unused)] fn main() { let parrots = vec!["Shivers", "Tweety", "Dinner"]; println!("parrots: {:?}", parrots); }
You can also iterate over the elements of a Vec:
#![allow(unused)] fn main() { let parrots = vec!["Shivers", "Tweety", "Dinner"]; for parrot in parrots.iter() { println!("{} says hi.", parrot); } }
Other collection types
The standard library has a variety of other collection types, which are helpful for just about anything you need. The std::collections docs have a lot of information about the types available and when you would want to use each.
As a sampling:
Vec
HashMap
andBTreeMap
HashSet
andBTreeSet
We won't go into the details on this page.
Exercise: Write a small program which constructs a HashMap with pirate ship names as keys and their crew sizes as values. For example, "Black Pearl" could have 3 crew members. Iterate over the elements of the hashmap and print each ship with its crew size.
You can also use raw pointers and unsafe Rust. We're not going to talk about that here, because you should almost always avoid it.
Ownership and Lifetimes
And now we come to one of the things that differentiates Rust from other systems languages. From other languages in general, really. And that is ownership and the borrow checker.
As we went over earlier, Rust does not have a garbage collector. Instead, the compiler is tracking when memory should be allocated and deallocated, and ensuring that your references remain valid.
How it does that is by keeping track of the lifetime of variables as well as their ownership. To unpack those, in short:
- A value has one owner at a time, and ownership is used for tracking when memory is valid and when it is dropped.
- The lifetime of a variable is the time during which references to it are valid.
These two concepts are highly related. The lifetime of a reference is linked to the lifetime of its owner.
To make it concrete, let's look at an example of a function which explains why we need lifetimes. This code example will not compile:
#![allow(unused)] fn main() { { // create an outer scope // define a reference to a u32, but do not initialize it let outer_ref: &u32; { // create an inner scope // declare and initialize a u32 let inner_val: u32 = 10; // try to assign a reference to the inner val to our ref outer_ref = &inner_val; // <-- this line will fail to compile } // <-- inner_val goes out of scope here // <-- but the compiler needs outer_ref to be valid here! println!("outer_ref value: {}", outer_ref); } }
Here's another example, where we try to return a reference to a local variable inside a function.
#![allow(unused)] fn main() { fn naughty_function() -> &u32 { let x: u32 = 10; &x } }
Rust rightfully presents us with an error there if we try to compile it. Whoops!
This might be your first interaction with the borrow checker. It's a vital piece of Rust machinery which helps prevent major issues. Notably, you could get both of the previous examples to compile in languages like C or C++, leading to use-after-free errors. With Rust, you can't do that1.
Lifetimes
Every reference is a borrow, and each borrow has a lifetime. That lifetime spans from when the variable is created to when it is destroyed.
What the borrow checker does is it ensures that each reference's lifetime is wholly contained by the borrowed value's lifetime.
Lifetimes can be explicitly given names.
These are typically 'a
, 'b
, etc. but you can also use longer descriptive names.
#![allow(unused)] fn main() { fn example<'a>(x: &'a u32) { let y: &'a u32 = &x; } }
This example also introduces generics, which we will only use for lifetimes until we cover them in more depth.
But basically, the <'a>
is for generics, and here it's giving a generic lifetime.
This says that we have some lifetime, 'a
, and our parameter x
is a reference of that lifetime.
It doesn't say specifically how long that lifetime is, because we don't know anything about that lifetime until it's filled in as a parameter of the generic function.
The lifetime 'static
means "referred to data will live for the duration of the program".
This is often used for string constants:
#![allow(unused)] fn main() { let msg: &'static str = "hello, world!"; }
Anywhere where you write a type annotation for a reference, you can also include an explicit lifetime. We've seen one example of this above. You'll see it often for structs, enums, and other data structures when they contain references. You will not see it often for functions, because of lifetime elision.
Lifetime elision is when we're allowed to omit the explicit lifetimes and just let the compiler take a guess (based on a few rules). The above example would be better written using implicit lifetimes:
#![allow(unused)] fn main() { fn example(x: &u32) { let y: &u32 = &x; } }
We won't unpack the lifetime elision rules here, but in general you can omit explicit lifetime names in most places. If you can't, the compiler will tell you, and you can try adding them!
Ownership
Ownership is related to lifetimes, and we can see a few examples of it. Each variable has an owner, and this ownership can be moved to another place. This happens if you pass something by value: the new place receives the value and promises that it will deallocate it when it needs to. But since the new place owns it, the old place is not allowed to use it anymore!
Here's one example that shows a value moving to a new owner, one that new Rust programmers often encounter:
#![allow(unused)] fn main() { let xs = vec![1,2,3]; for x in xs { println!("{x}"); } println!("total len: {}", xs.len()); }
This looks totally reasonable, but there's a problem:
When we iterated over xs
, since we used the value of xs
, we moved it, and we don't own it anymore!
This example does not compile.
Instead, we need to use a reference to let the for loop borrow the Vec, and then we can use it in both places:
#![allow(unused)] fn main() { let xs = vec![1,2,3]; for x in &xs { println!("{x}"); } println!("total len: {}", xs.len()); }
Now it works, woohoo!
That's the basics of ownership. Just remember that when you pass a value around, it moves the value to the other place.
Well, that's not true. It moves it if it can't copy it.
There are some variables, the simplest ones, which "are Copy", which means they implement the Copy trait. If something is Copy, then it will get copied instead of moved, and you can keep using it.
As an example, we can use a primitive u32
twice:
#![allow(unused)] fn main() { let x: u32 = 10; println!("x is {x}"); println!("x*2 is {}", x*2); }
Are you a little confused? If so, you're not alone! This isn't very clear, and the same syntax doing two different things implicitly can be tricky. But rest assured that this is understandable, and the compiler is here to help. If we go back to the first Ownership example and try to compile it, here's part of the error message:
3 | let xs = vec![1,2,3];
| -- move occurs because `xs` has type `Vec<i32>`, which does not implement the `Copy` trait
4 |
5 | for x in xs {
| -- `xs` moved due to this implicit call to `.into_iter()`
...
In here, it clearly tells us that the Vec was moved, why it was moved, and later on in the error message (which I cut off) it gives a suggestion for how to fix it (using a reference). We'll talk more about ownership in the next section on closures.
The borrow checker places a lot of restrictions on you. As a result, it's common to hear people refer to "fighting with the borrow checker." I like to think of it as getting a code review from an eager and extremely pedantic partner. It can be quite frustrating the first few times you run into it, but with time it becomes an invaluable part of your workflow. It's catching legitimate issues, and you should be scared working in languages without it!
Closures
Previously, we looked at named functions and we looked at ownership. In Rust, we also have closures, which give us the ability to make anonymous functions.
You can make a closure with explicit or inferred types:
#![allow(unused)] fn main() { let y: u32 = 10; let annotated = |x: u32| -> u32 { x + y }; let inferred = |x| x + y; println!("annotated: {}", annotated(32)); println!("inferred: {}", inferred(32)); }
The basic syntax for a closure is to use pipes around the parameter list followed by an expression for the return value.
Sometimes you'll see a no-argument closure, which looks like it's using the or-operator (||
) but that's the empty parameter list here, and could be written with spaces for clarity (| |
).
Closures can reference values from their outer scope, which is really handy. They can also capture the outer values and use them. This is handy for things like counters:
#![allow(unused)] fn main() { let mut count = 0; let mut increment = || { count += 1; count }; println!("count is {}", increment()); println!("count is {}", increment()); println!("count is {}", increment()); }
Note that the closure must be mut
if it captures a mutable variable.
This is because what it captures is part of the closure, so if it's mutating it then it's mutating itself.
You can also return closures from functions!
To do this, if you capture variables, you'll need to move the variables into the closure.
You do this with the move
keyword, which signals to the compiler that it should take ownership of its arguments, so that the closure cannot outlive its arguments.
Here are two examples of that in action. First, a closure which prints a message. This one has to explicitly annotate its lifetime, including on the return type.
#![allow(unused)] fn main() { fn print_msg<'a>(msg: &'a str) -> impl Fn() + 'a { let printer = move || { println!("{msg}"); }; printer } }
The way you would use it is by calling it to get a function, then calling that function.
#![allow(unused)] fn main() { // this line creates a new function, f let f = print_msg("hello, world"); // nothing has been printed yet // and this line invokes the function, which will print our message f(); }
And one which makes a counter.
#![allow(unused)] fn main() { fn make_counter() -> impl FnMut() -> u32 { let mut count = 0; let increment = move || { count += 1; count }; increment } }
Invoking it is similar, but this time, it has to be mutable.
#![allow(unused)] fn main() { let mut counter = make_counter(); println!("count is {}", counter()); // prints 1 println!("count is {}", counter()); // prints 2 println!("count is {}", counter()); // prints 3 }
You'll notice the return types for these functions are different.
What they return is an impl
of a trait.
We'll get to what traits are in a later section; for now, you can think of them like interfaces, so we know what we can do with the thing, but not its specific type.
There are three traits for functions: Fn
, FnMut
, and FnOnce
, which provide various restrictions on how the caller of the function can use it.
The other thing you'll notice is the impl
keyword, which is new.
This says we'll return something which implements this trait (like an interface in other languages), but we don't specify exactly what it is.
This is how you return closures, generally, because each closure is its own type.
There are good docs on these traits, as usual. In short, the restrictions are:
Fn
can be called be called multiple times, and it doesn't modify its underlying state.FnMut
can be called multiple times, but it may mutate itself when you do (so it needs a mutable reference to itself)FnOnce
can be called once. It consumes itself in that call, and you can't use it a second time.
If you have an Fn
, you can use it as FnMut
or FnOnce
.
And you can use FnMut
as FnOnce
.
But you can't go back up the chain!
Exercises:
- Write a closure which takes in two numbers and adds them together.
- Write a function which takes in an initial value and an increment and returns a closure which increments the value by that amount each time it's called.
Thanks for following along so far! You've gotten through what I think are the hardest parts of Rust. The rest should be easier, and you should be able to put this into practice. Take a breather, then move on to the next section.
Structs
Structuring your data is a key part of programming, and every language worth its salt gives you a way to do that. In object-oriented languages, this is usually by creating a class. In Rust, you structure your data with structs.
A struct is a named grouping of fields (data belonging to the struct), which also can have methods on it. Let's unpack that.
When you're creating a struct, first you have to give it a name:
#![allow(unused)] fn main() { struct PirateShip { } }
Then you can also put fields on it, of any type:
#![allow(unused)] fn main() { struct PirateShip { captain: String, crew: Vec<String>, treasure: f64, } }
And you can have methods on the struct by using an impl
block.
Those methods can take a reference to self (&self
) if they are just reading fields, or they can use a mutable reference (&mut self
) if they will be changing any data.
#![allow(unused)] fn main() { struct PirateShip { captain: String, crew: Vec<String>, treasure: f64, } impl PirateShip { pub fn count_treasure(&self) -> f64 { // some computations probably self.treasure } pub fn mutiny(&mut self) { if self.crew.len() > 0 { // replace the captain with one of the crew self.captain = self.crew.pop().unwrap(); } else { println!("there's no crew to perform mutiny"); } } } }
To create an instance of a struct, you give the name of the struct along with a value for each of the fields, specified by name (like treasure: 64.0
).
There is also some shorthand to use: if you have a variable in scope with the same name as one of the fields, you can specify that just by name.
That's confusing without an example, so let's see it in action.
#![allow(unused)] fn main() { struct PirateShip { captain: String, crew: Vec<String>, treasure: f64, } let blackbeard = "Blackbeard".to_owned(); let crew = vec!["Scurvy".to_owned(), "Rat".to_owned(), "Polly".to_owned()]; let ship = PirateShip { captain: blackbeard, crew, treasure: 64.0, }; }
In this example, we can see both forms of specifying fields.
The captain and treasure are specified with the <name>: <value>
form, while the crew is specified with the shorthand that means crew: crew
.
Note that in this example, we used the method to_owned
a few times.
This takes a reference to a string (&str
) and creates an owned string (String
), so that we don't have to worry about lifetimes.
The precise details of this aren't particularly relevant in this chapter, but it's a nice thing to keep in mind: if you want to avoid including lifetimes, you can use owned instances by cloning (or a method like to_owned
).
There's more complexity with strings in Rust than in other languages due to references and lifetimes, but further treatment of them is beyond the scope of this course.
Exercises:
- Define a struct for a crew member with a name, age, and any other attributes you would like.
- Implement a few methods on this struct, such as one to say who it is.
Enums
Many languages let you create enumeration types. Typically, these are shorthand for constants, so you can have a few different values that you know it's coming from. A common example would be days of the week.
In Typescript, an enum for days of the week would look something like this1:
enum WeekDay {
Monday = 0,
Tuesday,
Wednesday,
Thursday,
Friday,
Saturday,
Sunday
}
These are pretty straightforward in most languages, and useful but not earth-shattering.
Rust enums are far more powerful and are a key tool in structuring data and programs well.
You use them almost everywhere (indirectly, through Option
and Result
) for error handling and type safety.
What makes Rust enums more powerful is that they capture more than just a constant. Each variant of the enum can also have data.
If you're familiar with C, a Rust enum is equivalent to a tagged union. If you're not familiar with C, which is probably more likely if you're reading this, we'll unpack that and explain how enums are represented in memory. But first we'll see some examples of using them.
Defining and Using Enums
An enum has a name and it has variants. Each variant can either be a "unit" enum or it can have data associated with it. Additionally, there are two ways you can add data for a variant: positionally, or with named fields.
Let's define an enum with one of each of these types of variants to see them in action. This will just be a silly made-up example.
#![allow(unused)] fn main() { enum LanguageResource { // This variant has a field by position; in this case, it's probably the URL Website(String), // This variant has fields by name Book { title: String, pages: u64 }, // This is a unit variant, with no data contained SelfTeaching, } }
To write create an instance of an enum, you instantiate one of the variants.
You refer to it using the enum name and the variant name, separated with ::
.
Let's see examples of creating each of the previous variants, and then accessing their fields:
#![allow(unused)] fn main() { enum LanguageResource { Website(String), Book { title: String, pages: u64 }, SelfTeaching, } let site = LanguageResource::Website("https://yet-another-rust-resource.pages.dev/".to_owned()); let book = LanguageResource::Book { title: "The Rust Programming Language".to_owned(), pages: 300 }; let independent = LanguageResource::SelfTeaching; println!("the site is at {}", site.0); println!("the book {} is {} pages long", book.title, book.pages); }
When you have an instance of an enum, you can use it with pattern matching (to directly look at the fields and use them), or you can use methods on the enum.
One of the most common enums you use in Rust is Option
, which is defined something like this:
#![allow(unused)] fn main() { // the `<T>` just says we take a generic type, so the Option can contain anything // we want instead of having to define a new type by hand for each contained type. enum MyOption<T> { Some(T), None, } }
Then to use it you can either use methods (in this case, defined by the standard library) or pattern matching. Here is an example using both ways to accomplish the same thing:
#![allow(unused)] fn main() { let x = Some(3); if x.is_some() { println!("x = {}", x.unwrap()); } else { println!("x is empty :("); } match x { Some(v) => println!("x = {v}"), None => println!("x is empty :("), } }
And just like with structs, you can define your own methods on an enum you create.
Let's wrap this up by implementing the is_some
function for our own Option type.
#![allow(unused)] fn main() { enum MyOption<T> { Some(T), None, } impl<T> MyOption<T> { pub fn is_some(&self) -> bool { match self { MyOption::Some(_) => true, MyOption::None => false, } } } }
Enum Memory Representation
As mentioned above, enums are akin to tagged unions in C. The memory layout of these is pretty straightforward. Each enum has a "tag", which represents which variant it is holding, and then it has enough bytes to hold the largest variant.
The upshot of this is that no matter what variant your instance is, it will be as large as the largest variant, plus a bit extra for the tag.
How large is the tag? It can be as small as one byte, if you have fewer than 255 variants (if you have more, you need a larger tag) and if the alignment of the type is 1. We won't go into how to determine what the alignment of your type is but it's a great topic to explore on your own.
Why not just use a struct?
What's the reason we'd like to have an enum with data, instead of just a struct?
What's special about doing Result
as an enum?
There are a few main reasons you want to do it as an enum, and why you cannot always use a struct:
- It allows a more compact memory representation. A struct requires the memory of all its fields, and if you have fields that are present in some arms of the enum but not others, you'd pay for those always. In contrast, an enum only requires as much memory as its largest variant, plus a byte (or word) for its tag. So this can save a serious amount of memory! And for things like
Option
of a reference, it can even be free, because the compiler can do tricks to reuse memory in the pointer itself. - It allows you to do pattern matching, which you cannot do on a struct with private members.
- You can enforce things at compile time, like which fields are or are not set, rather than enforcing populating fields through logic at run time.
Exercises:
- Write an enum which represents days of the week.
- Look up the standard library
Result
type and write a basic implementation of it. Refer to ourOption
implementation for a similar example.
Weeks start on Mondays, not Sundays. Fight me.
More Control Flow
There are a couple of control flow constructs we skipped over earlier, because they rely on some concepts we didn't have.
Namely, you can combine pattern matching with if
and while
.
This comes in really handy when you want to do something only for a particular variant.
Here's a contrived example of retrieving something from a map, and doing something different in each case.
#![allow(unused)] fn main() { // let's just pretend we got this from a map, okay? let value = Some(42); if let Some(inner) = value { println!("inner was {inner}"); } else { println!("this is the failure case!"); } }
The general setup is that you put a variant on the left hand side inside an if let
, and if the match can be satisfied (if it works), then it'll fill in the placeholder variables.
If that match doesn't succeed, then you fall through to the else case.
The same thing works in loops.
One common use is with iterators, but you can use it with any enum.
(I see this used a lot less than if let
.)
#![allow(unused)] fn main() { let values = vec![1, 2, 3, 4, 5]; let mut iter = values.iter(); while let Some(v) = iter.next() { println!("v = {v}"); } }
Modules
Modules allow you to structure code in a maintainable way, so you don't end up with everything in one giant file. They also let you hide details so that consumers of your code don't rely on internal implementation details.
Creating modules
You declare modules with the mod
keyword, and then you can include them from one of three places:
- Directly with a block delimited by curly braces
- In a file at the same level with the name of the module
- Inside a folder with the module name, in the file
mod.rs
Modules can also be either public (pub mod
) or private (mod
).
If they're public, anyone consuming your crate can use the module and its pub members.
Otherwise, private members are accessible only to themselves and their descendants.
Let's say you're creating a module called math
.
Then inside your current module (starting from main.rs
or lib.rs
, unless you're nested inside another module already) you would create this module.
Let's assume you're working in main.rs
for the sake of these examples, but it's the same either way.
#![allow(unused)] fn main() { // in main.rs pub mod math { // look, I didn't say this was a *useful* module pub fn add(x: u32, y: u32) -> u32 { x + y } } }
If you want to break it out into other files, you can do that.
You still declare the module in main.rs
or your other file:
#![allow(unused)] fn main() { pub mod math; }
And then you put the contents in either math.rs
or math/mod.rs
, depending on your preference.
Usually it's clearer to use the math.rs
style over math/mod.rs
, because otherwise your editor can be filled with a lot of mod.rs
files and it's not clear which one you're working on!
#![allow(unused)] fn main() { pub fn add(x: u32, y: u32) -> u32 { x + y } }
And that's all there is to creating modules!
Using modules
To use what's declared in another module, you use the use
keyword.
This brings things into scope.
(If you want to re-export them, you can pub use
them.)
#![allow(unused)] fn main() { use std::collections::HashMap; }
If you refer to super::thing
, that gets thing
from the parent module.
If you refer to crate::thing
, that gets thing
from the root of the crate you're in.
Testing
Everyone's favorite topic! Writing tests!
It's actually pretty exciting with Rust, since you have good tools to make testing pretty easy. It's all built-in out of the box.
Unit tests
The typical way to write tests with Rust is to create a child module called test
, import from the parent, and test things.
These are written in a way that they're only compiled if a flag is enabled for tests, so they're not included in your release builds.
Handy!
Let's see an example.
We'll go back to the same example we had of a super useful addition function. Then we can write a test to make sure it adds things correctly.
#![allow(unused)] fn main() { pub fn plus(x: i32, y: i32) -> i32 { x + y } // this cfg says "only compile this if the test compile option is on" #[cfg(test)] // by convention we call this "tests" but it doesn't need to be mod tests { // usually you avoid importing all, but it's idiomatic for tests use super::*; // this is how we mark this function as a test #[test] fn adds_small_numbers() { let x = 10; let y = 20; let expected = 30; assert_eq!(plus(x, y), expected, "should add up correctly"); } // this test WILL FAIL because the integers roll over #[test] fn adds_big_numbers() { let x = 2_000_000_000; let y = 2_000_000_000; assert!(plus(x, y) > 0, "result should be positive"); } } }
Then to run the tests, you run cargo test
and it should find and execute all your tests (or your compile errors).
Integration tests
Just as with unit tests, you can write integration tests. The main difference is that these tests can only consume the public API that you provide.
Any file you put in the tests/
directory will be treated as an integration test.
Let's say you have your plus
function in a crate, then you could open tests/my_test.rs
and write:
#![allow(unused)] fn main() { use my_library::plus; #[test] fn test_addition() { assert_eq!(plus(10, 20), 30); } }
Doc tests
Rust also gives you facilities to write tests directly in your documentation.
You can put a docstring on a function, module, etc. with ///
(as opposed to the usual //
to start a comment).
If you put code blocks (with the markdown style ```), it will be compiled and run when you run cargo test
.
(This also has the neat side effect of making your documentation code examples automatically break the build if they are out of date!)
Here's an example for, again, our plus
function.
#![allow(unused)] fn main() { /// Adds together two numbers, and doesn't handle rollover. /// /// ``` /// use playground::plus; /// assert_eq!(30, plus(10, 20)); /// ``` pub fn plus(x: i32, y: i32) -> i32 { x + y } }
Linting and Formatting
There are two main canonical tool for formatting and linting Rust code.
The first is cargo-fmt
, the formatter.
The second is clippy
, the linter.
To run the formatter, just run cargo fmt
.
That's it.
To run the linter, run cargo clippy
after installing it one time.
Clippy is very helpful at suggesting idiomatic code, and it improves over time (read: it will sometimes come back and find things in code that you thought was fine last month).
It's invaluable as both a Rust learner and as an experienced Rust programmer, and it's usually good practice to treat Clippy warnings as errors (with specific exceptions annotated in the code if merited).
There are various options to configure these tools. I suggest using them vanilla for a while, if not forever. It keeps things consistent across the Rust ecosystem, and the suggestions are usually there for a reason. But if you do want to configure things, you can dive in!
Dependency Management
While Rust has a relatively early ecosystem of libraries, it is a strong ecosystem for a variety of domains. The main place you find a crate is on, fittingly, crates.io.
After you search for a crate on there, and you want to add it, you can run a simple command. For example, to add the popular serialization/deserialization crate serde, you would run
cargo add serde
This would add the latest version.
If you want to add it manually, you can specify the version in Cargo.toml
with a line like serde = "1.0.154"
.
The version you specify is matched with semver. To avoid just repeating the docs, I refer you to the official docs if you want to do something more advanced or want to understand how it's handling version matching.
Traits
Traits are much like interfaces in other languages. They give a way of defining shared behavior and a way of using said shared behavior.
At its base level, a trait is a collection of methods. The type that the methods belong to is unknown, because that's part of the implementation: the trait gets implemented for a given type.
The methods can either be abstract (must be implemented in order to implement the trait) or they can be implemented, using only the type information you have. Namely, these would use other methods on that trait.
Defining a trait
Since a trait is a collection of methods, we give those methods as the definition of the trait. For example, we can define a trait for a key-value store.
We know that any key-value store will be able to set a value and get it again. Those are the base primitives, but what if we want another operation, like get-and-set? As long as it's expressible in terms of just other methods on the KeyValueStore, we can do an implementation of that on the trait itself!
#![allow(unused)] fn main() { trait KeyValueStore { // These ones have to be implemented by structs which impl KeyValueStore fn set(&mut self, key: &str, value: Vec<u8>); fn get(&self, key: &str) -> Option<Vec<u8>>; fn lock(&mut self, key: &str); fn unlock(&mut self, key: &str); // This one is defined for all KeyValueStores fn get_and_set(&mut self, key: &str, value: Vec<u8>) -> Option<Vec<u8>> { self.lock(key); let old_value = self.get(key); self.set(key, value); self.unlock(key); old_value } } }
One thing to note is that the methods on a trait are all public!
You don't have to put a visibility modifier, pub
, on them because they're by default visible.
Implementing a trait on a type
Now let's look at how to implement a trait on a type.
After you've defined a trait, and a struct, you impl
the trait.
For example, let's say we make a trait called Printable
with a print
method.
We will also create a struct to implement the trait.
#![allow(unused)] fn main() { trait Printable { fn print(&self); } struct Ship { name: String } impl Printable for Ship { fn print(&self) { println!("<Ship name=\"{}\">", self.name); } } }
There is one other way you can impl a trait, and it's pretty incredible. You can tell the compiler to derive the implementation. This is only doable for traits that implement some auto derive functionality.
A lot of the built-in traits, like Debug
and PartialEq
, can be derived.
Here's how you derive those for a simple struct containing a string and a number:
#![allow(unused)] fn main() { #[derive(Debug, PartialEq)] struct PirateShip { name: String, masts: f32, // float since you can have part of a mast } }
Note: You can impl any trait on a type that you define. And you can impl a trait that you define on any type. But you cannot impl a trait that you didn't define on a type that you didn't define. This prevents having multiple impls of the same trait on the same type if different crates both impl it.
Using traits
There are a few ways you can use a trait. You can use methods from the trait on a struct that impls the trait. Or you can accept the trait as a parameter to a function, or return it as the return type. (You can also use them as part of generics; we'll cover that with generics.)
Calling methods from traits
To call a method that's on a trait, you have to use
that trait to make it visible.
This is mostly so that you don't have collisions from traits with the same methods on the same type.
As an example, there's a trait in the standard library called Read
.
It lets you read bytes from a source.
Lots of types impl Read, and one that does is &[u8]
, but you can't use it unless you use
it.
This example won't compile:
#![allow(unused)] fn main() { let mut s: Vec<u8> = "sad example".into(); let mut buf: [u8; 32] = [0; 32]; (&s[..]).read(&mut buf); }
But by bringing std::io::Read
into scope, it does compile!
#![allow(unused)] fn main() { use std::io::Read; let mut s: Vec<u8> = "sad example".into(); let mut buf: [u8; 32] = [0; 32]; (&s[..]).read(&mut buf); }
If you run into a situation where a method seems like it should be on a type but your tooling or the compiler are saying it isn't, look at if you're missing a use
somewhere.
Traits as parameters
If we know that a function only needs something that implements a trait, we can pass it in with impl
.
Let's say we're writing a function which needs a key-value store, but we don't care which one.
Then we can write this function to accept any key-value store:
#![allow(unused)] fn main() { fn save_record(kv: &impl KeyValueStore) { // use the key value store somehow } }
Code that passes in a type that doesn't impl KeyValueStore will not compile, and you can be sure that this will work. At compile time, the type is resolved to be the concrete type.
Traits as return types
You can also use the impl keyword for return types to specify that you're returning a value that impls the trait. For example, you could define something that returns a KeyValueStore:
#![allow(unused)] fn main() { fn create_in_memory_kvstore(config: Config) -> impl KeyValueStore { // create and return the KeyValueStore todo!() } }
The thing to note is that the function can only return one type.
If you have multiple implementations of the trait, you cannot have one branch which returns type A and one which returns type B.
At compile time, the compiler needs to be able to swap out impl KeyValueStore
for the one specific type which you're going to return.
Exercises:
- Go back to our pirate ship struct. Based on this, define a trait for any sort of
Vessel
, which has one method:mutiny
.- Implement
Vessel
forPirateShip
.- Create a new kind of vessel (
NavalShip
?) and implementVessel
for it.
Generics
Generics are an essential feature of Rust, allowing you to write reusable code that works with multiple types without sacrificing performance. They let you write code that works with a variety of types without duplicating code.
What are Generics?
Generics are a way to write code that accepts one or more type parameters, which can then be used within the code as actual types. This allows the code to work with different types, while still being type-safe and efficient.
In Rust, generics are similar to templates in C++ or generics in Java, TypeScript, or other languages.
Using Generics
To illustrate how generics work in Rust, let's start with a simple example. Suppose you have a function that takes two arguments and returns the larger of the two. Without generics, you'd need to write a separate function for each type you want to support, e.g., one for integers and one for floating-point numbers.
However, using generics, you can write a single function that works with any type that implements the PartialOrd
trait. Here's an example:
fn max<T: PartialOrd>(x: T, y: T) -> T { if x > y { x } else { y } } fn main() { let a = 5; let b = 10; let c = 3.14; let d = 6.28; println!("Larger of {} and {}: {}", a, b, max(a, b)); println!("Larger of {} and {}: {}", c, d, max(c, d)); }
In the max function definition, we introduce a generic type parameter T using angle brackets (<>
). We also specify the trait bound PartialOrd for T
using the colon syntax (:
).
This constraint ensures that the max function only works with types that implement the PartialOrd
trait, which is necessary for comparing values using the >
operator.
Tip: It can be hard to know what trait bound you need, especially when new to Rust. One of the things I like to do is leave it out entirely, then let the compiler tell me which trait bound it thinks is missing. This works a surprising amount of the time, especially for simple cases.
Now, the max function works with both integers and floating-point numbers. As an added bonus, you can call it with any two values of the same type that implement the PartialOrd trait. So it will even work for strings, or types that you don't even know about! Pretty neat and pretty powerful.
Generic Structs
struct Point<T> { x: T, y: T, } fn main() { let integer_point = Point { x: 5, y: 10 }; let float_point = Point { x: 3.14, y: 6.28 }; println!("Integer point: ({}, {})", integer_point.x, integer_point.y); println!("Floating point: ({}, {})", float_point.x, float_point }
In the Point
struct definition, we introduce a generic type parameter T
using angle brackets (<>
). This allows us to use the same Point
struct with different types for the x
and y
coordinates.
Note that in this example, both coordinates must have the same type. If you want to allow different types for x
and y
, you can introduce multiple generic type parameters:
struct Point2<X, Y> { x: X, y: Y, } fn main() { let mixed_point = Point2 { x: 5, y: 6.28 }; println!("Mixed point: ({}, {})", mixed_point.x, mixed_point.y); }
Here we have left the types unbounded, but you would likely want some trait bounds for these generic parameters. PartialOrd, PartialEq, and Debug are common choices.
Generic Enums and Traits
You can use generics with enums and traits in a similar way as with structs and functions.
Here's an example of a generic Result
enum that can be used to represent the success or failure of a computation (in fact, this is how the standard library type is defined):
enum Result<T, E> { Ok(T), Err(E), } fn divide(x: f64, y: f64) -> Result<f64, String> { if y == 0.0 { Result::Err("Cannot divide by zero.".to_string()) } else { Result::Ok(x / y) } } fn main() { let result = divide(5.0, 2.0); match result { Result::Ok(value) => println!("Result: {}", value), Result::Err(error) => println!("Error: {}", error), } let result = divide(5.0, 0.0); match result { Result::Ok(value) => println!("Result: {}", value), Result::Err(error) => println!("Error: {}", error), } }
In this example, we define a generic Result
enum with two type parameters: T
for the success value and E
for the error value.
The Result
enum has two variants: Ok(T)
for success and Err(E)
for failure.
We then define a divide function that returns a Result<f64, String>
.
The function takes two f64
arguments and either returns the result of the division or an error message if the divisor is zero.
In the main function, we call the divide function and pattern match on the returned Result
to handle both the success and error cases.
This Result
enum is a simplified version of the Result type that is part of the Rust standard library, which is used extensively for error handling.
Exercise: Write a generic
divide
function. (Hint: look up thestd::ops::Div
trait.)
Error Handling
Error handling is an essential aspect of any programming language, and Rust is no exception.
Rust provides robust error handling mechanisms, like the Result
type and the ?
operator, which allow you to deal with errors in a clean and idiomatic way.
And you can also bail out for unrecoverable errors with panics.
The Result Type
Rust has a built-in Result
enum for handling errors in a type-safe manner.
The Result
type is defined as follows:
#![allow(unused)] fn main() { enum Result<T, E> { Ok(T), Err(E), } }
The Result
enum has two variants:
Ok
, which represents a successful operation and contains the result valueErr
, which represents a failed operation and contains the error value
The Result
type encourages you to handle errors explicitly and provides a clear separation between the success and error cases.
This is one of the things that makes Rust so powerful for writing robust code:
You can't forget to handle errors, the type system will require that you address them.
You can use pattern matching to handle the different variants of the Result
type:
fn main() { let result = some_function(); match result { Ok(value) => println!("Success: {}", value), Err(error) => println!("Error: {}", error), } } fn some_function() -> Result<String, String> { // ... }
You can also use unwrap
or expect
if you know the returned value is an Ok.
But you have to be very sure: if you unwrap an Err, your program will panic!
This is okay in instances like missing configuration where you don't want the program to run at all if it isn't there, but you have to be deliberate.
As a general rule, don't use unwrap
or expect
unless you want the program to crash if it fails.
The ? Operator
While pattern matching on Result
values is powerful, it can become verbose when dealing with multiple operations that may return errors.
To simplify error handling in these situations, Rust provides the ?
operator.
It's only usable when you're expecting to return a Result value, and in those cases is tremendously helpful.
The ?
operator, when placed after an expression that returns a Result
, automatically handles the error case.
If the expression returns an Err
, the function immediately returns the error value.
If the expression returns an Ok
, the ?
operator unwraps the Ok
value and continues executing the function.
Here's an example of using the ?
operator:
fn main() { match read_config() { Ok(config) => println!("Config: {:?}", config), Err(error) => println!("Error: {}", error), } } fn read_config() -> Result<String, String> { let file = read_file("config.txt")?; let parsed = parse_config(&file)?; Ok(parsed) } fn read_file(path: &str) -> Result<String, String> { // ... } fn parse_config(file_contents: &str) -> Result<String, String> { // ... }
In this example, the read_config
function calls two other functions that return Result
values.
Instead of using pattern matching to handle the errors, the ?
operator is used to simplify the code.
It elevates the "happy path" so that you can read through that directly, and you know that errors will trickle up.
Panics
Rust has another mechanism for error handling called panics. A panic is a runtime error that results in the immediate termination of the program. Panics are reserved for exceptional situations where it's impossible or undesirable to continue executing the program, such as an unrecoverable error or a broken invariant.
You can cause a panic explicitly using the panic!
macro:
fn main() { let index = 10; if index > 5 { panic!("Index is out of bounds!"); } // ... }
When a panic occurs, Rust unwinds the stack, running destructors for all objects in scope and then terminating the program. Alternatively, Rust can be configured to abort the program directly, without unwinding the stack.
Panics should be used sparingly and only in exceptional circumstances.
In most cases, you should prefer using the Result
type for error handling, as it promotes explicit and robust error handling.
Async/await
Asynchronous programming is an essential technique for writing efficient and responsive code, especially when dealing with I/O-bound tasks or tasks that may take a long time to complete.
Rust provides an asynchronous programming model through the async
and await
keywords1, which allow you to write non-blocking code cleanly.
The async and await pattern in Rust is similar to that in other languages like Python and TypeScript.
In Python, you use the async def
syntax to define an asynchronous function, and the await
keyword to call an asynchronous function:
import asyncio
async def fetch_data():
# ...
async def main():
data = await fetch_data()
print(data)
asyncio.run(main())
In TypeScript, you use the async
keyword before the function definition, and the await
keyword to call an asynchronous function:
async function fetchData(): Promise<string> {
// ...
}
async function main() {
const data = await fetchData();
console.log(data);
}
main();
Rust's async and await model works similarly to Python and TypeScript.
Async and await
In Rust, the async
keyword is used to define asynchronous functions.
An asynchronous function is a function that can be paused and resumed later, allowing other tasks to run concurrently2.
Asynchronous functions in Rust always return a Future
.
A Future
is a trait that represents a value that may not be available yet but will be at some point in the future.
Here's an example of an asynchronous function in Rust:
#![allow(unused)] fn main() { async fn fetch_data() -> Result<String, String> { // ... } }
If you invoke this function, you get back a Future.
But the return type is Result, not Future.
That's because Rust hides that detail from us a little bit.
Instead of making us put Future
everywhere, we have a little friendlier syntax.
Just remember that when you invoke the function, it doesn't execute anything until you await
it.
The await
keyword is used to pause the execution of an asynchronous function and wait for a Future
to resolve.
When a Future
is awaited, the current task is suspended, allowing other tasks (such as the one you awaited) to run concurrently.
Once the awaited Future
resolves, the execution of the suspended task resumes.
Here's an example of how to use the await
keyword to call an asynchronous function:
async fn main() { match fetch_data().await { Ok(data) => println!("Data: {}", data), Err(error) => println!("Error: {}", error), } }
In this example, the fetch_data
asynchronous function is called with the .await
syntax.
The main
function is also defined as asynchronous using the async
keyword.
Note that you can only use the await
keyword inside an asynchronous function.
If you try to use await
in a non-async function, you'll get a compile-time error.
Async runtimes
To use async code, you'll need some sort of async runtime! The runtime is responsible for scheduling tasks onto threads. The typical one people go for these days is Tokio, but there are other options. And if you're ever feeling ambitious, you can write your own!
await
looks like a method or field when you use it, but it's a keyword and is used in syntax as such. It's a keyword the same way that match
is, but often feels like you're calling a method, so it can be kind of confusing.
Even with a single-threaded async runtime, tasks are concurrent, although they may not run in parallel.
Other Resources
This course has just scratched the surface with Rust. You should be able to pair, with some difficulty, with someone who is more experienced. But this is not the end of the road. By the nature of this course and the size of Rust, we've skipped a lot and skimmed over what we did cover.
Now you must go forth and keep learning and writing Rust code.
Here are a few paths forward from here.
- Books!
- Rust by Example
- Rustlings, a set of small exercises
And nothing replaces getting your hands on the keyboard and writing code.
Credits
These resources were part of the inspiration for this course:
This content was authored by Nicole, aka ntietz.
Acknowledgements
Huge thanks to these people who have provided invaluable feedback and more throughout this process:
- Cole Brossart
- Jake Weiner
- Dan Reich
- Miccah Castorina