Memory Management

One of the defining characteristics of Rust is that it gives you full control over memory: it doesn't have a runtime garbage collector, and it doesn't do reference counting unless you explicitly ask it to. When things get allocated and deallocated is entirely in your control (but constrained by some safety rules; you can't deallocate it then use it again).

This means to really get Rust, we have to also really get how memory management works. There's only a little bit here that's specific to Rust; it's more of a primer on memory management for those coming from higher level languages who may not have seen this before, or in a while.

Stack and Heap

The first thing to know is that there are two places memory gets allocated: on the stack and on the heap. If you're familiar with the data structures, you're halfway there. The stack is a stack like you're familiar with: things get pushed on and popped off (but also referenced from the middle of the stack). On the other hand, the heap is just a big open field of memory (not a heap data structure) where we can allocate things.

The stack is controlled by the program's execution, comprising call frames (pushed on for function calls) and local variables of those functions. If something is on the stack, it's going to only live as long as the function it's inside. That's why you can't return references to things that are local variables, because they'll get deallocated when the function ends and its call frame is popped off the stack.

The heap is different. It's this big wide open space where variables can live as long as they'd like. Well, until someone comes along and deallocates them. When programs allocate memory on the heap, then it has to be deliberately deallocated later. The heap gives us basically unbounded memory (up to what's made available by your operating system) that can live for a long time. The penalty is that that memory has to be managed (we have to deallocate it sometime, not automatically when call frames get popped) and it is often less efficient than memory on the stack1.

Managing Memory

Different programming languages have different ways of handling heap-allocated memory.

The base level where the language gives you no help is manual memory management. This is what you get in C and C++, where you have to explicitly allocate and deallocate memory. In C, these functions are malloc to allocate memory, free to release it, and realloc to resize some allocation.

With manual memory management, you have all the power but you also have all the problems. This technique is where we get vulnerabilities like buffer overflows leading to remote code execution, or issues with use-after-free which can wreak all sorts of havoc. With great power comes great responsibility.

The highest amount of help is in languages with automatic memory management. This is what you see in most programming languages. There are a few forms, the two most common are (tracing) garbage collection (as you see in Java and Go, for example) and reference counting (as you see in Python and Swift, for example). These techniques largely handle memory management for you so you don't have to think about it, and you avoid some major problems with manual memory management. Notably, you cannot have use-after-free issues, and it's usually harder to have buffer overflows lead to arbitrary memory access, since the runtime checks for that.

Rust is a special language. It doesn't employ automatic memory management, but it also avoids the pain and problems that manual memory management brings. You get all the power of manual memory management, and usually don't have to worry about it at all. It does this by keeping track of "ownership" of memory (we'll talk about this in the next section) and the lifetimes of memory, and deallocating memory automatically when it won't be used anymore.

Rust also places some restrictions on what you can do to make this feasible.

References and Pointers

When we talk about memory management, we're talking about how much memory we have, where it is, and (to an extent) what it represents. This leads us to a type we haven't talked about yet: references. (And their cousins, pointers.)

References are found in most languages, but they're not usually called out as such. It's a concept hidden below the surface2.

A reference is just a way of referring to a variable (or memory) at another location. So if you have a variable x, then a reference to x would be written &x, and is a way of letting you access x. To access the underlying value, you dereference it, which you write as *x.

This is a little abstract, so let's see it in action.

#![allow(unused)]
fn main() {
// First create a variable, x
let x: u32 = 10;

// Then create a reference to it:
let ref_x: &u32 = &x;

// Then print out the reference!
println!("x = {}", *ref_x);
}

Rust will also automatically dereference references for you, so usually you can omit the * dereference operator:

#![allow(unused)]
fn main() {
// First create a variable, x
let x: u32 = 10;

// Then create a reference to it:
let ref_x: &u32 = &x;

// Print out the value without explicitly dereferencing it
println!("x = {}", ref_x);
}

Pointers have the same underlying representation as references: both hold an address to another place in memory. But while a reference refers to a variable at another location, a pointer is just an address in memory. This means that unlike with references, Rust doesn't make guarantees about what a pointer can do, so you have to use unsafe Rust to use it and access memory at that location.

You generally don't need to use pointers in Rust unless you're doing something very specific, and using them entails using unsafe Rust. It's good to know of them, but we won't go into the details on how to use them at all.

Allocating on the Stack or on the Heap

To allocate on the stack, you don't do anything special. Any local variable is allocated on the stack, since it's part of the call frame for the function you're in.

To allocate on the heap, you have to use a type that causes heap allocation. We'll talk about that in the next section!

Exercises:

  1. In your own words, what is the difference between allocating on the heap and on the stack?
  2. In your own words, what is the difference between a reference and a pointer?

These exercises are particularly challenging due to how abstract the concepts are when you haven't put them in practice. Be kind to yourself!


1

Why this is is a bit out of scope, but it's due to memory locality and how things get retrieved. The variables in the stack will likely already be in the CPU cache when the related code is executing, but you have to do a slow fetch from main memory for heap memory, since the CPU can't predict well what you're going to need.

2

This is one of the things I greatly appreciate about Rust: it doesn't hide these fundamental concepts, so while there's more work to understand things up front before you use the language, you can rest confident that you better understand what's happening, and there's less unexpected behavior.