Inlining - The Rust Performance Book (2024)

Entry to and exit from hot, uninlined functions often accounts for anon-trivial fraction of execution time. Inlining these functions can providesmall but easy speed wins.

There are four inline attributes that can be used on Rust functions.

None. The compiler will decide itself if the function should be inlined.This will depend on factors such as the optimization level, the size of thefunction, whether the function is generic, and if the inlining is across acrate boundary.
#[inline]. This suggests that the function should be inlined.
#[inline(always)]. This strongly suggests that the function should beinlined.
#[inline(never)]. This strongly suggests that the function should notbe inlined.

Inline attributes do not guarantee that a function is inlined or not inlined,but in practice #[inline(always)] will cause inlining in all but the mostexceptional cases.

Simple Cases

The best candidates for inlining are (a) functions that are very small, or (b)functions that have a single call site. The compiler will often inline thesefunctions itself even without an inline attribute. But the compiler cannotalways make the best choices, so attributes are sometimes needed.Example 1,Example 2,Example 3,Example 4,Example 5.

Cachegrind is a good profiler for determining if a function is inlined. Whenlooking at Cachegrind’s output, you can tell that a function has been inlinedif (and only if) its first and last lines are not marked with event counts.For example:

 . #[inline(always)] . fn inlined(x: u32, y: u32) -> u32 {700,000 eprintln!("inlined: {} + {}", x, y);200,000 x + y . } . . #[inline(never)]400,000 fn not_inlined(x: u32, y: u32) -> u32 {700,000 eprintln!("not_inlined: {} + {}", x, y);200,000 x + y200,000 }

You should measure again after adding inline attributes, because the effectscan be unpredictable. Sometimes it has no effect because a nearby function thatwas previously inlined no longer is. Sometimes it slows the code down. Inliningcan also affect compile times, especially cross-crate inlining which involvesduplicating internal representations of the functions.

Harder Cases

Sometimes you have a function that is large and has multiple call sites, butonly one call site is hot. You would like to inline the hot call site forspeed, but not inline the cold call sites to avoid unnecessary code bloat. Theway to handle this is to split the function always-inlined and never-inlinedvariants, with the latter calling the former.

For example, this function:

#![allow(unused)]fn main() {fn one() {};fn two() {};fn three() {};fn my_function() { one(); two(); three();}}

Would become these two functions:

#![allow(unused)]fn main() {fn one() {};fn two() {};fn three() {};// Use this at the hot call site.#[inline(always)]fn inlined_my_function() { one(); two(); three();}// Use this at the cold call sites.#[inline(never)]fn uninlined_my_function() { inlined_my_function();}}

Example 1,Example 2.