2022-02-11
When trying to program asynchronous code in Rust for the first time, you are bound to stumble across a great deal of resources that are less than helpful. Examples that, while working, don’t really tell you why this is needed (as the code could have been written without async as well) and how it works. As someone rather new to the Rust & Async world, I’ll try to provide explanations that would’ve helped me.
I assume though, that you already know that you need async; not necessarily, that you know why you need it though.
While many of the Rust documentation is severely beginner unfriendly, the Rust Async Book starts pretty well, by giving an overview of the bits and pieces in the Rust async world. So well, in fact, that I’m gonna quote them verbatim:
While asynchronous programming is supported by Rust itself, most async applications depend on functionality provided by community crates. As such, you need to rely on a mixture of language features and library support:
- The most fundamental traits, types and functions, such as the Future trait are provided by the standard library.
- The
async
/await
syntax is supported directly by the Rust compiler.- Many utility types, macros and functions are provided by the
futures
crate. They can be used in any async Rust application.- Execution of async code, IO and task spawning are provided by “async runtimes”, such as Tokio and async-std. Most async applications, and some async crates, depend on a specific runtime. See “The Async Ecosystem” section for more details.
Some language features you may be used to from synchronous Rust are not yet available in async Rust. Notably, Rust does not let you declare async functions in traits. Instead, you need to use workarounds to achieve the same result, which can be more verbose.
In other words: Some of the things required for writing Async are
provided by the Rust compiler, i.e., built-in into the language. That
is, the keywords async
and await
are
part of the core language and don’t require importing any
crate, even from the stdlib. They are part of the language, just like
the fn
keyword, or if
and else
.
We will figure out what they do soon.
Further, some of the more fundamental parts in expressing asynchronous types are part of the standard library. They work closely in tandem with the core language specification, i.e., the keywords.
Less necessary, but still provided by the Rust Developers, is an
additional futures
crate that eases dealing with the
Future
trait and others, provided by the standard
library.
However, as we will see, Async only lets us specify that we want to “somehow” execute code asynchronously. The way asynchronous tasks are scheduled (executed) is determined by “async runtimes”, which are provided by third parties.
async
/await
KeywordsThe Rust Book chapter on keywords tells us that
async
– return a Future
instead of
blocking the current threadawait
– suspend execution until the result of a
Future
is readyIndeed, if we modify:
fn foo() -> usize {
42
}
fn main() {
println!("foo: {}", foo())
}
to make foo
async by adding the async
keyword before fn foo()
, then we get a bunch of errors:
error[E0277]: `impl Future<Output = usize>` doesn't implement `std::fmt::Display`
--> src/main.rs:6:25
|
6 | println!("foo: {}", foo())
| ^^^^^ `impl Future<Output = usize>` cannot be formatted with the default formatter
|
= help: the trait `std::fmt::Display` is not implemented for `impl Future<Output = usize>`
= note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
= note: this error originates in the macro `$crate::format_args_nl` (in Nightly builds, run with -Z macro-backtrace for more info)
For more information about this error, try `rustc --explain E0277`.
So the async
keyword indead simply replaces our
-> usize
by something like
impl Future<Output = usize>
. And it replaces the
return value 42
by some specific instance of a
Future
which, when executed, would return 42
.
In fact, we can remove the aynsc
keyword from the function
signature again and instead write (yield the same error as above, of
course, though):
fn foo() -> impl Future<Output=usize> {
async { 42 }
}
fn main() {
println!("foo: {}", foo())
}
In order to actually execute the Future
to retrieve the
result, we need to use the futures
helper crate (don’t
forget to add futures
to the Cargo.toml):
use futures::executor::block_on;
async fn foo() -> usize {
42
}
// Or:
//fn foo() -> impl Future<Output=usize> {
// async { 42 }
//}
fn main() {
println!("foo: {}", block_on(foo()));
}
This means, foo
doesn’t actually return 42
anymore, but something like a function that can be
called/waited-for/blocked-on to run the computation and retrieve the
result. This does sound a bit like a function pointer—but it’s more
powerful!
use futures::executor::block_on;
async fn bar(a: usize, b: usize) -> usize {
a*b + 42
}
fn main() {
let bar1 = bar(1,2);
let bar2 = bar(3,4);
println!("bar2: {}", block_on(bar2));
println!("bar1: {}", block_on(bar1));
}
While a function pointer can be passed around and called by providing
arguments, a Future
already has these arguments provided!
At the point where the code is actually run (in the
block_on()
), the arguments are already fixed! So, while we
call bar()
with arguments 1
and 2
first, the Future
bar2
that was created by
calling bar()
with 3
and 4
will
actually be executed before bar1
.
We can do some more fun things by passing around these futures in
functions as objects (this requires an import of the Future
trait):
use futures::executor::block_on;
use std::future::Future;
async fn bar(a: usize, b: usize) -> usize {
a*b + 42
}
fn baz(f: impl Future<Output=usize>) {
println!("f: {}", block_on(f));
}
fn main() {
let bar1 = bar(1,2);
let bar2 = bar(3,4);
baz(bar2);
baz(bar1);
}
This is essentially the same as the previous code and produces analog output.
Finally, we can also implement another function that has a different
signature than bar()
but also produces an
usize
… and pass it to baz()
!
use futures::executor::block_on;
use std::future::Future;
async fn bar(a: usize, b: usize) -> usize {
a*b + 42
}
async fn meep(buf: Vec<u8>) -> usize {
buf.len().try_into().unwrap()
}
fn baz(f: impl Future<Output=usize>) {
println!("f: {}", block_on(f));
}
fn main() {
let bar1 = bar(1,2);
let bar2 = bar(3,4);
let buf = (0..42).collect();
let meow = meep(buf);
baz(bar2);
baz(bar1);
baz(meow);
}
await
?We’ve ignored that there’s this await
keyword in the
Rust core language while silently introducing not only the
standard library Future
trait but also the
futures
crate. So what does .await
enable us
to do? Basically, we can write code that consumes the result of an
asynchronous functions (such as usize
in this case) without
actually executing the code there and then:
use futures::executor::block_on;
async fn bar(a: usize, b: usize) -> usize {
a*b + 42
}
async fn meep(buf: Vec<u8>) -> usize {
buf.len().try_into().unwrap()
}
async fn complex_stuff() -> usize {
let a = bar(1,2);
let b = bar(3,4);
let buf = (0..42).collect();
let meow = meep(buf);
let m = meow.await;
a.await + b.await + m
}
fn main() {
println!("result: {}", block_on(complex_stuff()));
}
Basically we can compose asynchronous functions and use their results, this will create a new asynchronous function itself.
If we observe the above code in more detail, we notice that the type
of m
is a usize
(just like
a.await
and b.await
. Our .await
expression thus allows us to write code that assumes that m
is already calculated, while the execution has not actually even started
when complex_stuff
was called, as the
complex_stuff
function is, in itself, asynchronous and only
returns a Future
.
That is: .await
let’s us use asynchronous code
just like it was synchronous. And the async
annotation
similarly helps us write the asynchronous primitives in the
first place.
There’s one caveat though: main()
cannot be asynchronous
(that would be really weird, complete asynchronous “program”?). So if we
have any asynchronous code in our code base and use it somewhere, every
calling function would, at first, be asynchronous in itself leading up
to main. We already have hinted at how to solve this problem: The
futures
create gives us a rather primitive
block_on()
executor that simply runs our huge constructed
futures-to-be-executed-later-tree in the current thread,
waiting/blocking til it’s done.
futures
’ Blocking
ExecutorIndeed, block_on
can be somewhat thought of as:
pub fn block_on<F: Future>(f: F) -> F::Output {
let result = loop {
match f.poll() {
Poll::Pending => continue,
Poll::Ready(r) => break r,
}
}
}
Although, in reality, things are a bit more complicated.
block_on()
first pins the future on the stack and then
calls the internal run_executor()
function found here.
Understanding this code isn’t necessary to understanding async though
:-)
Wrapping up and looking again at longer example above, the call to
complex_stuff()
doesn’t really take any time at all, nor do
the calls to bar()
and meep()
, since they all
only provide you with an opaque handler, the Future
that
must be explicitly .poll()
ed until the result is ready.
While simply calling block_on()
in main()
works… this isn’t really the asynchronous programming we want: Simply
waiting “queuing” all actions and then running them in a blocking way
until they are done. Ideally, we want some kind of multithreading, with
a scheduler that dispatches Futures
that are currently able
to run, them being able to notify the system whether they are currently
blocked and execution should be suspended, etc. (effectively cooperative
multitasking).
The job of a async runtime is to basically provide a more sophisticated executor.
The Tokio runtime is probably the most well-known Rust async runtime. A Hello World in Tokio looks like this:
#[tokio::main]
async fn main() {
println!("Hello World");
}
Wait… didn’t we say main()
isn’t allowed to be
async
? Indeed, the tokio::main
is a macro that
allows us to treat main()
as if it were async while,
strictly speaking, still being synchronous. The Tokio guide even tells
us so:
For example, the following:
#[tokio::main] async fn main() { println!("hello"); }
gets transformed into:
fn main() { let mut rt = tokio::runtime::Runtime::new().unwrap(); rt.block_on(async { println!("hello"); }) }
So the Tokio runtime simply provides us… with a different way to
block_on
that, unlike
futures::executor::block_on
doesn’t simply run everything
in the local thread :-)
Next to this, Tokio also provides async equivalents to standard
library functions, as well as helpers such as task::spawn()
that simply start executing the given task (which may even be a
Future.await
) “in the background” and returning a
JoinHandle
with which you can refer to said task later on,
checking whether it’s still running or not. This is… quite similar to
Futures
in itself, however, a Future, once created, is not
run until it is .await
ed. A spawn()
ed task is
always run, no matter whether we actually wait or poll the
JoinHandle
.
The canonical code to use the async-std
crate providing
an alternative asynchronous runtime is:
use async_std::task;
fn main() {
task::block_on(async {
println!("hello");
}
}
Which looks eerily similar to what Tokio does when applying the
tokio::main
macro :-)
From a user perspective, both crates are simply two approaches to the
same problem. async-std
even has task::spawn()
as well, with pretty much identical semantics. This shouldn’t be
surprising, since they are both async versions of
std::thread::spawn
.
Asynchronous programming in Rust can be thought of simply stating the
intent that I will, at some point, want to execute a function
f
with some fixed arguments a1,...,aN
and will
then do something with the eventual result r
, without
actually needing to compute this function right here.
The scheduling of “what to compute when” is mostly delegated to the
runtime, as the programmer only states dependencies (such as,
meep(buf)
must be run before complex_stuff()
since the latter consumes the former).