1-Dec

Functional, Development

Introduction to Rust

Why has Rust been voted the most beloved programming language – seven years in a row, ever since it got released? What’s the hype about? Read along to get a glimpse of Rust, without having to read boring documentation. We’ll introduce Rust, bit-by-bit, by writing a simple CLI program that parses a list of HTTP requests and executes them.

15 min read

·

By Sivert Johansen, Isak Singh

·

December 1, 2022

Introduction to Rust

This article is for developers that’s interested in learning Rust or would like to learn what it has to offer. We’ll write a simple CLI program used as a motivating example. This program takes an input file containing a list of HTTP requests that a user wants to send, parses them, and then runs the requests.

But first, what is Rust and why should you use it?

What is Rust?

Released in 2015, Rust is a general-purpose programming language initially developed by Mozilla in the early 2010’s. It’s a strongly and statically typed language designed for “performance, reliability and productivity”. Rust uses an ownership model to manage its memory, in contrast to garbage-collected languages like C# and Java, and the manually managed C and Zig. This allows Rust to keep track of how long the values should be allocated and ensures that you don’t mistakenly misuse or mutate values when you’re not supposed to. Its main goal is to enable you to write robust software, free of memory-safety bugs causing security vulnerabilities, like buffer overflows and use-after-free. This memory safety is achieved by using a unique ownership model and not allowing arbitrary pointer accesses. By combining these language features, Rust achieves compile-time safety from data races and safely parallelizing tasks.

Want to try it yourself? The recommended way to install Rust and the relevant tools is from its version manager "Rustup", found here.

Why Rust?

Rust has seen an increase in use and popularity in the recent years. It has won “Most loved programming language” in the StackOverflow Developer Survey seven years in a row – once a year since its release! The language has won by a mile and lies considerably ahead of the other languages in the survey.

Recently, several big players in the tech world are noticing the benefits of Rust. For instance, Microsoft Azure CTO, Mark Russinovich, tweeted that all new projects written in C/C++ at Microsoft should instead be written in Rust, and NSA urging people to use memory safe languages, one of which is Rust.

The main reason for this is that memory-related bugs such as buffer overflows are still serious security issues, as many areas require maximized performance and use languages such as C or C++, which aren't memory safe. Also, just recently the widely used OpenSSL library announced a critical vulnerability caused by a buffer overflow. These types of errors are the ones Rust is specifically designed to prevent, effectively eliminating most of them if the code compiles.

Motivating Example

To showcase some features of the language we will create a CLI program that takes an input file containing descriptions for HTTP requests and executes them. To start, the requests will be very limited and will have the form:

GET nrk.no
POST example.com { "id": "12345"}

That is, either GET or POST, a url, and a potential body in the POST case.

To start off, we set up our project using Rust built in package manager: Cargo

cargo new http-sender

This will create the following file structure

http-sender
├── Cargo.toml
└── src
    └── main.rs

1 directory, 2 files

Cargo.toml is a file containing all configuration metadata and information about the project, such as its authors, and version number; but it also specifies dependencies and compiler flags. The main.rs file is the main entry point of our program.

Downloading dependencies and running code made with Cargo can be done with the run command:

cargo run
  Compiling http-sender v0.1.0 (/Documents/http-sender)
    Finished dev [unoptimized + debuginfo] target(s) in 0.89s
     Running `target/debug/http-sender`
Hello, world!

cargo new generated an implementation of "Hello, world!" for us, so we can skip right past writing "Hello, world!" in Rust. Pretty neat!

Now, to finally start writing our CLI program, we'll begin by defining a data structure to encapsulate a request. We could do all parsing and executing inline, but that would quickly become a mess. All our HTTP requests will all contain an URL and a method, and in the case of POST, a body. We could define our structure like this:

struct HttpRequest {
    method: String,
    url: String,
    body: String,
}

This has a few obvious drawbacks. For one, it allows us to set illegal and nonsensical request methods such as "CRASH" or "BURN". It also allows GET requests with bodies.

This problem can be solved by making types that only allows valid states. We are essentially making Rust's strong type system guarantee the correctness for us. Error-handling will happen when we parse the file, which makes the important part of the code clearer.

To design our structure better, we will make use of Rust's enums (sometimes called sum-types or algebraic data types):

enum Method {
    Get,
    Post { body: String },
}

This creates a new type Method which is either the value Get, or the value Post with a field named body carrying a string. This way, we prohibit GET requests with bodies and other invalid methods just by using this type, and the intent is clearer too.

Using our newly created type, we can define the improved HttpRequest to be:

struct HttpRequest {
    url: String,
    method: Method,
}

To parse the input file, we will read each line of the supplied input file and try to parse it into a HttpRequest. We start by creating the method that will parse the each line. The function signature looks like this:

fn parse_line(line: &str) -> Result<HttpRequest, HttpParseError>

Now, you might notice that we wrote &str and not String here, which is because we want this function to simply read from the string, and it does not need to write to it. By writing the ampersand (&), we specify that the function wants a reference to a string, which is called str, so we get &str. Rust cares a lot about who "owns" values, as they are required to clean up after them (deallocating them). How this works happens automatically and at compile time, and is part of how Rust achieves its strong compile-time safety. For now, don't worry about it too much.

Result is one of Rust's built in enum types. Its source code definition (stripping some details for brevity) is:

enum Result<T, E> {
    Ok(T),
    Err(E),
}

This definition tells us that a result can either be Ok with T, or Err with an E. The T and E in this case are generic types, so that any type can be used in conjunction with the Result type. Rust doesn't support exceptions, so using Results are the conventional way of dealing with fallible functions. In our case the Ok type T is a HttpRequest and the error is a HttpParseError, which is yet another enum type, that we define to be:

enum HttpParseError {
    UnrecognizedMethod,
    UnrecognizedFormat,
    // ... More variants could be defined
}

(A quick sidenote: although all of our enum examples have had two variants, any number of variants are allowed.)

Let's parse the input file and implement the parse_line function:

fn parse_line(line: &str) -> Result<HttpRequest, HttpParseError> {
    let vec: Vec<&str> = line.split(' ').collect();
    match vec.as_slice() {
        ["GET", url] => Ok(HttpRequest {
            method: Method::Get,
            url: url.to_string(),
        }),
        ["POST", url, body @ ..] => Ok(HttpRequest {
            method: Method::Post {
                body: body.join(" "),
            },
            url: url.to_string(),
        }),
        [header, ..] => Err(HttpParseError::UnrecognizedMethod),
        _ => Err(HttpParseError::UnrecognizedMethod),
    }
}

Now, let's break down the code. On the first line, we split the line (e.g., "POST foo bar") on spaces, and then collect it into a vector (Rust's lists) of strings (so we have["POST", "foo", "bar"]).

Then we pattern match on the vector as a slice – a view into an array.

When the compiler sees a match expression, it matches the value with the list of patterns and stops when encountering the first pattern that matches. Each branch has the form of SomePattern => some_expression(),. For example, in the first case, if the vector has a pattern of an array starting with "GET" followed by some other string which we will call url, then the branch is executed and the others are not matched. It's essentially a compact version of chained if-else-if statements, but it also returns the value of the branch that is executed, making it a match expression and not a match statement. The value from the expression is also returned from the function itself. Method::Get just means create the variant Get in the type Method.

In the second pattern, we check for the pattern starting with "POST" and some url, followed by any number of words. body @ .. just means: name a variable body, which binds to (@) the rest of the slice (..), if it's empty or not. Since we split the line on " " (space), we'll just join the body parts (hehe) back together using a space by writing body.join(" ").

Interestingly if we encounter the pattern [header, ..] meaning, the vector has, first, some string header, and then anything ( .. means the rest of the slice) after our two first patterns, then we know the line contains something different than GET or POST. We consider this an UnrecognizedMethod error, and return one accordingly. Another possible error we can match for is the pattern ["POST", url] which corresponds to a POST method without a body.

Finally, when creating the HttpRequest structs, we have to use .to_string() on the url and body, which turns a stack-allocated string (&str) into a heap-allocated string (String). We have to do this to prevent us from potentially accessing a dangling pointer sometime later, i.e., a pointer which points to an invalid object. In our case, the stack-allocated string might have been deallocated if we try to use it later in our program. Returning the stack-allocated string from the function is also possible, avoiding the allocation entirely, where the new string is just a slice of the other string in memory. However, to do that we need to understand Rust's concept of lifetimes, which is a more complex concept to deal with later. For now we can definitely afford the allocations 🤓

Oh, and the last line in the match-expression just says to match whatever (the underscore _), and return an error saying we encountered an unrecognized format.

Testing, testing, 123?

Does it work? Let's print a parsed request to see:

fn main() {
    println!("{:?}", parse_line("GET example.com"));
}

println is also a macro, which prints things, "{:?}" is a format string specifying that we should print the result of the parse_line function call as a debug representation.

But oh no, this does not work! The compiler spits out an error message (which had beautiful colors, but which we cannot show here):

error[E0277]: `HttpRequest` doesn't implement `Debug`
 --> src/main.rs:2:22
  |
2 |     println!("{:?}", parse_line("GET example.com"));
  |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `HttpRequest` cannot be formatted using `{:?}`
  |
  = help: the trait `Debug` is not implemented for `HttpRequest`
  = note: add `#[derive(Debug)]` to `HttpRequest` or manually `impl Debug for HttpRequest`
  = help: the trait `Debug` is implemented for `Result<T, E>`
  = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider annotating `HttpRequest` with `#[derive(Debug)]`
  |
6 | #[derive(Debug)]
  |

Wow, quite a descriptive message. It might seem like a lot, but if we had the colors, we would quickly see the first part saying that the request type cannot be printed as a debug representation because it doesn't implement the Debug trait. A trait is similar to an interface in other languages, meaning something that a type can implement. Luckily, the error message contained out a help-message at the bottom with the exact code change we need to fix our code!

The suggested code change is a derive macro, which we place on top of our type, meaning we say to the compiler: "just implement the most basic implementation of Debug that works for this type for me". Let's add the line to the top of all of our types HttpRequest, Method and HttpParseError:

#[derive(Debug)] // 👈 new!
enum Method
{ 
// ... 
    
#[derive(Debug)] // 👈 new! 
struct HttpRequest
{
// ...

#[derive(Debug)] // 👈 new!
enum HttpParseError
{
// ...

Now let's try to run our code again and test that the output is correct:

Ok(HttpRequest { url: "example.com", method: Get })

Nice! Our parse_line-function worked. However, we don't want to have to run our code, read the output and then manually ensure it is correct with every code change. It would be nice if we could test it somehow. Luckily, Rust has a simple and easy test framework built in! In the same file, main.rs, we can simply write the following:

#[test]
fn parses_get_request() {
    assert_eq!(
        parse_line("GET example.com"),
        Ok(HttpRequest {
            url: "example.com".to_string(),
            method: Method::Get
        })
    )
}

#[test] is a macro which generates the test harness needed. assert_eq is a macro that errors if the two values do not match. Unfortunately, once again, this will fail to compile because our types cannot be compared yet, as they do not implement PartialEq (short for a partial equivalence relation), and once again the compiler suggests to implement it using the derive macro. We'll modify all our types so they now look akin to this and implement PartialEq:

//             👇 new!
#[derive(Debug, PartialEq)]
struct HttpRequest {
// ...

All right, now it doesn't complain (we can check it using cargo check 😏). So how do we run our test? Well, cargo run runs our code, so maybe cargo test tests it?

running 1 test
test parses_get_request ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

Wohoo! Success! Also, the compiler loves colors, but our blog does not, so you cannot see the nice green colors of success 😔

And by the way, how nice is it that Cargo just works the way we expect it to? cargo run, cargo check, cargo test? What's next, cargo fix my code? Well, actually yes... But surely it cannot fix broken code? Well, Cargo strikes once again, with cargo fix --broken-code 😎Although obviously this doesn't always work, or we would be out of work!

Let's write a couple of more tests to ensure it works as expected:

#[test]
fn parses_post_requests() {
    // With body
    assert_eq!(
        parse_line("POST example.com { testing: 123 }"),
        Ok(HttpRequest {
            url: "example.com".to_string(),
            method: Method::Post {
                body: "{ testing: 123 }".to_string()
            }
        })
    );
    // Without body
    assert_eq!(
        parse_line("POST example.com"),
        Ok(HttpRequest {
            url: "example.com".to_string(),
            method: Method::Post {
                body: "".to_string()
            }
        })
    );
}

#[test]
fn does_not_parse_junk() {
    assert_eq!(
        parse_line("input validation is painful sometimes"),
        Err(HttpParseError::UnrecognizedMethod)
    );
}

Alright, and now let's test them using cargo test again:

running 3 tests
test parses_get_request ... ok
test does_not_parse_junk ... ok
test parses_post_requests ... ok

test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

Great success!

Reading from a file

There are several ways to read from a file in Rust. The simplest one (in our opinion), and the one we'll use is Rust's built-in function read_to_string, found in the fs module in the standard library module std. read_to_string takes in a file path, tries to read the file into a String and returns a Result<String, std::io::Error>. In our case, we'll be reading from a hardcoded file we know exist, so we only care about the Ok case. Since this is supposed to be a short post and we know our file exists, we can just convince Rust that we know the Result is Ok and unwrap the value inside Ok using the function unwrap(). Note that, if our assumption that the value is always Ok were wrong, the code will panic and stop running. We don't recommend doing this in production code.

let content = std::fs::read_to_string("input").unwrap();

To parse each individual line, we will split the content of the string on the newline character (\n):

content.split('\n')

Like so. Now, what we haven't said yet, is that split actually returns an Iterator! Well, some type that implements the trait Iterator, which is a trait for.. well.. iterating things! There are a myriad of available methods for iterators, the simplest being next, which simply returns the next element in the iterator, or the previously seen collect which we've seen neatly creates a vector from the iterator. The method we are interested in is map (which is also common in lots of other languages), taking a function that transforms values from a type T, to a new type U as a parameter. In our case, we want to transform each line in the file to a Result<HttpRequest, HttpParseError>. Finally, having mapped each line, we want to call our good friend collect to give us a vector. Putting everything into a function read_requests we get:

fn read_requests(file: &str) -> Vec<Result<HttpRequest, HttpParseError>> {
  let content = std::fs::read_to_string(file).unwrap();
  content.split('\n').map(parse_line).collect()
}

Clean.

Notice that we use our parse_line function from before. Let's create a test for the function. For brevity, we assume there is a file named test_input which contains:

POST example.com { "key": 5 }
DROP TABLE articles
GET example.com

The test looks like this:

#[test]
fn test_read_requests() {
    let requests = read_requests("test_input");
    assert_eq!(
        requests,
        vec![
            Ok(HttpRequest {
                url: "example.com".to_string(),
                method: Method::Post {
                    body: "{ \"key\": 5 }".to_string()
                }
            }),
            Err(HttpParseError::UnrecognizedMethod),
            Ok(HttpRequest {
                url: "example.com".to_string(),
                method: Method::Get
            })
        ]
    );
}

Running the test again, we get..

running 3 tests
test parses_get_request ... ok
test does_not_parse_junk ... ok
test parses_post_requests ... ok
test test_read_requests ... ok

test result: ok. 4 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

Great!

Executing HTTP Requests

Since we have done most of the work up until this point, sending the actual HTTP requests is the easy part. We will not implement the HTTP protocol ourselves (sorry to disappoint anyone), we will instead use one of Rust's excellent community libraries: reqwest. We've seen a pattern of Cargo having intuitive commands, so let's try adding it using... cargo add? Well, almost, we can add it using cargo add reqwest --features blocking. This will add reqwest as one of our dependencies while also enabling the blocking API. This will update our Cargo.toml file to be:

[package]
name = "http-sender"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
 # 👇 new
reqwest = { version = "0.11.13", features = ["blocking"] }

To keep the article simple we will use reqwest's blocking API instead of its asynchrounous flavor. Leaving async/await for another article. We can now (finally) start writing our main function! Let's go through all the HTTP-request instructions that we've parsed:

fn main() {
    for request in read_requests("input") {
        todo!()
    }
}

This reads the requests from the input file and iterates over it, because read_request returns a vector of Results so we can iterate over it. There are many ways to handle the error cases. In our case, we'll simply ignore them. There are many ways to do this, such as using the continue loop construction if the value was an error, or doing a pattern match; however, the cleanest and coolest way is probably to use Iterators 😎. Turning a vector into an iterator can be done using the into_iter function. From there, we can use flatten which simply discards error values and unwraps the Ok values, making the iterator iterate over the values within the Ok. This is also not how we would do it if this was a real program. It would be nicer to give user feedback that the instruction is incorrect and will be ignored.

fn main() {                               // 👇        👇
    for request in read_requests("input").into_iter().flatten() {
        todo!()
    }
}

To execute each request, we'll just pattern match on each request's method and call the corresponding function in the library we added:

fn main() {
    let client = reqwest::blocking::Client::new();
    for request in read_requests("input").into_iter().flatten() {
        let url = format!("https://{}", request.url);
        let res = match request.method {
            Method::Get => client.get(&url),
            Method::Post { body } => client.post(&url).body(body),
        }
        .send() // 👈 Execute the request
        .unwrap(); // 👈 Just assume the requests work for now, ignoring everything except 2xx responses 😅
                        // 👇 convert the body to text, failing if it's something else
        println!("{}", res.text().unwrap());
    }
}

Here we create a client that we will be using for every request. We then iterate over each request and use the match expression to either send a GET request or a POST request. Finally, since it's an expression, we can call send on the expression which sends the request and returns a Result which we blindly assume is always Ok, and print the text (which we also blindly assume is text) stored in the response.

Results

Running our program with GET example.com in our input file, we get..

<!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    ✂ ️snip snip
    ✂️ ....

Et voilà!

It runs the GET request, and print the result!

Final remarks

Although our introduction to Rust has been quick (at least that's what we hope it felt like), we hope that you have gotten a decent taste of the Rust programming language. If you are still interested in learning more, there are plenty of good resources. A good place to start is "The Book", as the official and free book on Rust is known in the community or The Rust Programming Language.

We hope you enjoyed this article and we hope to see you again in another 👋