4-Dec

Development, Everyday hacks and Rust

A better `ls` in 100 lines of Rust

Don't you just hate how boring ls is? I mean, it just spits out the names of files and folders on a line, one after the other. Come on, it's not 1987 anymore.
Let's try to make our own version, using Rust! Don't know any Rust? Go ahead and read last year's Introduction to Rust first! ๐Ÿฆ€ I'll be waiting ๐Ÿฅฑ

7 min read

ยท

By Isak Singh

ยท

December 4, 2023

Let's list some files


To get started, let's just grab all the files in our current working directory and spit them back out. Let's use the excellent walkdir-crate. Let's create our project and get going, and call our project explore ๐Ÿ”Ž

$ cargo new explore
  Created binary (application) `explore` package
$ cargo add walkdir
  Updating crates.io index
    Adding walkdir v2.4.0 to dependencies.


And in our src/main.rs:

use walkdir::WalkDir;

fn main() {
    for entry in WalkDir::new(".") {
        // Let's postpone error-handling for later
        let entry = entry.unwrap();
        println!("{}", entry.path().display())
    }
}


And run it:

$ cargo run --quiet
.
./Cargo.toml
./target
./target/.rustc_info.json
./target/CACHEDIR.TAG
./target/debug
./target/debug/.fingerprint
./target/debug/.fingerprint/explore-43ca298cd94bedc4
[snip]
./src
./src/main.rs


Wow, that's a lot of output! It prints all build files and .git-files, but I truncated it. Let's clean that up by changing the files we're interested in, which means hiding hidden files (e.g. files that start with .) and only show one-level of depth:

// Add this function:
fn is_hidden(entry: &DirEntry) -> bool {
    entry
        .file_name()
        .to_str()
        .map(|s| s.starts_with('.'))
        .unwrap_or(false)
}

// And in our main loop:
for entry in WalkDir::new(".")
    .min_depth(1)
    .max_depth(1)
    .into_iter()
    .filter_entry(|e| !is_hidden(e))
{
    let entry = entry.unwrap();
    println!("{}", entry.path().display())
}


Now let's run it again:

$ cargo run -q
./Cargo.toml
./target
./Cargo.lock
./src


Nice! ๐ŸŽ‰

Oh, and by the way, I'm skipping showing you all the imports, so if you're following along, just use rust-analyzer's code actions to import them automatically.


Hold for the applause ๐Ÿ‘


Let's add the excellent command-line argument parsing library Clap and use that to parse some input arguments. We want it to figure out the parsing, so we add the feature derive as well:

$ cargo add clap -F derive
Updating crates.io index
    Adding clap v4.4.8 to dependencies.
            Features:
            + color
            + derive
            + error-context
            + help
            + std
            + suggestions
            + usage
            - cargo
            - debug
            - deprecated
            - env
            - string
            - unicode
            - unstable-doc
            - unstable-styles
            - unstable-v5
            - wrap_help


To use Clap, we create a struct representing our input arguments, slap on some attributes to tell Clap how to initialize things, and change our usages so it works just the same:

#[derive(Parser)]
struct Options {
    #[arg(short, long, value_name = "PATH")]
    path: Option<PathBuf>,
    #[arg(long, default_value_t = 1)]
    min_depth: usize,
    #[arg(long, default_value_t = 1)]
    max_depth: usize,
    #[arg(long, default_value_t = false)]
    hidden: bool,
}

fn main() {
    let options = Options::parse();

    for entry in WalkDir::new(options.path.unwrap_or(".".into()))
        .min_depth(options.min_depth)
        .max_depth(options.max_depth)
        .into_iter()
        .filter_entry(|e| options.hidden || !is_hidden(e))
    {
    // ...


And run it again to check it still works as before:

$ cargo run -q
./Cargo.toml
./target
./Cargo.lock
./src


It does! And now we can also tweak the output with --max-depth, --min-depth, --hidden and --path/-p!

$ # Pass arguments through cargo to our program using `--` followed by our arguments:
$ cargo run -q -- --max-depth 2 --hidden -p .
./Cargo.toml
./target
./target/.rustc_info.json
./target/CACHEDIR.TAG
./target/debug
./Cargo.lock
./.gitignore
./.git
./.git/config
./.git/objects
./.git/HEAD
./.git/info
./.git/description
./.git/hooks
./.git/refs
./src
./src/main.rs


Awesome!

Sizable files


Now let's spice things up by adding size output to our entries! Let's look at an entry's metadata and find out:

// In our main loop, add:
let entry = entry.unwrap();
let size = entry.metadata().unwrap().len();
println!("{size:>9}B\t{}", entry.path().display());


Looks simple enough, but that printing stuff is weird. Let's break it down: {size} will print the size variable, which is in bytes. {size:>9} means to align the number right-aligned to 9 characters in width. \t adds a tab character, which aligns the next stuff somewhat nicely. And also we add a B because the size is in bytes. Let's run it:

$ cargo run -q
      246B      ./Cargo.toml
      160B      ./target
     7466B      ./Cargo.lock
       96B      ./src


Sweeeet! Now we're almost on par with ls.However, at some point in time, or perhaps from the beginning, ls got support for colors. We can't be any worse! Let's add some colors to differentiate directories, files and symlinks from each other. colored looks good!

$ cargo add colored
    Updating crates.io index
        Adding colored v2.0.4 to dependencies.
                Features:
                - no-color


And in our src/main.rs main loop:

let formatted_entry = if entry.file_type().is_dir() {
    entry.path().display().to_string().blue()
} else if entry.file_type().is_file() {
    entry.path().display().to_string().white()
} else {
    // Symlinks, probably
    entry.path().display().to_string().yellow()
};

println!("{size:>9}B\t{formatted_entry}");


colored adds a Colorize-trait which gives us extension methods like .blue() and friends, but it only works on strings (&str and String), so we need to convert the output of .display() to a string, using .to_string(). If you'd inspect .display(), you would see it returns a Display-type, which implements std::fmt::Display, which is why it works without converting it to a string first. But anyway, let's run it:

$ cargo run -q
      264B      ./Cargo.toml
      160B      ./target
     9471B      ./Cargo.lock
       96B      ./src


Sweeeeeet!! Ehh... wait.. You can't really see the colors like that.. Give me a second... Now!

A screenshot showing that we ran `cargo run -q` and the output, with directories colored in blue, and files in white. The size in bytes is still white, though.


Woo! It's looking a bit bland with the size missing colors, so let's spice it with some colors too:

// snip
println!("{:>9}{}\t{formatted_entry}", size.to_string().green(), "B".green());
// snip


And rerun:

The same screenshot as before, but now showing the size in green.


Awesome! However, this beautiful new colored output highlights a problem. 9471B? It's over 9000! We should get ahead of ourselves and format large byte quantities into something more readable. Let's add bytesize and use it!

$ cargo add bytesize
    Updating crates.io index
      Adding bytesize v1.3.0 to dependencies.
             Features:
             - serde


bytesize adds B, KB or whatever, so we need to remove our B from the output. Also, since we want the formatting 123 KB first, and then color it, we need to format it using format!("{}", ByteSize(size)) and then color it green, and then print it:

// snip
println!("{:>9}\t{formatted_entry}", format!("{}", ByteSize(size)).green());
// snip


Rerun time:

A sweet, sweet screenshot of the same as before, but our 9471 bytes is now formatted as `9.7 KB`. The rest of the output is the same


Nice!

Putting aside the noise


Now, I don't know about you, but I'm not a super fan about it spitting out ./ as a prefix to everything, and the whole directory path for every line of the nested files.


To fix this, let's grab only the filename and extension without all the prefixing paths to each entry before printing. To avoid printing everything on a single line and not understanding which file belongs in which directory, let's indent each file by the depth it has (the number of nested directories it is in). Also, as a bonus, walkdir gives us the directory itself as a single entry first, so we automatically understand which entry belongs to which directory! Let's make it clearer by putting it into code. Let's change our main loop code for formatting entries to:


And let's compile and run, again:

A screenshot of the program ran with 3 in depth `using --max-depth 3`, printing only the filenames, e.g. `main.rs` instead of `src/main.rs`, prefixed by a dimmed `|` times the indentation amount. To be honest, it looks kind of sweet. And, hello and thank you to all the people loving and reading the alt text. ๐Ÿค 


Awesome! I love the dimmed bars for indentation, too. Nice touch, Isak. ๐Ÿ˜

Time for a change


This is a pretty nice-looking CLI-tool already! Let's spice it up to ls's level by adding some date stuff. Rust has support for dates in the standard library, but doesn't support formatting them (trust me, with the amount of weird stuff going on with dates/timestamps and the stability promises of a standard library, this is really a good thing).
Let's use chrono:

$ cargo add chrono
    Updating crates.io index
      Adding chrono v0.4.31 to dependencies.
             Features:
             + android-tzdata
             + clock
             + iana-time-zone
             + js-sys
             + oldtime
             + std
             + wasm-bindgen
             + wasmbind
             + winapi
             + windows-targets
             - __internal_bench
             - alloc
             - arbitrary
             - libc
             - pure-rust-locales
             - rkyv
             - rustc-serialize
             - serde
             - unstable-locales


We saw earlier that each entry has a .metadata() object. It returns an Option<Metadata> and we've been .unwrap()-ping it without problem until now. This object also contains a .modified()-method that gives us a Result of a SystemTime. We've been cowboying fine so far, so let's unwrap that as well and hope it exists:

// in the upper part of our main loop, we'll tweak the `size` variable to extract our `metadata` into it's own variable, so we don't have to unwrap twice
let metadata = entry.metadata().unwrap();
let size = metadata.len();
// snip
let date = metadata.modified().unwrap();


Let's add an option for actually printing the date modified, because we might not always need it. In our Options-struct, add:

#[arg(short, long, default_value_t = false)]
modified: bool,


Onto formatting and actually using chrono. For non-serialization formats, it's pretty nice to see which day, date and time something has changed. I don't know what you like, but my favorite RFC for those types of date formats is RFC 2822, which spits out something in the likes of Mon, 13 Nov 2023 19:18:49 +0000, so we'll just use chrono to spit out that!


Let's create a formatted_date variable that checks if the modified/-m-flag is set, and returns an empty string if it isn't. If it is, it will convert our date to chrono's DateTime<Utc>, spit out the beautiful RFC 2822 output, and then strip away the ugly + 0000-suffix, because who wants that.

let formatted_date = if options.modified {
    format!("\t{}",
        DateTime::<Utc>::from(date)
            .to_rfc2822()
            .strip_suffix(" +0000")
            // We know it is in UTC so the stripping works, probably ๐Ÿค 
            .unwrap()
            .blue()
    )
} else { "".to_string() };

println!(
    "{:>9}{}\t{formatted_entry}",
    format!("{}", ByteSize(size)).green(),
    formatted_date
);


We'll print out our dates in blue:

A screenshot of our program ran, passing along `-m` to show its date modified. The output prints the size (as before), a tabulated space, then the dates, e.g. `Mon, 13 Nov 2023 19:18:49` and then another tabulation and the filename, as before.


Sick! ๐Ÿคฉ


Now let's not compare ourselves with others...


I'm not gonna lie, this is pretty similar to the output of the awesome tool exa, also written in Rust. Let's compare them!

A screenshot of two commands and their output. The first is `exa -la`, which outputs similar information, but also permissions of files, the author/owner of the file, and some extra highlights because it understands this is a Rust project, so it underlines and uses yellow+bold for `Cargo.toml`. Afterwards is our command run, same as before.


In fact, ignoring the permissions and author of the files (because who cares about those), they're mostly the same! Except our outputs files with hidden files and directories last. This is a simple fix:

// In our for-loop:
WalkDir::new(options.path.unwrap_or(".".into()))
    // New ๐Ÿ‘‡
    .sort_by_file_name()


And rerun one last time!

A screenshot of our program being run, with the same output as before, but now with `.git` and `.gitignore` first. Same as `exa`'s output!


Neeat ๐Ÿ˜Ž

This article is probably getting a bit too long, so let's leave it at that. Adding more features such as permissions, error-handling and so on is left as an exercise to the reader (but really can be solved by running cargo add anyhow, search-and-replacing .unwrap() with .context("describe the error")? to yeet the error out of the function, assuming you changed the main function's return type to anyhow::Result<()> and return Ok(()) at the end of the function). Huh, we almost could have done all that stuff in the parenthesis, but anyhow.


Time to feed the dog


If you want to use this program as a replacement for ls, it's super easy!
Just run:

$ cargo build --release && cargo install --path .


And then a bunch of output is spat out. Afterwards it should be installed in $HOME/.cargo/bin/explore ore wherever your Cargo installation is, and assuming you have that in your path (which you should have), you can run it like this now!

A screenshot of `explore --hidden` being ran in our project folder, outputting the same as before, except without the date modified. Success!

Conclusion


Thanks for reading! I hope you see how easy it is to create a tool of your own using Rust and its great tooling and ecosystem. And we did all of this in... how many lines? Well, the title was a lie, we actually only used 68 lines of Rust, according to the awesome written-in-Rust tool tokei

A screenshot of `tokei` being run in our repo, outputting that we have 81 lines of Rust, with 68 actual code lines, 5 comments, and 8 blank lines. Additionally, we have 10 lines of toml


68 lines for all that functionality! And of course some more when we include the dependencies, but oh well. Pretty nice! Doesn't make for a good title, though... Perhaps "A better ls in less than 69 lines of Rust" is nicer. Is it really a better version of ls? Probably not, but that wouldn't make for a good title either.


There's plenty more that can be done if you want to create your own version. As an example, you can add:

  • Permissions
  • Author
  • Date created
  • Options for all of the above
  • Support for ignoring files and folders according to ignore-files (perhaps use ignore)- And so on


Happy holidays! ๐Ÿฆ€