Carcinisation of mirrord (or: why we use Rust)
Posted June 14, 2022 by Aviram Hassan ‐ 6 min read
A classic example of carcinisation is MetalBear’s mirrord project, where several different components converged on Rust as their main language. In this post, we’ll detail their different evolutionary paths and explain why they ended up being written in Rust.
First of all, what is mirrord?
mirrord is an open-source tool that lets developers run local processes in the context of their cloud environment. It is meant to provide the benefits of running your service in a cloud environment (e.g. staging) without going through the hassle of actually deploying it there, and without disrupting the environment by deploying untested code.1
mirrord has four main components:
- Agent - Runs in the cloud and acts as proxy for the users.
- Layer - Shared library that runs inside the local service, hooking IO operations (sockets, filesystem) and proxying those to the agent.
- CLI - Wrapper to inject/load the Layer into local processes.
- VS Code Extension - Same as CLI, but for VSCode.
This component is shipped as a container image. The layer creates a job with this image, providing it with elevated permissions on the same node as the impersonated pod. The job then enters the impersonated pod’s namespaces in order to be able to access its file system and network interfaces.
Ideally, we would like to switch namespaces only for the necessary code flows, so we can have minimal impact on the impersonated pod. Linux lets you control namespace based on threads, so you can have different functionalities running in different threads and on different namespaces. Controlling threads this way can be hard in some frameworks/languages. For example, in Go, threads are actually abstracted so you need to do some tinkering to ensure correctness. On the other hand, Rust doesn’t really abstract anything, and offers very fine tuned control over threads and namespaces.
One of mirrord’s goals is to let multiple developers work on the same environment without impacting each other. To achieve that, our agent has have a very small footprint. Rust lets us have a fixed-size memory layout, without many allocations and without the overhead of a garbage collector.
The Agent has many functionalities running at the same time, requiring the ability to move data between threads safely.
Rust provides great primitives and safety around concurrency and task management. It doesn’t let us send data types that are thread-bound unknowingly thanks to Send and it warns us when we hold references to non-atomic data types across threads with Sync.
This component is shipped as a dylib/so (shared library) and loaded into the local process that’s being plugged into the cloud. Once loaded, it hooks many libc functions (and some other frameworks, such us libuv) to create a smart management layer that decides what operation happens locally and what is being relayed to run remotely.
It was obvious to us that the layer had to be written in a low level language. Yes, we could create a bridge layer2 (JS actually has one built in with Frida) but that would add complexity and security concerns. Whereas in other parts of the solution we run in a self-contained context (i.e. our own process), this time we’re being loaded into a process that isn’t aware of our existence and we can’t introduce bugs into it. Rust + Frida let us hook low level functions such as
open, etc in a relatively safe manner leveraging great abstractions such as
Arc to manage our internal data structures and sync primitives such as
mpsc to build communication between different parts of our code (in the layer, it’s mainly between the “main loop thread” and hooks).
Like the Agent, the Layer also needs to have low overhead in order to provide great developer experience.
The CLI is responsible for injecting the layer into the target process. Right now, our main (and only) load mechanism is using
DYLD_INSERT_LIBRARIES (on macOS) and
LD_PRELOAD on Linux.
Implementing this in another language could’ve been easy but we decided to go with Rust for several reasons.
The load method is fairly simple, but if in the future we’d want to introduce a more sophisticated injection method like using “ptrace” or other methods. Rust would let us implement it and use those functions and layouts at more comfort than other languages.
Rust generates standalone binaries (apart from libc). On top of that, using Cargo’s new
bindeps feature we could embed mirrord-layer into the CLI, creating a smooth and transparent experience instead of having to ship two files or downloading dependencies at runtime.
VS Code Extension
The extension can’t really be written in Rust due to VS Code support for JS only. We did consider using WASM, but the bridging logic would “cost” too much to be worth any value Rust might provide (which, for the extension, would mostly be consistency with our other components)..
Bonus section! I had a feeling having our codebase be mainly in Rust would make hiring engineers a lot easier. As a Rust enthusiast, I would have loved to work somewhere where I’d get to work with Rust regularly, and I suspected that many others felt the same. I even posted a poll in/r/rust to see how others feel. The results were encouraging - around 40% of the people who voted thought the market for hiring Rust engineer is employer driven. I suspect that in any other ecosystem, the results would have been much more one sided in favor of the market being employee driven. When we finally started hiring, we posted in the “Who’s Hiring” mega thread in /r/rust, and received applications from some great candidates, making building the team both fun and fast.
Rust can be used for many use cases, software and applications. We believe that every task requires different tools, but in our case our project’s whole ecosystem fit right into Rust. You always need to do your research before choosing a language and stack, but generally speaking Rust is powerful and versatile and can be an amazing choice for a lot of different solutions.
Do you have any questions/corrections? Our website is completely open-source, so feel free to submit it as an issue or PR to our repo.
Want to help mirrord? Have a look at our open issues in the GitHub issue tracker and feel free to contribute.