TrueNAS Scale Migration, or: Can't Escape K8s

What and Why

I run a small NAS for myself and my family. I hesitate to call it a "homelab" because it's really just the one box.

Four hard drive bays in a black chassis
Figure 1: This fancy case was US$120

Until recently this was running TrueNAS Core, which is based on FreeBSD. This was good for cred, but difficult for me to use in practice. I've spent two decades learning how to use Linux, and very little of that transferred. Maybe if I knew how to use a Mac it'd be easier.

TrueNAS has announced the apparent deprecation of the FreeBSD offering. That gave me the excuse I needed to finally to migrate to TrueNAS Scale, the "new" Debian-based version.

This past weekend I finally got around to doing that migration!

Reinstalling

My NAS normally runs headless, and my consumer motherboard definitely doesn't have IPMI. So after making sure it wasn't in use I unplugged it and dragged it over to my workbench. I plugged in a keyboard and monitor, inserted a flash drive I'd dd'd an image onto1, and walked through a very minimal installer.

When I'm setting up Linux I have all sorts of Opinions on dmcrypt and lvm. I had some grand designs on splitting the NAS's boot SSD into halves and installing the two OSes side by side, but I abandoned that. I just hit next-next-next-done, hit the "upgrade" option, and went to get a snack. When I came back the system had rebooted… back into TrueNAS Core.

It turns out I can't actually remember which is which2, and I'd written the BSD one onto the flash drive. Sigh. At least now I knew the key that dumped the motherboard into boot media selection.

A Clean Break

In theory there's supposed to be a way to migrate configuration from Core to Scale, but I didn't bother. I'd bodged together a lot of stuff outside the guard rails. Jail shells are similar to Linux containers but BSD and Linux are not.

So after rebooting, importing my ZFS pool, and setting the hostname, I clicked the "App" tab and installed the most important application, Plex I saw something in the corner about installing a "chart…"

oh no.

Oh Yes

me: "god I switched to TrueNAS scale because TrueNAS Core (the BSD one) is being deprecated, and what do I find inside? KUBERNETES" lydia: "lmfaooooooo"
Figure 2: jumpscare

Yeah, it's k8s. And helm. It's using k3s, so it's not completely ridiculous, but still. What. Apparently I'm not the only person who was exasperated, since the next version will switch to docker, but that doesn't help me now.

I almost avoided writing anything, too. My first stop was setting up my household dashboard3, and that was just one container. After some editing to remove the secrets from the executable and into environment variables, I just needed to fill out a few fields in the GUI.

But of course I inevitably ran up against a limitation. Despite being k8s under the hood, which loves nothing more than a sidecar4, there was no way to add a pod with multiple containers. So of course I had to dive off the deep end myself.

First off: is there some official way to modify the YAML? No, of course not. Is there a way to add my own helm chart? Not easily. Can I just get access to kubectl? Unfortunately, yes.

The Paradox of Expertise

I often run into a situation when venturing outside my normal tech bubbles where I know both too much and too little. When deploying a "Custom App" on TrueNAS, there's a port forwarding setting, which obviously I don't have access to for my own Deployment.

Now how does that work? All the forum posts are unhelpful or condescending (the TrueNAS forums are MEAN!). So it's up to me and my Linux Skills.

I've got a working app running on port 9045, so let's figure this out. There's nothing in netstat:

root@montero[~]# netstat -lpnt | grep 9045
root@montero[~]#

Nor iptables:

root@montero[~]# iptables -L | grep 9045
root@montero[~]# iptables -L -t nat | grep 9045
root@montero[~]#

On the k8s side, there's no ingress controller:

root@montero[~]# k3s kubectl get ingress --all-namespaces
No resources found

And no stand-out annotations in the service:

root@montero[~]# k3s kubectl get svc -n ix-den-tv den-tv-ix-chart -o jsonpath={.metadata.annotations}
{"meta.helm.sh/release-name":"den-tv","meta.helm.sh/release-namespace":"ix-den-tv"}
root@montero[~]# k3s kubectl get svc -n ix-den-tv den-tv-ix-chart -o jsonpath={.metadata.labels}
{"app.kubernetes.io/instance":"den-tv","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"ix-chart","app.kubernetes.io/version":"v1","helm.sh/chart":"ix-chart-2403.0.0"}

Eventually I figured out that I just needed to set a NodePort on the service and it would work. How? No idea! Some magic with kube-router5, probably. Either way, I eventually got my Deployment going, copied here for posterity:

---
apiVersion: v1
kind: Namespace
metadata:
  name: transmission
---
apiVersion: v1
kind: Secret
metadata:
  namespace: transmission
  name: wireguard-private
stringData:
  WIREGUARD_PRIVATE_KEY: <nice try bucko>
---
apiVersion: v1
kind: Service
metadata:
  name: transmission
  namespace: transmission
spec:
  selector:
    app.kubernetes.io/name: ellie-transmission
  type: NodePort
  ports:
  - protocol: TCP
    port: 9091
    nodePort: 9091
    targetPort: web
    name: web
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: transmission
  name: transmission
  labels:
    app.kubernetes.io/name: ellie-transmission
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: ellie-transmission
  template:
    metadata:
      labels:
        app.kubernetes.io/name: ellie-transmission
    spec:
      containers:
      - name: transmission
        image: linuxserver/transmission:4.0.5
        ports:
        - containerPort: 9001
          name: web
        - containerPort: 51413
          name: torrent-tcp
        - containerPort: 51413
          name: torrent-udp
          protocol: UDP
        volumeMounts:
        - mountPath: /config
          name: config
        - mountPath: /downloads
          name: download-complete
      - name: vpn
        image: qmcgaw/gluetun
        env:
        - name: VPN_SERVICE_PROVIDER
          value: nordvpn
        - name: SERVER_COUNTRIES
          value: Canada
        - name: SERVER_CITIES
          value: Vancouver
        - name: VPN_TYPE
          value: wireguard
        - name: DNS_KEEP_NAMESERVER
          value: "on"
        envFrom:
        - secretRef:
            name: wireguard-private
        securityContext:
          capabilities:
            add:
            - NET_ADMIN
      volumes:
        - name: config
          hostPath:
            path: /mnt/panini/ix-applications/releases/transmission/volumes/ix_volumes/config
            type: ""
          # emptyDir:
          #   sizeLimit: 500Mi
        - name: download-complete
          # emptyDir:
          #   sizeLimit: 1Gi
          hostPath:
            path: /mnt/panini/media/media
            type: ""

And after spending most of a day chasing down a typo in a port, I had my workload running smoothly. Of course, it still doesn't show up in TrueNAS. No idea how that will work (maybe the websocket?) I don't even know if it'll survive a version bump! But it's there for now.

Conclusion

Now, Kubernetes isn't a horrible choice for this kind of work. Helm is a good templating system, even if I have tiller flashbacks. Using k3s makes… not no sense.

The thing is… I go out of my way to make sure my hobbies are as far from my work as possible. I write in Rust or Haskell, I use Nix, I do web "design." Weekends are not for work! Weekends are for silly projects and and reading yuri. Instead I learned things I can use at my job. And for that, TrueNAS will never be forgiven.

At least now I don't need to remember the arguments to BSD sed.

Footnotes:

1

The way that works is fascinating, btw

2

I repeatedly screwed it up while writing this post

3

Under FreeBSD this was running on a Linux VM. I haven't figured out how set up VMs yet on Scale, the bridging configuration is very "BYO."

4

WHY YOU WANT SIDECAR FOR KUBERNETES? IS NOT GOOD ENOUGH AS PROCURED FROM CORE INFRASTRUCTURE TEAM? YOU THINK NEEDS IMPROVEMENT?

5

Turns out it's IPVS. Today I learned!

Let's make an information Display Part 3: Deploying

[Part 1] [Previous]

The Development Environment

Fake API

Several of our gauges involve making real requests to APIs. Ordinarily that's not a problem, but when debugging or iterating you can run up your usage very quickly.

The solution? Fake data!

We could just use static data, but what's the fun in that? This way we can tell when the backend gets updated.

pub trait Mock
where
    Self: Sized,
{
    fn get_one(rng: &mut ThreadRng) -> Self;

    fn get() -> Self {
        Self::get_one(&mut rand::thread_rng())
    }

    fn get_several(rng: &mut ThreadRng) -> Vec<Self> {
        get_several(rng)
    }
}

Here's the example for BusArrival:

fn get_one(rng: &mut ThreadRng) -> Self {
    let arrival = Local::now() + Duration::minutes(rng.gen_range(0..60));

    BusArrival {
        arrival,
        live: rng.gen(),
    }
}

You might notice Mock::get_several calls a function also named get_several.

This is for types like BusArrival that need preprocessing:

  fn get_several(rng: &mut ThreadRng) -> Vec<Self> {
    let mut arrivals = get_several(rng);
    arrivals.sort_by_key(|v: &BusArrival| v.arrival);
    arrivals
}

Traits in rust often behave a lot like Objects, but here's one way they're very different: If my implementation defines get_several, there's no way to use the Mock implementation. By breaking it out, we can call this default and then add our additional logic.

Serving the Mocks

We use a feature to enable these fakes:

[features]
fake = ["den-message/mock", "dep:rand"]

Then when we start up, we just generate our values and seed our cache:

#[cfg(feature = "fake")]
async fn init() {
    use den_message::*;
    use den_tail::cache::GaugeCache;

    let mut rng = rand::thread_rng();

    let vars = vec![
        GaugeUpdate::BusArrival(BusLine::get_several(&mut rng)),
        // etc
    ];

    for var in vars {
        GaugeCache::update_cache(var).await.unwrap();
    }
}

If we wanted, we could schedule updates to be randomly generated later, too, just by calling init in a loop with interval.

actix::spawn(async {
    let mut interval = actix::clock::interval(Duration::from_secs(1));
    loop {
        interval.tick().await;
        init().await;
    }
});

Trunk

Yew recommends trunk as a tool for development. It handles compiling, bundling assets, and serving webassembl applications. It even does automatic reloads when code changes.

Configurations, interestingly, come in the form of html files. Here's mine:

<!DOCTYPE html>
<html lang="en">
    <head>
        <link rel="stylesheet"
              href="https://fonts.googleapis.com/css?family=Overpass">
        <link data-trunk rel="css" href="static/den-head.css">
        <link data-trunk rel="copy-file" href="static/wifi_qr.png"
    </head>
    <body>
    </body>
</html>

I make use of cargo-make to run the application more easily.

Here I run the frontend:

[tasks.serve_he ad]
workspace = false
command = "trunk"
args = ["serve",  "den-head/index.html", "--port", "8090", "--public-url=/static"]

The workspace = false is because, by default, cargo-make will try to run serve_tail in every component directory. Not what we want in this case.

And the backend:

[tasks.serve_tail]
workspace = false
env = {"RUST_LOG" = "den_tail=debug"}
command = "cargo"
args = ["run", "--features", "trunk-proxy,fake", "--bin", "den-tail"]

There's fake from before. trunk-proxy does what it sounds like: it passes through requests to / on the backend to Trunk.

here's what the index function looks like:

#[get("/")]
async fn index(req: HttpRequest) -> Result<HttpResponse> {
    imp::static_("index.html", req).await
}

Where imp is one of two backends. When trunk-proxy is enabled, it uses actixproxy and actix-ws-proxy1

#[cfg(feature = "trunk-proxy")]
mod imp {
    pub(crate) async fn static_(path: &str, _req: HttpRequest) -> Result<HttpResponse> {
        use actix_proxy::IntoHttpResponse;

        let client = awc::Client::new();

        let url = format!("http://{}/static/{}", PROXY_HOST, path);
        log::warn!("proxying {}", url);
        Ok(client.get(url).send().await?.into_http_response())
    }
}

When it's not enabled, i.e. in production, it uses actixfiles instead:

pub(crate) async fn static_(path: &str, req: HttpRequest) -> Result<HttpResponse> {
    Ok(NamedFile::open_async(format!("static/{}", path))
       .await?
       .into_response(&req))
}

The Backend Deploy

Let's talk about production.

When I first built this application, I deployed in a Modern Way. I had a Dockerfile, I pushed it to my Dockerhub account, and pulled it down to run it. But this had some annoying properties. For one, because it was public, I couldn't add any secrets to the file. And since the QR code for the wifi is secret I would've needed to generate the image at runtime instead of compile time.

Plus, it was just overkill. Instead, now I just build a tarball.

This uses the Cargo-make rust-script backend to create an archive, grab the files, and write them all out.

[tasks.tar]
dependencies =  ["build-all"]
workspace = false
script_runner = "@rust"
script = '''
//! ```cargo
//! [dependencies]
//! tar = "*"
//! flate2 = "1.0"
//! ```
fn main() -> std::io::Result<()> {
    use std::fs::File;
    use tar::Builder;
    use flate2::Compression;
    use flate2::write::GzEncoder;

    let file = File::create("den-tv.tar.gz")?;
    let gz = GzEncoder::new(file, Compression::best());
    let mut ar = Builder::new(gz);

    // Use the directory at one location, but insert it into the archive
    // with a different name.
    ar.append_dir_all("static", "den-head/dist")?;
    ar.append_path_with_name("target/release/den-tail", "den-tail")?;
    ar.into_inner()?.finish()?.sync_all()?;

    Ok(())
}
'''

Then to deploy it, I just use ansible.

I copy the file over and extract it with unarchive:

- ansible.builtin.unarchive:
    copy: true
    src: '../den-tv/den-tv.tar.gz'
    owner: '{{ user.name }}'
    dest: '{{ dir.path }}'
  become: true
  register: archive

Set up a systemd service: #+

[Unit]
Description="den TV service"

[Service]
WorkingDirectory={{ dir.path }}
ExecStart={{ dir.path }}/den-tail
User={{ user.name }}
Environment=RUST_LOG=den_tail=debug

[Install]
WantedBy=multi-user.target

Then restart it:

- name: start service
  ansible.builtin.systemd_service:
    name: den-tv
    daemon-reload: "{{ unit.changed }}"
    enabled: true
    state: restarted
  become: true

It runs on a virtual machine on my NAS, so it's easily accessible over the network.

The Frontend Deploy

The frontend is served by, what else, a Raspberry Pi.

den-tv-photo-in-situ.jpg
Figure 1: My phone really did not like taking this picture

I planned to use cage to automatically start a full-screened browser. But for whatever reason, on the version of Raspbian I'm running cage hard-crashes after a minute or two. Instead, I'm just using the default window manager and a full-screened Chrome2. I've got a wireless keyboard I can grab to make changes if need be, but it's been rock solid.

Automatic Reloads

There's one last trick: We know how to restart the backend when the code changes, but what about the frontend?

Take a look at build.rs from den-message:

const ENV_NAME: &str = "CARGO_MAKE_GIT_HEAD_LAST_COMMIT_HASH";

fn main() {
    println!("cargo:rerun-if-env-changed={}", ENV_NAME);
    let git_hash = std::env::var(ENV_NAME).unwrap_or_else(|_| "devel".to_string());
    println!("cargo:rustc-env=GIT_HASH={}", git_hash);
}

We use an environment variable exposed by cargo-make to capture the git hash. It's stored in den-message:

pub const VERSION: &str = env!("GIT_HASH");

When the backend server receives a new connection, it sends a hello message:

fn send_hello(ctx: &mut WebsocketContext<Self>) {
    let hello = &DenMessage::Hello {
        version: den_message::VERSION.to_string(),
    };

    match serde_json::to_string(hello) {
        Err(e) => error!("Failed to encode Hello: {:?}", e),
        Ok(msg) => ctx.text(msg),
    }
}

Because den-message is shared between the backend and frontend, it's also available on the websocket side. When we receive the Hello message, we check to see if it matches the version the webassembly was compiled with:

fn update(&mut self, ctx: &yew::Context<Self>, msg: Self::Message) -> bool {
        match msg {
            // snip
            Ok(DenMessage::Hello { version }) => {
                if version != den_message::VERSION {
                    gloo_console::log!("reloading for version mismatch");
                    let _ = window().location().reload();
                }
            }
        }
        true
    }

If the backend sends a different version, the page knows to reload. Since the backend serves the frontend, the next reload will always have the newest version.

The coolest effect of this is that I can sit at the kitchen table and run ansible-playbook to ship a new version. Then a few seconds later, the screen on the other side of the table automagically refreshes and shows me my changes.

Pretty snazzy!

Conclusion

I hope you've enjoyed this series, or at least found it informative. This project is absolutely over-engineered, and over-complicated. It took me multiple weeks to build, but I learned a ton. Along the way my searches led me to a lot of random folks' blog posts about things they've done. I hope if nothing else, these posts show up in someone's search results and help them solve a problem.

Thanks for reading, and feel free to get in touch!

Footnotes:

1

i wrote this! I learned a lot about actix internals in the process. I'm still slightly annoyed how short the solution was.

2

For whatever reason, the version of Firefox from Raspbian refuses to run webassembly.

Let's Make an Information Display Part 2: Frontend

Previous: Part One

So we have our data. We need some way to display it in a human-friendly format. Obviously I don't have anything against pure json, but it does not make for good information density.

If we're building a frontend application, the most obvious answer is Javascript. But I'm not going to be writing Javascript in my free time. That'd be like writing a Go backend: completely unbecoming.

What do we use instead? That was rhetorical, we're obviously using rust.

There's a number of front-end Rust libraries, but the three I considered were dioxus, percy, and yew. I'd previously used Yew for ezeerust, a web frontend for a z80 emulator I wrote. The others I just got from various blog posts other people have written. Since this is a purely personal project I engaged in some vibes-based-engineering.

And by that I mean I started writing this thing in September and have no idea why I picked what I did. Yew it is!

Connect for

Our data is waiting for us on the other end of the websocket, so the first thing to do is connect.

fn connect(&self, ctx: &yew::Context<Self>) -> Result<(), JsError> {
      let ws = WebSocket::open(format!("ws://{}/ws", get_host()).as_str())?;
      ctx.link()
          .send_stream(ws.err_into().and_then(parse_ws));
      Ok(())
  }

Since this is a wasm app intended to run in a browser, we're using gloo_net for websockets.

But already this looks pretty familiar!

Yew and Me

Yew is similar to React.js, which means it's declarative. Where a vanilla Javascript app might say "change #busupdate .route26 to this text," you instead say "The bus route should look like this" and the system figures out how to efficiently make changes.

This looks remarkably similar to actix! We've got a Context and we're going to send a stream somewhere.

Here's the signature for send_stream we're calling here:

pub fn send_stream<S, M>(&self, stream: S)
where
    M: Into<COMP::Message>,
    S: Stream<Item = M> + 'static,

and here's add_stream from actix:

fn add_stream<S>(&mut self, fut: S) -> SpawnHandlewhere
where
    S: Stream + 'static,
    A: StreamHandler<S::Item>,

But while an Actix application features a collection of quasi-autonomous Actors sending each other async messages, Yew applications are built out of a tree of Component objects.

We'll have one Application component that creates lots of little Gauge components, and they'll all create smaller components still. Also an Actor, a Component only handles one kind of message. Let's look at the trait:

pub trait Component: Sized + 'static {
  type Message: 'static;
  type Properties: Properties;

  // Required methods
  fn create(ctx: &Context<Self>) -> Self;
  fn view(&self, ctx: &Context<Self>) -> Html;
}

Usually Component::Message would be an Enum type, but in our case we only care about the results of parsing websocket inputs we called with add_stream:

type Message = Result<DenMessage, DecodeError>;

Then we just need an update handler:

fn update(&mut self, ctx: &yew::Context<Self>, msg: Self::Message) -> bool {
    match msg {
        Ok(DenMessage::Update(update)) => self.handle_gauge(update),
        // stay tuned for part 3!
    }
}

All handle_gauge will do is update the fields of the application which look like this.

pub struct App {
    bus_arrivals: LastUpdate<Vec<BusLine>>,
    //snip
}
impl App {
    fn handle_gauge(&mut self, msg: GaugeUpdate) {
        match msg {
            GaugeUpdate::BusArrival(bus) => self.bus_arrivals.set(bus),
            // snip
        }
    }
}

LastUpdate is a wrapper that provides a little housekeeping, specifically tracking when last a field was updated.

This information is used to detect stale data. If a gauge hasn't been updated in a while, it'll visually dim itself so we know not to trust it.

Render Unto Caesar

Now that we have our data secured, we need to display it! Somehow this internal state needs to become HTML. And Yew has a very nifty mechanism for doing this: the html!() macro.

Similar to React's JSX, this lets us write natural-ish HTML. Here's the snippet for the bus updates:

fn view(&self, _ctx: &yew::Context<Self>) -> yew::Html {
    html! {
        <main class={classes!("container")}>
            <Gauge slug={"bus"} stale={self.bus_arrivals.stale()}>
                <BusGauge routes={self.bus_arrivals.get()} />
            </Gauge>
            // snip
        </main>
      }
}

Like with JSX, lowercase tags are just plain HTML, in this case a semantic HTML tag. But Gauge and BusGauge represent capital-C Components.

Gauge is pretty simple, basically "Wrap in <article> with these classes, indicating if it's stale.

#[derive(Properties, PartialEq)]
pub struct GaugeProps {
    pub stale: bool,
    pub slug: &'static str,
    pub children: Children,
}

#[function_component(Gauge)]
pub fn gauge(props: &GaugeProps) -> Html {
    let cls = classes!("gauge", props.slug, props.stale.then_some(Some("stale")));
    html! {
        <div class={cls}>
            <h2>{props.slug}</h2>
            { props.children.clone() }
        </div>
    }
}

We can see the stale and slug arguments that correspond to attributes we passed in.

Children is a special value that allows us to wrap other tags in <Gauge></Gauge> tags. Otherwise, we'd have <Gauge /> and it wouldn't be nearly as expressive.

But this a a pretty simple component. BusGauge is where it gets interesting.

Gauge your Interest

#[derive(Debug, Clone, Properties, PartialEq)]
pub struct BusProps {
    pub routes: Rc<Vec<BusLine>>,
}

pub struct BusGauge;

Yew components are supposed to store most of their data in Properties. the component gets re-rendered.

This is good! That's how information trickles down the component graph from the root. The BusGauge struct will exist for the life of the application. If we stored information there at create time, It'd never be updated when App sent us new data.

The Rc there is because Properties are cloned very frequently, so it's a good idea to make those clones cheap.

Let's see what that this view method looks like.

fn view(&self, ctx: &yew::Context<Self>) -> Html {
    html! {
        <ul>
            {for ctx.props().routes.iter().map(bus_route)}
        </ul>
    }
}

Fair enough. bus_route similarly maps down to individual_arrival(), which handles the numbers. That's where the interesting stuff happens. Let's look back at BusArrival:

#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
pub struct BusArrival {
    pub arrival: chrono::DateTime<Local>,
    pub live: bool,
}

arrival is an absolute Datetime, but we know our display is just minutes to arrival. So we calculate that.

let mins = (arrival.arrival - Local::now()).num_minutes();

What happens if mins is less than zero? We skip it, of course. individual_arrival returns Option<Html> instead of Html, because a departure time of -1 isn't too useful. But otherwise, we render it out.

(mins > 0).then(|| {
      html! {
              <li> { format!("{}", mins)}
              if arrival.live {
                  <sup class={classes!("arrival-live")}>{"🛜"}</sup>
              }
              </li>
      }
  })

bool::then is a handy method that returns Some if true, or None if false. Could this be an if? sure! But I really like chains like this. Debate me in the comments1. Then we just write out the minutes, with the little icon if it's live. It probably should be a fancy SVG icon, but I'm a backend person at heart, cut me some slack.

We're all done, though! We've rendered our gauge!

…once.

Nothing But Time

There's actually two kinds of Component you can write in Yew: function components (like the Gauge) and struct components (like BusRoute). The function components have a lot less boilerplate, but the struct components give you a lot more control over the lifecycle.

We're using that here. Here's BusGauge::create, called when our component is initialized:

fn create(ctx: &yew::Context<Self>) -> Self {
    let _ = {
        let link = ctx.link().clone();
        Interval::new(30_000, move || link.send_message(()))
    }
    .forget();

    Self
}

First, we get a reference to ourselves. Then, every 30 seconds, we send ourselves an empty message. Interval is from the gloo_timers package, and by calling forget we ensure it will run indefinitely.

The content doesn't matter: any message will call Component::update.

fn update(&mut self, _ctx: &yew::Context<Self>, _msg: Self::Message) -> bool {
    true
}

update returns a boolean, which represents whether we should re-render our element. By doing so unconditionally, we re-render our gauge every 30 seconds. And because the minute offset is calculated at render-time, not on the backend, it'll never be more than 30 seconds out of date.

I actually use this for a World Clock gauge too. By setting the update interval to every second and sticking a Local::now() in the render, you've got a nice little clock that never goes stale.

CSS

Here's the part of web development that feels the most black magic to me. I've got to turn this:

An unstyled bold header that says "bus", and then a bulleted list with arrival times
Figure 1: web 1.0-tastic

Into something that conveys information usefully.

Now, let's do some expectation setting. I picked colours mostly based on named HTML colours. This is not going to win any design awards. But it will, hopefully, be legible.

Let's get started!

Variable Speed

Did you know CSS has variables now?? Check this out:

.bus {
    --bg: aliceblue;
    --2nd: lightblue;
}

.gauge {
    background-color: var(--bg);
}

Did you think I was joking about named colours? I love named colours.

The same unstyled headers, but with pastel colours
Figure 2: who doesn't love pastels

The Grid… a digital frontier…

The biggest thing I learned how to use was the Grid layout

I've been doing web development since rounded corners required PNGs, so this feels like the deep magic.

Here's my template:

.container {
    display: grid;
    gap: .5em;

    grid-template-areas:
        "weather  weather weather weather"
        "calendar trash   wifi    bus"
        "calendar trash   .       clock";
}

And look what this makes:

A grid of coloured rectangles full of text
Figure 3: this already blew my mind

But that's not the layout we specified. So we give them names:

.bus {
    grid-area: bus;
    --bg: aliceblue;
    --2nd: lightblue;
}
The same grid but arranged with a grey box at the top
Figure 4: a shape emerges

This is starting to look right, but it's not really following our arrangement.

There's a bunch of new units we have access to. No mouse means no scroll bars, so we'll use vh and vw, viewport height and width.

width: 100vw; /* 100% viewport width */
height: 100vh; /* 100% viewport height */

And we can specify the sizes we want in terms of fr units.

grid-template-rows: .2fr 1fr .8fr;

This roughly means "10%, 50%, 40%." The fr values are a ratio, rather than absolute values.

The grid again, but with a grey box at the top, then a grid of eight boxes
Figure 5: now we're talking

And from there, it's just some basic styling:

/* reset the default padding and margins from ul */
ul {
  padding: 0;
  margin: 0;
}

/* we don't use the headings */
h2,
h3 {
    display: none;
}

/* make a little box with the route number */
bus .route-line {
    background-color: var(--2nd);
    padding: .5em;
    list-style: none;
    text-align: center;
}

/* get out of here bullets */
.bus li {
    list-style: none;
    margin: 0.5em;
}

/* I can deny it no longer! ...i am small */
.bus .arrival-live {
    font-size: var(--font-tiny);
}
The bus gauge with [26] in blue, and the background in lighter blue. several numbers are listed below.
Figure 6: Almost ready

One of the last changes I made was for legibility. The display we're using is only 720p, and I wanted to be able to see it from a distance. For the font, I went with Overpass, based on the venerable Highway Gothic used on American highway signs2.

The other thing was slightly bolding everything:

:root {
    font-weight: 600;
}

The End Result

A grid of several coloured boxes. Across the top is the weather, then across the bottom is a calendar, emoji representing trash, a QR code for our wifi network, bus arrival times, and a clock showing the time in Sydney and New York
Figure 7: Tada 🎉

Next Time

Deployment! I'll walk through how deployed the client, the server, and the development tooling I build along the way

Footnotes:

1

this blog does not have comments

2

this is a anti-Clearview house

Let's Make an Information Display Part 1: Backend

About this Project

Our house in Vancouver came with a TV pre-installed in the kitchen, right above the fridge. This was, originally, for monitoring all of the security cameras. But surveillance cameras show the exact same thing, day in and day out. What if, instead, I aggregated a bunch of different datums our house cares about?

A former housemate had a little web app that displayed the weather, and when the next trash day was. But I was feeling a little more ambitious than that, and I never miss an opportunity to over-engineer something.

A mockup of a information display, showing weather, trash pickup, planes, and more
Figure 1: the initial mockup

There's two important things to notice here. One, planes are cool. Two, both the ADSB and the Slack message want to be near-real-time1, which means this can't just be a server side application.

The First Architecture

Client Side Best Side

My first idea was to have everything happen on the client side. There's a half dozen or so APIs that we need to aggregate, but if this going to only run in one or two places, why have a backend at all?

It turns out there's a couple reasons. One is CORS, an annoying but necessary security feature that won't let you just call any old URL you want. If you control the endpoint you can add a couple headers that say "Hey, whatever, do what you want." But I don't control a lot of the endpoints, so I was going to need a proxy anyway.

The other reason was that making requests from inside Webassembly was just a little awkward.

For example, here's how reqwest, a common Rust library, makes a request?

let client = reqwest::Client::new();
let res = client.post("http://httpbin.org/post")
    .body("Some text")
    .send()
    .await
    .unwrap();

Easy, right?

But on the Webassembly side, you can't just make arbitrary socket calls. You need to use the browser's built-in XHR methods.

Request::get("/path")
    .send()
    .await?
    .unwrap()

And while that interface is the same, pretty much no library you'd want to use comes with built-in support. You lose access to a lot of the ecosystem advantages Rust usually gives you. Before I gave up, for example, I wrote my own slack client because none of the existing ones supported WASM.

You'd expect Rust would have a… trait HttpClient that had a bunch of implementations, but no such luck. Maybe once AsyncTraits stabilise.

The architecture I eventually settled on was more traditional, consisting of three crates:

  • den-tail is the backend, running in a VM on my NAS
  • den-head is the frontend, running in webassembly in a browser
  • den-message represents the JSON-encoded wire format2, transmitted over websockets from den-tail to den-head.

In this post we'll discuss the first one, and part 2 will introduce our frontend.

The Backend Situation

Here's the basic problem statement: The backend needs to fetch updates from a lot of different sources, on a lot of different schedules. Bus departures should be every minute or so, but the trash schedule needs once a day at most.

And all that data needed to flow back over the websocket to the web frontend. Are you thinking what I'm thinking?

Actor model!!

A directed graph showing data flow, from GaugeUpdaters to GaugeCache to WsActor to den-tail
Figure 2: the datums must flow

There's a neat actor model library called actix. And even better for my purposes, it's mostly a web framework that happens to use actors. I don't have to choose a web framework!

A basic actor looks like this:

pub struct UpdateActor {
    updater: Rc<dyn GaugeUpdater>,
    name: &'static str,
}

impl Actor for UpdateActor {
    type Context = Context<Self>;
}

#[derive(Message)]
#[rtype(result = "()")]
pub struct RequestUpdate;

impl Handler<RequestUpdate> for UpdateActor {
    type Result = ();

    fn handle(&mut self, _msg: RequestUpdate, ctx: &mut Self::Context) -> Self::Result {
        todo!()
    }
}

An actor can receive several kinds of messages, which are just structs or enums. The compiler even checks that you're sending messages to actors that understand them!

     --> den-tail/src/ws.rs:37:12
    |
37  |     a.send(MyStruct2);
    |       ---- ^^^^^^^^^ the trait `actix::Handler<MyStruct2>` is not implemented for `WsActor`
    |       |
    |       required by a bound introduced by this call
    |
    = help: the following other types implement trait `actix::Handler<M>`:
              <WsActor as actix::Handler<GaugeUpdateMessage>>
              <WsActor as actix::Handler<MyStruct>>

The Data Must Flow

Let's see how data actually moves through the system.

To do that, we'll take a look at the Bus Arrival. I call the individual displays "Gauges." Here's what the bus gauge looks like:

A title (46) with several numbers below it (9, 18, 55). The 9 has a wifi icon next to it
Figure 3: The Bus Gauge (with fake data)

This means, the 46 bus has arrivals in 9, 18, and 55 minutes. The 7's arrival is real-time, the others are just from the schedule.

First question: How do we get this information?

The API

I live in Vancouver, and translink has a real-time API we can use. I looked into using Transit's API, but it's limited to 1500 calls a month. That's plenty for an app used once a day, but it works out to 2 an hour for an always-on application. No good!

There wasn't a crate, but it's not too complicated an API:

[
  {
    "RouteNo": "049",
    "RouteName": "METROTOWN STN/DUNBAR LOOP/UBC",
    "Direction": "WEST",
    "RouteMap": {
      "Href": "https://nb.translink.ca/geodata/049.kmz"
    },
    "Schedules": [
      {
        "Pattern": "WB1",
        "Destination": "UBC",
        "ExpectedLeaveTime": "9:49pm 2024-01-03",
        "ExpectedCountdown": 11,
        "ScheduleStatus": "-",
        "CancelledTrip": false,
        "CancelledStop": false,
        "AddedTrip": false,
        "AddedStop": false,
        "LastUpdate": "09:37:09 pm"
      },
    ]
  }
]

There's a lot, but we don't care about most of it. Just the route number, the expected leave time, and the schedule status.

Rust has a really excellent SERialize/DEserialize library called, appropriately serde.

It's very simple to use! You just create a normal struct, add a few annotations when the JSON object doesn't quite match Rust's conventions, and derive Deserialize. Magic!

Here's our struct:

#[derive(Deserialize, Debug, Clone)]
struct Route {
    #[serde(rename = "RouteNo")]
    route_no: String,

    #[serde(rename = "Schedules")]
    schedules: Vec<EstimatedArrival>,
}

#[derive(Deserialize, Debug, Clone)]
struct EstimatedArrival {
    #[serde(rename = "ExpectedLeaveTime")]
    expected_leave_time: String,
    #[serde(rename = "ScheduleStatus")]
    schedule_status: String,
}

Then we have to retrieve it. There's a lot of Rust http clients, and I initially picked Reqwest. But after I settled on actix I moved everything to awc.

async fn fetch(&self) -> Result<Vec<Route>> {
        let url = format!(
            "https://api.translink.ca/rttiapi/v1/stops/{}/estimates",
            self.stop
        );
        Ok(self
            .client
            .get(url)
            .query(&[("apikey", &self.api_token)])?
            .insert_header((ACCEPT, "application/json"))
            .send()
            .await?
            .json()
            .await?)
    }

There's a neat trick here with the Rust type system: because we know that we're returning a Vec of Route structs, we don't need to tell json() what format to use! It'll automagically call Vec<Route>::deserialize.

There's a little more post-processing. We parse the date (annoyingly complicated, since the date format changes) and whether it's live or not3.

Here's the end result, from den-message. This is the wire format that will eventually be sent to the frontend

#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
pub struct BusLine {
    pub bus_line: String,
    pub arrivals: Vec<BusArrival>,
}

#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
pub struct BusArrival {
    pub arrival: chrono::DateTime<Local>,
    pub live: bool,
}

The Interface

Obviously this isn't the only gauge we need to update. But the logic for every one is the same: retrieve an update, do some parsing, return it. And when you need a lot of implementations of the same thing, you make a trait!

#[async_trait(?Send)]
pub trait GaugeUpdater {
    async fn update(&self) -> Result<GaugeUpdate>;
}

Note that while it currently uses the asynctrait, it won't need to for long!

The ?Send is a consequence of Actix. Send is a Rust trait meaning "safe to send between threads." But it usually comes with some overhead, involving a mutex or some other synchronization primitive. But Actix's own future wrapper, ActorFuture, doesn't require Send. And since awc is mainly for the actix system, it isn't Send either. Until I figured out the ?Send, I got a lot of gnarly error messages, and I stuck with Reqwest (which is Send). But eventually the asynctrait docs gave me the answer.

Anyway! GaugeUpdate is an enum, with variants for all the gauges' associated data. Every kind of data we collect can be turned into a GaugeUpdate

#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
pub enum GaugeUpdate {
    IpAddress(IpAddress),
    TrashDay(TrashDay),
    CalendarEvents(Vec<CalendarEvent>),
    SlackMessages(Vec<SlackMessage>),
    Weather(Weather),
    BusArrival(Vec<BusLine>),
}

There's our BusLine from above!

Updater

Since we're using an actor model, every updater is going to get its own actor. But since we've got a trait, we can use the single actor type for all of them.

pub struct UpdateActor {
    // Not having UpdateActor be typed makes it easier to construct collections
    updater: Rc<dyn GaugeUpdater>,
    name: &'static str,
}

Probably both the Rc and the dyn could be removed with some clever typings, but this works well enough.

Since actors are all about messages, we'll use an empty unit-like struct to indicate we want an update.

#[derive(Message)]
#[rtype(result = "()")]
pub struct RequestUpdate;

The actor could make its own schedule, but instead we use an UpdateSupervisor to periodically send RequestUpdate to every actor using the runinterval method on an actor's context.

The actors are all run from a Supervisor, so they'll be restarted in the face of any Err results. They can't do anything about panics, so we are extra diligent to not call unwrap or expect.

Cache Money

Let's go back to UpdateActor and see how it handles those RequestUpdate messages.

  impl Handler<RequestUpdate> for UpdateActor {
    type Result = ();

    fn handle(&mut self, _msg: RequestUpdate, ctx: &mut Self::Context) -> Self::Result {
        let updater = self.updater.clone();
        let name = self.name;
        ctx.spawn(
            async move { updater.update().await }
                .and_then(GaugeCache::update_cache)
                .map_ok_or_else(move |e| error!("Error running {}: {}", name, e), |_| ())
                .into_actor(self),
        );
    }
}

There's some housekeeping to appease the almighty borrow checker, but in essence we call updater.update().await and pass the result to GaugeCache::update_cache:

What's GaugeCache::update_cache?

I don't know if this is a common pattern in Actix, but in Erlang it's not typical to send messages directly to actors. Instead, there will be functions that send the appropriate messages for you:

-spec attach(Node) -> 'already_attached' | 'attached' when
      Node :: node().
attach(Node) ->
    gen_server:call({global, pool_master}, {attach, Node}).

GaugeCache::update_cache performs a similar task. It looks up the running cache instance from the supervisor (which will start it if necessary.) From there, it sends the update in a format the cache can understand.

pub async fn update_cache(msg: GaugeUpdate) -> crate::Result<()> {
      Self::from_registry()
          .try_send(GaugeUpdateMessage(msg))
          .map_err(|_| crate::Error::ActixSendError)
}

Let's take a look at the cache actor. It's the brains of this whole operation.

pub struct GaugeCache {
    gauges: HashMap<UpdateKind, GaugeUpdate>,
    clients: HashSet<Recipient<GaugeUpdateMessage>>,
}

gauges stores the most recent update of every kind we've received. clients represents outgoing websocket connections, which we'll get to.

Let's see what happens when we receive a GaugeUpdateMessage:

#[derive(Message, Clone)]
#[rtype(result = "()")]
pub struct GaugeUpdateMessage(pub GaugeUpdate);

impl Handler<GaugeUpdateMessage> for GaugeCache {
    fn handle(&mut self, msg: GaugeUpdateMessage, _ctx: &mut Self::Context) -> Self::Result {
        // Store our update
        self.receive(UpdateKind::of(&msg.0), msg.0.clone());
        // Send it to everyone subscribed
        self.clients.iter().for_each(|v| v.do_send(msg.clone()));
    }
}

Seems simple enough. But where did those clients come from? A Connect message, of course!

#[derive(Message)]
#[rtype(result = "()")]
struct Connect(Recipient<GaugeUpdateMessage>);

And what's that handler look like?

  impl Handler<Connect> for GaugeCache {
    type Result = ();

    fn handle(&mut self, msg: Connect, _ctx: &mut Self::Context) -> Self::Result {
        self.clients.insert(msg.0.clone());
        self.gauges
            .values()
            .cloned()
            .map(GaugeUpdateMessage)
            .for_each(|v| msg.0.do_send(v));
    }
}

First, we store our client in the client list so it can receive future updates. Then, we send a catch-up: the most recent messages of every kind. That way, a reconnecting client doesn't need to wait for fresh updates to come down the pipe. Some of the less time-critical updaters only run once a day, so they'd be waiting a while!

Web Sock It To Me

Actix handles websockets with the actix-web-actors crate. You'd think all of actix-web would be web-actors, but I guess not. Regardless, it plugin in nicely to our existing menagerie of actors.

The way this works is actually pretty interesting.

First of all, for incoming messages, we use a StreamHandler. Where a regular Handler handles a single message, StreamHandler works with a Stream of messages.

pub trait StreamHandler<I>
where
      Self: Actor,
{
    fn handle(&mut self, item: I, ctx: &mut Self::Context);

    //snip
}

For our implementation, there aren't many incoming messages we care about:

impl StreamHandler<Result<ws::Message, ws::ProtocolError>> for WsActor {
    fn handle(&mut self, item: Result<ws::Message, ws::ProtocolError>, ctx: &mut Self::Context) {
        match item {
            Ok(ws::Message::Ping(ping)) => ctx.pong(&ping),
            Ok(ws::Message::Close(c)) => {
                info!("Closing connection: {:?}", c);
                ctx.close(c)
            }
            // error handling omitted
        }
    }
}

We close gracefully, respond to pings, and handle errors.

But what about sending messages back down the websocket?

The key is to use the WebsocketContext instead of the regular Actix Context.

impl Actor for WsActor {
    type Context = WebsocketContext<Self>;

    // snip
}

How does this work? Let's take a look at WebsocketContext::create's signature

pub fn create<S>(actor: A, stream: S) -> impl Stream<Item = Result<Bytes, Error>>
where
    A: StreamHandler<Result<Message, ProtocolError>>,
    S: Stream<Item = Result<Bytes, PayloadError>> + 'static,

We take an Actor (which must be a StreamHandler) and a Stream. The Stream is over Bytes, which will be decoded into the Message we expect. It's designed to take the web::Payload actixweb uses.

But what's that return type? Normally we'd expect a Self to be returned, but instead we get a stream of Bytes

That stream is how outgoing messages get sent to the client. When you call WebsocketContext::text to send a message, behind the scenes it's enqueueing a message which ultimately ends up in that stream.

Here's how we use it to send GaugeUpdateMessages

impl Handler<GaugeUpdateMessage> for WsActor {
    type Result = ();

    fn handle(&mut self, msg: GaugeUpdateMessage, ctx: &mut Self::Context) -> Self::Result {
        match serde_json::to_string(&DenMessage::Update(msg.0)) {
            Err(e) => error!("Failed to encode payload: {:?}", e),
            Ok(msg) => ctx.text(msg),
        };
    }
}

(We'll get to DenMessage in a later post, but for now it's just a wrapper enum around GaugeUpdate.)

Handle is synchronous, so ctx.text can't wait for a message to be sent. Instead, it gets added to WebsocketContext's internal queue that's eventually set back to the client by the whims of Actix's scheduler.

How does the cache know to send us GaugeUpdateMessage? We need to subscribe to the cache. If we look at the Actor trait, we can see that there's some lifecycle hooks we can use:

pub trait Actor: Sized + Unpin + 'static {
  type Context: ActorContext;

  fn started(&mut self, ctx: &mut Self::Context) { ... }
  fn stopped(&mut self, ctx: &mut Self::Context) { ... }

  // snip
}

So these just need to be hooked up to the GaugeCache.

The signature looks like this:

pub fn connect(r: Recipient<GaugeUpdateMessage>) -> crate::Result<()>

where Recipient means "Any actor that can receive a GaugeUpdateMessage. In practice this will always be the same kind of actor, but there's no reason to hardcode that. This constrains us to only sending GaugeUpdateMessage, instead of every message we've got a Handler for, but that's okay.

Wiring it up gets us this:

fn started(&mut self, ctx: &mut Self::Context) {
      if let Err(e) = GaugeCache::connect(ctx.address().recipient()) {
          error!("Failed to connect to cache: {:?}", e);
          ctx.stop();
      }
  }

There's a similar stopped method so we don't send messages to a client not listening (there's a metaphor there, probably).

Now all that's left is to actually run actix! Their documentation is very good, this is all we need:

#[get("/ws")]
async fn websocket(req: HttpRequest, stream: web::Payload) -> Result<HttpResponse> {
    Ok(ws::start(WsActor, &req, stream)?)
}

There's the incoming stream for upstream Websocket messages. ws::start will call HttpResponseBuilder::streaming to stream the downstream half. And that's the end of the backend!

A diagram of message flows between different actors in the system
Figure 4: this chart isn't useful but i like graphs

Our bus arrival is traveling over the websocket connection. What does that wire format look like? I'm not actually sure! The beauty of using Rust on both ends is that we just need to make sure serialisation is reversible. The libraries will handle the rest.

Next Time

We'll build out the frontend and actually use this data we fired down the pipe!

Part 2

Footnotes:

1

I haven't actually implemented either of these. Whoops.

2

I briefly considered using gRPC, but those same restrictions raised their heads: browsers don't provide low-enough level socket access for GRPC to work without a weird proxy.

3

this is a completely undocumented single character. I have assummed "+", "-", and " " are live, mostly based on guesses.

You Gotta Zag On Em

The Good

The ! error type is really nifty

So in Rust, if you want to to indicate a function can fail, it returns a result type:

fn open(path: &str) -> Result<File, std::io::Error>

If you're going to frequently use a single type, it's common to make your own "result" type that includes this:

 pub type Result<T> = std::result::Result<T, std::io::Error>;

fn open(path: &str) -> Result<File>

and this works well enough. There's even a handy syntax for unpacking result values:

fn get_data() -> Result<String, std::io::Error> {
    let f = open("file")?;
    f.read_all()
}

This little ? will get desugared (expanded) to something like:

fn get_data() -> Result<String, std::io::Error> {
    let f = match open("file") {
          Ok(v) => v,
          Err(e) => return e.into()
        }
    f.read_all()
}

The e.into() is important because it means the error returned by open doesn't have to use the exact error std::io::Error, but anything that can convert to it.

Rust's use of .into() and its reciprocal, .from() is extremely powerful. But without a library, this can get very verbose.

For example, say you have a function that could return two kinds of error. The typical way to encapsulate this is an enum, which in Rust behaves a lot like a union would in C:

pub enum Error {
    ConnectionError(somelib::ConnectionError),
    ParseError(someotherlib::ParseError),
  }

but you don't get those From implementations for free. You'd need to write them yourself:

impl From<somelib::ConnectionError> for Error {
    fn from(value: somelib::ConnectionError) -> Self {
        Error::ConnectionError(value)
    }
}


impl From<sometotherlib::ParseError> for Error {
    fn from(value: someotherlib::ParseError) -> Self {
        Error::ParseError(value)
    }
}

but this gets tedious quickly. The rust solution is a library like thiserror, which lets you write something like this:

use thiserror::Error;

#[derive(Error)]
pub enum Error {
    ConnectionError(#[from] somelib::ConnectionError),
    ParseError(#[from] someotherlib::ParseError),
}

Which is quite useful, but does require pulling in an extra library. The result is very "ergonomic" as Rustaceans likes to say, but takes a lot of infrastructure to get there. That's probably Rust in a nutshell.

In contrast, Zig has builtin support for error union types:

const Errors = error{ UnknownInstruction, NoDot };

You would have a function that looked used a ! syntax:

fn get_data(file: []const u8) Errors!SomeType

And you can either explicitly return UnknownInstruction like in rust, or use the try keyword:

try openFile("somepath");

In fact, you can actually omit the explict error union type, and Zig will make one for you:

// you wouldn't actually write Zig like this, because it doesn't like heap allocations. But that's for later.
fn get_data(file: []const u8) !SomeType {
    try open_file(file);
    return try parse_file();
}

That's pretty handy! What would've taken one or two dozen lines of rust is just baked into the language.

The comptime abstraction makes a lot of sense

Zig's marquee feature is the comptime syntax, which denotes that a variable or expression runs at compile time instead of runtime.

So for example, while rust needs a whole extra syntax for generics:

pub struct Stack<T> {
    stack: []T
}

impl<T> Stack<T> {
    fn pop(&mut self) -> Option<T> {
        // ...
    }
}

In Zig, you just write a normal function:

pub fn Stack(comptime T: type) type {
    return struct {
        stack: []T,

        const This = @This();

        pub fn pop(self: *This) ?T {
            // ...
        }
    }
}

You can kind of do this kind of thing with procedural macros, but it requires a lot of infrastructure. You need to build a special crate, do a bunch of imports, etc.

In Zig you can actually dedicate entire blocks to comptime:

comptime {
    const typ = @typeInfo(someval);
    for typ.Struct.fields |field| {
        // .. do something to every field
    }
}

I never really noticed how much extra syntax Rust had built up for handing this kind of thing.

You can do complex type expressions in rust like this:

fn use_type<T>(input: T) where T: IsNumric {
    // use numerics somehow
}

But in zig, you could write this like:

pub fn constant(comptime T: type, val: T) Instruction {
  if !is_numeric(T) {
      @compileError("not a numeric type!")
    }
}

It uses the exact same syntax as the rest of the language, so you don't have to remember some obscure syntax or where the <T> goes.

The @"literal" syntax is cute

This is a little thing. But say you're building a struct with a member whose name is a reserved word like type.

In rust, you'd simply give it a slightly different name:

pub struct MyStruct {
    type_: String
}

But zig has a special syntax for this that reminds me of the :"atom" syntax from Erlang or Ruby:

pub const MyStruct = struct {
    @"type": []const u8
};

I don't know if that's better per say, but it's definitely neat.

inline else is neat too

Imagine we have an enum with two types of number.

In Rust, it looks like this:

pub enum Number {
    I32(i32),
    I64(i64),
}

And in Zig:

pub const Number = union(enum) {
    I32: i32,
    I64: i64,
}

Now let's say we want to get that number as a string

impl Number {
  fn get_string(&self) -> String {
     match self {
         Number::I32(i) => i.to_string(),
         Number::I64(i) => i.to_string(),
     }
  }
}

Now in Zig, you could write this the same way:

fn to_string(n: Number, buf: []u8) ![]u8 {
   return switch(n) {
       .I32 => |i| try std.fmt.bufPrint(buf, "{d}", .{i}),
       .I64 => |i| try std.fmt.bufPrint(buf, "{d}", .{i}),
   };
}

but, you can also write it like this:

fn to_string(n: Number, buf: []u8) ![]u8 {
   return switch(n) {
       inline else => |i| try std.fmt.bufPrint(buf, "{d}", .{i}),
   };
}

i will have a different type for the .I32 and .I64 variants, but the inline else will generate the appropriate branches. Pretty cool!

defer and errdefer

  • "the biggest thing missing from C" - my wife*

I didn't use these much, and they're also found in one of my least favourite languages, Go.

But for a language intending to replace C, it's a big deal.

The idea goes, say you have some cleanup step that always should be done.

void access_shared(struct shared *data) {
  mutex_lock(&data->mutex);
  do_something(&data->shared);

  // a bunch of other junk

  mutex_unlock(&data->mutex);
}

so far so good. But Now let's imagine do_something is fallible.

int access_shared(struct shared *data) {
  int retcode;
  mutex_lock(&data->mutex);
  retcode = do_something(&data->shared);
  if (retcode < 0) {
    return retcode;
  }

  // a bunch of other junk

  mutex_unlock(&data->mutex);

  return 0;
}

But oh no! We've introduced an error – if do_something fails, we won't unlock the mutex.

This kind of error is extremely common. Rust's solution looks invisible:

fn access_shared(&mut self) -> Result<()> {
    let mut data = self.lock.unwrap();
    do_something(&mut data)?;
    Ok(())
}

What's happening behind the scenes here is that when data goes out of scope, it's automatically unlocked.

This is usually what you want! But it requires - say it with me - a lot of infrastructure. In this case, it implements the Drop trait. But this kind of magic can be unsettling.

The Go and Zig implementation, on the other hand, makes it immediately obvious what's happening.

fn access_shared(self: *Shared) !void {
    self.m.lock();
    defer self.m.unlock();

    try do_something(&self.data);
}

The errdefer case is a neat addition. It means it's easy to say, clean up after an error:

fn important_task(self: *State, allocator: Allocator) !*MyStruct {
    var foo = try doAnAllocation(allocator);
    errdefer deallocate(foo);

    try do_something();
}

neat!

The `std` available freestanding

This one takes some explaining. Rust's standard library is actually divided up into core and std. What's the difference? std assumes you have an allocator.

Allocators are a whole can of their own worms, but in short applications can put data on either the stack, where functions and their variables live, or the heap, which lives for an arbitrary amount of time.

Why the split? Well allocators are usually provided by the operating system, and Rust supports embedded applications where there might not be one.

But of course there's a lot of really handy stuff in std. Things like Vec and println! that most libraries want to use. So in practice, most libraries that support no_std, as it's called, have a feature std that's enabled by default, and a bunch of conditional compilation flags to disable features.

It works well enough, but it's definitely tedious.

Zig takes a different approach. Every function that wants to put data on the stack takes an Allocator. There's a built-in GeneralPurposeAllocator that you can probably use, or a bunch of more specific ones.

What this means is that any freestanding target (one without a hosting operating system) can still use the exact same standard library. No conditional compilation necessary. And no seperate crates or hardware-specific implementations.

The C interop is probably pretty cool

This is Zig's other headline feature, and I didn't use it at all.

But here's the gist:

In Rust, if you want to access a C library, you usually need bindings. If you're lucky, there's a -sys crate already generated. If not, you're probably going to be learning about bindgen.

Rust, of course, wants you to use C libraries as infrequently as possible because they can't uphold Rust's memory safety models. So there's a specific module that's very carefully wrapped and tested to provide a safe-ish interface.

Zig takes a different approach:

const c = @cImport({
    @cInclude("stdio.h");
});
pub fn main() void {
    _ = c.printf("hello\n");
}

You basically include a C library like any Zig library, and use it appropriately. There's obviously plenty of edge cases, but it's elegant in its simplicity.

It reminds me a lot of cython, a similar toolkit for Python.

Debug/ReleaseSafe/ReleaseFast/ReleaseSmall

Rust has two kinds of build:

  • debug builds, which is used for quick iteration and debugging
  • release builds, which take longer but are more optimized.

If you need anything beyond that, it's time to wade through Cargo options.

Zig, instead, lets you specify what you want to optimize for, and guides the compiler in the appropriate direction.

I've never found myself really needing to squeeze performance out of Rust code, but I love how obvious Zig lets you make your intentions.

The Bad

The documentation really isn't done

Zig is a much smaller language than Rust, and I certainly don't expect the same level of polish.

But you very quickly run into documentation limits. How do you split your code into multiple Zig files? TODO. What's the addrspace keyword do? who knows. How's JSON parsing handled? No documentation provided.

There's a lot of searching for "<issue> zig" or just looking at other people's code. I think I have been spoiled by Rust's and Python's impeccable documentation.

Quirks of comptime type system

anytype

Go has a type system, in theory, but in practice it was often insufficient to express what you want.

Before generics were added, a lot of functions end up looking like this:

// k8s.io/apimachinery@v0.28.3/pkg/api/meta
func Accessor(obj interface{}) (metav1.Object, error)

where interface{} basically means "whatever."

If it isn't acceptable, Go just panics during execution.

Zig's comptime features mean you're not going to get a runtime surprise. But it still means function signatures can be a little obscure.

Compare A Zig gzip function signature

fn decompress(allocator: mem.Allocator, reader: anytype) !Decompress(@TypeOf(reader))

To a rust one:

impl<R: BufRead> GzDecoder<R> {
    pub fn new(r: R) -> GzDecoder<R>
}

Once you can read the Rust interface syntax, you immediately know what kind of types will be acceptable.

With Zig, you can make some educated guesses, but you'll probably need to read some source code.

@TypeOf(i)

fn compare(this: anytype, that: @TypeOf(this)) bool {}
fn compare<T>(this: T, that: T) -> bool {}

They seem pretty similar, but there's some weird consequences imagine a function that returns an i32

fn get_val() i32 { return 1234 }

Now, this will work:

compare(get_val(), 1234);

But this will not:

compare(1234, get_val());

Because `1234` literal has the type comptime_int, which can be cooerced to any int But it doesn't know which one.

The rust code, though, will work just fine, because it only cares that the types are the same

so many reserved words

In some ways it's better than having obscure ascii syntax like rust does, but it feels messy. Sometimes I think the @"literal" syntax is just because there's so many variables you couldn't have otherwise.

No automatic test discovery

test "stack" {
  testing.refAllDecls(@This());
}

you need to put this in the root of your package. Why?

test is a language level construct and you can't discover recursively by default?

At least there's assert functions, unlike Go's boilerplate:

if expected != actual {
  t.Fatalf("expected value to be %v, got %v", expected, actual)
}

I'm pretty sure I coud type that snippet in my sleep now. Ugh.

The struct declaration feels mess

y In Zig, if you want to have a "method" associated with a struct, you simply put that function right in the struct.

const MyStruct = struct {
    value: i32,

    pub fn print(self: *MyStruct) void {}
}

Contrast that with Go:

type MyStruct struct {
  value: int,
}

func (s *MyStruct) Print() {}

or Rust:

struct MyStruct {
    value: i32,
}

impl MyStruct {
    fn print(&self) {}
}

Having the methods be mixed in with the variables means they're hard to pick out. Especially as functions get longer, the fields tend to get kind of lost in the sea of pub fn.

The try syntax doesn't chain well

try fallible().use_value()

doesn't work! but

(try fallible()).use_value()

does.

This is basically the reason Rust chose a .await suffix instead of an await prefix. It was very controversial at the time, but ultimately I think it was the right call. The ? syntax in rust lets you write code like this:

async fn get_weather_weer(&self) -> Result<Forecast> {
      Ok(self.client
          .get(FORECAST_URL)
          .query(&self.get_params())
          .send()
          .await?
          .json()
          .await?)
  }

Whereas in Zig you'd likely need a bunch of parentheses or intermediate variables

There's nothing wrong with using []const u8 for strings

but I don't like it. Zig might support unicode, but having an array of u8 values encourages ascii-like thinking. what's wrong with a string type, maybe with some useful iteration methods?

The Ugly

No cute mascot :C

cmon. I don't like Go but the gopher is very cute. Zig just has a boring stylised Z.

Maybe I'll try Hare next

Other posts