<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<title>MarcoPolo – Partially Functional - Nix</title>
	<author><name>Marco</name></author>
	<link href="https://marcopolo.io/tags/nix/atom.xml" rel="self" type="application/atom+xml"/>
  <link href="https://marcopolo.io"/>
	<generator uri="https://www.getzola.org/">Zola</generator>
	<updated>2021-05-10T00:00:00+00:00</updated>
	<id>https://marcopolo.io/tags/nix/atom.xml</id>
	
	<entry xml:lang="en">
		<title>Declarative Dev Environments</title>
		<published>2021-05-10T00:00:00+00:00</published>
		<updated>2021-05-10T00:00:00+00:00</updated>
		<link href="https://marcopolo.io/code/declarative-dev-environments/" type="text/html"/>
		<id>https://marcopolo.io/code/declarative-dev-environments/</id>
		<content type="html">&lt;p&gt;I don&#x27;t install development tools globally. I don&#x27;t have &lt;code&gt;node&lt;&#x2F;code&gt; added to my
&lt;code&gt;PATH&lt;&#x2F;code&gt; in my &lt;code&gt;~&#x2F;.zshrc&lt;&#x2F;code&gt; file, and running &lt;code&gt;cargo&lt;&#x2F;code&gt; outside a project folder
returns &amp;quot;command not found.&amp;quot; I wipe my computer on every reboot. With the
exception of four folders (&lt;code&gt;&#x2F;boot&lt;&#x2F;code&gt;, &lt;code&gt;&#x2F;nix&lt;&#x2F;code&gt;, &lt;code&gt;&#x2F;home&lt;&#x2F;code&gt;, and &lt;code&gt;&#x2F;persist&lt;&#x2F;code&gt;), everything
gets &lt;a href=&quot;https:&#x2F;&#x2F;grahamc.com&#x2F;blog&#x2F;erase-your-darlings&quot;&gt;deleted&lt;&#x2F;a&gt;. And it has worked
out great.&lt;&#x2F;p&gt;
&lt;p&gt;Instead of installing development packages globally, I declare them as a
dependency in my project&#x27;s dev environment. They become available as soon as I
&lt;code&gt;cd&lt;&#x2F;code&gt; into the project folder. If two projects use the same tool then I only keep
one version of that tool on my computer.&lt;&#x2F;p&gt;
&lt;p&gt;I think installing dev tools globally is a bad pattern that leads to nothing but
heartache and woe. If you are running &lt;code&gt;sudo apt-get install&lt;&#x2F;code&gt; or &lt;code&gt;brew install&lt;&#x2F;code&gt;
prior to building a project, you are doing it wrong. By defining your dev tool
dependencies explicitly you allow your projects to easily build on any
machine at any point in time. Whether it&#x27;s on a friends machine today, or a new
laptop in 10 years. It even makes CI integration a breeze.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;what-do-i-mean-by-a-declarative-dev-environment&quot;&gt;What do I mean by a declarative dev environment?&lt;&#x2F;h2&gt;
&lt;p&gt;I mean a project that has a special file (or files) that define all the
dependencies required to build and run your project. It doesn&#x27;t necessarily have
to include the actual binaries you will run in the repo, but it should be
reproducible. If you clone my project you should be running the exact
same tools as me.&lt;&#x2F;p&gt;
&lt;p&gt;Just like you have explicit dependencies on libraries you use in your program, a
declarative dev environment lets you define your tooling dependencies (e.g.
which version of Node, Yarn, or your specific cross compiler toolchain).&lt;&#x2F;p&gt;
&lt;h2 id=&quot;how-i-setup-my-declarative-dev-environments&quot;&gt;How I setup my declarative dev environments&lt;&#x2F;h2&gt;
&lt;p&gt;To accomplish this I use &lt;a href=&quot;https:&#x2F;&#x2F;nixos.org&quot;&gt;Nix&lt;&#x2F;a&gt; with &lt;a href=&quot;https:&#x2F;&#x2F;www.tweag.io&#x2F;blog&#x2F;2020-05-25-flakes&#x2F;&quot;&gt;Nix Flakes&lt;&#x2F;a&gt; and &lt;a href=&quot;https:&#x2F;&#x2F;direnv.net&#x2F;&quot;&gt;direnv&lt;&#x2F;a&gt;. There are three
relevant files: &lt;code&gt;flake.nix&lt;&#x2F;code&gt; which defines the build of the project and the tools
I need for development; &lt;code&gt;flake.lock&lt;&#x2F;code&gt; which is similar in spirit to a &lt;code&gt;yarn.lock&lt;&#x2F;code&gt;
or &lt;code&gt;Cargo.lock&lt;&#x2F;code&gt; file, it &lt;em&gt;locks&lt;&#x2F;em&gt; the exact version of any tool used and
generated automatically the first time you introduce dependencies; and finally a
&lt;code&gt;.envrc&lt;&#x2F;code&gt; file which simply tells direnv to ask Nix what the environment should
be, and sets up the environment when you &lt;code&gt;cd&lt;&#x2F;code&gt; into the folder. Here are some
simple examples:
&lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;MarcoPolo&#x2F;templates&#x2F;tree&#x2F;master&#x2F;trivial&quot;&gt;flake.nix&lt;&#x2F;a&gt;,
&lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;MarcoPolo&#x2F;templates&#x2F;blob&#x2F;master&#x2F;trivial&#x2F;.envrc&quot;&gt;.envrc&lt;&#x2F;a&gt;
(&lt;code&gt;flake.lock&lt;&#x2F;code&gt; omitted since it&#x27;s automatically generated).&lt;&#x2F;p&gt;
&lt;p&gt;As a shortcut for setting up a &lt;code&gt;flake.nix&lt;&#x2F;code&gt; and &lt;code&gt;.envrc&lt;&#x2F;code&gt;, you can use a template
to provide the boilerplate. When I start a new project I&#x27;ll run &lt;code&gt;nix flake init -t github:marcopolo&#x2F;templates&lt;&#x2F;code&gt; which copies the files from this
&lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;MarcoPolo&#x2F;templates&#x2F;tree&#x2F;master&#x2F;trivial&quot;&gt;repo&lt;&#x2F;a&gt; and puts them
in your current working directory. Then running &lt;code&gt;direnv allow&lt;&#x2F;code&gt; will setup your
local environment, installing any missing dependencies through Nix as a side
effect.&lt;&#x2F;p&gt;
&lt;p&gt;This blog itself makes use of &lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;MarcoPolo&#x2F;marcopolo.github.io&#x2F;blob&#x2F;master&#x2F;flake.nix#L14&quot;&gt;declarative dev
environments&lt;&#x2F;a&gt;.
Zola is the static site generator I use. When I &lt;code&gt;cd&lt;&#x2F;code&gt; into my blog my environment
is automatically setup with Zola available for previewing the blog.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;how-nix-works-roughly&quot;&gt;How Nix works, roughly&lt;&#x2F;h2&gt;
&lt;p&gt;This all works off &lt;a href=&quot;https:&#x2F;&#x2F;nixos.org&quot;&gt;Nix&lt;&#x2F;a&gt;. Nix is a fantastic package manager and build tool that
provides reproducible versions of packages that don&#x27;t rely on a specific global
system configuration. Specifically packages installed through Nix don&#x27;t rely an
a user&#x27;s &lt;code&gt;&#x2F;usr&#x2F;lib&lt;&#x2F;code&gt; or anything outside of &lt;code&gt;&#x2F;nix&#x2F;store&lt;&#x2F;code&gt;. You don&#x27;t even need
glibc installed (as may be the case if you are on &lt;a href=&quot;https:&#x2F;&#x2F;www.alpinelinux.org&#x2F;&quot;&gt;Alpine
Linux&lt;&#x2F;a&gt;).&lt;&#x2F;p&gt;
&lt;p&gt;For a deeper dive see &lt;a href=&quot;https:&#x2F;&#x2F;nixos.org&#x2F;guides&#x2F;how-nix-works.html&quot;&gt;How Nix Works&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;an-example-how-to-setup-a-yarn-based-js-project&quot;&gt;An example, how to setup a Yarn based JS project.&lt;&#x2F;h2&gt;
&lt;p&gt;To be concrete, let me show an example. If I wanted to start a JS project and
use &lt;a href=&quot;https:&#x2F;&#x2F;yarnpkg.com&#x2F;&quot;&gt;Yarn&lt;&#x2F;a&gt; as my dependency manager, I would do something
like this: &lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;bash&quot; class=&quot;language-bash &quot;&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;# 1. Create the project folder
mkdir my-project

# 2. Add the boilerplate files.
nix flake init -t github:marcopolo&amp;#x2F;templates

# 3. Edit flake.nix file to add yarn and NodeJS.
# With your text editor apply this diff:
# -          buildInputs = [ pkgs.hello ];
# +          buildInputs = [ pkgs.yarn pkgs.nodejs-12_x ];

# 4. Allow direnv to run this environment. This will also fetch yarn with Nix
#    and add it to your path.
direnv allow

# 5. Yarn is now available, proceed as normal. 
yarn init
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;You can simplify this further by making a Nix Flake template that already has
Yarn and NodeJS included. &lt;&#x2F;p&gt;
&lt;h2 id=&quot;another-example-setting-up-a-rust-project&quot;&gt;Another example. Setting up a Rust project.&lt;&#x2F;h2&gt;
&lt;pre data-lang=&quot;bash&quot; class=&quot;language-bash &quot;&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;# 1. Create the project folder
mkdir rust-project

# 2. Add the boilerplate files.
nix flake init -t github:marcopolo&amp;#x2F;templates#rust

# 3. Cargo and rust is now available, proceed as normal. 
cargo init
cargo run
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Here we used a Rust specific template, so no post template init changes were required.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;dissecting-the-flake-nix-file&quot;&gt;Dissecting the &lt;code&gt;flake.nix&lt;&#x2F;code&gt; file&lt;&#x2F;h2&gt;
&lt;p&gt;Let&#x27;s break down the &lt;code&gt;flake.nix&lt;&#x2F;code&gt; file so we can understand what it is we are
declaring.&lt;&#x2F;p&gt;
&lt;p&gt;First off, the file is written in &lt;a href=&quot;https:&#x2F;&#x2F;nixos.wiki&#x2F;wiki&#x2F;Nix_Expression_Language&quot;&gt;Nix, the programming
language&lt;&#x2F;a&gt;. At a high level you
can read this as JSON but with functions. Like JSON it can only represent
expressions (you can only have one top level JSON object), unlike JSON you can
have functions and variables. &lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;nix&quot; class=&quot;language-nix &quot;&gt;&lt;code class=&quot;language-nix&quot; data-lang=&quot;nix&quot;&gt;# This is our top level set expression. Equivalent to the top level JSON object.
{
  # These are comments

  # Here we are defining a set. This is equivalent to a JSON object.
  # The key is description, and the value is the string.
  description = &amp;quot;A very basic flake&amp;quot;;

  # You can define nested sets by using a `.` between key parts.
  # This is equivalent to the JSON object {inputs: {flake-utils: {url: &amp;quot;github:...&amp;quot;}}}
  inputs.flake-utils.url = &amp;quot;github:numtide&amp;#x2F;flake-utils&amp;quot;;

  # Functions are defined with the syntax of `param: functionBodyExpression`.
  # The param can be destructured if it expects a set, like what we are doing here. 
  # This defines the output of this flake. Our dev environment will make use of
  # the devShell attribute, but you can also define the release build of your
  # package here.
  outputs = { self, nixpkgs, flake-utils }:
    # This is a helper to generate these outputs for each system (x86-linux,
    # arm-linux, macOS, ...)
    flake-utils.lib.eachDefaultSystem (system:
      let
        # The nixpkgs repo has to know which system we are using.
        pkgs = import nixpkgs { system = system; };
      in
      {
        # This is the environment that direnv will use. You can also enter the
        # shell with `nix shell`. The packages in `buildInputs` are what become
        # available to you in your $PATH. As an example this only has the hello
        # package.
        devShell = pkgs.mkShell {
          buildInputs = [ pkgs.hello ];
        };

        # You can also define a package that is built by default when you run
        # `nix build`.  The build command creates a new folder, `result`, that
        # is a symlink to the build output.
        defaultPackage = pkgs.hello;
      });
}

&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;h2 id=&quot;on-dev-tools-and-a-dev-setup&quot;&gt;On Dev Tools and A Dev Setup&lt;&#x2F;h2&gt;
&lt;p&gt;There is a subtle distinction on what constitutes a Dev Tool vs A Dev Setup. I
classify Dev Tools as things that need to be available to build or develop a given
project specifically. Think of &lt;code&gt;gcc&lt;&#x2F;code&gt;, &lt;code&gt;yarn&lt;&#x2F;code&gt;, or &lt;code&gt;cargo&lt;&#x2F;code&gt;. The Dev Setup category
are for things that are useful when developing in general. Vim, Emacs,
&lt;a href=&quot;https:&#x2F;&#x2F;geoff.greer.fm&#x2F;ag&#x2F;&quot;&gt;ag&lt;&#x2F;a&gt; are some examples.&lt;&#x2F;p&gt;
&lt;p&gt;Dev tools are worth defining explicitly in your project&#x27;s declarative dev environment (in
a &lt;code&gt;flake.nix&lt;&#x2F;code&gt; file). A Dev Setup is highly personal and not worth defining in the
project&#x27;s declarative dev environment. But that&#x27;s not to say your dev setup in not
worth defining at all. In fact, if you are (or when you become) familiar with
Nix, you can extend the same ideas of this post to your user account with &lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;nix-community&#x2F;home-manager&quot;&gt;Home
Manager&lt;&#x2F;a&gt;. &lt;&#x2F;p&gt;
&lt;p&gt;With Home Manager You can declaratively define which programs you want available
in your dev setup, what Vim plugins you want installed, what ZSH plugins you
want available and much more. It&#x27;s the core idea of declarative dev environments
taken to the user account level.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;why-not-docker&quot;&gt;Why not Docker?&lt;&#x2F;h2&gt;
&lt;p&gt;Many folks use Docker to get something like this, but while it gets close – and
in some cases functionally equivalent – it has some shortcomings:&lt;&#x2F;p&gt;
&lt;p&gt;For one, a Dockerfile is not reproducible out of the box. It is common to use
&lt;code&gt;apt-get install&lt;&#x2F;code&gt; in a Dockerfile to add packages. This part isn&#x27;t reproducible
and brings you back to the initial problem I outlined. &lt;&#x2F;p&gt;
&lt;p&gt;Docker is less effecient with storage. It uses layers as the base block of
Docker images rather than packages. This means that it&#x27;s relatively easy to end
up with many similar docker images (for a more thorough analysis check
out &lt;a href=&quot;https:&#x2F;&#x2F;grahamc.com&#x2F;blog&#x2F;nix-and-layered-docker-images&quot;&gt;Optimising Docker Layers for Better Caching with
Nix&lt;&#x2F;a&gt;).&lt;&#x2F;p&gt;
&lt;p&gt;Spinning up a container and doing development inside may not leverage your
existing dev setup. For example you may have Vim setup neatly on your machine,
but resort to &lt;code&gt;vi&lt;&#x2F;code&gt; when developing inside a container.  Or worse, you&#x27;ll 
rebuild your dev setup inside the container, which does nothing more than
add dead weight to the container since it&#x27;s an addition solely for you and not
really part of the project. Of course there are some workarounds to this issue,
you can bind mount a folder and VS Code supports opening a project inside a
container.  &lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;zmkfirmware&#x2F;zmk&quot;&gt;ZMK&lt;&#x2F;a&gt; does this and it has
worked great.&lt;&#x2F;p&gt;
&lt;p&gt;If you are on MacOS, developing inside a container is actually slower. Docker
on Mac relies on running a linux VM in the background and running containers in
that VM. By default that VM is underpowered relative to the host MacOS machine.&lt;&#x2F;p&gt;
&lt;p&gt;There are cases where you actually do only want to run the code in an
x86-linux environment and Docker provides a convenient proxy for this. In these
cases I&#x27;d suggest using Nix to generate the Docker images. This way you get the
declarative and reproducible properties from Nix and the convenience from Docker.&lt;&#x2F;p&gt;
&lt;p&gt;As a caveat to all of the above, if you already have a reproducible dev environment
with a Docker container that works for you, please don&#x27;t throw that all out and
redesign your system from scratch. Keep using it until it stops meeting your
needs and come back to this when it happens. Until then, keep building.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;on-nix-flakes&quot;&gt;On Nix Flakes&lt;&#x2F;h2&gt;
&lt;p&gt;Nix Flakes is still new and in beta, so it&#x27;s likely that if you install Nix from
their &lt;a href=&quot;https:&#x2F;&#x2F;nixos.org&#x2F;download.html&quot;&gt;download page&lt;&#x2F;a&gt; you won&#x27;t have Nix Flakes
available. If you don&#x27;t already have Nix installed, you can install a version
with Nix Flakes &lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;numtide&#x2F;nix-unstable-installer&quot;&gt;with the unstable installer&lt;&#x2F;a&gt;,
otherwise read the section on &lt;a href=&quot;https:&#x2F;&#x2F;nixos.wiki&#x2F;wiki&#x2F;Flakes#Installing_flakes&quot;&gt;installing flakes&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;closing-thoughts&quot;&gt;Closing thoughts&lt;&#x2F;h2&gt;
&lt;p&gt;In modern programming languages we define all our dependencies explicitly and
lock the specific versions used. It&#x27;s about time we do that for all our tools
too. Let&#x27;s get rid of the &lt;code&gt;apt-get install&lt;&#x2F;code&gt; and &lt;code&gt;brew install&lt;&#x2F;code&gt; section of READMEs.&lt;&#x2F;p&gt;
</content>
	</entry>
	
	<entry xml:lang="en">
		<title>Simple Declarative VMs</title>
		<published>2021-03-24T00:00:00+00:00</published>
		<updated>2021-03-24T00:00:00+00:00</updated>
		<link href="https://marcopolo.io/code/simple-vms/" type="text/html"/>
		<id>https://marcopolo.io/code/simple-vms/</id>
		<content type="html">&lt;p&gt;I&#x27;ve been on a hunt to find a simple and declarative way to define VMs. I wanted
something like &lt;a href=&quot;https:&#x2F;&#x2F;nixos.org&#x2F;manual&#x2F;nixos&#x2F;stable&#x2F;#ch-containers&quot;&gt;NixOS
Containers&lt;&#x2F;a&gt;, but with a
stronger security guarantee. I wanted to be able to use a Nix expression to
define what the VM should look like, then reference that on my Server&#x27;s
expression and have it all work automatically. I didn&#x27;t want to manually
run any commands. The hunt is over, I finally found it.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;my-use-case&quot;&gt;My Use Case&lt;&#x2F;h2&gt;
&lt;p&gt;I want a machine that I can permanently hook up to a WireGuard VPN and treat
as if it were in a remote place. At first I did this with a physical machine,
but I didn&#x27;t want to commit the whole machine&#x27;s compute for a novelty. What I
really want is a small VM that is permanently hooked up to a WireGuard VPN.
Minimal investment with all the upsides.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;nixos-qemu&quot;&gt;NixOS QEMU&lt;&#x2F;h2&gt;
&lt;p&gt;Nix OS supports building your system in a QEMU runnable environment right out of
the box. &lt;code&gt;nixos-rebuild build-vm&lt;&#x2F;code&gt; is a wrapper over &lt;code&gt;nix build github:marcopolo&#x2F;marcopolo.github.io#nixosConfigurations.small-vm.config.system.build.vm&lt;&#x2F;code&gt;. (Side note, with
flakes you can build this exact VM by running that command&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#1&quot;&gt;1&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;). This means NixOS
already did the hard work of turning a NixOS configuration into a valid VM that
can be launched with QEMU. Not only that, but the VM shares the &lt;code&gt;&#x2F;nix&#x2F;store&lt;&#x2F;code&gt;
with the host. This results in a really small VM (disk size is 5MB).&lt;&#x2F;p&gt;
&lt;p&gt;NixOS does the heavy lifting of converting a configuration into a script that
will run a VM, so all I need to do is write a service that manages this process.
Enter &lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;MarcoPolo&#x2F;simple-vms&#x2F;&quot;&gt;simple-vms&lt;&#x2F;a&gt;, heavily inspired by
&lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;Nekroze&#x2F;vms.nix&quot;&gt;vms.nix&lt;&#x2F;a&gt; and
&lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;Mic92&#x2F;nixos-shell&quot;&gt;nixos-shell&lt;&#x2F;a&gt;. &lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;MarcoPolo&#x2F;simple-vms&#x2F;&quot;&gt;simple-vms&lt;&#x2F;a&gt; is a NixOS
module that takes in a reference to the
&lt;code&gt;nixosConfigurations.small-vm.config.system.build.vm&lt;&#x2F;code&gt; derivation and the
option of whether you want state to be persisted, and defines a Systemd
service for the vm (There can be multiple VMs). This really is a simple
module, the NixOS service definition is about 10 lines long, and its
&lt;code&gt;ExecStart&lt;&#x2F;code&gt; is simply:&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code&gt;mkdir -p &amp;#x2F;var&amp;#x2F;lib&amp;#x2F;simple-vms&amp;#x2F;${name}
cd &amp;#x2F;var&amp;#x2F;lib&amp;#x2F;simple-vms&amp;#x2F;${name}
exec ${cfg.vm.out}&amp;#x2F;bin&amp;#x2F;run-nixos-vm;
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;With this service we can get and keep our VMs up and running.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;stateless-vms&quot;&gt;Stateless VMs&lt;&#x2F;h2&gt;
&lt;p&gt;I got a sticker recently that said &amp;quot;You either have one source of truth, of
multiple sources of lies.&amp;quot; To that end, I wanted to make my VM completely
stateless. QEMU lets you mount folders into the VM, so I used that to mount host
folders in the VM&#x27;s &lt;code&gt;&#x2F;etc&#x2F;wireguard&lt;&#x2F;code&gt; and &lt;code&gt;&#x2F;etc&#x2F;ssh&lt;&#x2F;code&gt; so that the host can
provide the VM with WireGuard keys, and the VM can persist it&#x27;s SSH host keys.&lt;&#x2F;p&gt;
&lt;p&gt;That&#x27;s all the VM really needs. Every time my VM shuts down I delete the drive.
And just to be safe, I try deleting any drive on boot too.&lt;&#x2F;p&gt;
&lt;p&gt;If you&#x27;re running a service on the VM, you&#x27;ll likely want to persist that
service&#x27;s state files too in a similar way.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;fin&quot;&gt;Fin&lt;&#x2F;h2&gt;
&lt;p&gt;That&#x27;s it. Just a small post for a neat little trick. If you set this up let
me know! I&#x27;m interested in hearing your use case.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;footnotes&quot;&gt;Footnotes&lt;&#x2F;h3&gt;
&lt;div class=&quot;footnote-definition&quot; id=&quot;1&quot;&gt;&lt;sup class=&quot;footnote-definition-label&quot;&gt;1&lt;&#x2F;sup&gt;
&lt;p&gt;User&#x2F;pass = root&#x2F;root. Exit qemu with C-a x.&lt;&#x2F;p&gt;
&lt;&#x2F;div&gt;
</content>
	</entry>
	
	<entry xml:lang="en">
		<title>Backups made simple</title>
		<published>2021-03-07T00:00:00+00:00</published>
		<updated>2021-03-07T00:00:00+00:00</updated>
		<link href="https://marcopolo.io/code/backups-made-simple/" type="text/html"/>
		<id>https://marcopolo.io/code/backups-made-simple/</id>
		<content type="html">&lt;p&gt;I&#x27;ve made a backup system I can be proud of, and I&#x27;d like to share it with you
today. It follows a philosophy I&#x27;ve been fleshing out called &lt;em&gt;The
Functional Infra&lt;&#x2F;em&gt;. Concretely it aims to:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Be pure. An output should only be a function of its inputs.&lt;&#x2F;li&gt;
&lt;li&gt;Be declarative and reproducible. A by product of being pure.&lt;&#x2F;li&gt;
&lt;li&gt;Support rollbacks. Also a by product of being pure.&lt;&#x2F;li&gt;
&lt;li&gt;Surface actionable errors. The corollary being it should be easy to understand
and observe what is happening.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;At a high level, the backup system works like so:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;ZFS creates automatic snapshots every so often.&lt;&#x2F;li&gt;
&lt;li&gt;Those snapshots are replicated to an EBS-backed EC2 instance that is only
alive while backup replication is happening. Taking advantage of ZFS&#x27;
incremental snapshot to make replication generally quite fast.&lt;&#x2F;li&gt;
&lt;li&gt;The EBS drive itself stays around after the instance is terminated. This
drive is a Cold HDD (sc1) which costs about $0.015 gb&#x2F;month.&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;h2 id=&quot;zfs&quot;&gt;ZFS&lt;&#x2F;h2&gt;
&lt;p&gt;To be honest I haven&#x27;t used ZFS all that much, but that&#x27;s kind of my point. I,
as a non-expert in ZFS, have been able to get a lot out of it just by
following the straightforward documentation. It seems like the API is well
thought out and the semantics are reasonable. For example, a consistent snapshot
is as easy as doing &lt;code&gt;zfs snapshot tank&#x2F;home&#x2F;marco@friday&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;automatic-snapshots&quot;&gt;Automatic snapshots&lt;&#x2F;h3&gt;
&lt;p&gt;On NixOS setting up automatic snapshots is a breeze, just add the following to
your NixOS Configuration:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;nix&quot; class=&quot;language-nix &quot;&gt;&lt;code class=&quot;language-nix&quot; data-lang=&quot;nix&quot;&gt;{
  services.zfs.autoSnapshot.enable = true;
}
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;and setting the &lt;code&gt;com.sun:auto-snapshot&lt;&#x2F;code&gt; option on the filesystem. E.g.: &lt;code&gt;zfs set com.sun:auto-snapshot=true &amp;lt;pool&amp;gt;&#x2F;&amp;lt;fs&amp;gt;&lt;&#x2F;code&gt;. Note that this can also be done on
creation of the filesystem: &lt;code&gt;zfs create -o mountpoint=legacy -o com.sun:auto-snapshot=true tank&#x2F;home&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;With that enabled, ZFS will keep a snapshot for the latest 4 15-minute, 24
hourly, 7 daily, 4 weekly and 12 monthly snapshots.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;on-demand-ec2-instance-for-backups&quot;&gt;On Demand EC2 Instance for Backups&lt;&#x2F;h3&gt;
&lt;p&gt;Now that we&#x27;ve demonstrated how to setup snapshotting, we need to tackle the
problem of replicating those snapshots somewhere so we can have real backups.
For that I use one of my favorite little tools:
&lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;stephank&#x2F;lazyssh&quot;&gt;lazyssh&lt;&#x2F;a&gt;. Its humble description betrays
little information at its true usefulness. The description is simply:
&lt;em&gt;A jump-host SSH server that starts machines on-demand&lt;&#x2F;em&gt;. What it enables is
pretty magical. It essentially lets you run arbitrary code when something SSHs
through the jump-host.&lt;&#x2F;p&gt;
&lt;p&gt;Let&#x27;s take the classic ZFS replication example from the
&lt;a href=&quot;https:&#x2F;&#x2F;docs.oracle.com&#x2F;cd&#x2F;E18752_01&#x2F;html&#x2F;819-5461&#x2F;gbchx.html&quot;&gt;docs&lt;&#x2F;a&gt;:
&lt;code&gt;host1# zfs send tank&#x2F;dana@snap1 | ssh host2 zfs recv newtank&#x2F;dana&lt;&#x2F;code&gt;. This
command copies a snapshot from a machine named &lt;code&gt;host1&lt;&#x2F;code&gt; to another machine named
&lt;code&gt;host2&lt;&#x2F;code&gt; over SSH. Simple and secure backups. But it relies on &lt;code&gt;host2&lt;&#x2F;code&gt; being
available. With &lt;code&gt;lazyssh&lt;&#x2F;code&gt; we can make &lt;code&gt;host2&lt;&#x2F;code&gt; only exist when needed.
&lt;code&gt;host2&lt;&#x2F;code&gt; would start when the ssh command is invoked and terminated when the ssh
command finishes. The command with &lt;code&gt;lazyssh&lt;&#x2F;code&gt; would look something like this
(assuming you have a &lt;code&gt;lazyssh&lt;&#x2F;code&gt; target in your &lt;code&gt;.ssh&#x2F;config&lt;&#x2F;code&gt; as explained in the
&lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;stephank&#x2F;lazyssh&quot;&gt;docs&lt;&#x2F;a&gt;):&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code&gt;host1# zfs send tank&amp;#x2F;dana@snap1 | ssh -J lazyssh host2 zfs recv newtank&amp;#x2F;dana
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Note the only difference is the &lt;code&gt;-J lazyssh&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;So how do we actually setup &lt;code&gt;lazyssh&lt;&#x2F;code&gt; to do this? Here is my configuration:&lt;&#x2F;p&gt;
&lt;div &gt;
    &lt;script src=&quot;https:&amp;#x2F;&amp;#x2F;gist.github.com&amp;#x2F;MarcoPolo&amp;#x2F;13462e986711f62bfc6b7b8e494c5cc8.js&quot;&gt;&lt;&#x2F;script&gt;
&lt;&#x2F;div&gt;
&lt;p&gt;Note there are a couple of setup steps:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;Create the initial sc1 EBS Drive. I did this in the AWS Console, but you
could do this in Terraform or the AWS CLI.&lt;&#x2F;li&gt;
&lt;li&gt;Create the ZFS pool on the drive. I launched my lazy archiver without the ZFS
filesystem option and ran: &lt;code&gt;zpool create -o ashift=12 -O mountpoint=none POOL_NAME &#x2F;dev&#x2F;DRIVE_LOCATION&lt;&#x2F;code&gt;. Then I created the
&lt;code&gt;POOL_NAME&#x2F;backup&lt;&#x2F;code&gt; dataset with &lt;code&gt;zfs create -o acltype=posixacl -o xattr=sa -o mountpoint=legacy POOL_NAME&#x2F;backup&lt;&#x2F;code&gt;.&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;As a quality of life and security improvement I setup
&lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;nix-community&#x2F;home-manager&quot;&gt;homemanager&lt;&#x2F;a&gt; to manage my SSH
config and known_hosts file so these are automatically correct and properly
setup. I generate the lines for known_hosts when I generate the host keys
that go in the &lt;code&gt;user_data&lt;&#x2F;code&gt; field in the &lt;code&gt;lazsyssh-config.hcl&lt;&#x2F;code&gt; above. Here&#x27;s the
relevant section from my homemanager config:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;nix&quot; class=&quot;language-nix &quot;&gt;&lt;code class=&quot;language-nix&quot; data-lang=&quot;nix&quot;&gt;{
  programs.ssh = {
    enable = true;

    # I keep this file tracked in Git alongside my NixOS configs.
    userKnownHostsFile = &amp;quot;&amp;#x2F;path&amp;#x2F;to&amp;#x2F;known_hosts&amp;quot;;
    matchBlocks = {
      &amp;quot;archiver&amp;quot; = {
        user = &amp;quot;root&amp;quot;;
        hostname = &amp;quot;archiver&amp;quot;;
        proxyJump = &amp;quot;lazyssh&amp;quot;;
        identityFile = &amp;quot;PATH_TO_AWS_KEYPAIR&amp;quot;;
      };

      &amp;quot;lazyssh&amp;quot; = {
        # This assume you are running lazyssh locally, but it can also
        # reference another machine.
        hostname = &amp;quot;localhost&amp;quot;;
        port = 7922;
        user = &amp;quot;jump&amp;quot;;
        identityFile = &amp;quot;PATH_TO_LAZYSSH_CLIENT_KEY&amp;quot;;
        identitiesOnly = true;
        extraOptions = {
          &amp;quot;PreferredAuthentications&amp;quot; = &amp;quot;publickey&amp;quot;;
        };
      };
    };
  };
}
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Finally, I use the provided NixOS Module for &lt;code&gt;lazyssh&lt;&#x2F;code&gt; to manage starting it and
keeping it up. Here&#x27;s the relevant parts from my &lt;code&gt;flake.nix&lt;&#x2F;code&gt;:&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code&gt;{
  # My fork that supports placements and terminating instances after failing to
  # attach volume.
  inputs.lazyssh.url = &amp;quot;github:marcopolo&amp;#x2F;lazyssh&amp;#x2F;attach-volumes&amp;quot;;
  inputs.lazyssh.inputs.nixpkgs.follows = &amp;quot;nixpkgs&amp;quot;;

    outputs =
    { self
    , nixpkgs
    , lazyssh
    }: {
      nixosConfigurations = {

        nixMachineHostName = nixpkgs.lib.nixosSystem {
          system = &amp;quot;x86_64-linux&amp;quot;;
          modules = [
              {
                imports = [lazyssh.nixosModule]
                services.lazyssh.configFile =
                  &amp;quot;&amp;#x2F;path&amp;#x2F;to&amp;#x2F;lazyssh-config.hcl&amp;quot;;
                # You&amp;#x27;ll need to add the correct AWS credentials to `&amp;#x2F;home&amp;#x2F;lazyssh&amp;#x2F;.aws`
                # This could probably be a symlink with home-manager to a
                # managed file somewhere else, but I haven&amp;#x27;t go down that path
                # yet
                users.users.lazyssh = {
                  isNormalUser = true;
                  createHome = true;
                };
              }
          ];
        };
      };
    }
}
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;With all that setup, I can ssh into the archiver by simple running &lt;code&gt;ssh archiver&lt;&#x2F;code&gt;. Under the hood, &lt;code&gt;lazyssh&lt;&#x2F;code&gt; starts the EC2 instance and attaches the
EBS drive to it. And since &lt;code&gt;ssh archiver&lt;&#x2F;code&gt; works, so does the original example
of: &lt;code&gt;zfs send tank&#x2F;dana@snap1 | ssh archiver zfs recv newtank&#x2F;dana&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;automatic-replication&quot;&gt;Automatic Replication&lt;&#x2F;h2&gt;
&lt;p&gt;The next part of the puzzle is to have backups happen automatically. There are
various tools you can use for this. Even a simple cron that runs the &lt;code&gt;send&#x2F;recv&lt;&#x2F;code&gt;
on a schedule. I opted to go for what NixOS supports out of the box, which is
&lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;alunduil&#x2F;zfs-replicate&quot;&gt;https:&#x2F;&#x2F;github.com&#x2F;alunduil&#x2F;zfs-replicate&lt;&#x2F;a&gt;.
Unfortunately, I ran into a couple issues that led me to make a fork. Namely:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;Using &lt;code&gt;&#x2F;usr&#x2F;bin&#x2F;env - ssh&lt;&#x2F;code&gt; fails to use the ssh config file. My fork supports
specifying a custom ssh binary to use.&lt;&#x2F;li&gt;
&lt;li&gt;Support for &lt;code&gt;ExecStartPre&lt;&#x2F;code&gt;. This is to &amp;quot;warm up&amp;quot; the archiver instance. I run
&lt;code&gt;nixos-rebuild switch&lt;&#x2F;code&gt; which is basically a no-op if there is no changes to
apply from the configuration file, or blocks until the changes have been
applied. In my case these are usually the changes inside the UserData field.&lt;&#x2F;li&gt;
&lt;li&gt;Support for &lt;code&gt;ExecStopPost&lt;&#x2F;code&gt;. This is to add observability to this process.&lt;&#x2F;li&gt;
&lt;li&gt;I wanted to raise the systemd timeout limit. In case the &lt;code&gt;ExecStartPre&lt;&#x2F;code&gt; takes
a while to warm-up the instance.&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;Thankfully with flakes, using my own fork was painless. Here&#x27;s the relevant
section from my &lt;code&gt;flake.nix&lt;&#x2F;code&gt; file:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;nix&quot; class=&quot;language-nix &quot;&gt;&lt;code class=&quot;language-nix&quot; data-lang=&quot;nix&quot;&gt;  # inputs.zfs-replicate.url = &amp;quot;github:marcopolo&amp;#x2F;zfs-replicate&amp;#x2F;flake&amp;quot;;
  # ...
  # Inside nixosSystem modules...
  ({ pkgs, ... }:
    {
      imports = [ zfs-replicate.nixosModule ];
      # Disable the existing module
      disabledModules = [ &amp;quot;services&amp;#x2F;backup&amp;#x2F;zfs-replication.nix&amp;quot; ];

      services.zfs.autoReplication =
        let
          host = &amp;quot;archiver&amp;quot;;
          sshPath = &amp;quot;${pkgs.openssh}&amp;#x2F;bin&amp;#x2F;ssh&amp;quot;;
          # Make sure the machine is up-to-date
          execStartPre = &amp;quot;${sshPath} ${host} nixos-rebuild switch&amp;quot;;
          honeycombAPIKey = (import .&amp;#x2F;secrets.nix).honeycomb_api_key;
          honeycombCommand = pkgs.writeScriptBin &amp;quot;reportResult&amp;quot; &amp;#x27;&amp;#x27;
            #!&amp;#x2F;usr&amp;#x2F;bin&amp;#x2F;env ${pkgs.bash}&amp;#x2F;bin&amp;#x2F;bash
            ${pkgs.curl}&amp;#x2F;bin&amp;#x2F;curl https:&amp;#x2F;&amp;#x2F;api.honeycomb.io&amp;#x2F;1&amp;#x2F;events&amp;#x2F;zfs-replication -X POST \
              -H &amp;quot;X-Honeycomb-Team: ${honeycombAPIKey}&amp;quot; \
              -H &amp;quot;X-Honeycomb-Event-Time: $(${pkgs.coreutils}&amp;#x2F;bin&amp;#x2F;date -u +&amp;quot;%Y-%m-%dT%H:%M:%SZ&amp;quot;)&amp;quot; \
              -d &amp;quot;{\&amp;quot;serviceResult\&amp;quot;:\&amp;quot;$SERVICE_RESULT\&amp;quot;, \&amp;quot;exitCode\&amp;quot;: \&amp;quot;$EXIT_CODE\&amp;quot;, \&amp;quot;exitStatus\&amp;quot;: \&amp;quot;$EXIT_STATUS\&amp;quot;}&amp;quot;
          &amp;#x27;&amp;#x27;;
          execStopPost = &amp;quot;${honeycombCommand}&amp;#x2F;bin&amp;#x2F;reportResult&amp;quot;;
        in
        {
          inherit execStartPre execStopPost host sshPath;
          enable = true;
          timeout = 90000;
          username = &amp;quot;root&amp;quot;;
          localFilesystem = &amp;quot;rpool&amp;#x2F;safe&amp;quot;;
          remoteFilesystem = &amp;quot;rpool&amp;#x2F;backup&amp;quot;;
          identityFilePath = &amp;quot;PATH_TO_AWS_KEY_PAIR&amp;quot;;
        };
    })
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;That sets up a systemd service that runs after every snapshot. It also
reports the result of the replication to
&lt;a href=&quot;https:&#x2F;&#x2F;www.honeycomb.io&#x2F;&quot;&gt;Honeycomb&lt;&#x2F;a&gt;, which brings us to our next
section...&lt;&#x2F;p&gt;
&lt;h2 id=&quot;observability&quot;&gt;Observability&lt;&#x2F;h2&gt;
&lt;p&gt;The crux of any automated process is it failing silently. This is especially bad
in the context of backups, since you don&#x27;t need them until you do. I solved this
by reporting the result of the replication to Honeycomb after every run. It
reports the &lt;code&gt;$SERVICE_RESULT&lt;&#x2F;code&gt;, &lt;code&gt;$EXIT_CODE&lt;&#x2F;code&gt; and &lt;code&gt;$EXIT_STATUS&lt;&#x2F;code&gt; as returned by
systemd. I then create an alert that fires if there are no successful runs in
the past hour.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;future-work&quot;&gt;Future Work&lt;&#x2F;h2&gt;
&lt;p&gt;While I like this system for being simple, I think there is a bit more work in
making it pure. For one, there should be no more than 1 manual step for setup,
and 1 manual step for tear down. There should also be a similar simplicity in
upgrading&#x2F;downgrading storage space.&lt;&#x2F;p&gt;
&lt;p&gt;For reliability, the archiver instance should scrub its drive on a schedule.
This isn&#x27;t setup yet.&lt;&#x2F;p&gt;
&lt;p&gt;At $0.015 gb&#x2F;month this is relatively cheap, but not the cheapest. According to
&lt;a href=&quot;https:&#x2F;&#x2F;filstats.com&#x2F;&quot;&gt;filstats&lt;&#x2F;a&gt; I could use
&lt;a href=&quot;https:&#x2F;&#x2F;www.filecoin.com&#x2F;&quot;&gt;Filecoin&lt;&#x2F;a&gt; to store data for much less. There&#x27;s no
Block Device interface to this yet, so it wouldn&#x27;t be as simple as ZFS
&lt;code&gt;send&#x2F;recv&lt;&#x2F;code&gt;. You&#x27;d lose the benefits of incremental snapshots. But it may be
possible to build a block device interface on top. Maybe with an &lt;a href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Network_block_device&quot;&gt;nbd-server&lt;&#x2F;a&gt;?&lt;&#x2F;p&gt;
&lt;h2 id=&quot;extra&quot;&gt;Extra&lt;&#x2F;h2&gt;
&lt;p&gt;Bits and pieces that may be helpful if you try setting something similar up.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;setting-host-key-and-nix-configuration-with-userdata&quot;&gt;Setting host key and Nix Configuration with UserData&lt;&#x2F;h3&gt;
&lt;p&gt;NixOS on AWS has this undocumented nifty feature of setting the ssh host
key and a new &lt;code&gt;configuration.nix&lt;&#x2F;code&gt; file straight from the &lt;a href=&quot;https:&#x2F;&#x2F;docs.aws.amazon.com&#x2F;AWSEC2&#x2F;latest&#x2F;APIReference&#x2F;API_UserData.html&quot;&gt;UserData
field&lt;&#x2F;a&gt;.
This lets you one, be sure that your SSH connection isn&#x27;t being
&lt;a href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Man-in-the-middle_attack&quot;&gt;MITM&lt;&#x2F;a&gt;, and two, configure
the machine in a simple way. I use this feature to set the SSH host key and set
the machine up with ZFS and the the &lt;code&gt;lz4&lt;&#x2F;code&gt; compression package.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;questions-comments&quot;&gt;Questions? Comments?&lt;&#x2F;h3&gt;
&lt;p&gt;Email me if you set this system up. This is purposely not a tutorial, so you may
hit snags. If you think something could be clearer feel free to make an
&lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;marcopolo&#x2F;marcopolo.github.io&quot;&gt;edit&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
</content>
	</entry>
	
	<entry xml:lang="en">
		<title>Goodbye, bit rot</title>
		<published>2021-02-01T00:00:00+00:00</published>
		<updated>2021-02-01T00:00:00+00:00</updated>
		<link href="https://marcopolo.io/code/goodbye-bit-rot/" type="text/html"/>
		<id>https://marcopolo.io/code/goodbye-bit-rot/</id>
		<content type="html">&lt;p&gt;Take a look at this picture:&lt;&#x2F;p&gt;
&lt;p&gt;&lt;img src=&quot;https:&#x2F;&#x2F;marcopolo.io&#x2F;code&#x2F;goodbye-bit-rot&#x2F;smalltalk-76.png&quot; alt=&quot;Smalltalk 76&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;That&#x27;s a photo of Smalltalk 76 running the prototypical desktop UI. It&#x27;s
taken for granted that this photo will be viewable for the indefinite future
(or as long as we keep a PNG viewer around). But when we think about code,
maybe the very same Smalltalk code we took this photo of, it&#x27;s assumed that
eventually that code will stop running. It&#x27;ll stop working because of a
mysterious force known as &lt;a href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Software_rot&quot;&gt;bit
rot&lt;&#x2F;a&gt;. Why? It&#x27;s this truly
inevitable? Or can we do better?&lt;&#x2F;p&gt;
&lt;h2 id=&quot;we-can-do-better&quot;&gt;We can do better&lt;&#x2F;h2&gt;
&lt;p&gt;Bit rot often manifests in the case where some software &lt;em&gt;A&lt;&#x2F;em&gt; relies on a certain
configured environment. Imagine &lt;em&gt;A&lt;&#x2F;em&gt; relies on a shared library &lt;em&gt;B&lt;&#x2F;em&gt;. As time
progresses, the shared library &lt;em&gt;B&lt;&#x2F;em&gt; can (and probably will) be updated
independently of &lt;em&gt;A&lt;&#x2F;em&gt;. Thus breaking &lt;em&gt;A&lt;&#x2F;em&gt;. But what if &lt;em&gt;A&lt;&#x2F;em&gt; could say it
explicitly depends on version &lt;em&gt;X.Y.Z&lt;&#x2F;em&gt; of &lt;em&gt;B&lt;&#x2F;em&gt;, or even better yet, the version
of the library that hashes to the value &lt;code&gt;0xBADCOFFEE&lt;&#x2F;code&gt;. Then you break the
implicit dependency of a correctly configured environment. &lt;em&gt;A&lt;&#x2F;em&gt; stops
depending on the world being in a certain state. Instead, &lt;em&gt;A&lt;&#x2F;em&gt;
&lt;em&gt;explicitly defines&lt;&#x2F;em&gt; what the world it needs should look like.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;enter-nix&quot;&gt;Enter Nix&lt;&#x2F;h2&gt;
&lt;p&gt;This is what &lt;a href=&quot;https:&#x2F;&#x2F;nixos.org&#x2F;&quot;&gt;Nix&lt;&#x2F;a&gt; gives you. A way to explicitly define
what a piece of software needs to build and run. Here&#x27;s an example of the
definition on how to build the &lt;a href=&quot;https:&#x2F;&#x2F;www.gnu.org&#x2F;software&#x2F;hello&#x2F;&quot;&gt;GNU
Hello&lt;&#x2F;a&gt; program:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;nix&quot; class=&quot;language-nix &quot;&gt;&lt;code class=&quot;language-nix&quot; data-lang=&quot;nix&quot;&gt;with (import &amp;lt;nixpkgs&amp;gt; {});
derivation {
  name = &amp;quot;hello&amp;quot;;
  builder = &amp;quot;${bash}&amp;#x2F;bin&amp;#x2F;bash&amp;quot;;
  args = [ .&amp;#x2F;builder.sh ];
  buildInputs = [ gnutar gzip gnumake gcc binutils-unwrapped coreutils gawk gnused gnugrep ];
  src = .&amp;#x2F;hello-2.10.tar.gz;
  system = builtins.currentSystem;
}
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;It&#x27;s not necessary to explain this &lt;a href=&quot;https:&#x2F;&#x2F;nixos.org&#x2F;guides&#x2F;nix-pills&#x2F;generic-builders.html#idm140737320275008&quot;&gt;code in
detail&lt;&#x2F;a&gt;.
It&#x27;s enough to point out that &lt;code&gt;buildInputs&lt;&#x2F;code&gt; defines what the environment should
contain (i.e. it should contain &lt;code&gt;gnutar&lt;&#x2F;code&gt;, &lt;code&gt;gzip&lt;&#x2F;code&gt;, &lt;code&gt;gnumake&lt;&#x2F;code&gt;, etc.). And the
versions of these dependencies are defined by the current version of
&lt;code&gt;&amp;lt;nixpkgs&amp;gt;&lt;&#x2F;code&gt;. These dependencies can be further pinned (or &lt;em&gt;locked&lt;&#x2F;em&gt; in the
terminology of some languages like Javascript and Rust) to ensure that this
program will always be built with the same exact versions of its dependencies.
This extends to the runtime as well. This means you can run two different
programs that each rely on a different &lt;code&gt;glibc&lt;&#x2F;code&gt;. Or to bring it back to our
initial example, software &lt;em&gt;A&lt;&#x2F;em&gt; will always run because it will always use the
same exact shared library &lt;em&gt;B&lt;&#x2F;em&gt;.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;a-concrete-example-this-will-never-bit-rot&quot;&gt;A concrete example. This will never bit rot.&lt;&#x2F;h2&gt;
&lt;p&gt;To continue our Smalltalk theme, here&#x27;s a &amp;quot;Hello World&amp;quot; program that, barring a
fundamental change in how Nix Flakes works, will work forever&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#1&quot;&gt;1&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt; on an x86_64
linux machine.&lt;&#x2F;p&gt;
&lt;p&gt;The definition of our program, &lt;code&gt;flake.nix&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;nix&quot; class=&quot;language-nix &quot;&gt;&lt;code class=&quot;language-nix&quot; data-lang=&quot;nix&quot;&gt;{
  inputs.nixpkgs.url = &amp;quot;github:NixOS&amp;#x2F;nixpkgs&amp;#x2F;nixos-20.09&amp;quot;;
  outputs =
    { self, nixpkgs }:
    let
      pkgs = nixpkgs.legacyPackages.x86_64-linux;
    in
    {
      defaultPackage.x86_64-linux = pkgs.writeScriptBin &amp;quot;hello-smalltalk&amp;quot; &amp;#x27;&amp;#x27;
        ${pkgs.gnu-smalltalk}&amp;#x2F;bin&amp;#x2F;gst &amp;lt;&amp;lt;&amp;lt; &amp;quot;Transcript show: &amp;#x27;Hello World!&amp;#x27;.&amp;quot;
      &amp;#x27;&amp;#x27;;
    };
}
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;The pinned version of all our dependencies, &lt;code&gt;flake.lock&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;json&quot; class=&quot;language-json &quot;&gt;&lt;code class=&quot;language-json&quot; data-lang=&quot;json&quot;&gt;{
  &amp;quot;nodes&amp;quot;: {
    &amp;quot;nixpkgs&amp;quot;: {
      &amp;quot;locked&amp;quot;: {
        &amp;quot;lastModified&amp;quot;: 1606669556,
        &amp;quot;narHash&amp;quot;: &amp;quot;sha256-9rlqZ5JwnA6nK04vKhV0s5ndepnWL5hpkaTV1b4ASvk=&amp;quot;,
        &amp;quot;owner&amp;quot;: &amp;quot;NixOS&amp;quot;,
        &amp;quot;repo&amp;quot;: &amp;quot;nixpkgs&amp;quot;,
        &amp;quot;rev&amp;quot;: &amp;quot;ae47c79479a086e96e2977c61e538881913c0c08&amp;quot;,
        &amp;quot;type&amp;quot;: &amp;quot;github&amp;quot;
      },
      &amp;quot;original&amp;quot;: {
        &amp;quot;owner&amp;quot;: &amp;quot;NixOS&amp;quot;,
        &amp;quot;ref&amp;quot;: &amp;quot;nixos-20.09&amp;quot;,
        &amp;quot;repo&amp;quot;: &amp;quot;nixpkgs&amp;quot;,
        &amp;quot;type&amp;quot;: &amp;quot;github&amp;quot;
      }
    },
    &amp;quot;root&amp;quot;: {
      &amp;quot;inputs&amp;quot;: {
        &amp;quot;nixpkgs&amp;quot;: &amp;quot;nixpkgs&amp;quot;
      }
    }
  },
  &amp;quot;root&amp;quot;: &amp;quot;root&amp;quot;,
  &amp;quot;version&amp;quot;: 7
}
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;copy those files into a directory and run it:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;bash&quot; class=&quot;language-bash &quot;&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;❯ nix run
Hello World!
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;h2 id=&quot;solid-foundations&quot;&gt;Solid Foundations&lt;&#x2F;h2&gt;
&lt;p&gt;With Nix, we can make steady forward progress. Without fear that our foundations
will collapse under us like sand castles. Once we&#x27;ve built something in Nix we
can be pretty sure it will work for our colleague or ourselves in 10 years. Nix
is building a solid foundation that I can no longer live without.&lt;&#x2F;p&gt;
&lt;p&gt;If you haven&#x27;t used Nix before, here&#x27;s your call to action:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Nix&#x27;s homepage: &lt;a href=&quot;https:&#x2F;&#x2F;nixos.org&#x2F;&quot;&gt;https:&#x2F;&#x2F;nixos.org&#x2F;&lt;&#x2F;a&gt;&lt;&#x2F;li&gt;
&lt;li&gt;Nix&#x27;s Learning page: &lt;a href=&quot;https:&#x2F;&#x2F;nixos.org&#x2F;learn&quot;&gt;https:&#x2F;&#x2F;nixos.org&#x2F;learn&lt;&#x2F;a&gt;&lt;&#x2F;li&gt;
&lt;li&gt;Learn Nix in little bite-sized pills: &lt;a href=&quot;https:&#x2F;&#x2F;nixos.org&#x2F;guides&#x2F;nix-pills&#x2F;&quot;&gt;https:&#x2F;&#x2F;nixos.org&#x2F;guides&#x2F;nix-pills&#x2F;&lt;&#x2F;a&gt;&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;hr &#x2F;&gt;
&lt;h2 id=&quot;disclaimer&quot;&gt;Disclaimer&lt;&#x2F;h2&gt;
&lt;p&gt;There are various factors that lead to bit rot. Some are easier to solve than
others. For the purpose of this post I&#x27;m only considering programs that are
roughly self contained. For example, if a program relies on hitting a specific
Google endpoint, the only way to use this program would be to emulate the whole
Google stack or rely on that &lt;a href=&quot;https:&#x2F;&#x2F;gcemetery.co&#x2F;&quot;&gt;endpoint existing&lt;&#x2F;a&gt;.
Sometimes it&#x27;s doable to emulate the external API, and sometimes it isn&#x27;t. This
post is specifically about cases where it is feasible to emulate the external API.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;footnotes&quot;&gt;Footnotes&lt;&#x2F;h3&gt;
&lt;div class=&quot;footnote-definition&quot; id=&quot;1&quot;&gt;&lt;sup class=&quot;footnote-definition-label&quot;&gt;1&lt;&#x2F;sup&gt;
&lt;p&gt;Okay forever is a really long time. And this will likely not run forever. But why? The easy reasons are: &amp;quot;Github is down&amp;quot;, &amp;quot;A source tarball you need can&#x27;t be fetched from the internet&amp;quot;, &amp;quot;x86_64 processors can&#x27;t be found or emulated&amp;quot;. But what&#x27;s a weird reason that this may fail in the future? It&#x27;ll probably be hard to predict, but maybe something like: SHA256 has been broken and criminals and&#x2F;or pranksters have published malicious packages that match a certain SHA256. So build tools that rely on a deterministic and hard to break hash algorithm like SHA256 (like what Nix does) will no longer be reliable. That would be a funny future. Send me your weird reasons: &lt;code&gt;&amp;quot;marco+forever&amp;quot; ++ &amp;quot;@marcopolo.io&amp;quot;&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;div&gt;
</content>
	</entry>
	
	<entry xml:lang="en">
		<title>Nix and small containers with Docker multi-stage builds</title>
		<published>2020-05-15T00:00:00+00:00</published>
		<updated>2020-05-15T00:00:00+00:00</updated>
		<link href="https://marcopolo.io/code/nix-and-small-containers/" type="text/html"/>
		<id>https://marcopolo.io/code/nix-and-small-containers/</id>
		<content type="html">&lt;p&gt;Multi Stage builds are great for minimizing the size of your container. The
general idea is you have a stage as your builder and another stage as your
product. This allows you to have a full development and build container while
still having a lean production container. The production container only carries
its runtime dependencies.&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;dockerfile&quot; class=&quot;language-dockerfile &quot;&gt;&lt;code class=&quot;language-dockerfile&quot; data-lang=&quot;dockerfile&quot;&gt;FROM golang:1.7.3
WORKDIR &amp;#x2F;go&amp;#x2F;src&amp;#x2F;github.com&amp;#x2F;alexellis&amp;#x2F;href-counter&amp;#x2F;
RUN go get -d -v golang.org&amp;#x2F;x&amp;#x2F;net&amp;#x2F;html
COPY app.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR &amp;#x2F;root&amp;#x2F;
COPY --from=0 &amp;#x2F;go&amp;#x2F;src&amp;#x2F;github.com&amp;#x2F;alexellis&amp;#x2F;href-counter&amp;#x2F;app .
CMD [&amp;quot;.&amp;#x2F;app&amp;quot;]
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;(from Docker&#x27;s &lt;a href=&quot;https:&#x2F;&#x2F;docs.docker.com&#x2F;develop&#x2F;develop-images&#x2F;multistage-build&#x2F;&quot;&gt;docs on multi-stage&lt;&#x2F;a&gt;)&lt;&#x2F;p&gt;
&lt;p&gt;Sounds great, right? What&#x27;s the catch? Well, it&#x27;s not always easy to know what the
runtime dependencies are. For example you may have installed something in &#x2F;lib
that was needed in the build process. But it turned out to be a shared library
and now it needs to be included in the production container. Tricky! Is there
some automated way to know all your runtime dependencies?&lt;&#x2F;p&gt;
&lt;h2 id=&quot;enter-nix&quot;&gt;Enter Nix&lt;&#x2F;h2&gt;
&lt;p&gt;&lt;a href=&quot;https:&#x2F;&#x2F;nixos.org&#x2F;&quot;&gt;Nix&lt;&#x2F;a&gt; is a functional and immutable package manager. It works great for
reproducible builds. It keeps track of packages and their dependencies via their
content hashes. And, relevant for this exercise, it also keeps track of the
dependencies of a built package. That means we can use Nix to build our project
and then ask Nix what our runtime dependencies are. With that information we can
copy just those files to the product stage of our multi-stage build and end up
with the smallest possible docker container.&lt;&#x2F;p&gt;
&lt;p&gt;Our general strategy will be to use a Nix builder to build our code. Ask the Nix
builder to tell us all the runtime dependencies of our built executable. Then
copy the executable with all it&#x27;s runtime dependencies to a fresh container. Our
expectation is that this will result in a minimal production container.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;example&quot;&gt;Example&lt;&#x2F;h2&gt;
&lt;p&gt;As a simple example let&#x27;s package a &amp;quot;Hello World&amp;quot; program in Rust. The code is
what you&#x27;d expect:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;rust&quot; class=&quot;language-rust &quot;&gt;&lt;code class=&quot;language-rust&quot; data-lang=&quot;rust&quot;&gt;pub fn main() {
    println!(&amp;quot;Hello, world!&amp;quot;);
}
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;h3 id=&quot;nix-build-expression&quot;&gt;Nix build expression&lt;&#x2F;h3&gt;
&lt;p&gt;If we were just building this locally, we&#x27;d just run &lt;code&gt;cargo build --release&lt;&#x2F;code&gt;.
But we are going to have Nix build this for us so that it can track the runtime
dependencies. Therefore we need a &lt;code&gt;default.nix&lt;&#x2F;code&gt; file to describe the build
process. Our &lt;code&gt;default.nix&lt;&#x2F;code&gt; build file looks like this:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;nix&quot; class=&quot;language-nix &quot;&gt;&lt;code class=&quot;language-nix&quot; data-lang=&quot;nix&quot;&gt;with (import &amp;lt;nixpkgs&amp;gt; {});
rustPlatform.buildRustPackage {
  name = &amp;quot;hello-rust&amp;quot;;
  buildInputs = [ cargo rustc ];
  src = .&amp;#x2F;.;
  # This is a shasum over our crate dependencies
  cargoSha256 = &amp;quot;1s4vg081ci6hskb3kk965nxnx384w8xb7n7yc4g93hj55qsk4vw5&amp;quot;;
  # Use this to figure out the correct Sha256
  # cargoSha256 = lib.fakeSha256;
  buildPhase = &amp;#x27;&amp;#x27;
    cargo build --release
  &amp;#x27;&amp;#x27;;
  checkPhase = &amp;quot;&amp;quot;;
  installPhase = &amp;#x27;&amp;#x27;
    mkdir -p $out&amp;#x2F;bin
    cp target&amp;#x2F;release&amp;#x2F;hello $out&amp;#x2F;bin
  &amp;#x27;&amp;#x27;;
}
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Breaking down the Nix expression: we specify what our inputs our to our
build: &lt;code&gt;cargo&lt;&#x2F;code&gt; and &lt;code&gt;rustc&lt;&#x2F;code&gt;; we figure out what the sha256sum is of our crate
dependencies; and we define some commands to build and install the executable.&lt;&#x2F;p&gt;
&lt;p&gt;We can verify this works locally on our machine by running &lt;code&gt;nix-build .&lt;&#x2F;code&gt;
(assuming you have Nix installed locally). You&#x27;ll end up with a symlink named
result that points the compiled executable residing in &#x2F;nix&#x2F;store. Running
&lt;code&gt;.&#x2F;result&#x2F;bin&#x2F;hello&lt;&#x2F;code&gt; should print &amp;quot;Hello, world!&amp;quot;.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;docker-file&quot;&gt;Docker file&lt;&#x2F;h3&gt;
&lt;p&gt;Now that we&#x27;ve built our Nix expression that defines how the code is built, we
can add Docker to the mix. The goal is to have a builder stage that runs the
nix-build command, then have a production stage that copies the executable and
its runtime dependencies from builder. The production stage container will
therefore have only the minimal amount of stuff needed to run.&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;Dockerfile&quot; class=&quot;language-Dockerfile &quot;&gt;&lt;code class=&quot;language-Dockerfile&quot; data-lang=&quot;Dockerfile&quot;&gt;# Use nix as the builder
FROM nixos&amp;#x2F;nix:latest AS builder

# Update the channel so we can get the latest packages
RUN nix-channel --update nixpkgs

WORKDIR &amp;#x2F;app

# Run the builder first without our code to fetch build dependencies.
# This will fail, but that&amp;#x27;s okay. We just want to have the build dependencies
# cached as a layer. This is just a caching optimization that can be removed.
COPY default.nix .
RUN nix-build . || true

COPY . .

# Now that our code is here we actually build it
RUN nix-build .

# Copy all the run time dependencies into &amp;#x2F;tmp&amp;#x2F;nix-store-closure
RUN mkdir &amp;#x2F;tmp&amp;#x2F;nix-store-closure
RUN echo &amp;quot;Output references (Runtime dependencies):&amp;quot; $(nix-store -qR result&amp;#x2F;)
RUN cp -R $(nix-store -qR result&amp;#x2F;) &amp;#x2F;tmp&amp;#x2F;nix-store-closure

ENTRYPOINT [ &amp;quot;&amp;#x2F;bin&amp;#x2F;sh&amp;quot; ]

# Our production stage
FROM scratch
WORKDIR &amp;#x2F;app
# Copy the runtime dependencies into &amp;#x2F;nix&amp;#x2F;store
# Note we don&amp;#x27;t actually have nix installed on this container. But that&amp;#x27;s fine,
# we don&amp;#x27;t need it, the built code only relies on the given files existing, not
# Nix.
COPY --from=builder &amp;#x2F;tmp&amp;#x2F;nix-store-closure &amp;#x2F;nix&amp;#x2F;store
COPY --from=builder &amp;#x2F;app&amp;#x2F;result &amp;#x2F;app
CMD [&amp;quot;&amp;#x2F;app&amp;#x2F;bin&amp;#x2F;hello&amp;quot;]
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;If we build this &lt;code&gt;Dockerfile&lt;&#x2F;code&gt; with &lt;code&gt;docker build .&lt;&#x2F;code&gt;, we&#x27;ll end up with an 33MB
container. Compare this to a naive
&lt;a href=&quot;https:&#x2F;&#x2F;gist.github.com&#x2F;MarcoPolo&#x2F;7953f1ca2691405b5b04659027967336&quot;&gt;Dockerfile&lt;&#x2F;a&gt;
where we end up with a 624 MB container! That&#x27;s an order of magnitude smaller
for a relatively simple change.&lt;&#x2F;p&gt;
&lt;p&gt;Note that our executable has a shared library dependency on libc. Alpine
linux doesn&#x27;t include libc, but this still works. How? When we build our code we
reference the libc shared library stored inside &lt;code&gt;&#x2F;nix&#x2F;store&lt;&#x2F;code&gt;. Then when we copy
the executable nix tells us that the libc shared library is also a dependency so
we copy that too. Our executable uses only the libc inside &lt;code&gt;&#x2F;nix&#x2F;store&lt;&#x2F;code&gt; and
doesn&#x27;t rely on any system provided libraries in &lt;code&gt;&#x2F;lib&lt;&#x2F;code&gt; or elsewhere.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;&#x2F;h2&gt;
&lt;p&gt;With a simple Nix build expression and the use of Docker&#x27;s multi stage builds we
can use Docker&#x27;s strength of providing a consistent and portable environment
with Nix&#x27;s fine grained dependency resolution to create a minimal production
container.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;a-note-on-statically-linked-executables&quot;&gt;A note on statically linked executables&lt;&#x2F;h2&gt;
&lt;p&gt;Yes, you could build the hello world example as a statically linked musl-backed
binary. But that&#x27;s not the point. Sometimes code relies on a shared library, and
it&#x27;s just not worth or impossible to convert it. The beauty of this system is
that it doesn&#x27;t matter if the output executable is fully statically linked or
not. It will work just the same and copy over the minimum amount of code needed
for the production container to work.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;a-note-on-nix-s-dockertools&quot;&gt;A note on Nix&#x27;s dockerTools&lt;&#x2F;h2&gt;
&lt;p&gt;Nix proves a set of functions for creating Docker images:
&lt;a href=&quot;https:&#x2F;&#x2F;nixos.org&#x2F;nixpkgs&#x2F;manual&#x2F;#sec-pkgs-dockerTools&quot;&gt;pkgs.dockerTools&lt;&#x2F;a&gt;. It&#x27;s
very cool, and I recommend checking it. Unlike docker it produces
deterministic images. Note, for all but the simplest examples, KVM is required.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;a-note-on-bazel-s-rules-docker&quot;&gt;A note on Bazel&#x27;s rules_docker&lt;&#x2F;h2&gt;
&lt;p&gt;I don&#x27;t know much about this, but I&#x27;d assume this would be similar to what I&#x27;ve
described. If you know more about this, please let me know!&lt;&#x2F;p&gt;
</content>
	</entry>
</feed>
