Reproducible Builds with Nix

Reproducible builds - also known as deterministic builds - are the holy grail of a lot of organizations that care about the authenticity and security of their products. In simple terms it's the ability to produce the exactly same binary build using the same code version and parameters, regardless of where and when it is built, down to a single bit.

The main reason why this is useful is that anyone who might doubt the authenticity of a software release - or just wants to make sure they are getting exactly what they want - can build the software themselves and expect exactly the same result.

Reproducible Android Builds

Because of the complexity of our stacks here at Status to achieve deterministic builds we needed something else in addition to the multiple Package managers we use:

While Yarn is deterministic because it uses a lock file which ensures your dependency versions will not change without you knowing, the other two are much more problematic.

Both Maven and Gradle handle Java dependencies using something called POM files: Project Object Model files. These XML files define project dependencies, their types, and relations. The issue with that is that they often do not specify the exact version required. In addition to that they are fetched from Maven Repositories at build time, and can change depending on when then they are downloaded. Same goes for the maven-metadata.xml files which define latest versions of a package available.

The result of this are non-deterministic builds which use different collection of dependencies when built at different times on different machines. This can also be affected by local Maven cache.

The Solution

In order to control all dependencies and versions of tools used during build a complex solution was necessary. That solution is the Nix package manager.

Nix is a tool which uses a subset of Haskell programming language to define the entire tree of software dependencies necessary to manage an entire Linux operating system: NixOS.

The language allows a developer to define everything that is necessary to build a piece of software in a fully deterministic manner. This includes:

  • Sources
  • Patches
  • Compilers
  • Build Scripts
  • Dependencies & Libraries
  • Environment Variables
  • Other Tools

Because of this all of the variables involved in a build of yours software are controlled for. This includes everything, including - for example - build time, which is always set to 0 Unix time, meaning zero seconds since 00:00:00 UTC on 1 January 1970.

The Example

As an example let's take the simplest program we can get: Hello World written in C.

#include <stdio.h>

int main()
{
  printf("Hello World");
  return 0;
}

This code is available from the collection of Hello World programs at www.helloworld.org.

In Nix we build software using a derivation. We'll make a simple one to build our helloworld.c program:

{ pkgs ? import <nixpkgs> { } }:

let
  inherit (pkgs) stdenv gcc fetchurl;
in stdenv.mkDerivation {
  pname = "hello";
  version = "1.0";

  buildInputs = [ gcc ];

  src = fetchurl {
    url = https://www.helloworld.org/data/helloworld.c;
    sha256 = "1syr8snddx5v71arsvv205ka82qljhjg2424yylrp5rymr049w69";
  };

  buildPhases = ["unpackPhase" "buildPhase" "installPhase"];

  unpackPhase = ''
    cp $src ./hello.c
  '';

  buildPhase = ''
    gcc -o hello hello.c
  '';

  installPhase = ''
    mkdir -p $out/bin
    cp hello $out/bin/
  '';
}

If we run nix-build on this file, named default.nix we'll get the resulting hello binary in /nix/store, which is where all nix build results and inputs are stored:

 > nix-build default.nix                                      
these derivations will be built:
  /nix/store/hzd4ci09wj4bdif462hjwh8imdg6lrl6-hello-1.0.drv
building '/nix/store/hzd4ci09wj4bdif462hjwh8imdg6lrl6-hello-1.0.drv'...
unpacking sources
patching sources
configuring
no configure script, doing nothing
building
installing
post-installation fixup
shrinking RPATHs of ELF executables and libraries in /nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0
shrinking /nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0/bin/hello
strip is /nix/store/hrkc2sf2883l16d5yq3zg0y339kfw4xv-binutils-2.31.1/bin/strip
stripping (with command strip and flags -S) in /nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0/bin
patching script interpreter paths in /nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0
checking for references to /build/ in /nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0...
/nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0

 > /nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0/bin/hello
Hello World

Step by Step

Build Arguments

At the top of the default.nix file sits the single argument:

{ pkgs ? import <nixpkgs> { } }:

The argument is an attribute set with one key: pkgs

This argument has a default value which is import <nixpkgs> { }. This incantation essentially means we are importing the default nixpkgs, which is a massive Git repository that contains derivations to build a lot of software, including GCC compiler we used in the build.

Preparation

The next let/in block is simply a way to prepare some things we need before the build:

let
  inherit (pkgs) stdenv gcc fetchurl;
in

The stdenv, gcc, and fetchurl variables are actually keys of the pkgs set, so if we did not do this we could simply replace our use of fetchurl later with pkgs.fetchurl.

The Derivation

The mkDerivation line is simply a call of the mkDerivation function:

  stdenv.mkDerivation {

The function is called with a set({ ... }) which arguments are things like pname or version.

Derivation Arguments

Everything within the curly braces after mkDerivation call are the call arguments passed as a set:

stdenv.mkDerivation {
  pname = "hello";
  version = "1.0";

  buildInputs = [ gcc ];
  ...

They define things like buildInputs, which in our case is the gcc compiler coming from nixpkgs. Because it is passed to the build via buildInputs the tools available in /nix/store/blznzy96bwzv58v7iy9bgp1r8hmd3g1f-gcc-wrapper-9.2.0/bin will be made available via PATH environment variable.

Fetching Sources

The next build needs to get the program source from somewhere. In our case we use fetchurl function with the set argument passing url for the source file and sha256 to make sure we are getting what we expect.

  src = fetchurl {
    url = https://www.helloworld.org/data/helloworld.c;
    sha256 = "1syr8snddx5v71arsvv205ka82qljhjg2424yylrp5rymr049w69";
  };

Nix provides many other functions like fetchFromGitHub or fetchFromGitLab for other possible code sources.

Build Phases

The buildPhases key should be quite self-explanatory. It defines the phases that the build will have to run.

  buildPhases = ["unpackPhase" "buildPhase" "installPhase"];
  
  unpackPhase = ''
    cp $src ./hello.c
  '';
  
  buildPhase = ''
    gcc -o hello hello.c
  '';
  
  installPhase = ''
    mkdir -p $out/bin
    cp hello $out/bin/
  '';

The slightly magic elements are $src and $out environment variables.

  • $src - Literally the result of our call to fetchurl:
    • /nix/store/imk4qa4k8rrfmsyckmwsvzd99dlfnp3c-helloworld.c
  • $out - The directory where build result should end up:
    • /nix/store/vrggb8dhxc9cl8330b8xrwfcgkrzsq9w-hello-1.0

This shows another special thing about Nix. All arguments passed to mkDerivation are made available in the shell that executes the build steps as environment variables. This includes ones like buildPhase or installPhase.

The Result

Calling nix-build on our derivation simply takes all of the inputs for mkDerivation and constructs a shell in which all of the buildPhases are executed with all the specified tools and env variables available. Because packages come from nixpkgs repository they can be locked on specific versions, making the builds deterministic and reproducible.

The truth is, Nix already has a hello package defined in nixpkgs, but it's a much more elaborate one called GNU Hello.
You can find its derivation in pkgs/applications/misc/hello/default.nix, which you'll find to be much simpler than ours. That is because of something called genericBuild which does the most common sense steps for packages written in C.

Conclusion

This of course only scratches the surface of what Nix is capable of, and to really understand what can be achieved using Nix a much deeper dive is required.

If you found this interesting, I've organized two presentations on Nix these past two weeks to make our developers more acquainted with Nix package manager, help them use it more, and be able to debug its issues:

You can watch the video or browse the presentation PDFs. Hopefully this will help more people discover the Nix and NixOS, which are very powerful tools for software developers and sysadmins alike.