[remark] We need deterministic installs, not just immutable OSs -- Volution Notes

These days I see a lot of talk about immutable Linux distributions (like various Fedora, OpenSUSE or Ubuntu spinoffs), some targeting desktops, some servers, while others containers. All the talk is about reliability and maintability, because any updates are done atomically, any failures can be easily rollbacked, and user-state is kept outside the OS file-system -- except for /etc which is "merged", and except for /var which is completely ignored. (A good recent article on the subject is Introduction to immutable Linux systems.)

Which I would say it sounds very nice, and I'm glad to see some positive progress in the unices landscape (Linux and BSD alike).

Because I still remember how Windows used to (perhaps still needs to?) be reinstalled every few months because cruft would pile on, and the whole thing crawled to a halt or crashed constantly. And I still struggle with Linux deployments that slowly drift away from a pristine installation with each new update and with each new configuration change (with or without Ansible-like tools).

However I'm not overly enthusiastic as I'll shortly describe.

Let me start first with a parallel: my favorite topic of containers. The technology is great in principle, however it is terrible in practice.

Instead of the container technology nudging people into creating small deployment bases with only the minimum required files to support the use-case at hand, it has helped "devops engineers" bundle even the proverbial kitchen sink with their applications.

(I am aware that there are initiatives to create small language tailored containerized runtimes; however, nothing beats in practice the good old apt install this-package that-package the-whole-internet-meta-package...)

Thus, my enthusiasm might be flamed by the possibilities enabled by the technology, but I'm quickly reminded by history that we don't always use the tools at our disposal in the most proper manner...

Getting back to our immutable OSs, my main issue is mainly with how they are assembled. (And my objections apply to all intended scenarios, from desktop, to container, to embedded.)

Namely, regardless of the chosen distribution (perhaps except for NixOS, but I'll have to learn more about NixOS to be sure) they all basically snapshot the outcome of a regular distribution package install and call the outcome the immutable OS "image" or "snapshot" or "layer" or "whatever-your-fancy".

Sure, to have an immutable OS there must be a lot of other supporting infrastructure (both in code and in services), there are certainly a lot of changes to the OS internals (especially bootup scripts), however ignoring all that, they are plain snapshots of regular distributions with some magic sprinkled here and there.

(I won't even mention that they make the usage of "application containers" like Flatpack or Snap mandatory for any kind of desktop application, or "proper containers", like Podman or Docker or any of their wrappers, for any development work...)

The only differentiator, besides which Linux distribution they are based on, is how that snapshot is made, how it's customized (by example for installing new packages, if even at all possible), how it's updated and how it's distributed and installed. Most use OSTree or rpm-ostree, others use BTRFS snapshots, meanwhile others use plain disk images.

What would I like instead?

I think it is past the time for distributions to take a leap forward into the future and change the way software packages are deployed. (I would have preferred to also leap forward into the future with regard to how these packages are built, but that's a much tougher battle...)

But first, I need to make a detour and describe where we are today within the landscape of software packaging and deployment.

For the sake of this article, I'm just going to ignore how the packages are actually built. I would just assume that there is a magic tool that enables us to build-package --package tool-v42 --output ./directory, and in a blink of an eye, we get our build artifacts nicely tucked into ./directory.

At the moment all package managers, from rpm to dpkg to apk to pacman, do the following (I'll be generalizing a bit):

they assume that ./directory contains a metadata file describing the package (name, version, dependencies, etc.);
they assume that ./directory contains all the files that need to be copied into the target installation file-system;
they assume that ./directory might also contain some hooks; these are scripts to be executed before / after each operation of installing / upgrading / uninstalling / etc.;
they take all these and somehow bundle all into a single "binary package" file that is then published via a package repository;

Then at install time all package managers do the following (again, I'll be generalizing):

(let's ignore for the moment the way the package is discovered in the package repository, the way it is fetched and verified;)
they take the binary package file and extract it somewhere into a temporary ./directory;
looking at the metadata file, they first recurse this process for any of the dependencies;
they take the files and copy them into the target installation file-system;
they run the various hooks;

I would be very glad if anyone that writes me an email and points out some package manager that doesn't follow exactly this pattern. Sure, there is Clear's swupd which works with multiple packages as "bundles", and perhaps NixOS which I'll have to check if it supports hooks.

A reader pointed out to me that Haiku uses disk images instead of archive packages that are mounted read-only.

So what is the problem with any of the above?

Not much, with the exception of those hooks...

Because you see, these hooks are plain sh scripts that can do basically anything on your system, which means:

they might be unreliable and break in subtle ways;
For example, on my OpenSUSE distributions I symlink /etc/profile to /dev/null and I add set -e -E -u -o pipefail to /etc/bash.bashrc which means that many rpm post-install hooks error out for some reason or another...
Thus, you can't always be sure that the hook executed successfully or not, you have to double-check.
they should be (but most aren't) idempotent;
In theory, running the same hook once, twice, or any number of times should yield exactly the same outcome. Obviously, this doesn't happen in practice. Some hooks run echo 'something' >> /etc/conf, some other fail on a second execution, etc.
they should be (but most aren't) reversible;
In theory the uninstall hook should completely undo what the install hook has previously done, thus running any sequence of install / uninstall hooks (as in i u i i u u u i u, or i u u u i u) should yield exactly the same outcome as not running any hook. Obviously this doesn't happen as some hooks run sed -e '...' -i /etc/conf which is hard to reverse.
they should be (but most aren't) reproducible;
In theory, if one starts from the same state (say a snapshot of the target file-system) and one runs the same sequence of package installs in separate sessions, each session outcome should be identical (in the cryptographic checksum sense of the word).
In practice this doesn't happen at all, if not for the other reasons, then because just the simple act of creating a file is non-reproducible as it has a different timestamp (which of course could be fixed, but it is not).
However, letting the above aside, in practice many of these install hooks download stuff from the internet or create files whose contents is random (like sshd server keys).
they should be (but most aren't) deterministic;
This is a stronger case of the reproducibility property. Running these hooks should have a canonical defined order, so that no matter the arguments I give to the install tool, the same hooks are always run in the same order.

So, to wrap things up, the mere existence of such package hooks has the following consequences for an installation:

an installation is unreliable;
an installation is not reproducible;
an installation is not deterministic (in the strong sense);
(we don't even care about idempotency or reversibility at the moment;)

Here are a few links about the kinds of hooks in various distributions:
Debian documentation
Fedora documentation
ArchLinux documentation
RPM manual

Thus, getting back to our immutable OSs, because all of them are the outcome of a classical installation process that relies on hooks, they are by consequence: unreliable, unreproducible, nondeterministic...

That can't be so, right?

Indeed, it isn't so for the simple reason of release engineering. Namely, when releasing a new snapshot of an immutable OS one doesn't just blindly run the install and snapshot that; instead one runs a battery of tests to see if the outcome is sane, and if all "tastes and smells right" it is then released.

So, how can we fix this?

(And as a direct consequence, not only fix the immutable OS release quality, but also improve the quality of container-based deployments, and even classical package based distributions, which could benefit greatly from all of this.)

First of all, let's see what use-cases these hooks serve, and see if we can't provide a better way to handle these.

To my knowledge (based on my experience) these hooks solve the following broad functions:

create OS-managed entities like users, groups, and their membership, fstab, etc.; this has already been nicely solved by systemd by simply dropping special unit files in /etc, and after the install recreate these file-backed OS databases; (even better, we could change the OS to directly support these drop-in files out of the box;)
manipulate OS-level services (especially enabling); this has again been nicely solved by systemd with its unit files, by just dropping these files in /etc;
override or extend some configurations; many tools have added support for conf.d folders, thus drop-in files are again the solution;
create symlinks or hardlinks to installation files; (for example, this is something that Alpine's busybox package needs to do, else the system is unusable;) this use-case is already served by having the symlinks as part of the package;
creating executable or library aliases for packages that provide alternatives to other similar packages; like in the case of Alpine, some tools can be provided by busybox, meanwhile a more complete variant could be provided by util-linux, etc.) this could be solved by providing optional prioritized overlays (as in files bundled together to be copied in the target file-system), and during install check if any lower-priority overlay conflicts with any higher-priority one, and if so warn or don't apply; (alternatively hoist these overlays as first-class packages;)
creating installation unique state files, like for example initializing sshd keys, initializing mysql or postgresql databases, etc.; this could be handled by the service startup scripts themselves (and most are doing just this);
am I missing out some other use-case?

Thus, by removing the concept of package hooks, and improving package conflict detection (which shouldn't really happen, because that is why our package managers are full-blown SAT solvers capable of solving Sudoku puzzles), we can truly have immutable OSs.

Not only immutable in the sense of "we always boot from the same immutable snapshot", but also immutable in the stronger sense of "regardless of how many times we try to create that immutable snapshot, we always deterministically get the same outcome".

What prompted me to write this article?

I've been experimenting with OSTree and various distributions (Alpine and OpenSUSE) to see how easy it is to create (as a non-root user) extremely lightweight containers to be run with bwrap. And obviously, because I do want to have deterministic outcomes, I've opted out from executing package hooks...

As a consequence I've failed with Alpine, because without running busybox's post install hook I get a completely broken system, because many basic UNIX tools (like basename and such) are only installed as symlinks to busybox after running the post install hook. (Obviously these symlinks could have been part of the package itself, or perhaps an additional package, but unfortunately this is not the case...)

Thus, I couldn't manage to have Erlang or Elixir running in such a container, because their bootstrap scripts are actually sh scripts with a lot of magic in them... But this is a topic for the next article...

What others have to say on the subject:
Introduction to immutable Linux systems
Linux distributions: Can we do without hooks and triggers?
Immutable -> reprovisionable, anti-hysteresis