In Go-land you pay even for what you don't use -- Volution Notes

Prologue

Lately I've been programming both in Go and Rust -- among other languages like Python, Erlang, or Bash (and, yes, Bash is a programming language) -- thus those acquainted with Rust will observe that the title is a hint to one of Rust's major selling points, that of zero cost abstractions, namely that you don't pay for what you don't actually use (right now, but you might use in the future).

However, although the issue I'm about to describe is present to some extent in any programming language -- from Bash (again, yes, Bash is a programming language, although a very bad one), to Java, Python, Ruby, NodeJS, and even Rust -- unfortunately in Go it is made even worse by not having a suitable workaround.

The issue might seem minor, and usually it is so, especially given that only lately when I've focused on the start-up performance of one of my tools have I stumbled upon this issue. And in fact the first time I tackled it was with the overhead introduced by loading some Python modules I didn't always use.

But, with Python I expected this to be an issue, and I knew immediately where to look. With Go less so, and I've done quite a bit of digging to find the problem (but unfortunately no usable solution)...

What are "initializers"?

Starting with Go, package initializers are nothing more than anonymous functions, tagged init, without any arguments, that are executed (in quite a specific order) before the main function of an executable. (I say "anonymous" and "tagged" because inside the same file you can have many such initializers, and they can't be called explicitly from anywhere.) Obviously these are meant for libraries and not executables (which can just call that code at the beginning of main).

What can you do within these initializers? Basically anything you can in a normal function, just that it executes before main. For example a library could initialize some private (but global) variables with something that requires a bit of code to figure out, like configuration files, runtime parameters, credentials, or loading previously computed and then cached data, etc.

Needless to say, the unwritten (but unenforced) rule of these initializers is that they should be short and not stall the start-up needlessly... (Especially since Go has lightweight go-routines that could do the initialization in the background if more effort is required...)

For example:

package somePackage

var someGlobalA someTypeA
var someGlobalB someTypeB

func init () {
    // do some non-trivial, but quick, computation...
    someGlobalA = ...
    someGlobalB = ...
}

However, this feature is not specific to Go, and as mentioned all programming languages have some sort of package / module / library initializers:

Java has class static initializers, again with a rigid execution order;
Python, Ruby, NodeJS, and perhaps most similar interpreted languages, just allow arbitrary code to be executed at the top-level of a module's body; for example as described here for Python, this code will be executed in the same order the modules are being loaded (first dependencies then dependents, and so on recursively);
even Erlang, a functional programming language at heart, has something similar as part of OTP (Erlang's runtime platform);
and I'm certain that by now -- in 2 AP, as in second "anno pandemus", or for those reading in the past the year 2021 AD in the future -- C++ already has found a way to provide weapons of mass feet shooting for its developers;
(I would bet the same goes for C# and the entire .NET ecosystem;)

Regarding the other camp:

Rust, by design, makes sure that there is no life before main;
(from what I know) C (the language) doesn't also provide a way to have non constant static initializers;
moreover, many C-based libraries (and some well-written C++ ones) provide the user with a somelib_init function that must be called before any other calls to their functions;
although GCC (and perhaps many other C compilers and runtimes) do provide a way to have initialization functions (or constructors as they are called); however their main use-case seems to support languages that do require them;
(I would also bet, but not a lot, that more functional languages like Haskel, and certainly Scheme implementations of R7RS, are strict about this;)

But wait, (as the commercial says), there is more...

Most of the languages listed above, not only allow you to provide some means to execute arbitrary code (in fact that feature is more on the advanced developers side), but also provide the developer a way to initialize global variables with arbitrary expressions.

Again, in Go, for example:

package somePackage

var someGlobalC someTypeC = newSomeC ()

func newSomeC () someTypeC {
    // do some non-trivial, but quick, computation...
    return ...
}

Thus, what initially might just seem as some harmless variable initialization, in fact could be a call to a slow grinding wheel computing the answer to the ultimate question (to which we all know the answer is usually 43 due to a one-off bug)...

A slight detour about "finalizers"

Although many languages provide "initializers", few provide "finalizers", as in code that is assured to be run before a program finishes.

The POSIX standard provides atexit(3), GCC does seem to also provide a way to register "destructors", some languages even expose some wrappers for it, but none seem to provide some actual facility that are tailored for this specific use-case... Granted some languages do provide some hooks that trigger when a "scope" finishes -- for example we have defer in Go, and destructors for Rust and C++ -- however these are meant with another use-case in mind.

Strange... Perhaps that's why many servers don't provide a clean way to be shutdown, and instead one has to clobber them with a definitive SIGKILL after asking nicely in vain with SIGTERM, SIGINT and other SIG-perhaps-this-works signals...

Also, as a funny irony, for one application in Rust I encountered a situation where calling the finalizers (i.e. destructors) took quite some time compared to the total execution time, and I had to resort to forcibly exiting the process without waiting for Rust to properly finalize.

So what's the issue?

Seeing that almost every programming language under the sun provides a way to sneak some code to be executed before main, one would ask why I am complaining about this, and especially why I have something against how it's implemented in Go?

Well for starters, Go is not the main culprit. Go doesn't force anyone to use the initializers features, neither does any other language for that mater.

Moreover, there is the question of utility: if I import a library doesn't it mean I also intend to use it? After all Go does an outstanding job of making sure (through compilation errors) that there are no unused imports.

And here is where the argument (and especially Go's implementation) starts to show its weakness...

Imagine for a moment we are building a "do everything within a single statically linked binary" tool.

A side-note about single-binary do-all tools

Why would one want to do this? (That is bundle a set of slightly related tools inside the same executable.)

Because it's one of the places where Go really shines, as a systems programming language, especially today with so many operating systems, let alone Linux and BSD distributions, flavors and architectures. Thus given how easily I can build an OSX, BSD or even Windows executable from my Linux, regardless if I use an Intel processor and I target Apple's new M1 ARM processor, and corroborated with the fact that I don't want to dance the rpm or deb polka (not to speak about the Windows MSI or OSX DMG baroque dance), I would prefer to just throw one binary for each supported platform on GitHub's release page and be done with that.

And it seems this isn't a new or isolated trend, as lots of Go projects ship this way:

starting with (the C-based) Busybox (that I believe powers at least 75% of all world-wide home and small-business networking appliances), and its BSD alternative ToyBox;
Cloudflare's cloudflared or flarectl;
Netlify's former netlifyctl;
open source projects like Hugo or Caddy;
(and I could keep on naming similar tools, written from Python to Erlang;)
(heck, if I think better, this is exactly what the FlatPack and AppImage projects are selling;)

What all of these have in common? One (usually statically linked, or self-contained / self-extracting) binary that can be thrown anywhere on the file-system, and as long as it's in the $PATH, can easily be used.

What else do they have in common? Lots of slightly related functionality, which I bet are full of initializers...

Getting back to our initializers...

So in this case it seems that even though one imports lots of different libraries, they aren't used all at the same time; at best one or two major libraries are used to implement a given sub-command.

However, given how initializers work in Go, that is each initializer for every package is run before main regardless, all these "small but many" initializers add up and slow the startup process.

So you say this is "make believe problem"?

Well not quite... Let's for a second say that my argument about why I would like to bundle multiple functionality in the same binary is moot; let's say I pay the cost of using a good tool for the wrong job. Let's...

Let us imagine one wants to use the Go built-in net/rpc package, that allows one to easily implement RPC clients and servers over HTTP, plain TCP or UNIX domain sockets. Now it seems that there are some "debugging" facilities that export some HTML for the available services -- I don't use this, nor do I have any idea on how they are used, nor can they be disabled -- and as a consequence in its source code, at line 39 in debug.go, there is the following code, which unfortunately, based on my experiments, takes at least 0.5 milliseconds to initialize:

var debug = template.Must(template.New("RPC debug").Parse(debugText))

So to summarize:

Do I use it? No.
Can I disable it? No.
Am I paying for what I'm not actually using (now or in the future)? Yes.

A similar situation is encountered with the encoding/gob package, a dependency of net/rpc, whose initialization does quite a bit of work related to reflection as seen through the source code file types.go.

In fact, for the project I was using, even if I just os.Exit(0) first thing in main it still takes around 3 milliseconds to start, as opposed to around 0.5 milliseconds for a truly empty main.

One might say that these 3 milliseconds are a drop in the ocean, but as the ocean is made of many drops, so do overheads add-up when a tool is called enough times in a tight loop...

Fortunately there are some upsides...

On the plus side, especially for server applications, given that all initialization happens before main is called, dependencies have an opportunity to check that the environment they are being executed in is sane. For example a library could check that there is a $HOME or $TMPDIR, check if there is enough disk or memory, etc.

Thus, one would get an early failure if something doesn't look right, as opposed to the other approach (in case of lazy initialization) where some code could be executing for days until it calls some obscure library that only then discovers that the planets are not properly aligned...

What can we do about this (now)?

In the hope that I've convinced you about the importance of selective initialization of dependencies, I would like to discuss what we can do about it now and in the future.

In Go? Nothing right now. As said previously initializers are forcibly and automatically executed no mater what.

In other programming languages? Things differ...

for example in Python one can choose to import somelib only inside the function that actually uses it; (or, and I've done this, one can write some code based on __import__ that loads a module only when it's first used, something akin to what Java does;)
the same I assume can be said for most other scripting languages like Ruby or NodeJS;
for Java it's even better, as according to the JVM specification these initializers are executed only when one "interacts" with a class (e.g. creates an instance, reads or writes a static field, etc.);
in Erlang, although the language doesn't provide support for initializers, the OTP (Erlang's runtime platform) gives the user complete control about which "application" to initialize, and in what order; (this is done either through "application" configuration files, or at a more low-level through "boot scripts";)

However, there is some hope for Go, as it seems that in the soon-to-be released Go 1.16, one can easily assess the initialization overhead through the export GODEBUG=inittrace=1 feature.

In fact based on that idea, I've taken a look in the runtime package source code, namely the func doInit(t *initTask) function from proc.go; and poking around I was quickly able to support an on-demand initialization of packages. However this is non-standard, meaning my code would fail to build with a normal Go compiler, and thus only an academic endeavor. (Although I'm seriously thinking of using it at least for the binaries I'll provide myself, and with fallback code for others that don't use my custom Go runtime source code.)

What could we do better (in the future)?

Preferably the Go developers would find a way (even a cumbersome one for when it's actually needed) to allow the developer to control the initialization of various packages (or modules given how Go groups code these days).

Failing that, and this can be applied to any language, perhaps library developers should rely less on global automatic initializers, and either (preferably) initialize the required global state on first use (similar to what Rust's lazy_static library does), or provide the required hooks to call the initialization explicitly when the user actually needs the library.

In fact, and if it didn't bother us so far maybe we should have a deeper reflection of our own programming style, why do we actually need a global state... Why can't we just keep everything we need in the values (be they "objects", "structures", "maps", etc.) we return to the caller? The only reasonable reason would be a cached immutable "database" or similar, which could be easily initialized on first use...

Closing thoughts

In the end I don't know what to say... I really like Go as a replacement for scripting languages, especially for system related tasks, and as a suitable language for those small utilities that should be easily portable from Linux to other POSIX systems.

I especially like how fast it builds, the fact that it can output static linked binaries, and that cross-platform builds are a breeze. (I can say neither of these about Rust for example.)

However, this was quite a disappointment for me... Especially after using the Go 1.16 beta and seeing how many built-in packages need to initialize their global state and what the overhead cost is... Frankly I was a little-bit more optimistic about the built-in packages...

Next time when starting a project I think I'll lean more towards Rust than Go for non-throw-away projects.