06 Init and Services

The Problem

The generation kernel is running (05 Loading the Real OS). /persist is about to be unlocked. But a kernel by itself does nothing useful -- it manages hardware and runs processes, but it doesn't start any processes on its own (except PID 1).

PID 1 is the first process the kernel starts. Everything else in userspace is a descendant of PID 1. If PID 1 dies, the kernel panics. PID 1's job is to start and supervise all the services that make the system useful: the maintainer agent, WireGuard, the key service, the reconciler.

This is what an init system does. It's the conductor of the orchestra -- it decides what starts, in what order, with what dependencies, and what happens when something crashes.

What an Init System Does

An init system has four core responsibilities:

Service startup: Start services in the right order, respecting dependencies (WireGuard must start before the gossip mesh, the gossip mesh must start before cert renewal)
Dependency management: Express "service A needs service B running first" and resolve the graph
Service supervision: If a service crashes, restart it. Keep it running.
Readiness signaling: Know when a service is actually ready to serve (not just when its process started, but when it's done initializing)

The difference between "started" and "ready" matters. A web server's process might start in 10ms, but it's not ready until it's bound to a port and accepting connections. If another service connects before the server is ready, it gets a "connection refused." Readiness signaling solves this.

How Others Do It

systemd: The Kitchen Sink

systemd is the init system on most mainstream Linux distributions (Ubuntu, Fedora, Arch, Debian). It manages services, mounts filesystems, handles logging, manages network, runs timers, manages containers, handles user sessions, and much more.

Service definitions are declarative "unit files":

[Unit]
Description=My Web Server
After=network.target
Requires=database.service

[Service]
ExecStart=/usr/bin/myserver
Restart=on-failure

[Install]
WantedBy=multi-user.target

Strengths: Massive ecosystem, well-documented, handles everything. Parallel startup. Socket activation (services start on first connection). Weaknesses: Enormous codebase (~1.4M lines). Complex. Many features that an embedded/immutable OS doesn't need. Requires glibc (no musl). The binary is large.

runit: Simple Supervision

runit is a minimalist init system. Services are directories containing a run script. runit starts all services in parallel, supervises them, and restarts them on crash. No dependency management beyond "start everything."

# /etc/sv/myserver/run
#!/bin/sh
exec /usr/bin/myserver

Strengths: Tiny (~3000 lines of C), simple, reliable. Easy to understand. Weaknesses: No dependency ordering (all services start simultaneously). No readiness signaling (you hope services start fast enough). No declarative dependencies.

OpenRC: Dependency-Aware Scripts

OpenRC (used by Gentoo, Alpine Linux) is a dependency-based init system using shell scripts. Services declare dependencies and OpenRC starts them in order.

# /etc/init.d/myserver
depend() {
    need net database
    after firewall
}
start() {
    start-stop-daemon --start --exec /usr/bin/myserver
}

Strengths: Shell-based (easy to understand and debug), proper dependency management, lightweight. Weaknesses: Sequential start within dependency levels (slower than parallel). Shell scripts are fragile for complex service management. Readiness signaling is limited.

s6: Minimal Supervision + s6-rc: Dependency Layer

s6 is a process supervisor (like runit but with more features). s6-rc adds a dependency management layer on top. They're separate tools that compose.

s6 handles:

Process supervision (restart on crash)
Readiness notification (notification-fd protocol)
Clean shutdown sequencing
Logging (s6-log)

s6-rc adds:

Dependency graph (service A depends on B and C)
Oneshots (run once at boot, like mounting filesystems)
Bundles (groups of services)
Atomic service transitions

Strengths: Small, auditable codebase (~5,000 lines total across s6 + s6-rc), proper dependency ordering, real readiness signaling, works with musl libc. Weaknesses: Less documentation than systemd. Learning curve (two-layer model). Fewer tutorials and community resources. Used primarily in embedded systems and container images rather than desktop distributions (Obarun Linux and the 66 init system are the notable distro-level adopters).

The Tradeoffs

Feature	systemd	runit	OpenRC	s6 + s6-rc
Dependency ordering	Yes	No	Yes	Yes
Readiness signaling	Yes (sd_notify)	No	Limited	Yes (notification-fd)
Parallel startup	Yes	Yes (no ordering)	Partial	Yes
Codebase size	~1.4M lines	~3,000 lines	~30,000 lines	~5,000 lines
musl compatible	No (requires glibc)	Yes	Yes	Yes
Complexity	High	Low	Medium	Medium
Community/docs	Massive	Small	Medium	Small

To understand the deciding factors, start from FortrOS's image philosophy:

Why Buildroot: FortrOS images are immutable -- rebuilt from scratch for every update, not patched. This requires a build system that compiles everything from source with full control over every binary in the image. Three tools do this: Buildroot, Yocto, and Nix. Buildroot is the simplest (single Makefile, one defconfig, cross-compiles the entire image). Yocto is more powerful but significantly more complex (layer system, bitbake recipes, steeper learning curve). Nix is the most reproducible but heaviest (requires the Nix store, different package model). For a minimal embedded-style image where simplicity and auditability matter, Buildroot is the natural fit.

Why musl: Buildroot supports both glibc and musl as the C library. FortrOS chooses musl because it's smaller (~100K lines vs glibc's ~1.5M lines), simpler (easier to audit), and produces smaller binaries. In an immutable OS where you control every binary, a smaller C library means less code running with the highest privileges and less attack surface. Every process on the system links against the C library -- it's the most pervasive dependency.

The consequence for init: musl eliminates systemd (systemd requires glibc). This isn't the reason FortrOS chose musl -- musl was chosen for its own merits. But it narrows the init system options to s6, runit, or OpenRC.

From those three, s6 + s6-rc wins on:

Small attack surface: PID 1 runs with root privileges. s6 + s6-rc's ~5,000 lines vs systemd's ~1.4M is the difference between "one person can audit this" and "nobody has read all of it."
Readiness signaling: Services need to know when dependencies are truly ready (not just started). s6's notification-fd is real readiness. runit has nothing.
Proper dependency ordering: Services have real dependencies that must be respected. runit starts everything simultaneously with no ordering.

How FortrOS Does It

FortrOS uses s6 for process supervision and s6-rc for dependency management and boot sequencing.

The s6-rc Service Tree

Services are defined as directories in the Buildroot overlay:

buildroot/overlay/etc/s6-rc/source/
  default/              # The bundle started at boot
    contents            # Lists all services to start
  persist-mount/        # Oneshot: unlock and mount /persist
    type                # "oneshot"
    up                  # Script to run
  wireguard/            # Oneshot: bring up WG interface
    type
    up
    dependencies        # "persist-mount" (needs keys from /persist)
  fortros-maintainer/   # Longrun: the gossip/CRDT agent
    type                # "longrun"
    run                 # The supervised process
    notification-fd     # "3" (readiness signaling)
    dependencies        # "wireguard"
  fortros-key-service/  # Longrun: HKDF key derivation
    type
    run
    notification-fd
    dependencies        # "persist-mount"
  fortros-reconciler/   # Longrun: workload lifecycle
    type
    run
    notification-fd
    dependencies        # "fortros-maintainer fortros-key-service"

The dependency graph resolves to a boot order:

persist-mount (unlock /persist)
  -> wireguard (needs WG key from /persist)
    -> fortros-maintainer (needs WG overlay)
      -> fortros-reconciler (needs maintainer IPC)
  -> fortros-key-service (needs master key from /persist)
    -> fortros-reconciler (needs key derivation)

Readiness Signaling

s6 uses notification-fd for readiness: a service writes a newline to file descriptor N (configured in the notification-fd file) when it's truly ready to serve. s6-rc waits for this signal before starting dependent services.

For the maintainer, "ready" means: WireGuard is up, gossip mesh has been joined, initial TreeSync pull from the provisioner is complete. Only then does it signal readiness, and only then does s6-rc start the reconciler.

This prevents the common bug where a service starts, its dependents start immediately, and the dependents fail because the first service isn't actually ready yet.

Service Types

Type	Behavior	FortrOS Example
Oneshot	Run a script once at boot	persist-mount, wireguard
Longrun	Supervised daemon (restarted on crash)	maintainer, key-service, reconciler
Bundle	Group of services (no process)	default (everything)

The Boot Watchdog

The boot watchdog is an s6-rc oneshot that depends on the maintainer. When the maintainer signals readiness, the watchdog runs and writes "ok" to the current generation's status file. This is how 05 Loading the Real OS's generation health marking works -- if the maintainer never becomes ready (bad generation, broken service), the independent timeout marks the generation as "failed" and reboots.

Why Not systemd?

systemd requires glibc (the GNU C library). FortrOS uses musl libc -- a smaller, simpler C library that's easier to audit and produces smaller binaries. This is a hard constraint: systemd cannot be compiled against musl.

Beyond the library constraint, systemd's feature scope (login management, container runtime, network management, DNS resolver, NTP client, device management) duplicates things FortrOS handles through dedicated components (WireGuard for networking, maintainer for orchestration, reconciler for workloads). Each duplicated feature is code running as PID 1 that FortrOS doesn't need and can't easily audit. s6 + s6-rc provides exactly what's needed -- supervision, dependency ordering, readiness signaling -- with nothing extra.

Stage Boundary

What This Stage Produces

After s6-rc completes the default bundle:

/persist is mounted (persist-mount oneshot)
WireGuard interface is up (wireguard oneshot)
Maintainer is running and ready (gossip mesh joined)
Key service is running (ready for per-service key derivation)
Reconciler is running (ready to manage workloads)
Boot watchdog has marked the generation as "ok"

What Is Handed Off

The system is now a running FortrOS node with all core services. The next stages happen concurrently through these services:

The maintainer handles 07 Overlay Networking (WireGuard mesh management)
The maintainer handles 08 Cluster Formation (gossip, CRDTs, TreeSync)
The reconciler handles 09 Running Workloads (containers, VMs)

What This Stage Does NOT Do

It does not manage the WireGuard mesh topology (that's the maintainer's job, 07 Overlay Networking)
It does not handle org state (that's gossip/CRDTs, 08 Cluster Formation)
It does not start workloads (that's the reconciler, 09 Running Workloads)
It does not handle upgrades (that's 10 Sustaining the Org)