GUIDE · PREVIEW
GUIDE / CON.13
source: docs/guide/concepts/Content-Addressed Storage.md
Concepts

Content-Addressed Storage

What It Is

Content-addressed storage identifies data by its cryptographic hash, not by a filename or path. A 1MB chunk of data is addressed by SHA-256(data) -- if you have the hash, you can verify the data is correct. If two files contain identical chunks, those chunks are stored once (deduplication for free).

FortrOS's storage layer builds on top of Erasure Coding: files are split into content-addressed chunks, each chunk is erasure-coded into shards, and shards are distributed across the org. This page covers the chunk layer; erasure coding covers the shard distribution layer.

Why It Matters

Traditional storage addresses data by location: "file X is at /path/to/file on server Y." If the server changes, the path changes. If you copy the file, the copy has a different address. There's no built-in way to verify the copy matches the original.

Content addressing flips this: the hash IS the address. The same data always produces the same hash, regardless of where it's stored or how many copies exist. Verification is built in -- re-hash the data and compare. This gives you deduplication, integrity verification, and location-independence in one mechanism.

Git uses this model (commits, trees, and blobs are content-addressed). IPFS uses it. Docker image layers use it. FortrOS's shard storage uses it.

How It Works

Chunking

Files are split into fixed-size blocks (1MB). Each block is hashed independently:

file.dat (5.2 MB)
  -> chunk-0: SHA-256 = a1b2c3... (1 MB)
  -> chunk-1: SHA-256 = d4e5f6... (1 MB)
  -> chunk-2: SHA-256 = 789abc... (1 MB)
  -> chunk-3: SHA-256 = def012... (1 MB)
  -> chunk-4: SHA-256 = 345678... (0.2 MB)

The file is represented by an ordered list of chunk hashes. This list itself is hashed to produce the file's root hash.

Merkle DAG

The chunks form a Merkle DAG (Directed Acyclic Graph): the file root points to its chunk hashes, each chunk hash points to the data. For large files, intermediate tree nodes group chunks into subtrees, keeping the root hash compact regardless of file size.

This gives:

  • Integrity: Verify any chunk by re-hashing. Verify the whole file by checking the root hash.
  • Deduplication: Identical chunks across different files share storage. Two VMs with the same base image share all base chunks.
  • Partial transfer: If you have some chunks already (from a previous version), only fetch the missing ones.

COW History

When a file is edited, only the changed chunks are new. The updated file root points to mostly the same chunks as the previous version, plus the new ones. Old roots are retained as history.

Version 1 root: [chunk-A, chunk-B, chunk-C]
Version 2 root: [chunk-A, chunk-B', chunk-C]  (only B changed -> B')

Both versions share chunk-A and chunk-C. Only chunk-B' is new storage. Rolling back to version 1 is a pointer change to the old root -- no data copying, no restore operation. The Merkle DAG IS the version history.

Deletion and Reference Counting

"Deleting" a file means removing its root from the namespace -- the name is gone, but the chunks it pointed to may still be referenced by other files, other versions, or other services. A chunk stays in storage as long as at least one reference exists.

file-A root: [chunk-1, chunk-2, chunk-3]
file-B root: [chunk-2, chunk-4, chunk-5]

Delete file-A:
  chunk-1: no remaining references -> eligible for garbage collection
  chunk-2: still referenced by file-B -> stays
  chunk-3: no remaining references -> eligible for garbage collection

Garbage collection runs periodically: scan for chunks with zero references, reclaim their storage. This is the same model as git's garbage collection (unreachable objects are cleaned up, reachable ones stay). The reference count is maintained by the placement service as part of the file metadata.

Service-Owned Encryption

Each service encrypts its data with its own key (derived from the key service) BEFORE handing chunks to the placement service. The placement service stores and retrieves opaque bytes -- it holds no keys, sees no plaintext.

Service encrypts chunk -> encrypted chunk hashed -> hash is the address
Placement service stores: {hash -> encrypted bytes}
Placement service knows: hash, size, location
Placement service CANNOT: decrypt, read content, determine file type

Even if all shards are collected and the placement DB is compromised, the data is useless without the owning service's key.

How FortrOS Uses It

All org data flows through this layer:

  • Boot images (kernel + initramfs) are stored as content-addressed chunks
  • VM disk images (base images for client VMs) are chunked and deduplicated
  • WAL entries (Client Profiles and Roaming) are stored as chunks
  • Service data (database backups, config archives) is chunked and encrypted by the owning service

The placement service is the API surface. Services don't interact with chunks or shards directly. They call the placement service to store and retrieve named files. The placement service handles: chunking, erasure coding, shard distribution, integrity verification, and reassembly.

Erasure coding sits below chunking. Each chunk is independently erasure-coded (K-of-N shards) and distributed across nodes. See Erasure Coding for the shard layer.

Deduplication is automatic. Two services storing the same base image share chunks. The placement service detects duplicate hashes and avoids redundant storage.

Alternatives

Filename-addressed storage (traditional): Files identified by path. No built-in integrity verification, no deduplication, location-dependent. Simpler but doesn't scale to distributed systems.

Block-level replication (Ceph RBD, DRBD): Replicate raw disk blocks. No deduplication, no content verification above the block level. Better for VM disk images where block-level access matters.

Object storage (S3-compatible): Key-value store for blobs. Can be content-addressed (hash as key) but typically isn't (user-chosen keys). FortrOS's placement service is conceptually similar but with built-in encryption, erasure coding, and topology-aware placement.

Links