Storage in Kubernetes: The Hard Parts

February 1, 2026

Storage is where Kubernetes gets real. Stateless applications are straightforward; if a pod dies, another takes its place. Stateful workloads are different. Databases, message queues, file stores - they need data to survive pod failures, node failures, and sometimes entire cluster rebuilds.

In air-gapped and secure environments, the usual answer of "use a managed cloud service" doesn't apply. You need to bring your own storage, operate it yourself, and make sure it doesn't lose data.

This post covers how Lattice handles block and object storage and the lessons we've learned about keeping data safe in environments where recovery options are limited.

Two Types of Storage, Two Different Problems

Kubernetes workloads need two fundamentally different types of storage:

Block storage provides persistent volumes that look like disks to containers. Databases, message queues, and applications that need filesystem-level access use block storage. This is your PostgreSQL data directory, your etcd state, your application's local file store.

Object storage provides S3-compatible API access for unstructured data. Backups, log archives, container images, application artifacts, and large files typically use object storage. It's not mounted as a filesystem - applications talk to it over HTTP.

Both are essential for a production platform. Lattice includes Longhorn for block storage and Garage for object storage, each solving different problems.

Block Storage with Longhorn

Longhorn is a distributed block storage system built specifically for Kubernetes. It runs on the cluster itself, using the local disks of cluster nodes to provide replicated persistent volumes.

Why Longhorn

No specialised hardware - Longhorn uses whatever disks are available on your nodes. No SAN, no NFS server, no external storage array. In environments where infrastructure is constrained, this matters.

Kubernetes-native - Longhorn integrates through the Container Storage Interface (CSI). Pods request storage through standard PersistentVolumeClaims, and Longhorn handles provisioning, attachment, and replication transparently.

Replication - Data is replicated across multiple nodes. When a node fails, the data survives on other nodes and the volume remains accessible. The replication factor is configurable per volume; three replicas is the common default for production.

Snapshot and backup - Longhorn supports volume snapshots (point-in-time copies) and backups to external targets. Combined with object storage, this provides a complete backup strategy.

Operational Considerations

Longhorn isn't without trade-offs:

Performance - Network-replicated storage is slower than local disk. For most workloads, the difference is negligible. For latency-sensitive applications (high-throughput databases, for instance), it's worth benchmarking and understanding the overhead.

Disk management - Longhorn needs healthy disks on enough nodes to maintain replication. Monitoring disk health and capacity is essential — the observability stack covered in Part 3 handles this with dedicated Longhorn and disk/PVC dashboards.

Rebuilding - When a replica is lost (node failure, disk failure), Longhorn rebuilds it on another node. This consumes network bandwidth and I/O. In resource-constrained environments, rebuilds can impact other workloads if not managed.

Space planning - With three replicas, a 100GB volume consumes 300GB of raw disk across the cluster. Capacity planning needs to account for replication overhead, snapshot space, and headroom for rebuilds.

When Longhorn Isn't Enough

Longhorn is excellent general-purpose block storage. But some workloads need something different:

Local-path provisioner - For workloads that need raw local disk performance and can tolerate data loss on node failure (caches, temporary storage, stateless processors). K3s includes this by default.
Host-path volumes - For specific cases where data must reside on a particular node. Less flexible, but sometimes necessary.

The right choice depends on the workload's durability and performance requirements. Longhorn is the default; alternatives exist for specific needs.

Object Storage with Garage

Garage is a lightweight, S3-compatible object storage system designed for self-hosted environments. It runs on the cluster alongside Longhorn, providing a fundamentally different storage tier.

Why Garage

Lightweight - Garage is designed to run with minimal resources, making it practical for environments where every resource counts. Unlike MinIO, which is well-established but heavier, Garage's footprint is modest enough to run comfortably alongside other workloads.

S3-compatible - Applications and tools that speak S3 work with Garage without modification. Velero for backups, Loki for log chunks, container registries for image storage — they all use the same familiar API.

Geo-aware replication - Garage was designed for distributed deployments across multiple zones or sites. Data placement is zone-aware, ensuring replicas are spread across failure domains. Even in single-site deployments, this translates to node-aware replication.

Resilient by design - Garage tolerates node failures gracefully. Data remains accessible as long as enough nodes are available to satisfy the replication factor.

What Object Storage Serves

In a Lattice deployment, Garage typically handles:

Backup targets - Longhorn volume backups, database dumps, etcd snapshots, and configuration archives. Having backups in object storage means they're accessible independently of the block storage layer. If Longhorn has problems, your backups aren't affected.

Log archival - Loki can store log chunks in object storage, extending retention beyond what local disk allows. Older logs move to Garage while recent logs stay in fast local storage.

Artifact storage - Container images (when running a private registry), Helm chart archives, deployment bundles, and any other artifacts that need durable storage.

Application data - Any application that stores files, uploads, reports, or exports. Object storage scales more gracefully than block storage for large volumes of unstructured data.

Garage vs MinIO

MinIO is the more established option for self-hosted S3-compatible storage. It's battle-tested, widely deployed, and has extensive tooling. So why Garage?

Resource efficiency - Garage runs comfortably with a fraction of MinIO's resource requirements. For environments where cluster resources are constrained i.e. edge deployments, smaller clusters, or clusters where workloads need most of the capacity this matters.

Operational simplicity - Garage's configuration and operation are straightforward. Fewer moving parts means fewer things to go wrong.

Replication model - Garage's zone-aware replication is built into the core design, not bolted on. Data placement considers failure domains automatically.

MinIO remains a strong choice, particularly for environments with substantial storage requirements, complex access policies, or teams already experienced with it. Lattice's modular architecture means swapping one for the other is feasible; the S3 API is the same either way.

Combining Block and Object Storage

The real power comes from using both tiers together:

Databases run on block storage - PostgreSQL, etcd, Redis - anything that needs filesystem semantics uses Longhorn persistent volumes. Data is replicated across nodes for durability.

Backups go to object storage - Database dumps, volume snapshots, configuration exports all land in Garage. This creates a separate copy on a separate storage layer. If block storage fails, backups survive.

Logs flow through both - Active logs sit in fast storage for quick queries. Archived logs move to object storage for long-term retention.

Applications choose the right tier - Workloads that need mounted filesystems use block storage. Workloads that store files, artifacts, or large objects use the S3 API.

This layered approach means no single storage failure takes out everything. Block storage problems don't affect backups. Object storage problems don't affect running databases.

Backup Strategies

Backups in air-gapped environments need more thought than "sync to S3 in another region."

What to Back Up

etcd - The cluster's brain. Without it, you've lost all cluster state; deployments, services, secrets, everything. Regular etcd snapshots are non-recommended.

Persistent volumes - Application data stored in Longhorn volumes. Snapshot schedules provide point-in-time copies; backup to object storage provides an additional tier.

Configuration - Inventories, variable files, certificates, and any customisation. These live in version control, but having a backup in the cluster's object storage provides resilience.

Secrets - Encryption keys, certificates, credentials. Handled carefully - backed up encrypted, access controlled, stored separately from general backups where possible.

Backup Frequency

The right frequency depends on how much data you can afford to lose - the Recovery Point Objective (RPO):

etcd - Hourly or more frequently. Cluster state changes constantly; losing hours of state means recreating deployments manually.
Databases - Depends on the application. Transaction-critical databases might need continuous WAL archiving. Others might tolerate daily dumps.
Volume snapshots - Daily for most workloads, more frequently for critical data.
Configuration - Every change (via version control), with periodic full exports.

Backup Testing

A backup that can't be restored isn't a backup, it's a waste of storage.

We test restores regularly:

Can we restore an etcd snapshot to a fresh cluster?
Can we restore a database dump and verify data integrity?
Can we recover a Longhorn volume from a Garage backup?

These tests run in staging environments. Discovering your backup process doesn't work during a real disaster is about as bad as not having backups at all.

Disaster Recovery

What happens when things go seriously wrong?

Node Failure

The most common scenario. A node becomes unavailable - hardware failure, OS crash, network partition.

Block storage - Longhorn serves the volume from remaining replicas. A new replica is rebuilt on another node automatically. Applications may experience a brief interruption during failover.
Object storage - Garage continues serving from remaining nodes. If replication is sufficient, no data is lost or unavailable.
Pods - Kubernetes reschedules pods to healthy nodes. Stateless pods recover quickly; stateful pods wait for their volumes to be available.

Disk Failure

A disk fails on one node. Longhorn marks the affected replicas as failed and rebuilds them on healthy disks. Garage rebalances affected data.

Monitoring catches this - the Longhorn dashboard shows degraded volumes, and alerts fire on replica count changes.

Multiple Failures

When enough fails simultaneously - multiple nodes, multiple disks, or a combination - recovery depends on how much redundancy was in place.

With three replicas across three or more nodes, losing one node is routine. Losing two simultaneously is serious but survivable for data with three replicas. Losing three means data loss for anything without external backups.

This is why backups to object storage matter. If block storage suffers catastrophic failure, backups provide the recovery path. If the entire cluster is lost, etcd snapshots plus volume backups plus configuration in version control provide the path to rebuild.

Full Cluster Recovery

The worst case: rebuilding from scratch.

Deploy a fresh cluster using Ansible (same playbooks, same configuration)
Restore etcd from the most recent snapshot
Restore persistent volumes from object storage backups
Verify application functionality

This should be practiced. A documented procedure that's been tested in staging gives confidence that recovery is possible. A theoretical procedure that's never been run is guesswork.

Performance Considerations

Storage performance in Kubernetes has several dimensions:

IOPS - How many I/O operations per second the storage can handle. Matters for databases and transaction-heavy workloads.

Throughput - How much data can be read or written per second. Matters for large file operations and data processing.

Latency - How long each I/O operation takes. Matters for everything, especially databases.

Longhorn adds latency compared to local disk due to network replication. For most applications, this is acceptable. For performance-critical workloads, options include:

Reducing replica count (trading durability for performance)
Using local-path provisioner for caches and temporary data
Placing storage-intensive workloads on nodes with faster disks
Using data locality settings to prefer local replicas for reads

Garage's performance characteristics differ from block storage - object storage is optimised for throughput rather than latency. Large file uploads and downloads are efficient; many small operations are less so. Design applications accordingly.

Lessons Learned

Plan capacity early - Storage that fills up causes cascading failures. Monitor usage, set alerts at 70-80% capacity, and have a plan for expansion.

Test recovery before you need it - Backup and restore procedures should be practiced regularly, not discovered during incidents.

Separate your tiers - Block storage failures shouldn't affect your backups. Object storage problems shouldn't affect running databases. Independence between storage layers is deliberate.

Monitor everything - Disk health, volume replication, backup success, capacity trends. Storage failures are often predictable if you're watching the right metrics.

Document your storage architecture - Which workloads use which storage tier, replication factors, backup schedules, recovery procedures. When someone needs to make a decision at 2 AM, documentation saves time.

What's Next

The next post in this series will cover service mesh - when Istio adds genuine value, when it's unnecessary complexity, and how to introduce it incrementally.

Lattice is a project developed by Digital Native Group.