What I Learned Researching Docker Storage (and Why My Disk Hates Me)

1. The Moment I Knew Something Was Off

It started with a whisper, a familiar and dreaded message from my OS: "No space left on device". A quick df -h confirmed my fears.

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       251G  249G    2G  99% /

99% usage! Docker swarm was failing with cryptic messages like this device is not a swarm node for all docker service commands. Panic stations! Where had all my space gone? My first suspect is always Docker. I ran its built-in disk usage command:

docker system df

The output was staggering:

TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
Images          50        45        40.2GB    15.1GB (37%)
Containers      24        18        195.5GB   190.2GB (97%)
Local Volumes   12        10        5.8GB     1.1GB (18%)
Build Cache     315       0         8.5GB     8.5GB

Over 195GB used by containers alone, and a whopping 190GB of it was "reclaimable." The beast wasn't just in its lair; it had eaten the whole house. The numbers told a clear story: my Docker containers were the problem.

An illustration of a developer looking stressed while staring at a full hard drive meter.

2. OverlayFS for Humans

To understand the problem, I had to understand how Docker stores files. It uses a union file system called OverlayFS.

Imagine you have a base image, like a pristine photograph of a landscape. This is your read-only LowerDir. Now, you want to draw on it, but without ruining the original. You place a clear plastic sheet over it. This sheet is your writable UpperDir. Any changes you make—drawing a sun, adding a tree—happen on this top layer.

To an observer (your container), it looks like a single, merged picture (MergedDir). They see the original landscape and your additions as one cohesive image.

Here's a simplified view of the layers:

The container sees a unified view of both layers, but all changes are written to the UpperDir. When a container needs to read a file, Docker looks in the UpperDir first. If it's not there, it looks down through the read-only LowerDir layers. When a container writes a file, it happens in the UpperDir. This is the crucial part.

3. Who's Writing What, Where, and Why?

The "Aha!" moment came when I realized what was being written to the UpperDir. When you run a container for a stateful application—like a PostgreSQL database or a Prometheus monitoring server—without explicitly telling Docker where to store its data, it defaults to writing inside the container's own filesystem.

And where does that live? You guessed it: the UpperDir in /var/lib/docker/overlay2/.

To monitor my Docker swarm, I had deployed a popular monitoring stack using docker-compose. It included Prometheus for metrics, Loki for logs, and Grafana for dashboards. Here's a simplified version of the docker-compose.yml file that caused the headache:

# docker-compose.yml - THE WRONG WAY
version: '3.8'
 
services:
  prometheus:
    image: prom/prometheus:v2.45.0
    ports:
      - '9090:9090'
    # MISTAKE: No volume for data. All metrics written to container layer.
 
  grafana:
    image: grafana/grafana:9.5.3
    ports:
      - '3000:3000'
    # MISTAKE: No volume for dashboards and settings.
 
  loki:
    image: grafana/loki:2.8.2
    ports:
      - '3100:3100'
    # MISTAKE: No volume for logs.

Each of these services is stateful. Prometheus scrapes and stores massive amounts of time-series data. Loki ingests logs. Grafana stores dashboard configurations. Without volumes, all of this data was being written directly into each container's writable layer, bloating it by the minute.

4. The Cost of Forgetting Volumes

The real kicker is that this data is persistent in a non-obvious way. If you stop the container, the UpperDir and all its contents remain. When you restart it, the data is still there. This is great if you intended it, but a nightmare if you didn't.

My Prometheus "test" container, left running for weeks, had silently consumed gigabytes of data per day, contributing significantly to the 190GB of reclaimable space. It was just sitting there, taking up space in a way that wasn't immediately obvious.

And docker system prune? It's a fantastic tool, but it's not a magic bullet. It cleans up stopped containers, unused networks, and dangling images. It won't touch the writable layer of a container that is merely stopped, let alone one that's still running. My giant Prometheus data was safe from the prune, silently consuming my SSD.

5. How I Got My Sanity (and my Disk Space) Back

The solution is simple, elegant, and fundamental to using Docker correctly: Use volumes.

Volumes are Docker's mechanism for persisting data generated by and used by Docker containers. They are managed by Docker and exist outside the container's lifecycle, on the host machine.

Here's how I fixed my Prometheus problem and how I handle stateful services now:

A. Use Named Volumes for Stateful Data

Instead of letting Docker write to the UpperDir, I now create a named volume.

# i. Create a managed volume
docker volume create prometheus-data
 
# ii. Run the container, mapping the volume to the data directory
docker run -d -p 9090:9090 \
  -v prometheus-data:/prometheus \
  prom/prometheus

Now, all the data Prometheus writes to its /prometheus directory is stored in the prometheus-data volume on the host, completely bypassing the OverlayFS UpperDir.

B. Manage Services with `docker-compose.yml`

For any real project, I use Docker Compose. It makes managing volumes a breeze.

Here's a simple docker-compose.yml for a web app with a PostgreSQL database:

version: '3.8'
 
services:
  app:
    build: .
    ports:
      - '8000:8000'
    volumes:
      - .:/app # Bind mount for development
    depends_on:
      - db
 
  db:
    image: postgres:15
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=password
    volumes:
      - postgres-data:/var/lib/postgresql/data # Named volume for persistence
 
volumes:
  postgres-data: # Declare the named volume here

With this setup, the database files are safe and sound in the postgres-data volume, and my application code is mapped directly for easy development using a bind mount.

C. Bonus: Use `tmpfs` for Stateless, Temporary Data

Sometimes you need a container to write temporary data that you don't want to persist at all (e.g., caches, temporary logs). For this, tmpfs mounts are perfect. They mount a temporary filesystem in the host's memory.

docker run -d --tmpfs /app/cache my-app

Data written to /app/cache is lightning-fast and disappears the moment the container stops.

D. Bonus Tip #2: Tame Your Container Logs

It's not just volumes that can save your disk. Container logs are another silent space-eater. By default, Docker uses the json-file driver, which writes container logs (stdout/stderr) to a JSON file on the host. If left unchecked, these files can grow indefinitely.

I once had a blackbox_exporter container fill up 13GB of disk space with logs I wasn't even actively monitoring. The fix is to configure log rotation directly in your docker-compose.yml file.

services:
  blackbox:
    image: prom/blackbox-exporter:latest
    logging:
      driver: 'json-file'
      options:
        max-size: '10m'
        max-file: '5'
    # ... other service configuration ...

This simple configuration tells Docker:

max-size: '10m': Let a log file grow to a maximum of 10 megabytes.
max-file: '5': Keep a maximum of 5 log files.

When the primary log file hits 10MB, Docker starts a new one. Once it has 5 files, it will start deleting the oldest one. This simple setup prevents any single container from overwhelming your disk with logs.

6. OverlayFS: The Intern and The Senior Dev

If you're still wrapping your head around it, think of it this way:

LowerDir (the image) is the cool, calm senior dev. They did their work weeks ago, and it's perfect, stable, and read-only.

UpperDir (the container's writable layer) is the frazzled intern on their first day. They're frantically writing notes on sticky pads and sticking them all over the senior's pristine monitor.

MergedDir (what the container sees) is their shared desk. It looks like a single workspace, but all the chaos is happening on the intern's sticky notes.

A Volume is a proper, dedicated filing cabinet. When the intern needs to store something important, you tell them to put it in the cabinet, not on another sticky note.

7. TL;DR

My disk-space horror story has a simple moral: don't let your containers write wherever they want. Be intentional about data.

Use volumes. For everything that writes data you want to keep (databases, uploads, etc.), use a named volume.
Don't ignore /var/lib/docker. It won't ignore you. If your disk is full, it's a prime suspect.
Understand OverlayFS basics. Knowing about the UpperDir and LowerDir will save you from future headaches.
Run docker system df regularly. Treat it like a regular health check to keep tabs on your container environment.