Danny Willems -- Work In Progress

A mathematician fighting for privacy and security on the Internet, while dreaming about describing the Universe with equations and symbols.

Research Publications Public Talks Open source software contributions CV Education Blog PGP public key Recommended softwares Contact Proton calendar for cryptography and cybersecurity events
26 August 2025

ZFS Crash Course

by Danny Willems

ZFS is one of those technologies you only need to set up once to realize you’ll never go back. Originally developed by Sun Microsystems, it combines the roles of a volume manager and a filesystem. That means you don’t need LVM + ext4 + mdadm—ZFS handles storage pooling, snapshots, compression, checksumming, replication, and more, all in one place.

This crash course is designed for people who just installed ZFS and want to get productive fast.

At LeakIX, we rely on ZFS for our own datasets, where we manage tens of terabytes of security scan data. The reliability and snapshotting features are critical for handling that scale safely.

Why ZFS?

Core Concepts

Basic Workflow

1. Create a Pool

Say you have two disks: /dev/sdb and /dev/sdc. To create a mirrored pool named tank:

zpool create tank mirror /dev/sdb /dev/sdc

To check its status:

zpool status

2. Add a Dataset

Datasets are like sub-filesystems:

zfs create tank/data
zfs set compression=lz4 tank/data

Now tank/data is mounted and ready to use.

3. Snapshots & Rollbacks

Create a snapshot:

zfs snapshot tank/data@before-upgrade

Rollback if something breaks:

zfs rollback tank/data@before-upgrade

4. Send & Receive (Backups)

To send a snapshot to another host:

zfs send tank/data@before-upgrade | ssh user@backuphost zfs receive backup/data

5. Monitoring

Check usage:

zfs list

Check health:

zpool status -v

What is a ZFS Scrub?

Checking Scrub Status

You can check the status of scrubs directly with ZFS commands:

zpool status

The output includes a scan: line showing when the last scrub was run, how much data was scanned, and whether errors were found.

zpool scrub poolname
watch -n 10 zpool status poolname
zpool scrub -s poolname

A scrub is a data integrity check that scans all data in the pool, verifies checksums, and repairs any corrupted blocks using redundancy (from mirrors or raidz).
You can think of it like fsck for ZFS, but online and non-disruptive. Scrubs should be scheduled periodically (e.g., monthly) to catch and fix silent corruption before it spreads.

What is atime?

The atime property tracks the last access time of files. While useful for some workloads (e.g., mail servers), it causes additional writes every time a file is read, which can hurt performance.
For most datasets, it’s safe and recommended to disable it with:

zfs set atime=off pool/dataset

Things to Keep in Mind

Understanding the ZFS Cache

ZFS uses an advanced caching system to improve performance:

Adding an NVMe as a Cache Device

If your pool is backed by slower spinning disks and you don’t want to rely only on RAM, you can add a fast NVMe drive as an L2ARC device. This extends your ARC cache from RAM to the NVMe, reducing pressure on memory while still accelerating reads.

Example:

zpool add tank cache /dev/nvme0n1

Keep in mind:

Best Practices

You can inspect ARC stats directly from /proc/spl/kstat/zfs/arcstats on Linux or use monitoring exporters to visualize hit ratios over time.

Quick Cheat Sheet

# Create a pool
zpool create tank mirror /dev/sdb /dev/sdc

# Create a dataset
zfs create tank/projects

# Set compression
zfs set compression=lz4 tank/projects

# Take a snapshot
zfs snapshot tank/projects@2025-08-26

# Rollback to snapshot
zfs rollback tank/projects@2025-08-26

# Backup with send/receive
zfs send tank/projects@2025-08-26 | ssh backup zfs receive tank-backup/projects

Conclusion

ZFS is much more than a filesystem—it’s a full data management platform. With snapshots, replication, and built-in integrity checks, it’s one of the most robust choices for anyone who cares about their data.

If you’ve just set it up on your server, spend some time learning its philosophy: pools, datasets, snapshots. With that foundation, you’ll have a rock-solid storage system that grows with your needs.

References and Further Reading

tags: ZFS - Filesystem - Storage - Linux