Linux Software RAID with mdadm: Complete Setup, Monitoring, and Disk Failure Recovery Guide
🎯 Key Takeaways
- Table of Contents
- RAID Levels Explained: Which to Choose
- Prerequisites and Disk Preparation
- Creating a RAID 1 Mirror (Two Disks)
- Creating a RAID 5 Array (Three or More Disks)
📑 Table of Contents
- Table of Contents
- RAID Levels Explained: Which to Choose
- Prerequisites and Disk Preparation
- Creating a RAID 1 Mirror (Two Disks)
- Creating a RAID 5 Array (Three or More Disks)
- Creating a RAID 6 Array (Four or More Disks)
- Creating RAID 10 (Stripe of Mirrors)
- Filesystem, Mounting, and fstab
- Monitoring RAID Health
- Responding to a Disk Failure
- Adding a Hot Spare
- Expanding an Existing Array
- Booting from a RAID Array
- Benchmarking RAID Performance
- Conclusion
Linux software RAID with mdadm is a mature, reliable, and completely free alternative to hardware RAID controllers. It runs on any server regardless of the storage controller, supports hot-spare replacement, live expansion, and RAID level migration — all without specialized hardware. This guide covers setting up RAID arrays, monitoring them for failures, recovering from a failed disk, and expanding capacity without data loss.
📑 Table of Contents
- Table of Contents
- RAID Levels Explained: Which to Choose
- Prerequisites and Disk Preparation
- Install mdadm
- Identify Available Disks
- Zero Superblocks on Disks Being Reused
- Partition the Disks (Optional but Recommended)
- Creating a RAID 1 Mirror (Two Disks)
- Creating a RAID 5 Array (Three or More Disks)
- Creating a RAID 6 Array (Four or More Disks)
- Creating RAID 10 (Stripe of Mirrors)
- Filesystem, Mounting, and fstab
- Persistent Mount via fstab
- Save mdadm Configuration
- Monitoring RAID Health
- Check Array Status
- Email Alerts on Failure
- Prometheus Metrics with node_exporter
- Responding to a Disk Failure
- Adding a Hot Spare
- Expanding an Existing Array
- Booting from a RAID Array
- Benchmarking RAID Performance
- Conclusion
Table of Contents
- RAID Levels Explained: Which to Choose
- Prerequisites and Disk Preparation
- Creating a RAID 1 Mirror (Two Disks)
- Creating a RAID 5 Array (Three or More Disks)
- Creating a RAID 6 Array (Four or More Disks)
- Creating RAID 10 (Stripe of Mirrors)
- Filesystem, Mounting, and fstab
- Monitoring RAID Health
- Responding to a Disk Failure
- Adding a Hot Spare
- Expanding an Existing Array
- Booting from a RAID Array
- Benchmarking RAID Performance
RAID Levels Explained: Which to Choose
| RAID Level | Min Disks | Fault Tolerance | Usable Capacity | Best For |
|---|---|---|---|---|
| RAID 0 | 2 | None | 100% (N disks) | Scratch/cache (no redundancy) |
| RAID 1 | 2 | 1 disk | 50% (N/2) | Boot volumes, OS disks |
| RAID 5 | 3 | 1 disk | (N-1)/N | General storage, read-heavy |
| RAID 6 | 4 | 2 disks | (N-2)/N | Large arrays, critical data |
| RAID 10 | 4 | 1 per mirror pair | 50% | Databases, high I/O |
Recommendation for most use cases: RAID 1 for boot/OS volumes, RAID 6 for large data arrays (4+ disks), RAID 10 for databases or high-write-throughput workloads. Avoid RAID 5 for arrays over 4TB per disk — rebuild time on large drives leaves you exposed to a second failure too long.
Prerequisites and Disk Preparation
Install mdadm
# RHEL / Rocky Linux / Fedora
dnf install -y mdadm
# Ubuntu / Debian
apt install -y mdadm
Identify Available Disks
# List all block devices and their current state
lsblk
# Identify disks with no partition table (ideal for new RAID members)
lsblk -o NAME,SIZE,TYPE,FSTYPE,MOUNTPOINT,MODEL
# Check for existing RAID signatures (zero these before reuse)
mdadm --examine /dev/sdb
mdadm --examine /dev/sdc
Zero Superblocks on Disks Being Reused
# CAUTION: This destroys any existing RAID metadata on the disk
mdadm --zero-superblock /dev/sdb
mdadm --zero-superblock /dev/sdc
# Also wipe partition signatures
wipefs -a /dev/sdb
wipefs -a /dev/sdc
Partition the Disks (Optional but Recommended)
Using partitions rather than whole disks makes it easier to replace drives from different manufacturers that may be slightly smaller.
# Create a single partition spanning the whole disk using parted
parted /dev/sdb --script \
mklabel gpt \
mkpart primary 1MiB 100%
parted /dev/sdc --script \
mklabel gpt \
mkpart primary 1MiB 100%
# Verify
lsblk /dev/sdb /dev/sdc
Creating a RAID 1 Mirror (Two Disks)
# Create RAID 1 array using two partitions
mdadm --create /dev/md0 \
--level=1 \
--raid-devices=2 \
/dev/sdb1 /dev/sdc1
# Monitor initial sync progress
watch -n2 cat /proc/mdstat
# Sample /proc/mdstat output during sync:
# Personalities : [raid1]
# md0 : active raid1 sdb1[0] sdc1[1]
# 976759936 blocks super 1.2 [2/2] [UU]
# [=====>...............] resync = 27.4% (268M/976M) finish=5.4min speed=120K/sec
# Examine the array
mdadm --detail /dev/md0
Creating a RAID 5 Array (Three or More Disks)
# Create RAID 5 with three disks (one parity disk equivalent)
mdadm --create /dev/md0 \
--level=5 \
--raid-devices=3 \
/dev/sdb1 /dev/sdc1 /dev/sdd1
# Create RAID 5 with four disks and a hot spare
mdadm --create /dev/md0 \
--level=5 \
--raid-devices=3 \
--spare-devices=1 \
/dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
# Monitor sync — RAID 5 takes longer than RAID 1 for the same capacity
cat /proc/mdstat
Creating a RAID 6 Array (Four or More Disks)
# RAID 6: tolerates two simultaneous disk failures
# Minimum 4 disks, usable capacity = (N-2) disks
mdadm --create /dev/md0 \
--level=6 \
--raid-devices=4 \
/dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
# Example: 4 × 4TB disks → usable capacity = 2 × 4TB = 8TB
# Verify creation
mdadm --detail /dev/md0
Creating RAID 10 (Stripe of Mirrors)
# RAID 10 requires an even number of disks (minimum 4)
# Provides both striping performance and mirror redundancy
mdadm --create /dev/md0 \
--level=10 \
--raid-devices=4 \
--layout=n2 \ # n2 = 2-way near mirror (default, best for most cases)
/dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
# Layout options:
# n2 (near): Mirrored copies on sequential drives — good random read
# f2 (far): Copies on far sectors — better sequential read
# o2 (offset): Offset copies — balances read performance
Filesystem, Mounting, and fstab
# Wait for sync to complete before creating filesystem (mandatory)
# Check: /proc/mdstat should show [UU] with no resync percentage
# Create XFS filesystem (recommended for RAID — journal helps recovery)
mkfs.xfs -f /dev/md0
# Or ext4 with proper stripe parameters (improves performance)
mdadm --detail /dev/md0 | grep "Chunk Size"
# If Chunk Size = 512K, use: -E stride=128,stripe-width=256 (for RAID5 with 3 disks)
mkfs.ext4 -b 4096 -E stride=128,stripe-width=256 /dev/md0
# Mount and test
mkdir -p /data/raid
mount /dev/md0 /data/raid
df -h /data/raid
Persistent Mount via fstab
# Get the UUID of the RAID array (more reliable than /dev/md0 which can change)
blkid /dev/md0
# /dev/md0: UUID="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" TYPE="xfs"
# Add to /etc/fstab
echo "UUID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx /data/raid xfs defaults,nofail 0 0" >> /etc/fstab
# Test fstab without rebooting
mount -a
df -h /data/raid
Save mdadm Configuration
# Save the array config so mdadm can assemble it on next boot
mdadm --detail --scan >> /etc/mdadm/mdadm.conf
# Or on RHEL/Rocky (no /etc/mdadm/ by default):
mdadm --detail --scan >> /etc/mdadm.conf
# Update initramfs so RAID is available at early boot
update-initramfs -u # Debian/Ubuntu
dracut -f # RHEL/Rocky/Fedora
Monitoring RAID Health
Check Array Status
# Quick status of all arrays
cat /proc/mdstat
# Detailed info for a specific array
mdadm --detail /dev/md0
# Check individual disk state
mdadm --examine /dev/sdb1
# Key states in mdadm --detail:
# [UU] = all disks Up (healthy)
# [U_] = one disk failed/missing (degraded)
# [_U] = same, different position
Email Alerts on Failure
# Enable mdadm monitoring and email alerts
# Edit /etc/mdadm/mdadm.conf (or /etc/mdadm.conf):
MAILADDR admin@example.com
MAILFROM mdadm@yourserver.com
# Start the monitor daemon
systemctl enable --now mdmonitor
# Test that alerts work
mdadm --monitor --daemonise --mail=admin@example.com --delay=30 /dev/md0
Prometheus Metrics with node_exporter
# node_exporter automatically exposes mdadm RAID metrics
# Key metrics:
# node_md_state — array health state (0=clean, 1=active, 2=degraded)
# node_md_disks_active — number of active member disks
# node_md_disks — total expected member disks
# Grafana alert rule example (PromQL):
# node_md_disks_active{device="md0"} < node_md_disks{device="md0"}
Responding to a Disk Failure
# Step 1: Identify the failed disk (mdadm marks it as (F)ailed)
mdadm --detail /dev/md0
cat /proc/mdstat
# Step 2: Fail the disk manually if mdadm hasn't already
mdadm /dev/md0 --fail /dev/sdb1
# Step 3: Remove the failed disk from the array
mdadm /dev/md0 --remove /dev/sdb1
# Step 4: Physically replace the disk (server may support hot-swap)
# After replacement, the new disk appears as /dev/sdb
# Step 5: Partition the new disk identically to other members
parted /dev/sdb --script mklabel gpt mkpart primary 1MiB 100%
# Step 6: Add the new disk to the array
mdadm /dev/md0 --add /dev/sdb1
# Step 7: Monitor the rebuild
watch -n5 cat /proc/mdstat
Adding a Hot Spare
A hot spare is a disk added to the array that sits idle until a member disk fails. When failure occurs, mdadm automatically begins rebuilding onto the spare — no manual intervention required.
# Add a hot spare to an existing array
mdadm /dev/md0 --add /dev/sde1
# Verify it shows as spare (S)
mdadm --detail /dev/md0 | grep Spare
Expanding an Existing Array
# Add a new disk to increase array size (RAID 5/6 supports this)
mdadm /dev/md0 --add /dev/sdf1
# Grow the array to use the new disk
mdadm --grow /dev/md0 --raid-devices=4
# Monitor the reshape (can take hours for large arrays)
cat /proc/mdstat
# After reshape completes, grow the filesystem to use new space
xfs_growfs /data/raid # XFS (online, no unmount needed)
# or
resize2fs /dev/md0 # ext4 (can be done while mounted)
Booting from a RAID Array
For a RAID-protected OS installation, create a small RAID 1 for /boot and the main RAID array for /. This ensures the system boots even if one disk fails.
# During installation (Anaconda/Debian installer):
# Create partitions on both disks:
# /dev/sda1 + /dev/sdb1 → RAID 1 → /boot (500MB, ext4)
# /dev/sda2 + /dev/sdb2 → RAID 1 → / (remaining, xfs)
# Post-install: Install GRUB to both disks so either can boot
grub2-install /dev/sda
grub2-install /dev/sdb
grub2-mkconfig -o /boot/grub2/grub.cfg
Benchmarking RAID Performance
# Sequential read/write benchmark
fio --name=seqread --rw=read --bs=1M --size=4G --numjobs=1 \
--filename=/data/raid/testfile --runtime=30 --time_based
fio --name=seqwrite --rw=write --bs=1M --size=4G --numjobs=1 \
--filename=/data/raid/testfile --runtime=30 --time_based
# Random I/O (databases)
fio --name=randread --rw=randread --bs=4K --size=1G --numjobs=4 \
--filename=/data/raid/testfile --runtime=30 --time_based --group_reporting
# Simple dd benchmarks
dd if=/dev/zero of=/data/raid/testfile bs=1M count=4096 oflag=direct
dd if=/data/raid/testfile of=/dev/null bs=1M iflag=direct
# Clean up test file
rm /data/raid/testfile
Conclusion
Linux software RAID with mdadm delivers enterprise-grade redundancy on any hardware, from a $30 server chassis to a hyperscale storage node. The key advantages over hardware RAID — portability (arrays reassemble on any Linux system), full transparency (every operation is visible in /proc/mdstat and logs), and the ability to grow and reshape arrays online — make it a compelling choice even when hardware RAID is available. Set up email alerts through mdadm’s monitor daemon, keep a hot spare in each critical array, and test your recovery procedure before you need it: mount a degraded array, confirm data access, then rebuild. That drill will pay off the day a real disk failure wakes you at 3 AM.
Was this article helpful?
About Ramesh Sundararamaiah
Red Hat Certified Architect
Expert in Linux system administration, DevOps automation, and cloud infrastructure. Specializing in Red Hat Enterprise Linux, CentOS, Ubuntu, Docker, Ansible, and enterprise IT solutions.