NAS – Next-gen filesystems – BTRFS RAID 5

Introduction

This part continues on the previous post. Except I’ve loaded up some extra virtual disks for testing RAID5. With BTRFS RAID5 is the most experimental part. With being just implemented in kernel 3.19.

The tests

Test One: Setting up the RAID5

So let’s look at our setup. In this case I have three 6GB disks.

lsblk
NAME                        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                           8:0    0     6G  0 disk
├─sda1                        8:1    0   243M  0 part /boot
├─sda2                        8:2    0     1K  0 part
└─sda5                        8:5    0   5.8G  0 part
  ├─btrfs--vg-root (dm-0)   252:0    0   4.8G  0 lvm  /
  └─btrfs--vg-swap_1 (dm-1) 252:1    0     1G  0 lvm  [SWAP]
sdb                           8:16   0     6G  0 disk
sdc                           8:32   0     6G  0 disk
sdd                           8:48   0     6G  0 disk

So let’s create a BTRFS RAID5 on top of those three.

sudo mkfs.btrfs -d raid5 -m raid5 -L disk-raid5 /dev/sdb /dev/sdc /dev/sdd
sudo btrfs fi show
Label: 'disk-raid5'  uuid: 5e8d29ae-aea8-4460-a049-fae62e9994fd
        Total devices 3 FS bytes used 112.00KiB
        devid    1 size 6.00GiB used 1.23GiB path /dev/sdb
        devid    2 size 6.00GiB used 1.21GiB path /dev/sdc
        devid    3 size 6.00GiB used 1.21GiB path /dev/sdd

As you can see all devices are connected and assigned to the raid5.

Now let’s add it to our fstab file.

nano /etc/fstab

Use the UUID ‘5e8d29ae-aea8-4460-a049-fae62e9994fd’ from the ‘fi show’ command.

UUID=5e8d29ae-aea8-4460-a049-fae62e9994fd /media/btrfs-raid5          btrfs defaults 0       0

Now create our mountpiunt.

sudo mkdir -p /media/btrfs-raid5

And test is all works with a reboot.

sudo reboot

After rebooting we should see our new filesystem. Note that it will show 18GB instead of 12GB. The program ‘df’ has a terrible way to report space used for BTRFS filesystems.

df -h
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/btrfs--vg-root  4.6G  1.9G  2.5G  43% /
none                        4.0K     0  4.0K   0% /sys/fs/cgroup
udev                        477M  4.0K  477M   1% /dev
tmpfs                        98M  984K   97M   1% /run
none                        5.0M     0  5.0M   0% /run/lock
none                        488M     0  488M   0% /run/shm
none                        100M     0  100M   0% /run/user
/dev/sdb                     18G   17M   16G   1% /media/btrfs-raid5
/dev/sda1                   236M  100M  124M  45% /boot

Enabling Samba

Add our ‘/media/btrfs-raid5’ as a samba share so we can add some files.

sudo nano /etc/samba/smb.conf
[btrfs-raid5]
   comment = Test BTRFS RAID 5
   browseable = yes
   path = /media/btrfs-raid5
   valid users = btrfs
   writable = yes

Assign correct rights

sudo chown -R btrfs:btrfs /media/btrfs-raid5/

Restart the service to apply our changes

sudo service smbd restart

Now we can copy some test data to the disks

df -h
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/btrfs--vg-root  4.6G  1.9G  2.5G  44% /
none                        4.0K     0  4.0K   0% /sys/fs/cgroup
udev                        477M  4.0K  477M   1% /dev
tmpfs                        98M  1.3M   97M   2% /run
none                        5.0M     0  5.0M   0% /run/lock
none                        488M     0  488M   0% /run/shm
none                        100M     0  100M   0% /run/user
/dev/sdb                     18G  5.3G  7.6G  42% /media/btrfs-raid5
/dev/sda1                   236M  100M  124M  45% /boot
sudo btrfs fi show
Label: disk-raid5  uuid: 5e8d29ae-aea8-4460-a049-fae62e9994fd
        Total devices 3 FS bytes used 5.28GiB
        devid    1 size 6.00GiB used 3.95GiB path /dev/sdb
        devid    2 size 6.00GiB used 3.93GiB path /dev/sdc
        devid    3 size 6.00GiB used 3.93GiB path /dev/sdd

Test Two: Expanding the RAID5

For this test I’ve shut down the machine and added an extra disk ‘sde’.

lsblk
NAME                        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                           8:0    0     6G  0 disk
├─sda1                        8:1    0   243M  0 part /boot
├─sda2                        8:2    0     1K  0 part
└─sda5                        8:5    0   5.8G  0 part
  ├─btrfs--vg-root (dm-0)   252:0    0   4.8G  0 lvm  /
  └─btrfs--vg-swap_1 (dm-1) 252:1    0     1G  0 lvm  [SWAP]
sdb                           8:16   0     6G  0 disk
sdc                           8:32   0     6G  0 disk
sdd                           8:48   0     6G  0 disk
sde                           8:64   0     6G  0 disk

Next add it to the mount point ‘/media/btrfs-raid5’. In our case we can add the device also to sdb. (As reported by df to be the disk for this mountpoint.)

sudo btrfs device add /dev/sde /media/btrfs-raid5

Once added you can query the filesystem to see if it’s there.

sudo btrfs fi show
Label: disk-raid5  uuid: 5e8d29ae-aea8-4460-a049-fae62e9994fd
        Total devices 4 FS bytes used 5.28GiB
        devid    1 size 6.00GiB used 3.95GiB path /dev/sdb
        devid    2 size 6.00GiB used 3.93GiB path /dev/sdc
        devid    3 size 6.00GiB used 3.93GiB path /dev/sdd
        devid    4 size 6.00GiB used 0.00 path /dev/sde

What is noticable is that there isn’t any data on this disk once we add it. Do note that in BTRFS you are responsible for balancing out your RAID5 once you start adding disks. Before we balance let’s verify the md5sums so we can see if the balance did any harm.

md5sum /media/btrfs-raid5/*
03486548bc7b0f1a3881dc00c0f8c5f8  /media/btrfs-raid5/S01E01 FLEMISH HDTV x264.mp4
a9390aed84a6be8c145046772296db26  /media/btrfs-raid5/S01E02 FLEMISH HDTV x264.mp4
2e37ed514579ac282986efd78ac3bb76  /media/btrfs-raid5/S01E03 FLEMISH HDTV x264.mp4
1596a5e56f14c843b5c27e2d3ff27ebd  /media/btrfs-raid5/S01E04 FLEMISH HDTV x264.mp4
f7d494d6858391ac5c312d141d9ee0e5  /media/btrfs-raid5/S01E05 FLEMISH HDTV x264.mp4
fe6f097ff136428bfc3e2a1b8e420e4e  /media/btrfs-raid5/S01E06 FLEMISH HDTV x264.mp4
43c5314079f08570f6bb24b5d6fde101  /media/btrfs-raid5/S01E07 FLEMISH HDTV x264.mp4
3b5ea952b632bbc58f608d64667cd2a1  /media/btrfs-raid5/S01E08 FLEMISH HDTV x264.mp4
db6b8bf608de2008455b462e76b0c1dd  /media/btrfs-raid5/S01E09 FLEMISH HDTV x264.mp4
0d5775373e1168feeef99889a1d8fe0a  /media/btrfs-raid5/S01E10 FLEMISH HDTV x264.mp4
8dd4b25c249778f197fdb33604fdb998  /media/btrfs-raid5/S01E11 FLEMISH HDTV x264.mp4
edac6a857b137136a4d27bf6926e1287  /media/btrfs-raid5/S01E12 FLEMISH HDTV x264.mp4

Next start the balance.

sudo btrfs balance start /media/btrfs-raid5

Once done we should see that the data is evenly spread across the disks.

sudo btrfs fi show
Label: disk-raid5  uuid: 5e8d29ae-aea8-4460-a049-fae62e9994fd
        Total devices 4 FS bytes used 5.28GiB
        devid    1 size 6.00GiB used 2.56GiB path /dev/sdb
        devid    2 size 6.00GiB used 2.56GiB path /dev/sdc
        devid    3 size 6.00GiB used 2.56GiB path /dev/sdd
        devid    4 size 6.00GiB used 2.56GiB path /dev/sde

Now let’s generate our md5sums again to see if the data is changed.

md5sum /media/btrfs-raid5/*
03486548bc7b0f1a3881dc00c0f8c5f8  /media/btrfs-raid5/S01E01 FLEMISH HDTV x264.mp4
a9390aed84a6be8c145046772296db26  /media/btrfs-raid5/S01E02 FLEMISH HDTV x264.mp4
2e37ed514579ac282986efd78ac3bb76  /media/btrfs-raid5/S01E03 FLEMISH HDTV x264.mp4
1596a5e56f14c843b5c27e2d3ff27ebd  /media/btrfs-raid5/S01E04 FLEMISH HDTV x264.mp4
f7d494d6858391ac5c312d141d9ee0e5  /media/btrfs-raid5/S01E05 FLEMISH HDTV x264.mp4
fe6f097ff136428bfc3e2a1b8e420e4e  /media/btrfs-raid5/S01E06 FLEMISH HDTV x264.mp4
43c5314079f08570f6bb24b5d6fde101  /media/btrfs-raid5/S01E07 FLEMISH HDTV x264.mp4
3b5ea952b632bbc58f608d64667cd2a1  /media/btrfs-raid5/S01E08 FLEMISH HDTV x264.mp4
db6b8bf608de2008455b462e76b0c1dd  /media/btrfs-raid5/S01E09 FLEMISH HDTV x264.mp4
0d5775373e1168feeef99889a1d8fe0a  /media/btrfs-raid5/S01E10 FLEMISH HDTV x264.mp4
8dd4b25c249778f197fdb33604fdb998  /media/btrfs-raid5/S01E11 FLEMISH HDTV x264.mp4
edac6a857b137136a4d27bf6926e1287  /media/btrfs-raid5/S01E12 FLEMISH HDTV x264.mp4

Still all good.

Test Three: Replacing a disk

So let’s replace our our first disk sdb. I chose this disk as it represents the mount point in the df command. Ideal test case to break.

NAME                        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                           8:0    0     6G  0 disk
├─sda1                        8:1    0   243M  0 part /boot
├─sda2                        8:2    0     1K  0 part
└─sda5                        8:5    0   5.8G  0 part
  ├─btrfs--vg-root (dm-0)   252:0    0   4.8G  0 lvm  /
  └─btrfs--vg-swap_1 (dm-1) 252:1    0     1G  0 lvm  [SWAP]
sdb                           8:16   0     6G  0 disk
sdc                           8:32   0     6G  0 disk
sdd                           8:48   0     6G  0 disk
sde                           8:64   0     6G  0 disk
sdf                           8:80   0     6G  0 disk

For RAID5 you can add and remove. Although I would recommend replace. But as it is the aim to see if everything breaks, I am going ahead with the add / delete option.

sudo btrfs device add /dev/sdf /media/btrfs-raid5
sudo btrfs fi show
Label: disk-raid5  uuid: 5e8d29ae-aea8-4460-a049-fae62e9994fd
        Total devices 5 FS bytes used 5.28GiB
        devid    1 size 6.00GiB used 2.56GiB path /dev/sdb
        devid    2 size 6.00GiB used 2.56GiB path /dev/sdc
        devid    3 size 6.00GiB used 2.56GiB path /dev/sdd
        devid    4 size 6.00GiB used 2.56GiB path /dev/sde
        devid    5 size 6.00GiB used 0.00 path /dev/sdf

Now delete the old /dev/sdb.

sudo btrfs device delete /dev/sdb /media/btrfs-raid5

This event causes a rebalance. As the data won’t be redundant. Note: this could take a long time.

sudo btrfs fi show
Label: disk-raid5  uuid: 5e8d29ae-aea8-4460-a049-fae62e9994fd
        Total devices 4 FS bytes used 5.28GiB
        devid    2 size 6.00GiB used 2.56GiB path /dev/sdc
        devid    3 size 6.00GiB used 2.56GiB path /dev/sdd
        devid    4 size 6.00GiB used 2.56GiB path /dev/sde
        devid    5 size 6.00GiB used 2.56GiB path /dev/sdf

Ok, now we are done generate the md5sums again.

md5sum /media/btrfs-raid5/*
03486548bc7b0f1a3881dc00c0f8c5f8  /media/btrfs-raid5/S01E01 FLEMISH HDTV x264.mp4
a9390aed84a6be8c145046772296db26  /media/btrfs-raid5/S01E02 FLEMISH HDTV x264.mp4
2e37ed514579ac282986efd78ac3bb76  /media/btrfs-raid5/S01E03 FLEMISH HDTV x264.mp4
1596a5e56f14c843b5c27e2d3ff27ebd  /media/btrfs-raid5/S01E04 FLEMISH HDTV x264.mp4
f7d494d6858391ac5c312d141d9ee0e5  /media/btrfs-raid5/S01E05 FLEMISH HDTV x264.mp4
fe6f097ff136428bfc3e2a1b8e420e4e  /media/btrfs-raid5/S01E06 FLEMISH HDTV x264.mp4
43c5314079f08570f6bb24b5d6fde101  /media/btrfs-raid5/S01E07 FLEMISH HDTV x264.mp4
3b5ea952b632bbc58f608d64667cd2a1  /media/btrfs-raid5/S01E08 FLEMISH HDTV x264.mp4
db6b8bf608de2008455b462e76b0c1dd  /media/btrfs-raid5/S01E09 FLEMISH HDTV x264.mp4
0d5775373e1168feeef99889a1d8fe0a  /media/btrfs-raid5/S01E10 FLEMISH HDTV x264.mp4
8dd4b25c249778f197fdb33604fdb998  /media/btrfs-raid5/S01E11 FLEMISH HDTV x264.mp4
edac6a857b137136a4d27bf6926e1287  /media/btrfs-raid5/S01E12 FLEMISH HDTV x264.mp4

To see if my weird changes keep working I want to see if the drive comes back up once deleted.

sudo reboot

Yup, here it is again.

sudo btrfs fi show
Label: disk-raid5  uuid: 5e8d29ae-aea8-4460-a049-fae62e9994fd
        Total devices 4 FS bytes used 5.28GiB
        devid    2 size 6.00GiB used 2.56GiB path /dev/sdc
        devid    3 size 6.00GiB used 2.56GiB path /dev/sdd
        devid    4 size 6.00GiB used 2.56GiB path /dev/sde
        devid    5 size 6.00GiB used 2.56GiB path /dev/sdf

Test Four: Crashing a disk

In this case I physically (or virtually) disconnected a disk.
So when booting you will see the ‘An error occured while mounting /media/btrfs-raid5’. Press s to skip.

Let’s verify the filesystem.

lsblk
NAME                        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                           8:0    0     6G  0 disk
├─sda1                        8:1    0   243M  0 part /boot
├─sda2                        8:2    0     1K  0 part
└─sda5                        8:5    0   5.8G  0 part
  ├─btrfs--vg-root (dm-0)   252:0    0   4.8G  0 lvm  /
  └─btrfs--vg-swap_1 (dm-1) 252:1    0     1G  0 lvm  [SWAP]
sdb                           8:16   0     6G  0 disk
sdc                           8:32   0     6G  0 disk
sdd                           8:48   0     6G  0 disk
sde                           8:64   0     6G  0 disk

So in our case we disconnected disk ‘sdf’ but this is a false report (see later). Let’s verify if it is still mounted (shouldn’t be there, as we said skip).

df -h
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/btrfs--vg-root  4.6G  1.9G  2.5G  44% /
none                        4.0K     0  4.0K   0% /sys/fs/cgroup
udev                        477M  4.0K  477M   1% /dev
tmpfs                        98M  1.2M   97M   2% /run
none                        5.0M     0  5.0M   0% /run/lock
none                        488M     0  488M   0% /run/shm
none                        100M     0  100M   0% /run/user
/dev/sda1                   236M  100M  124M  45% /boot

Now let’s inspect our BTRFS filesystem.

sudo btrfs fi show
Label: 'disk-raid5'  uuid: 5e8d29ae-aea8-4460-a049-fae62e9994fd
        Total devices 4 FS bytes used 5.28GiB
        devid    2 size 6.00GiB used 2.59GiB path /dev/sdc
        devid    4 size 6.00GiB used 2.59GiB path /dev/sdd
        devid    5 size 6.00GiB used 2.56GiB path /dev/sde
        *** Some devices missing

It shows missing devid 3. Which wasn’t sdf. But sdd. This is what is false reported. Seems like the /dev/ assignments got shuffled somehow.

So let’s see if we can repair this. Let’s try mounting it.

sudo mount -v -t btrfs LABEL=disk-raid5 /media/btrfs-raid5/

Won’t work. It seems we need a ‘-o degraded’ status.

sudo mount -v -t btrfs -o degraded LABEL=disk-raid5 /media/btrfs-raid5/

This should work.

df -h
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/btrfs--vg-root  4.6G  1.9G  2.5G  44% /
none                        4.0K     0  4.0K   0% /sys/fs/cgroup
udev                        477M   12K  477M   1% /dev
tmpfs                        98M  1.2M   97M   2% /run
none                        5.0M     0  5.0M   0% /run/lock
none                        488M     0  488M   0% /run/shm
none                        100M     0  100M   0% /run/user
/dev/sda1                   236M  100M  124M  45% /boot
/dev/sdc                     24G  5.3G   13G  31% /media/btrfs-raid5
sudo btrfs fi show

When we inspect the file system, we can see the devid 3 isn’t mapped back in. However it still ‘knows’ what amount of data there should be.

Label: disk-raid5  uuid: 5e8d29ae-aea8-4460-a049-fae62e9994fd
        Total devices 4 FS bytes used 5.28GiB
        devid    2 size 6.00GiB used 2.56GiB path /dev/sdc
        devid    3 size 6.00GiB used 2.56GiB path
        devid    4 size 6.00GiB used 2.56GiB path /dev/sdd
        devid    5 size 6.00GiB used 2.56GiB path /dev/sde

Let’s verify if it has some impact on the data.

md5sum /media/btrfs-raid5/*
03486548bc7b0f1a3881dc00c0f8c5f8  /media/btrfs-raid5/S01E01 FLEMISH HDTV x264.mp4
a9390aed84a6be8c145046772296db26  /media/btrfs-raid5/S01E02 FLEMISH HDTV x264.mp4
2e37ed514579ac282986efd78ac3bb76  /media/btrfs-raid5/S01E03 FLEMISH HDTV x264.mp4
1596a5e56f14c843b5c27e2d3ff27ebd  /media/btrfs-raid5/S01E04 FLEMISH HDTV x264.mp4
f7d494d6858391ac5c312d141d9ee0e5  /media/btrfs-raid5/S01E05 FLEMISH HDTV x264.mp4
fe6f097ff136428bfc3e2a1b8e420e4e  /media/btrfs-raid5/S01E06 FLEMISH HDTV x264.mp4
43c5314079f08570f6bb24b5d6fde101  /media/btrfs-raid5/S01E07 FLEMISH HDTV x264.mp4
3b5ea952b632bbc58f608d64667cd2a1  /media/btrfs-raid5/S01E08 FLEMISH HDTV x264.mp4
db6b8bf608de2008455b462e76b0c1dd  /media/btrfs-raid5/S01E09 FLEMISH HDTV x264.mp4
0d5775373e1168feeef99889a1d8fe0a  /media/btrfs-raid5/S01E10 FLEMISH HDTV x264.mp4
8dd4b25c249778f197fdb33604fdb998  /media/btrfs-raid5/S01E11 FLEMISH HDTV x264.mp4
edac6a857b137136a4d27bf6926e1287  /media/btrfs-raid5/S01E12 FLEMISH HDTV x264.mp4

Still all good. Perfomance seems to be severly impacted, however this will be due to the missing drive.

Now delete all missing disks from the file system.

sudo btrfs device delete missing /media/btrfs-raid5/

As you can see, this causes a rebalance. So a full disk will likely fail at this point. And a replace should be used instead this approach.

sudo btrfs fi show
        Total devices 3 FS bytes used 5.28GiB
        devid    2 size 6.00GiB used 2.88GiB path /dev/sdc
        devid    4 size 6.00GiB used 2.88GiB path /dev/sdd
        devid    5 size 6.00GiB used 2.88GiB path /dev/sde

Let’s reuse the /dev/sdb, so wipe it first.

sudo wipefs /dev/sdb

Now add this disk to the btrfs RAID5.

sudo btrfs device add /dev/sdb /media/btrfs-raid5
sudo btrfs fi show
Label: disk-raid5  uuid: 5e8d29ae-aea8-4460-a049-fae62e9994fd
        Total devices 4 FS bytes used 5.28GiB
        devid    2 size 6.00GiB used 2.88GiB path /dev/sdc
        devid    4 size 6.00GiB used 2.88GiB path /dev/sdd
        devid    5 size 6.00GiB used 2.88GiB path /dev/sde
        devid    6 size 6.00GiB used 0.00 path /dev/sdb

Now that we have added the /dev/sdb disk we can see a severe imbalance in the files. Luckily this can be easily fixed with a balance.

sudo btrfs balance start /media/btrfs-raid5
sudo btrfs fi show
Label: disk-raid5  uuid: 5e8d29ae-aea8-4460-a049-fae62e9994fd
        Total devices 4 FS bytes used 5.28GiB
        devid    2 size 6.00GiB used 2.56GiB path /dev/sdc
        devid    4 size 6.00GiB used 2.56GiB path /dev/sdd
        devid    5 size 6.00GiB used 2.56GiB path /dev/sde
        devid    6 size 6.00GiB used 2.56GiB path /dev/sdb
</p
Let's check our files again
<pre lang="Bash">
md5sum /media/btrfs-raid5/*
03486548bc7b0f1a3881dc00c0f8c5f8  /media/btrfs-raid5/S01E01 FLEMISH HDTV x264.mp4
a9390aed84a6be8c145046772296db26  /media/btrfs-raid5/S01E02 FLEMISH HDTV x264.mp4
2e37ed514579ac282986efd78ac3bb76  /media/btrfs-raid5/S01E03 FLEMISH HDTV x264.mp4
1596a5e56f14c843b5c27e2d3ff27ebd  /media/btrfs-raid5/S01E04 FLEMISH HDTV x264.mp4
f7d494d6858391ac5c312d141d9ee0e5  /media/btrfs-raid5/S01E05 FLEMISH HDTV x264.mp4
fe6f097ff136428bfc3e2a1b8e420e4e  /media/btrfs-raid5/S01E06 FLEMISH HDTV x264.mp4
43c5314079f08570f6bb24b5d6fde101  /media/btrfs-raid5/S01E07 FLEMISH HDTV x264.mp4
3b5ea952b632bbc58f608d64667cd2a1  /media/btrfs-raid5/S01E08 FLEMISH HDTV x264.mp4
db6b8bf608de2008455b462e76b0c1dd  /media/btrfs-raid5/S01E09 FLEMISH HDTV x264.mp4
0d5775373e1168feeef99889a1d8fe0a  /media/btrfs-raid5/S01E10 FLEMISH HDTV x264.mp4
8dd4b25c249778f197fdb33604fdb998  /media/btrfs-raid5/S01E11 FLEMISH HDTV x264.mp4
edac6a857b137136a4d27bf6926e1287  /media/btrfs-raid5/S01E12 FLEMISH HDTV x264.mp4

To verify if the BTRFS keeps existing you can reboot.

sudo reboot

Test Five: Byte corruption.

For this test I will fill the entire RAID with as much data as possible. (6 drives should equal to around 18 GB of data)

du -sh /media/btrfs-raid5
18G     /media/btrfs-raid5
btrfs filesystem df /media/btrfs-raid5/
Data, RAID5: total=17.62GiB, used=17.02GiB
System, RAID5: total=96.00MiB, used=16.00KiB
Metadata, RAID5: total=288.00MiB, used=19.20MiB
unknown, single: total=16.00MiB, used=0.00

Also note that df isn’t really a great tool to calculate size free for our BTRFS parition.

df -h
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/btrfs--vg-root  4.6G  1.9G  2.5G  44% /
none                        4.0K     0  4.0K   0% /sys/fs/cgroup
udev                        477M  4.0K  477M   1% /dev
tmpfs                        98M  1.3M   97M   2% /run
none                        5.0M     0  5.0M   0% /run/lock
none                        488M     0  488M   0% /run/shm
none                        100M     0  100M   0% /run/user
/dev/sda1                   236M  100M  124M  45% /boot
/dev/sdb                     24G   18G   50M 100% /media/btrfs-raid5

Now to test this I will shutdown the machine and use wxHexEditor to corrupt some bytes of our virtual disks. This emulates a disk writing bad bytes.

So once this is done start the machine again.

cat /var/log/syslog | grep BTRFS

This shows no errors, which is normal as BTRFS hasn’t started scrubbing. To trigger a scrub, you can access the file (which will trigger the checksum). Or we can start a scrub of the disks manually.

sudo btrfs scrub start /media/btrfs-raid5/

Once scrubbing starts we can follow the process.

sudo watch btrfs scrub status /media/btrfs-raid5/
scrub status for 5e8d29ae-aea8-4460-a049-fae62e9994fd
        scrub started at Sun Nov 22 17:08:46 2015, running for 120 seconds
        total bytes scrubbed: 15.62GiB with 3 errors
        error details: csum=3
        corrected errors: 3, uncorrectable errors: 0, unverified errors: 0

Now syslog will show errors popping up.

Nov 22 17:09:22 btrfs kernel: [  261.969305] BTRFS: checksum error at logical 44498075648 on dev /dev/sdc, sector 10316768, root 5, inode 281, offset 569163776, length 4096, links 1 (path: S01E05.mkv)
Nov 22 17:09:22 btrfs kernel: [  261.969310] BTRFS: bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
Nov 22 17:09:22 btrfs kernel: [  262.159200] BTRFS: fixed up error at logical 44498075648 on dev /dev/sdc
Nov 22 17:09:28 btrfs kernel: [  267.507804] BTRFS: checksum error at logical 48935047168 on dev /dev/sdc, sector 12507592, root 5, inode 287, offset 426938368, length 4096, links 1 (path: S02E03.mkv)
Nov 22 17:09:28 btrfs kernel: [  267.507809] BTRFS: bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
Nov 22 17:09:28 btrfs kernel: [  267.717962] BTRFS: fixed up error at logical 48935047168 on dev /dev/sdc
Nov 22 17:10:29 btrfs kernel: [  328.740414] BTRFS: checksum error at logical 45808136192 on dev /dev/sdc, sector 11169624, root 5, inode 283, offset 555790336, length 4096, links 1 (path: S01E07.mkv)

So this were the lengthy RAID5 tests and the last part of BTRFS. Next up is mhddfs.

Powershell – Windows Firewall: Trusting range of IPs

Introduction

just a code snippet this time. For a demo environment I’ve had to trust an entire set of IPs. Being the lazy person that I am I created a little powershell to add all at once.

The code

$IPs = @("10.10.10.210", "10.10.10.211", "10.10.10.212", "10.10.10.213", "10.10.10.214", "10.10.10.215", "10.10.10.216", "10.10.10.217", "10.10.10.218", "10.10.10.219") |`
	Foreach-object {
 
	#delete old rule (if there is one)
	netsh advfirewall firewall delete rule name="Allow from $_"
	#add new rule 
	netsh advfirewall firewall add rule name="Allow from $_" dir=in action=allow protocol=ANY remoteip=$_
	write-host "$_ Added Incoming for $?"
 
	#delete old rule (if there is one)
	netsh advfirewall firewall delete rule name="Allow to $_"
	#add new rule 
	netsh advfirewall firewall add rule name="Allow to $_" dir=out action=allow protocol=ANY remoteip=$_
	write-host "$_ Added Outgoing $?"
}

NAS – Next-gen filesystems – BTRFS RAID 0 & conversion to RAID 5

Introduction

This post will handle the creating of a RAID0 and the conversion of RAID5. This is a use case for the creation of my NAS. As I don’t have spare disks laying around, I will migrate the data from my current RAID5 to a BTRFS RAID0 and then decomission the old RAID5 and add a disk to the RAID0 to create a new BTRFS RAID5 system.

Test One: creating a RAID0

Our first test is to create a RAID0 on our BTRFS file system. I am currently using three 5GB disks. For RAID0 I will be using /dev/sdb and /dev/sdc

lsblk
NAME                        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                           8:0    0     6G  0 disk
├─sda1                        8:1    0   243M  0 part /boot
├─sda2                        8:2    0     1K  0 part
└─sda5                        8:5    0   5.8G  0 part
  ├─btrfs--vg-root (dm-0)   252:0    0   4.8G  0 lvm  /
  └─btrfs--vg-swap_1 (dm-1) 252:1    0     1G  0 lvm  [SWAP]
sdb                           8:16   0     5G  0 disk
sdc                           8:32   0     5G  0 disk
sdd                           8:48   0     5G  0 disk

So let’s create our RAID0 setup. Note that I am using the –mixed mode. This is for small disks only!

sudo mkfs.btrfs -d raid0 -m raid0 -L disk-raid0 --mixed /dev/sdb /dev/sdc

So now we can see in ‘fi show’ that whe have 2 devices working together.

sudo btrfs fi show
Label: 'disk-raid0'  uuid: a81963e6-8cfe-4da1-abed-579bc80669c7
        Total devices 2 FS bytes used 28.00KiB
        devid    1 size 5.00GiB used 532.00MiB path /dev/sdb
        devid    2 size 5.00GiB used 520.00MiB path /dev/sdc

Next, add it to the fstab file so it will start on boot.

sudo nano /etc/fstab

Use the UUID found in the ‘sudo btrfs fi show’ command.

UUID=a81963e6-8cfe-4da1-abed-579bc80669c7 /media/btrfs-raid0          btrfs defaults 0       0

Create our mountpoint.

sudo mkdir -p /media/btrfs-raid0

And reboot to test if the configuration works.

sudo reboot

Once rebooted you should see the RAID0 being mounted on ‘/media/btrfs-raid0’

df -h
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/btrfs--vg-root  4.6G  1.9G  2.5G  44% /
none                        4.0K     0  4.0K   0% /sys/fs/cgroup
udev                        477M  4.0K  477M   1% /dev
tmpfs                        98M  1.3M   97M   2% /run
none                        5.0M     0  5.0M   0% /run/lock
none                        488M     0  488M   0% /run/shm
none                        100M     0  100M   0% /run/user
/dev/sda1                   236M  100M  124M  45% /boot
/dev/sdc                     10G  4.3M   10G   1% /media/btrfs-raid0

Samba mount to test

Before continuing, I’d like to add a Samba mount point so I can move files to our newly created RAID0.

sudo nano /etc/samba/smb.conf
[btrfs-raid0]
   comment = Test BTRFS RAID 0
   browseable = yes
   path = /media/btrfs-raid0
   valid users = btrfs
   writable = yes

Assign correct ownership levels.

sudo chown -R btrfs:btrfs /media/btrfs-raid0/

Restart the service to verify if everything works.

sudo service smbd restart

Now copy some files and verify the content. (Skipping this)

Test Two: converting to a RAID5

Before we continue with the conversion of a RAID5, let’s chech our setup first.

sudo btrfs fi show

This shows two disks in use sdb and sdc.

Label: disk-raid0  uuid: a81963e6-8cfe-4da1-abed-579bc80669c7
        Total devices 2 FS bytes used 288.00KiB
        devid    1 size 5.00GiB used 528.00MiB path /dev/sdb
        devid    2 size 5.00GiB used 520.00MiB path /dev/sdc

Let’s fill it up with some 5GB of data and verify the content.

btrfs filesystem df /media/btrfs-raid0/
System, RAID0: total=16.00MiB, used=4.00KiB
Data+Metadata, RAID0: total=6.00GiB, used=5.28GiB
Data+Metadata, single: total=8.00MiB, used=4.00KiB
unknown, single: total=112.00MiB, used=0.00

Before converting to a RAID5 system, we need to add a disk. As RAID5 uses minimal 3 disks and RAID0 is currently working with 2. Let’s add the /dev/sdd disk.

sudo btrfs device add /dev/sdd /media/btrfs-raid0/

Now let’s see what has happened to our drives. As you can see there is nothing written on our added drive. Normally you should rebalance. But converting will trigger the rebalance.

sudo btrfs fi show
Label: disk-raid0  uuid: a81963e6-8cfe-4da1-abed-579bc80669c7
        Total devices 3 FS bytes used 5.28GiB
        devid    1 size 5.00GiB used 3.02GiB path /dev/sdb
        devid    2 size 5.00GiB used 3.01GiB path /dev/sdc
        devid    3 size 5.00GiB used 0.00 path /dev/sdd

Now let’s convert to a RAID5 system.

sudo btrfs balance start -dconvert=raid5 -mconvert=raid5 /media/btrfs-raid0/
sudo btrfs fi show

Now you can see that the drive is rebalanced and the data is written on our extra disk ‘sdd’.

Label: disk-raid0  uuid: a81963e6-8cfe-4da1-abed-579bc80669c7
        Total devices 3 FS bytes used 5.28GiB
        devid    1 size 5.00GiB used 4.03GiB path /dev/sdb
        devid    2 size 5.00GiB used 4.03GiB path /dev/sdc
        devid    3 size 5.00GiB used 4.03GiB path /dev/sdd

Now let’s verify if we can see the correct RAID level.

btrfs filesystem df /media/btrfs-raid0/
System, RAID5: total=64.00MiB, used=4.00KiB
Data+Metadata, RAID5: total=8.00GiB, used=5.28GiB
unknown, single: total=112.00MiB, used=0.00

Voila done. RAID0 is now a RAID5. :)

NAS – Next-gen filesystems – BTRFS RAID 1

Setting up BTRFS and upgrading the kernel

My test setup is based on Ubuntu 14.04.3 LTS, this version is still using 3.19 kernel, for BTRFS it’s better to use a newer stable version so we will update the kernel. In my case I will be updating to kernel 4.1.13. (At the moment of testing this is thle latest stable: https://www.kernel.org/)
Download the header files from Ubuntu

wget kernel.ubuntu.com/~kernel-ppa/mainline/v4.1.13-wily/linux-headers-4.1.13-040113_4.1.13-040113.201511092325_all.deb 
wget kernel.ubuntu.com/~kernel-ppa/mainline/v4.1.13-wily/linux-headers-4.1.13-040113-generic_4.1.13-040113.201511092325_amd64.deb 
wget kernel.ubuntu.com/~kernel-ppa/mainline/v4.1.13-wily/linux-image-4.1.13-040113-generic_4.1.13-040113.201511092325_amd64.deb

And let’s install

sudo dpkg -i linux-headers-4.1*.deb linux-image-4.1*.deb

Once completed reboot, to see if the kernel has been applied.

sudo reboot

Checking the kernel can be done

uname -r

Next install the ‘btrfs-tools’ package.

sudo apt-get install btrfs-tools

For convenience I am installing ‘samba’ too.

sudo apt-get install samba

Create the password.

sudo smbpasswd -a btrfs

Disk setup

Let’s add three disks to our virtual machine. In this example I’ve added three 5GB disks.

lsblk
NAME                        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                           8:0    0     6G  0 disk
├─sda1                        8:1    0   243M  0 part /boot
├─sda2                        8:2    0     1K  0 part
└─sda5                        8:5    0   5.8G  0 part
  ├─btrfs--vg-root (dm-0)   252:0    0   4.8G  0 lvm  /
  └─btrfs--vg-swap_1 (dm-1) 252:1    0     1G  0 lvm  [SWAP]
sdb                           8:16   0     5G  0 disk
sdc                           8:32   0     5G  0 disk
sdd                           8:48   0     5G  0 disk

Test One: creating a RAID1

Let’s create a RAID1 disk spanning sdb & sdc. If you are using disks larger than ~16GB, please don’t use ‘–mixed’: See Intermezzo 1

sudo mkfs.btrfs -d raid1 -m raid1 -L disk-raid1 --mixed /dev/sdb /dev/sdc 

Intermezzo 1: why ‘–mixed’?

To test the option ‘–mixed’ I ve created a setup without using this option. I’ve filled it up with a lot of data. And we’ll see what happens.

df -h
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/btrfs--vg-root  4.6G  1.9G  2.5G  44% /
none                        4.0K     0  4.0K   0% /sys/fs/cgroup
udev                        477M  8.0K  477M   1% /dev
tmpfs                        98M  1.3M   97M   2% /run
none                        5.0M     0  5.0M   0% /run/lock
none                        488M     0  488M   0% /run/shm
none                        100M     0  100M   0% /run/user
/dev/sda1                   236M  100M  124M  45% /boot
/dev/sdc                    5.0G  3.8G  203M  96% /media/btrfs-raid1

So after 3.8GB the disk already lists quite full. Quite odd no? The disk size should somewhere around 5GB.
Let’s see what btrfs shows.

sudo btrfs filesystem df /media/btrfs-raid1
Data, RAID1: total=3.97GiB, used=3.77GiB
Data, single: total=8.00MiB, used=0.00
System, RAID1: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, RAID1: total=1.00GiB, used=4.23MiB
Metadata, single: total=8.00MiB, used=0.00
unknown, single: total=16.00MiB, used=0.00

So what you will see is that the Metadata is taking up 1GB (Metadata, RAID1: total=1.00GiB). So this means that 1GB of the RAID is unusable. Hence, 3.8GB + 1GB ~= 5GB.
Let’s try to balance.

sudo btrfs balance start -v /media/btrfs-raid1
Dumping filters: flags 0x7, state 0x0, force is off
  DATA (flags 0x0): balancing
  METADATA (flags 0x0): balancing
  SYSTEM (flags 0x0): balancing
ERROR: error during balancing '/media/btrfs-raid1' - No space left on device

Apparently it doesn’t work :(.

More info: https://btrfs.wiki.kernel.org/index.php/FAQ#Help.21_I_ran_out_of_disk_space.21

/end intermezzo

Let’s continue.
Our new device is now working on sdb & sdc. The ‘show’ command should show a little bit more information.

sudo btrfs fi show
Label: 'disk-raid1'  uuid: 07759bba-2b6b-4d9a-b09f-605acbd6da0b
        Total devices 2 FS bytes used 28.00KiB
        devid    1 size 5.00GiB used 1.02GiB path /dev/sdb
        devid    2 size 5.00GiB used 1.01GiB path /dev/sdc

Now let’s add this to our fstab file.

sudo nano /etc/fstab

Fstab allows to use ‘btrfs’ as mount type. So we will use this.

UUID=07759bba-2b6b-4d9a-b09f-605acbd6da0b /media/btrfs-raid1          btrfs defaults 0       0

Create our mountpoint.

sudo mkdir -p /media/btrfs-raid1 

Now reboot and see if our fstab works.

sudo reboot
df -h
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/btrfs--vg-root  4.6G  1.9G  2.5G  44% /
none                        4.0K     0  4.0K   0% /sys/fs/cgroup
udev                        477M  4.0K  477M   1% /dev
tmpfs                        98M  1.2M   97M   2% /run
none                        5.0M     0  5.0M   0% /run/lock
none                        488M     0  488M   0% /run/shm
none                        100M     0  100M   0% /run/user
/dev/sda1                   236M  100M  124M  45% /boot
/dev/sdc                    5.0G   17M  4.0G   1% /media/btrfs-raid1

Configuring Samba

Next create a samba share to add some test data.

sudo nano /etc/samba/smb.conf
[btrfs-raid1]
   comment = Test BTRFS RAID 1
   browseable = yes
   path = /media/btrfs-raid1
   valid users = btrfs
   writable = yes

Change ownership.

sudo chown -R btrfs:btrfs /media/btrfs-raid1/
sudo service smbd restart

Let’s create an MD5 sum of my copied files to verify the file integrity later on.

md5sum /media/btrfs-raid1/*
03486548bc7b0f1a3881dc00c0f8c5f8  /media/btrfs-raid1/S01E01 HDTV x264.mp4
a9390aed84a6be8c145046772296db26  /media/btrfs-raid1/S01E02 HDTV x264.mp4
2e37ed514579ac282986efd78ac3bb76  /media/btrfs-raid1/S01E03 HDTV x264.mp4
1596a5e56f14c843b5c27e2d3ff27ebd  /media/btrfs-raid1/S01E04 HDTV x264.mp4
f7d494d6858391ac5c312d141d9ee0e5  /media/btrfs-raid1/S01E05 HDTV x264.mp4
fe6f097ff136428bfc3e2a1b8e420e4e  /media/btrfs-raid1/S01E06 HDTV x264.mp4
43c5314079f08570f6bb24b5d6fde101  /media/btrfs-raid1/S01E07 HDTV x264.mp4
3b5ea952b632bbc58f608d64667cd2a1  /media/btrfs-raid1/S01E08 HDTV x264.mp4
db6b8bf608de2008455b462e76b0c1dd  /media/btrfs-raid1/S01E09 HDTV x264.mp4
0d5775373e1168feeef99889a1d8fe0a  /media/btrfs-raid1/S01E10 HDTV x264.mp4
8dd4b25c249778f197fdb33604fdb998  /media/btrfs-raid1/S01E11 HDTV x264.mp4

Test 2 Replacing a disk

Let’s replace sdb with sdd in our setup.

sdd is also a 5GB disk so this should not pose any problems.

lsblk
NAME                        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                           8:0    0     6G  0 disk
├─sda1                        8:1    0   243M  0 part /boot
├─sda2                        8:2    0     1K  0 part
└─sda5                        8:5    0   5.8G  0 part
  ├─btrfs--vg-root (dm-0)   252:0    0   4.8G  0 lvm  /
  └─btrfs--vg-swap_1 (dm-1) 252:1    0     1G  0 lvm  [SWAP]
sdb                           8:16   0     5G  0 disk
sdc                           8:32   0     5G  0 disk
sdd                           8:48   0     5G  0 disk
sudo btrfs fi show
Label: disk-raid1  uuid: 07759bba-2b6b-4d9a-b09f-605acbd6da0b
        Total devices 2 FS bytes used 4.68GiB
        devid    1 size 5.00GiB used 5.00GiB path /dev/sdb
        devid    2 size 5.00GiB used 4.99GiB path /dev/sdc

So lets start replacing. The command to replace a disk is ‘replace’.

sudo btrfs replace start /dev/sdb /dev/sdd /media/btrfs-raid1

During the replace you will see an attached ‘devid’. This will be there during the replace.

sudo btrfs fi show
Label: disk-raid1  uuid: 07759bba-2b6b-4d9a-b09f-605acbd6da0b
        Total devices 3 FS bytes used 4.68GiB
        devid    1 size 5.00GiB used 5.00GiB path /dev/sdb
        devid    2 size 5.00GiB used 4.99GiB path /dev/sdc
        devid    0 size 0.00 used 0.00 path

To check the replace status run ‘btrfs replace status’.

sudo btrfs replace status /media/btrfs-raid1
31.2% done, 0 write errs, 0 uncorr. read errs

Once everything is done the filesystem should reflect our desired new configuration.

sudo btrfs fi show
Label: disk-raid1  uuid: 07759bba-2b6b-4d9a-b09f-605acbd6da0b
        Total devices 2 FS bytes used 4.68GiB
        devid    1 size 5.00GiB used 5.00GiB path /dev/sdd
        devid    2 size 5.00GiB used 4.99GiB path /dev/sdc
df -h
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/btrfs--vg-root  4.6G  1.9G  2.5G  44% /
none                        4.0K     0  4.0K   0% /sys/fs/cgroup
udev                        477M   12K  477M   1% /dev
tmpfs                        98M  1.3M   97M   2% /run
none                        5.0M     0  5.0M   0% /run/lock
none                        488M     0  488M   0% /run/shm
none                        100M     0  100M   0% /run/user
/dev/sda1                   236M  100M  124M  45% /boot
/dev/sdc                    5.0G  4.8G  307M  95% /media/btrfs-raid1

Let’s verify our MD5 sums again to see if everything is still consistent.

md5sum /media/btrfs-raid1/*
03486548bc7b0f1a3881dc00c0f8c5f8  /media/btrfs-raid1/S01E01 HDTV x264.mp4
a9390aed84a6be8c145046772296db26  /media/btrfs-raid1/S01E02 HDTV x264.mp4
2e37ed514579ac282986efd78ac3bb76  /media/btrfs-raid1/S01E03 HDTV x264.mp4
1596a5e56f14c843b5c27e2d3ff27ebd  /media/btrfs-raid1/S01E04 HDTV x264.mp4
f7d494d6858391ac5c312d141d9ee0e5  /media/btrfs-raid1/S01E05 HDTV x264.mp4
fe6f097ff136428bfc3e2a1b8e420e4e  /media/btrfs-raid1/S01E06 HDTV x264.mp4
43c5314079f08570f6bb24b5d6fde101  /media/btrfs-raid1/S01E07 HDTV x264.mp4
3b5ea952b632bbc58f608d64667cd2a1  /media/btrfs-raid1/S01E08 HDTV x264.mp4
db6b8bf608de2008455b462e76b0c1dd  /media/btrfs-raid1/S01E09 HDTV x264.mp4
0d5775373e1168feeef99889a1d8fe0a  /media/btrfs-raid1/S01E10 HDTV x264.mp4
8dd4b25c249778f197fdb33604fdb998  /media/btrfs-raid1/S01E11 HDTV x264.mp4

Test 3: Killing one off

In this case whe physically (or virtually disconnect a disk and see what happens and repair it)
So when booting you will see the ‘An error occured while mounting /media/btrfs-raid1’. Press s to skip.
The program ‘lsblk’ will list sdd missing. This is normal.

lsblk
NAME                        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                           8:0    0     6G  0 disk
├─sda1                        8:1    0   243M  0 part /boot
├─sda2                        8:2    0     1K  0 part
└─sda5                        8:5    0   5.8G  0 part
  ├─btrfs--vg-root (dm-0)   252:0    0   4.8G  0 lvm  /
  └─btrfs--vg-swap_1 (dm-1) 252:1    0     1G  0 lvm  [SWAP]
sdb                           8:16   0     5G  0 disk
sdc                           8:32   0     5G  0 disk

So let’s check the status of our BTRFS filesystem again.

sudo btrfs fi show
Label: 'disk-raid1'  uuid: 07759bba-2b6b-4d9a-b09f-605acbd6da0b
        Total devices 2 FS bytes used 4.68GiB
        devid    1 size 5.00GiB used 5.00GiB path /dev/sdc
        *** Some devices missing

Interesting, we see some are missing, but not which one.

So let’s try to mount our filesystem.

sudo mount -v -t btrfs LABEL=disk-raid1 /media/btrfs-raid1/

This won’t work because there is one drive missing. We will need to mount our filesystem in degraded mode.

sudo mount -v -t btrfs -o degraded LABEL=disk-raid1 /media/btrfs-raid1/

So once we are in degraded mode we will see a little bit more. We can see the devid, but we won’t see the drive. (As these is gone)

sudo btrfs fi show
Label: disk-raid1  uuid: 07759bba-2b6b-4d9a-b09f-605acbd6da0b
        Total devices 2 FS bytes used 4.68GiB
        devid    1 size 5.00GiB used 5.00GiB path /dev/sdc
        devid    2 size 5.00GiB used 4.99GiB path

So let’s verify our content first (in degraded mode).

md5sum /media/btrfs-raid1/*
03486548bc7b0f1a3881dc00c0f8c5f8  /media/btrfs-raid1/S01E01 HDTV x264.mp4
a9390aed84a6be8c145046772296db26  /media/btrfs-raid1/S01E02 HDTV x264.mp4
2e37ed514579ac282986efd78ac3bb76  /media/btrfs-raid1/S01E03 HDTV x264.mp4
1596a5e56f14c843b5c27e2d3ff27ebd  /media/btrfs-raid1/S01E04 HDTV x264.mp4
f7d494d6858391ac5c312d141d9ee0e5  /media/btrfs-raid1/S01E05 HDTV x264.mp4
fe6f097ff136428bfc3e2a1b8e420e4e  /media/btrfs-raid1/S01E06 HDTV x264.mp4
43c5314079f08570f6bb24b5d6fde101  /media/btrfs-raid1/S01E07 HDTV x264.mp4
3b5ea952b632bbc58f608d64667cd2a1  /media/btrfs-raid1/S01E08 HDTV x264.mp4
db6b8bf608de2008455b462e76b0c1dd  /media/btrfs-raid1/S01E09 HDTV x264.mp4
0d5775373e1168feeef99889a1d8fe0a  /media/btrfs-raid1/S01E10 HDTV x264.mp4
8dd4b25c249778f197fdb33604fdb998  /media/btrfs-raid1/S01E11 HDTV x264.mp4

Our filesystem still shows that all our data is still intact. So let’s continue with replacing.

So how do we replace? Add and remove? Think again… This will destroy your raid. Unless you have less than 50% used.

Luckily replace works with devids too. (As there is no /dev/ mapping anymore.) In our case the missing devid is ‘2’.

sudo btrfs replace start 2 /dev/sdb /media/btrfs-raid1

So let’s check if everything is working. Whilst replacing devid 2 will still be listed.

sudo btrfs fi show
Label: disk-raid1  uuid: 07759bba-2b6b-4d9a-b09f-605acbd6da0b
        Total devices 3 FS bytes used 4.68GiB
        devid    1 size 5.00GiB used 5.00GiB path /dev/sdc
        devid    2 size 5.00GiB used 4.99GiB path
        devid    0 size 0.00 used 0.00 path

To check the replace you can also ask the status.

sudo btrfs replace status /media/btrfs-raid1

Once done a show will show the repaired filesystem.

sudo btrfs fi show
Label: disk-raid1  uuid: 07759bba-2b6b-4d9a-b09f-605acbd6da0b
        Total devices 2 FS bytes used 4.68GiB
        devid    1 size 5.00GiB used 5.00GiB path /dev/sdc
        devid    2 size 5.00GiB used 4.99GiB path /dev/sdb

Let’s verify our file integrity once more.

md5sum /media/btrfs-raid1/*
03486548bc7b0f1a3881dc00c0f8c5f8  /media/btrfs-raid1/S01E01 HDTV x264.mp4
a9390aed84a6be8c145046772296db26  /media/btrfs-raid1/S01E02 HDTV x264.mp4
2e37ed514579ac282986efd78ac3bb76  /media/btrfs-raid1/S01E03 HDTV x264.mp4
1596a5e56f14c843b5c27e2d3ff27ebd  /media/btrfs-raid1/S01E04 HDTV x264.mp4
f7d494d6858391ac5c312d141d9ee0e5  /media/btrfs-raid1/S01E05 HDTV x264.mp4
fe6f097ff136428bfc3e2a1b8e420e4e  /media/btrfs-raid1/S01E06 HDTV x264.mp4
43c5314079f08570f6bb24b5d6fde101  /media/btrfs-raid1/S01E07 HDTV x264.mp4
3b5ea952b632bbc58f608d64667cd2a1  /media/btrfs-raid1/S01E08 HDTV x264.mp4
db6b8bf608de2008455b462e76b0c1dd  /media/btrfs-raid1/S01E09 HDTV x264.mp4
0d5775373e1168feeef99889a1d8fe0a  /media/btrfs-raid1/S01E10 HDTV x264.mp4
8dd4b25c249778f197fdb33604fdb998  /media/btrfs-raid1/S01E11 HDTV x264.mp4

Test 4: Redundancy check

For our last test we shall damage one of the vmdk files of the virtual machine. This can be done with wxHexEditor. Just open and replace with zeroes.

Once damaged let’s start scrubbing to see if we can detect some errors.

sudo btrfs scrub start /media/btrfs-raid1/

You can also request a status.

sudo watch btrfs scrub status /media/btrfs-raid1/
scrub status for 07759bba-2b6b-4d9a-b09f-605acbd6da0b
        scrub started at Fri Dec  4 15:28:30 2015 and finished after 29 seconds
        total bytes scrubbed: 9.36GiB with 7 errors
        error details: csum=7
        corrected errors: 7, uncorrectable errors: 0, unverified errors: 0

Or grep the results from syslog.

cat /var/log/syslog | grep BTRFS
Dec  4 15:28:52 btrfs kernel: [   70.751292] BTRFS: bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
Dec  4 15:28:52 btrfs kernel: [   70.799491] BTRFS: fixed up error at logical 3785674752 on dev /dev/sdb
Dec  4 15:28:55 btrfs kernel: [   73.465940] BTRFS: checksum error at logical 4282617856 on dev /dev/sdb, sector 8341960, root 5, inode 266, offset 512442368, length 4096, links 1 (path: S01E10 HDTV x264.mp4)
Dec  4 15:28:55 btrfs kernel: [   73.465946] BTRFS: bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
Dec  4 15:28:55 btrfs kernel: [   73.480159] BTRFS: fixed up error at logical 4282617856 on dev /dev/sdb
Dec  4 15:28:55 btrfs kernel: [   73.585207] BTRFS: checksum error at logical 4327469056 on dev /dev/sdb, sector 8429560, root 5, inode 266, offset 158330880, length 4096, links 1 (path: S01E10 HDTV x264.mp4)
Dec  4 15:28:55 btrfs kernel: [   73.585213] BTRFS: bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 5, gen 0
Dec  4 15:28:55 btrfs kernel: [   73.660404] BTRFS: fixed up error at logical 4327469056 on dev /dev/sdb
Dec  4 15:28:58 btrfs kernel: [   76.693662] BTRFS: checksum error at logical 4892540928 on dev /dev/sdb, sector 9533216, root 5, inode 267, offset 246824960, length 4096, links 1 (path: S01E11 HDTV x264.mp4)
Dec  4 15:28:58 btrfs kernel: [   76.693667] BTRFS: bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 6, gen 0
Dec  4 15:28:58 btrfs kernel: [   76.761209] BTRFS: fixed up error at logical 4892540928 on dev /dev/sdb
Dec  4 15:28:59 btrfs kernel: [   77.698926] BTRFS: checksum error at logical 5076742144 on dev /dev/sdb, sector 9892984, root 5, inode 267, offset 432074752, length 4096, links 1 (path: S01E11 HDTV x264.mp4)
Dec  4 15:28:59 btrfs kernel: [   77.698932] BTRFS: bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 7, gen 0
Dec  4 15:28:59 btrfs kernel: [   77.719481] BTRFS: fixed up error at logical 5076742144 on dev /dev/sdb

So this was every test I wanted to do for RAID1. Up next will be converting existing filesystems.

NAS – Next-gen filesystems – Introduction

Planning a NAS upgrade: Ubuntu 16.04 LTS

For Ubuntu 16.04 LTS I intend to upgrade and move away from my trusted mdadm RAID. For this use I will be testing to see if there are any new file systems that show some versatility in RAID creation, expanding and maintaining.
I will be looking at BTRFS for my RAID1 and RAID5 needs and MHDDFS for JBOD.

Tests

Part 1: BTRFS RAID 1
Part 2: BTRFS RAID 0 & conversion to RAID 5
Part 3: BTRFS RAID 5
Part 4: MHDDFS
Part 5: Conclusion

Benchmarking VMware ESXi versus Citrix XenServer

Introduction

In this little post I’ll be comparing VMware ESXi and Citrix XenServer.

The base image consists from a Windows 10: 32-bit image. On this image I installed Passmark software CPU Bench. Each VM will have 2048MB RAM available and only one instance will be running on the hypervisor hosts.

The base score for the i3-5010u is Score: 3054, Single threaded score: 1178

Testing

For each test I will gradually up the CPU power. The P symbol means Passmark score, S is the equivalent of the Single Thread score.

Test 1: 1 vCPU

Test one will be a 1 virtual CPU, running a test on each machine.

XenServer: 1 vCPU	P: 910 S: 983	P: 905 S:975	P: 918 S:903	
VMware: 1 vCPU		P: 957 S: 1146	P: 964 S:1151  	P: 951 S:1153

You can see that in the single threaded score there is a penalty of around 15-18% for Citrix Xen Server. VMware does a better job with only around 3% loss.
As for the Passmark test, we can see that VMware does a better job than XenServer. However the result is quite small. VMware performs arounf 5% better on a single core VM.

Test 2: 2 vCPUs

2 virtual cores with one core per socket (2×1)

XenServer: 2 vCPU	P: 1583 S:993 	P: 1633 S:991 	P: 1643 S:985
VMware: 2 vCPU		P: 1826 S:1150	P: 1826 S:1155 	P: 1829 S: 1155

In this test we can see that VMware seems to keep a rock steady score for the single threaded performance. Citrix XenServer stays the same too. Adding more virtual CPUs doesn’t seem to add any extra advantage to the single threaded performance.
However we can see in this test that with adding an extra CPU to both virtual machines, the passmark score seems to double. Good evolution, we are getting there.

Test 3: 4 vCPUs

This test is a little bit odd. XenServer allows a setup of 4 virtual cores on one CPU. And these results are in the same range as the 2×1 test. As the i3-5010U has 2 cores and 2 threads on each. This proves that XenServer will offer more options than VMware, though it’s worth checking your hardware layout before throwing around cores.

4 virtual cores with one core per socket

XenServer: (4x1) 4 vCPU	P: 1691 S: 987	P: 1668 S: 988 	P: 1732 S: 995
XenServer: (2x2) 4 vCPU	P: 2180 S: 987  P: 2110 S: 991  P: 2131 S: 993
VMware: (2x2) 4 vCPU	P: 2337 S: 1157 P: 2335 S: 1161 P: 2339 S: 1162

This test is the final score. Noticable, the single thread score doesn’t improve much. You can see the Citrix Xen Server stays the lowest of the two.

All results

single

passmark

Conclusion

You can see Citrix XenServer stays below VMware in raw performance. Is Citrix XenServer useless? No, despite the lower CPU performance, each hypervisor does more than the ones found in VMware. In VMware you will need a vCenter running to perform some of the advanced features. Example: In Citrix XenServer environment the configuration concerning high availability is found in each hypervisor.
This could explain some of the differences found in performance.

NAS – Improving Owncloud speed

Introduction

This part continues from the NAS – Part 4: Owncloud.

By default Owncloud can be a quite slow program. The setup which I have is no different. My setup grabs all the files from the network, causing a delay. To improve this I will be using the RAM disk suggested at: https://doc.owncloud.org/server/8.0/admin_manual/configuration_server/performance_tuning.html

Implementation

In the past I mounted the network location directly to ‘/var/www/owncloud/’ but this isn’t possible anymore, as I want this folder to be used for the RAM disk.
Lets create a new mount point named ‘/media/network’ and change our fstab to reflect this change.

sudo mkdir -p /media/network
sudo nano /etc/fstab

Unmount and remount everything again and verify that it is mounted.

sudo umount /var/www/owncloud
sudo mount -a
 
df -h

Now we shall create the RAM disk. Verify that your installation is less than 192MB big (hint du).

sudo mount -t tmpfs -o size=192m tmpfs /var/www/owncloud

And add it to the fstab file.

sudo nano /etc/fstab
tmpfs       /var/www/owncloud tmpfs   nodev,nosuid,noexec,nodiratime,size=192M   0 0

To verify that it is working as desired, please reboot the machine and check if it is mounted. (We don’t want missing files)

sudo reboot
df -h

Now install unison, unison is the tool I will use to synchronize the files from the network disk to the ‘/var/www/owncloud’ directory. It has its quirks but in my case it works fine.

sudo apt-get install unison

Start by synchronizing our old files to the new RAM disk.

sudo unison /var/www/owncloud /media/network -ignore 'Path {data}' -force /media/network -confirmbigdel=false

Once this is done we need to create our synchronization scripts.

cd /home/owncloud

The first script will load the files from the network source. It will stop Apache2 whilst synchronizing the data to the ‘/war/www/owncloud’ folder. Also there is a force option to explicit force the download from the ‘/media/network’ location. If we don’t force this, unison will detect the newly created RAM disk as a newer version and will commence deleting the files we need to run Owncloud!
When everything is done, a tmp file will be written to flag that the unison cron job may synchronize files.

nano load-owncloud.sh
#!/bin/sh
service apache2 stop
unison /var/www/owncloud /media/network -ignore 'Path {data}' -force /media/network -batch -confirmbigdel=false
ln -s /media/network/data /var/www/owncloud/data
chown -R www-data:www-data /var/www/owncloud
service apache2 start
touch /tmp/owncloud-initialized

The second job is our sync job. This is a two way sync, any changes made on the owncloud server and on the network source will be propagated to all.

nano sync-owncloud.sh
#!/bin/sh
FILE=/tmp/owncloud-initialized
if [ -f $FILE ];
then
	unison /media/network /var/www/owncloud -batch -ignore 'Path {data}'
fi

Make em executable.

chmod +x ./sync-owncloud.sh
chmod +x ./load-owncloud.sh

The last part of our scripting needs is the startup job. It’s pretty simple, this will just run our ‘load-owncloud.sh’ script.

sudo nano /etc/init/unison.conf
description "Owncloud File Sync"
author "Robbert Lambrechts"
env HOME=/home/owncloud
start on runlevel [2345]
 
pre-start script
    echo "Starts Owncloud sync..."
end script
 
post-stop script
    echo "Ends Owncloud sync..."
end script
 
exec /home/owncloud/load-owncloud.sh

Allow executing the startup script.

sudo chmod +x /etc/init/unison.conf

The last part is just allowing the sync job to run. This is done to apply changes.

sudo crontab -e
*/5 * * * * /home/owncloud/sync-owncloud.sh

That’s it. One fast Owncloud to serve files.

Troubleshooting AsRock C2750D4I

Introduction

In the past I’ve made a post about the unstability of the AsRock C2750D4I. Guess what, problems aren’t gone with this motherboard.
I am suspecting the RAID controller of the motherboard. When the server experiences heavy load, at least two disks disconnect, bringing down the software RAID.

Troubleshooting

Let’s start by finding out the disk layout of my RAID5.

cat /proc/mdstat
md5 : inactive sde1[3](S) sdh1[1](S) sdf1[0](S) sdd1[4](S)
      11720536064 blocks super 1.2

This shows that my RAID is spread across sde1, sdh1, sdf1 and sdd1. The last error logs from dmesg showed my that sdh1 went down and sdf1 went down before the RAID crash.

So let’s try to find some more information about these two crashed drives.

sudo lshw -c disk

The result will show you a little bit more information about each drive.

  *-disk
       description: ATA Disk
       product: SAMSUNG HD103SI
       physical id: 0.0.0
       bus info: scsi@2:0.0.0
       logical name: /dev/sda
       version: 1AG0
       serial: S20XJDWS700323
       size: 931GiB (1TB)
       capabilities: partitioned partitioned:dos
       configuration: ansiversion=5 sectorsize=512 signature=0007f8a5
  *-disk
       description: ATA Disk
       product: SAMSUNG HD103SI
       physical id: 0.0.0
       bus info: scsi@3:0.0.0
       logical name: /dev/sdb
       version: 1AG0
       serial: S20XJDWZ118279
       size: 931GiB (1TB)
       capabilities: partitioned partitioned:dos
       configuration: ansiversion=5 sectorsize=512 signature=00071895
  *-disk
       description: ATA Disk
       product: KINGSTON SVP200S
       physical id: 0.0.0
       bus info: scsi@5:0.0.0
       logical name: /dev/sdc
       version: 502A
       serial: 50026B7331033DD9
       size: 55GiB (60GB)
       capabilities: partitioned partitioned:dos
       configuration: ansiversion=5 sectorsize=512 signature=91a29a16
  *-disk
       description: ATA Disk
       product: ST3000DM001-1CH1
       vendor: Seagate
       physical id: 0.0.0
       bus info: scsi@6:0.0.0
       logical name: /dev/sdd
       version: CC24
       serial: Z1F27VHM
       size: 2794GiB (3TB)
       capabilities: gpt-1.00 partitioned partitioned:gpt
       configuration: ansiversion=5 guid=0556e5e5-1e62-42f4-a89c-29813a6f4a18 sectorsize=4096
  *-disk
       description: ATA Disk
       product: Hitachi HDS5C303
       vendor: Hitachi
       physical id: 0.0.0
       bus info: scsi@7:0.0.0
       logical name: /dev/sde
       version: MZ6O
       serial: MCE9215Q0B5MLW
       size: 2794GiB (3TB)
       capabilities: gpt-1.00 partitioned partitioned:gpt
       configuration: ansiversion=5 guid=ec9054e2-94c3-4d74-8fea-2d34ce0b92ac sectorsize=4096
  *-disk
       description: ATA Disk
       product: Hitachi HDS5C303
       vendor: Hitachi
       physical id: 0.0.0
       bus info: scsi@8:0.0.0
       logical name: /dev/sdf
       version: MZ6O
       serial: MCE9215Q0BHTDV
       size: 2794GiB (3TB)
       capabilities: gpt-1.00 partitioned partitioned:gpt
       configuration: ansiversion=5 guid=2f6f5a9b-441e-467d-861c-852e2bdefb5e sectorsize=4096
  *-disk
       description: ATA Disk
       product: WDC WD40EFRX-68W
       vendor: Western Digital
       physical id: 0.0.0
       bus info: scsi@9:0.0.0
       logical name: /dev/sdg
       version: 80.0
       serial: WD-WCC4E1653628
       size: 3726GiB (4TB)
       capabilities: gpt-1.00 partitioned partitioned:gpt
       configuration: ansiversion=5 guid=4ac4a5a9-ccd1-42c5-907a-9272c076a15c sectorsize=4096
  *-disk
       description: ATA Disk
       product: TOSHIBA DT01ACA3
       vendor: Toshiba
       physical id: 0.0.0
       bus info: scsi@10:0.0.0
       logical name: /dev/sdh
       version: MX6O
       serial: 63NZKNRKS
       size: 2794GiB (3TB)
       capabilities: gpt-1.00 partitioned partitioned:gpt
       configuration: ansiversion=5 guid=24069398-46d0-4b01-9e8e-2530cb9f1cf8 sectorsize=4096

The logical name field shows that my Toshiba (sdh) drive and my Hitachi drive (sdf) where impacted by the last drive/SATA error on the board. This information can be used to physically track the SATA cables to the correct drive.
So now that we have the disk names, we need to find out which controller is the culprit for throwing these errors.

First let’s identify the bus addresses of all SATA controllers available on the motherboard.

sudo lshw -c storage

The connected controller in the bus info field.

  *-storage
       description: SATA controller
       product: 88SE9172 SATA 6Gb/s Controller
       vendor: Marvell Technology Group Ltd.
       physical id: 0
       bus info: pci@0000:04:00.0
       version: 11
       width: 32 bits
       clock: 33MHz
       capabilities: storage pm msi pciexpress ahci_1.0 bus_master cap_list rom
       configuration: driver=ahci latency=0
       resources: irq:55 ioport:c040(size=8) ioport:c030(size=4) ioport:c020(size=8) ioport:c010(size=4) ioport:c000(size=16) memory:df410000-df4101ff memory:df400000-df40ffff
  *-storage
       description: SATA controller
       product: 88SE9230 PCIe SATA 6Gb/s Controller
       vendor: Marvell Technology Group Ltd.
       physical id: 0
       bus info: pci@0000:09:00.0
       version: 11
       width: 32 bits
       clock: 33MHz
       capabilities: storage pm msi pciexpress ahci_1.0 bus_master cap_list rom
       configuration: driver=ahci latency=0
       resources: irq:56 ioport:d050(size=8) ioport:d040(size=4) ioport:d030(size=8) ioport:d020(size=4) ioport:d000(size=32) memory:df610000-df6107ff memory:df600000-df60ffff
  *-storage:0
       description: SATA controller
       product: Atom processor C2000 AHCI SATA2 Controller
       vendor: Intel Corporation
       physical id: 17
       bus info: pci@0000:00:17.0
       version: 02
       width: 32 bits
       clock: 66MHz
       capabilities: storage msi pm ahci_1.0 bus_master cap_list
       configuration: driver=ahci latency=0
       resources: irq:48 ioport:e0d0(size=8) ioport:e0c0(size=4) ioport:e0b0(size=8) ioport:e0a0(size=4) ioport:e040(size=32) memory:df762000-df7627ff
  *-storage:1
       description: SATA controller
       product: Atom processor C2000 AHCI SATA3 Controller
       vendor: Intel Corporation
       physical id: 18
       bus info: pci@0000:00:18.0
       version: 02
       width: 32 bits
       clock: 66MHz
       capabilities: storage msi pm ahci_1.0 bus_master cap_list
       configuration: driver=ahci latency=0
       resources: irq:54 ioport:e090(size=8) ioport:e080(size=4) ioport:e070(size=8) ioport:e060(size=4) ioport:e020(size=32) memory:df761000-df7617ff

Now for each driven we can search the corresponding SATA controller address, this is listed as the pci values found above.

sudo udevadm info -q all -n /dev/sde | grep DEVPATH
E: DEVPATH=/devices/pci0000:00/0000:00:03.0/0000:02:00.0/0000:03:01.0/0000:04:00.0/ata8/host7/target7:0:0/7:0:0:0/block/sde
 
sudo udevadm info -q all -n /dev/sdd | grep DEVPATH
E: DEVPATH=/devices/pci0000:00/0000:00:03.0/0000:02:00.0/0000:03:01.0/0000:04:00.0/ata7/host6/target6:0:0/6:0:0:0/block/sdd
 
sudo udevadm info -q all -n /dev/sdf | grep DEVPATH
E: DEVPATH=/devices/pci0000:00/0000:00:04.0/0000:09:00.0/ata9/host8/target8:0:0/8:0:0:0/block/sdf
 
sudo udevadm info -q all -n /dev/sdh | grep DEVPATH
E: DEVPATH=/devices/pci0000:00/0000:00:04.0/0000:09:00.0/ata11/host10/target10:0:0/10:0:0:0/block/sdh

The last number before /ata is the device which it is connected to. So this means that sde and sdd are connected to an ATA device at 0000:04:00.0 which equals to the Marvell 88SE9172 SATA 6Gb/s Controller.
The drives sdf and sdh are connected to the ATA device at 0000:09:00.0, which translates to the Marvell 88SE9230 PCIe SATA 6Gb/s Controller.

Which is the asshole throwing me errors.

Now with these information we can unplug the disks, from the controller throwing the errors. The location of the Marvell 88SE9230 is explained in the manual at http://www.asrockrack.com/general/productdetail.asp?Model=C2750D4I#Manual. You can verify the physical existence on the board, together with the disk names found previously.

So I rerouted all disks (I prefer 3Gbps SATA above a dysfunctional 6Gbps any day) and since then the NAS has been stable.

Google App Engine – goapp: ‘C:\Program’ is not recognized as an internal or external command

Today (24/07/2014) I installed the Google Apps components from the Google SDK installer. However when I try to run my goapps application with the command: ‘goapp serve myapp/’

I am receiving an error: ‘C:\Program’ is not recognized as an internal or external command. The problem here is that the ‘goapp.bat’ file tries to access an executable file in the ‘C:\Program Files\Google\Cloud SDK\…’ folder. Because Windows is (still) super terrible at handling spaces in folder names in scripts, it fails.

The solution is to go to the ‘C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine’ folder and edit the ‘goapp.bat’ file.
At the bottom of the file you will see:

:: Note that %* can not be used with shift.
%GOROOT%\bin\%EXENAME% %1 %2 %3 %4 %5 %6 %7 %8 %9

Now add some quotes to this last line and your problem should be fixed.

:: Note that %* can not be used with shift.
"%GOROOT%\bin\%EXENAME%" %1 %2 %3 %4 %5 %6 %7 %8 %9

Once these changes are saved, go to the ‘C:\Program Files\Google\Cloud SDK\google-cloud-sdk\bin\’ folder. There’s ‘goapp.cmd’ file that gets added to the Windows path. Rename this file to ‘goapp.bck’ and copy your ‘goapp.bat file’.
In this last file change the last line again to:

:: Note that %* can not be used with shift.
"%GOROOT%\..\..\platform\google_appengine\goapp" %1 %2 %3 %4 %5 %6 %7 %8 %9

That’s it. Ugly, but it works…

Original Github issue: windows 7 C:/Program Files/… #688

NAS – Part 6: Health checks mdadm

Introduction

This post builds on part 2: NAS – Part 2: Software and services. It’s a detection script to see if your RAID is failing. In the past I’ve had my fair share of failed RAID configurations.

I do know the package mdadm can send alerts, however this small script which can be extended to detect specific changes in RAID/system configuration without using the built in reporting.

Implementation

First let’s start by installing mailutils. This package is needed

sudo apt-get install mailutils

Next up is the ‘ssmtp’ package. This package will allow you to send a mail.

sudo apt-get install ssmtp

Create the ssmtp directory (if it doesn’t exists).

sudo mkdir /etc/ssmtp/

And create an ssmtp.conf file.

sudo nano /etc/ssmtp/ssmtp.conf

This ssmtp.conf requires a username(author) and password(authpass). Also a mail hub (smtp, example: mailhub=smtp.gmail.com:587)

AuthUser=<your-email-adres>
AuthPass=<password>
FromLineOverride=YES
mailhub=<smtp-mailserver>
UseSTARTTLS=YES

To test your configuration you can try to send a test mail. Just change ’email@mail.com’ to your email adress.

echo "This is a test" | mail -s "Test" email@mail.com

If everything works you are ready to create your cron job script. (I will create this script in my user directory, however you can create this wherever you want.)

cd ~
nano health-mdstat.sh

The underscore of ‘cat /proc/mdstat’ is used by mdadm to notify you of any failing RAID disks. So I’ll be checking for this character.

#!/bin/bash
SUBJECT="---RAID IN DEGRADED STATE---"
EMAIL="<target email>"
FROM="<from email>"
EMAILMESSAGE="/tmp/cron-email"
 
cat /proc/mdstat > /tmp/cron-email
 
if grep -q "_" "$EMAILMESSAGE"; then
   mail -aFrom:$FROM -s "$SUBJECT" "$EMAIL" < $EMAILMESSAGE
fi

Let’s assign execute rights to our script.

sudo chmod +x ./health-mdstat.sh

That’s it! Now assign this to a cron job. I assigned my cron job to run daily.
Also happy scripting (when extending this script).