Distributed Replicated Block Device (DRBD) mirrors block devices between multiple hosts via an assigned network to form high availability clusters. The replication is transparent to other applications on the host systems. Any block device hard disks, partitions, RAID devices, logical volumes, etc can be mirrored. DRBD can be understood as network based raid-1.
DRBD can also be used as a backup solution over network or data disaster
recovery solution. In our setup we'll be using the following layers:
[cc lang='bash']
| physical disk |
| lvm |
| drbd |
| ext4 |
[/cc]
The second layer is LVM because on the backup server you cannot access
the data directly (DRBD will not allow to mount the /dev/drbd0 volume,
it’s locked to the drbd0 process). But snapshots allow us to subvert the
lock a bit. Or at least give us read access to the existing data.
[cc lang='bash']
# node01 is the main server
# /dev/sdb is the disk containing important data, 16g size
\$ apt-get install drbd8-utils lvm2
\$ fdisk /dev/sdb # create 1 primary lvm partition (type 8e), 16g
size
\$ pvcreate /dev/sdb1
\$ vgcreate vg0 /dev/sdb1
\$ lvcreate -l 100%FREE -n lv0 vg0
\$ lvdisplay | grep "Current LE" # this value will be used on node02
setup
Current LE 4095
[/cc]
[cc lang='bash']
# node02 is the backup server
# /dev/sdb is the disk storing the backup data, 32g size
\$ apt-get install drbd8-utils lvm2
\$ fdisk /dev/sdb # create 1 primary lvm partition (type 8e), 32g
size
\$ pvcreate /dev/sdb1
\$ vgcreate vg0 /dev/sdb1
\$ lvcreate -l 4095 -n lv0 vg0 # this is the "Current LE" value from
node01
[/cc]
Configure and start DRBD on both servers:
[cc lang='bash']
# node01 & node02
\$ cat \<\<'EOF' > /etc/drbd.conf
global { usage-count no; }
common { syncer { rate 1000M; } }
resource r0 {
protocol C; # Synchronous replication protocol
startup {
wfc-timeout 15;
degr-wfc-timeout 60;
}
net {
cram-hmac-alg sha1;
shared-secret "secret";
}
on node01 {
device /dev/drbd0;
disk /dev/vg0/lv0;
address 10.20.30.40:7788;
meta-disk internal;
}
on node02 {
device /dev/drbd0;
disk /dev/vg0/lv0;
address 50.60.70.80:7788;
meta-disk internal;
}
}
EOF
\$ drbdadm create-md r0
\$ service drbd start
[/cc]
Run this only on the main server to promote node01 as primary and start
syncing the data to node02:
[cc lang='bash']
# node01
\$ drbdadm -- --overwrite-data-of-peer primary all
\$ watch -n1 "cat /proc/drbd"
# wait until the data is fully synced
\$ mkfs.ext4 /dev/drbd0
\$ mkdir /important
\$ echo '/dev/drbd0 /important ext4 errors=remount-ro 0 1' >>
/etc/fstab
\$ mount -a
# let's add some content
\$ cd /important/
\$ dd if=/dev/zero of=zero.100 bs=1M count=100
\$ md5sum zero.100
2f282b84e7e608d5852449ed940bfc51 zero.100
[/cc]
Now let's test if the data is actually syncing between the main and the
backup:
[cc lang='bash']
# node01
\$ umount /important
\$ drbdadm secondary r0 # demote the main server to the secondary role
# node02
\$ drbdadm primary r0 # promote the backup server to the primary role
\$ mkdir /important
\$ mount /dev/drbd0 /important
\$ cd /important
\$ ls
lost+found zero.100
\$ md5sum zero.100
2f282b84e7e608d5852449ed940bfc51 zero.100 # same md5sum as main
server
\$ service drbd status
drbd driver loaded OK; device status:
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
m:res cs ro ds p mounted fstype
0:r0 Connected Primary/Secondary UpToDate/UpToDate C /important ext4
[/cc]
Let's switch back to the original scenario:
[cc lang='bash']
# node02
\$ umount /important
\$ drbdadm secondary r0
# node01
\$ drbdadm primary r0 # promote the backup server to the primary role
\$ mount -a
[/cc]
Now that everything is working as expected, let's take advantage of our
remote real-time backup system using LVM Snapshots:
[cc lang='bash']
# node02
\$ lvcreate -L1G -s -n lv0-bkp01 /dev/vg0/lv0
\$ mkdir /important-bkp01
\$ mount -t ext4 /dev/vg0/lv0-bkp01 /important-bkp01/
\$ cd /important-bkp01/
\$ ls
lost+found zero.100
\$ md5sum zero.100
2f282b84e7e608d5852449ed940bfc51 zero.100 # same md5sum as main
server
[/cc]
You can cron the creation / removal of snapshots according to your
needs (and your disk size) to provide either a back-in-time file
solution or a complete mirror access to the data on primary server.