Logical Volumes
Create an LVM on each server for the NFS Mount
On both servers:
lvcreate -n nfspoint -V 550G pve/data
mkfs.ext4 /dev/pve/nfspoint
echo '/dev/pve/nfspoint /var/lib/nfspoint ext4 defaults 0 2' >> /etc/fstab
mkdir /var/lib/nfspoint
mount /dev/pve/nfspoint /var/lib/nfspoint/
lvs
vim /etc/exports
/var/lib/nfspoint 10.0.0.0/255.0.0.0(rw,no_root_squash,no_all_squash,sync)
DRBD
DRBD is a little tricky.. Because LVM was thin provisioned it looked like some metadata was written to teh device /dev/pve/nfspoint
. This means I had to zero out the nfspoint.
DRBD Configuration:
global { usage-count no; }
common { syncer { rate 100M; } }
resource r0 {
protocol C;
device /dev/drbd0 minor 0;
startup {
wfc-timeout 120;
degr-wfc-timeout 60;
become-primary-on both;
}
net {
cram-hmac-alg sha1;
allow-two-primaries;
shared-secret "secret";
}
on HLPMX1 {
disk /dev/pve/nfspoint;
address 10.0.0.2:7788;
meta-disk internal;
}
on HLPMX2 {
disk /dev/pve/nfspoint;
address 10.0.0.3:7788;
meta-disk internal;
}
}
This wasn’t a problem because there was no data. This may be a problem when I expand the volume with LVM or need to make any kind of resource changes to the device.
dd if=/dev/zero of=/dev/pve/nfspoint bs=1M count=200
mkfs.ext4 –b 4096 /dev/drbd0
curl --output drbd9.15.tar.gz https://launchpad.net/ubuntu/+archive/primary/+sourcefiles/drbd-utils/9.15.0-1/drbd-utils_9.15.0.orig.tar.gz
tar -xf drbd9.15.tar.gz
cd drbd9.15
./configure --prefix=/usr --localstatedir=/var --sysconfdir=/etc
make all
make install
Node 2: (because gcc wasn’t installed)
apt install build-essential
apt install gcc
apt install flex
# Copy config to other server
sudo drbdadm create-md r0
sudo systemctl start drbd.service
sudo drbdadm -- --overwrite-data-of-peer primary all
mkfs.ext4 /dev/drbd0
mkdir /srv/nfspoint
sudo mount /dev/drbd0 /srv/nfspoint
Splitbrain
On Split Brain Victim
drbdadm disconnect r0
drbdadm secondary r0
drbdadm connect --discard-my-data r0
On Split Brain Survivor
drbdadm primary r0
drbdadm connect r0
Inverting Resourcing
On Current Primary:
umount /srv/nfspoint
drbdadm secondary r0
On Secondary:
drbdadm primary r0
mount /dev/drbd0 /srv/nfspoint
DRDB Broken?
Reboot both servers
systemctl start drbd.service
drbdadm status
umount /dev/pve/nfspoint
mkfs.ext4 -b 4096 /dev/pve/nfspoint
dd if=/dev/zero of=/dev/drbd0 status=progress
mkfs -t ext4 /dev/drbd0
Troubleshooting DRBD Issues
Odd Mountpoint with Loop Device
What should appear off lsblk
:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.7T 0 disk
├─sda1 8:1 0 1007K 0 part
├─sda2 8:2 0 512M 0 part
└─sda3 8:3 0 1.7T 0 part
├─pve-swap 253:0 0 8G 0 lvm [SWAP]
├─pve-root 253:1 0 96G 0 lvm /
├─pve-data_tmeta 253:2 0 15.6G 0 lvm
│ └─pve-data-tpool 253:4 0 1.5T 0 lvm
│ ├─pve-data 253:5 0 1.5T 0 lvm
│ ├─pve-vm--105--disk--0 253:6 0 20G 0 lvm
│ └─pve-nfspoint 253:7 0 550G 0 lvm
│ └─drbd0 147:0 0 550G 0 disk /srv/nfspoint
└─pve-data_tdata 253:3 0 1.5T 0 lvm
└─pve-data-tpool 253:4 0 1.5T 0 lvm
├─pve-data 253:5 0 1.5T 0 lvm
├─pve-vm--105--disk--0 253:6 0 20G 0 lvm
└─pve-nfspoint 253:7 0 550G 0 lvm
└─drbd0 147:0 0 550G 0 disk /srv/nfspoint
What was incorrectly appearing off lsblk
:
loop0 7:0 0 200M 0 loop /srv/nfspoint
sda 8:0 0 1.7T 0 disk
├─sda1 8:1 0 1007K 0 part
├─sda2 8:2 0 512M 0 part
└─sda3 8:3 0 1.7T 0 part
├─pve-swap 253:0 0 8G 0 lvm [SWAP]
├─pve-root 253:1 0 96G 0 lvm /
├─pve-data_tmeta 253:2 0 15.6G 0 lvm
│ └─pve-data-tpool 253:4 0 1.5T 0 lvm
│ ├─pve-data 253:5 0 1.5T 0 lvm
│ ├─pve-vm--100--disk--0 253:6 0 30G 0 lvm
│ ├─pve-vm--102--disk--0 253:7 0 100G 0 lvm
│ ├─pve-vm--103--disk--0 253:8 0 20G 0 lvm
│ ├─pve-vm--104--disk--0 253:9 0 20G 0 lvm
│ ├─pve-vm--101--disk--0 253:10 0 20G 0 lvm
│ └─pve-nfspoint 253:11 0 550G 0 lvm
│ └─drbd0 147:0 0 550G 0 disk
└─pve-data_tdata 253:3 0 1.5T 0 lvm
└─pve-data-tpool 253:4 0 1.5T 0 lvm
├─pve-data 253:5 0 1.5T 0 lvm
├─pve-vm--100--disk--0 253:6 0 30G 0 lvm
├─pve-vm--102--disk--0 253:7 0 100G 0 lvm
├─pve-vm--103--disk--0 253:8 0 20G 0 lvm
├─pve-vm--104--disk--0 253:9 0 20G 0 lvm
├─pve-vm--101--disk--0 253:10 0 20G 0 lvm
└─pve-nfspoint 253:11 0 550G 0 lvm
└─drbd0 147:0 0 550G 0 disk
If the server is creating a loop device and mounting that as the mount point, try rebooting the server, restarting the drbd service, and clearing out the device on the server having the issue and resyncing it with the primary.
I still dont know if it was split brain or what, but these were the commands I ran:
umount /nfs/sharepoint # This was the location of the mountpoint
drbdadm disconnect
drbdadm connect --discard-my-data r0
I honestly think rebooting the server in question fixed this issue. I think it was something about the /dev/drbd0
device that wasn’t working properly or created properly.
Links
Links 2
https://www.howtoforge.com/high\_availability\_nfs\_drbd\_heartbeat\_p2 https://pve.proxmox.com/wiki/Logical\_Volume\_Manager\_(LVM) https://linux.die.net/man/8/lvremove https://access.redhat.com/documentation/en-us/red\_hat\_enterprise\_linux/6/html/logical\_volume\_manager\_administration/lv\_remove https://serverfault.com/questions/266697/cant-remove-open-logical-volume
Links
- Ubuntu - Configure HA drbd
- SO - DRDB Not syncing between my nodes
- SO - Split Brain
- Adding DRBD shared volumes to Proxmox to support Live Migration
- drbd-utils 9.15.0-1 Source
- BuildingTutorial
- drbd-utils
- drbdadm
- Suse Configuring HA
Top comments (0)