rafaeldtinoco
Attachment 'ubuntu-ha-shared-disk-environment.txt'
Download 1 HOWTO: Ubuntu High Availability - Shared SCSI Disk only Environment
2 (Azure and other Environments)
3
4 --------------------
5
6 This is a mini tutorial indicating how to deploy a High Availability Cluster in
7 an environment that supports SCSI shared disks. Instead of relying in APIs of
8 public or private clouds, for example, to fence the virtual machines being
9 clustered, this example relies only in the SCSI shared disk feature, making
10 this example a perfect example to virtual and/or real machines having shared
11 SCSI disks.
12
13 NOTES:
14
15 1. I have made this document with Microsoft Azure Cloud environment in my
16 head and that's why the beginning of this document shows how to get a SHARED
17 SCSI DISK in an Azure environment. Clustering examples given bellow will
18 work with any environment, physical or virtual.
19
20 2. If you want to skip the cloud provider configuration, just search for
21 BEGIN keyword and you will be taken to the cluster and OS specifics.
22
23 --------------------
24
25 As all High Availability Clusters, this one also needs some way to guarantee
26 consistence among different cluster resources. Clusters usually do that by
27 having fencing mechanisms: A way to guarantee the other nodes are *not*
28 accessing the resources before services running on them, and managed by the
29 cluster, are taken over.
30
31 If following this mini tutorial in a Microsoft Azure Environment, make sure to
32 have in mind that this example needs Microsoft Azure Shared Disk feature:
33 - docs.microsoft.com/en-us/azure/virtual-machines/windows/disks-shared-enable
34
35 And the Linux Kernel Module called "softdog":
36 - /lib/modules/xxxxxx-azure/kernel/drivers/watchdog/softdog.ko
37
38 --------------------
39
40 Azure clubionicshared01 disk json file "shared-disk.json":
41
42 {
43 "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
44 "contentVersion": "1.0.0.0",
45 "parameters": {
46 "diskName": {
47 "type": "string",
48 "defaultValue": "clubionicshared01"
49 },
50 "diskSizeGb": {
51 "type": "int",
52 "defaultValue": 1024
53 },
54 "maxShares": {
55 "type": "int",
56 "defaultValue": 4
57 }
58 },
59 "resources": [
60 {
61 "apiVersion": "2019-07-01",
62 "type": "Microsoft.Compute/disks",
63 "name": "[parameters('diskName')]",
64 "location": "westcentralus",
65 "sku": {
66 "name": "Premium_LRS"
67 },
68 "properties": {
69 "creationData": {
70 "createOption": "Empty"
71 },
72 "diskSizeGB": "[parameters('diskSizeGb')]",
73 "maxShares": "[parameters('maxShares')]"
74 },
75 "tags": {}
76 }
77 ]
78 }
79 --------------------
80
81 Command to create the resource in a resource-group called "clubionic":
82
83 $ az group deployment create --resource-group clubionic \
84 --template-file ./shared-disk.json
85
86 --------------------
87 Basics:
88
89 - You will create a resource-group called "clubionic" with the following
90 resources at first:
91
92 clubionicplacement Proximity placement group
93
94 clubionicnet Virtual Network
95 subnets:
96 private 10.250.3.0/24
97 public 10.250.98.024
98
99 clubionic01 Virtual machine
100 clubionic01-ip Public IP address
101 clubionic01private Network interface
102 clubionic01public Network interface (clubionic01-ip associated)
103 clubionic01_OsDisk... OS Disk (automatic creation)
104
105 clubionic02 Virtual machine
106 clubionic02-ip Public IP address
107 clubionic02private Network interface
108 clubionic02public Network interface (clubionic02-ip associated)
109 clubionic02_OsDisk... OS Disk (automatic creation)
110
111 clubionic03 Virtual machine
112 clubionic03-ip Public IP address
113 clubionic03private Network interface
114 clubionic03public Network interface (clubionic03-ip associated)
115 clubionic03_OsDisk... OS Disk (automatic creation)
116
117 clubionicshared01 Shared Disk (created using cmdline and json file)
118
119 rafaeldtinocodiag Storage account (needed for console access)
120
121 --------------------
122
123 Initial idea is to create the network interfaces:
124
125 - clubionic{01,02,03}{public,private}
126 - clubionic{01,02,03}-public
127 - associate XXX-public interfaces to clubionic{01,02,03}public
128
129 And then create then create the clubionicshared01 disk (using yaml file).
130
131 After those are created, next step is to create the 3 needed virtual machines
132 with the proper resources, like showed above, so we can move on in with the
133 cluster configuration.
134
135 --------------------
136
137 I have created a small cloud-init file that can be used in "advanced" tab during
138 VM creation screens (you can copy and paste it there):
139
140 #cloud-config
141 package_upgrade: true
142 packages:
143 - man
144 - manpages
145 - hello
146 - locales
147 - less
148 - vim
149 - jq
150 - uuid
151 - bash-completion
152 - sudo
153 - rsync
154 - bridge-utils
155 - net-tools
156 - vlan
157 - ncurses-term
158 - iputils-arping
159 - iputils-ping
160 - iputils-tracepath
161 - traceroute
162 - mtr-tiny
163 - tcpdump
164 - dnsutils
165 - ssh-import-id
166 - openssh-server
167 - openssh-client
168 - software-properties-common
169 - build-essential
170 - devscripts
171 - ubuntu-dev-tools
172 - linux-headers-generic
173 - gdb
174 - strace
175 - ltrace
176 - lsof
177 - sg3-utils
178 write_files:
179 - path: /etc/ssh/sshd_config
180 content: |
181 Port 22
182 AddressFamily any
183 SyslogFacility AUTH
184 LogLevel INFO
185 PermitRootLogin yes
186 PubkeyAuthentication yes
187 PasswordAuthentication yes
188 ChallengeResponseAuthentication no
189 GSSAPIAuthentication no
190 HostbasedAuthentication no
191 PermitEmptyPasswords no
192 UsePAM yes
193 IgnoreUserKnownHosts yes
194 IgnoreRhosts yes
195 X11Forwarding yes
196 X11DisplayOffset 10
197 X11UseLocalhost yes
198 PermitTTY yes
199 PrintMotd no
200 TCPKeepAlive yes
201 ClientAliveInterval 5
202 PermitTunnel yes
203 Banner none
204 AcceptEnv LANG LC_* EDITOR PAGER SYSTEMD_EDITOR
205 Subsystem sftp /usr/lib/openssh/sftp-server
206 - path: /etc/ssh/ssh_config
207 content: |
208 Host *
209 ForwardAgent no
210 ForwardX11 no
211 PasswordAuthentication yes
212 CheckHostIP no
213 AddressFamily any
214 SendEnv LANG LC_* EDITOR PAGER
215 StrictHostKeyChecking no
216 HashKnownHosts yes
217 - path: /etc/sudoers
218 content: |
219 Defaults env_keep += "LANG LANGUAGE LINGUAS LC_* _XKB_CHARSET"
220 Defaults env_keep += "HOME EDITOR SYSTEMD_EDITOR PAGER"
221 Defaults env_keep += "XMODIFIERS GTK_IM_MODULE QT_IM_MODULE QT_IM_SWITCHER"
222 Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
223 Defaults logfile=/var/log/sudo.log,loglinelen=0
224 Defaults !syslog, !pam_session
225 root ALL=(ALL) NOPASSWD: ALL
226 %wheel ALL=(ALL) NOPASSWD: ALL
227 %sudo ALL=(ALL) NOPASSWD: ALL
228 rafaeldtinoco ALL=(ALL) NOPASSWD: ALL
229 runcmd:
230 - systemctl stop snapd.service
231 - systemctl stop unattended-upgrades
232 - systemctl stop systemd-remount-fs
233 - system reset-failed
234 - passwd -d root
235 - passwd -d rafaeldtinoco
236 - echo "debconf debconf/priority select low" | sudo debconf-set-selections
237 - DEBIAN_FRONTEND=noninteractive dpkg-reconfigure debconf
238 - DEBIAN_FRONTEND=noninteractive apt-get update -y
239 - DEBIAN_FRONTEND=noninteractive apt-get dist-upgrade -y
240 - DEBIAN_FRONTEND=noninteractive apt-get autoremove -y
241 - DEBIAN_FRONTEND=noninteractive apt-get autoclean -y
242 - systemctl disable systemd-remount-fs
243 - systemctl disable unattended-upgrades
244 - systemctl disable apt-daily-upgrade.timer
245 - systemctl disable apt-daily.timer
246 - systemctl disable accounts-daemon.service
247 - systemctl disable motd-news.timer
248 - systemctl disable irqbalance.service
249 - systemctl disable rsync.service
250 - systemctl disable ebtables.service
251 - systemctl disable pollinate.service
252 - systemctl disable ufw.service
253 - systemctl disable apparmor.service
254 - systemctl disable apport-autoreport.path
255 - systemctl disable apport-forward.socket
256 - systemctl disable iscsi.service
257 - systemctl disable open-iscsi.service
258 - systemctl disable iscsid.socket
259 - systemctl disable multipathd.socket
260 - systemctl disable multipath-tools.service
261 - systemctl disable multipathd.service
262 - systemctl disable lvm2-monitor.service
263 - systemctl disable lvm2-lvmpolld.socket
264 - systemctl disable lvm2-lvmetad.socket
265 apt:
266 preserve_sources_list: false
267 primary:
268 - arches: [default]
269 uri: http://us.archive.ubuntu.com/ubuntu
270 sources_list: |
271 deb $MIRROR $RELEASE main restricted universe multiverse
272 deb $MIRROR $RELEASE-updates main restricted universe multiverse
273 deb $MIRROR $RELEASE-proposed main restricted universe multiverse
274 deb-src $MIRROR $RELEASE main restricted universe multiverse
275 deb-src $MIRROR $RELEASE-updates main restricted universe multiverse
276 deb-src $MIRROR $RELEASE-proposed main restricted universe multiverse
277 conf: |
278 Dpkg::Options {
279 "--force-confdef";
280 "--force-confold";
281 };
282 sources:
283 debug.list:
284 source: |
285 # deb http://ddebs.ubuntu.com $RELEASE main restricted universe multiverse
286 # deb http://ddebs.ubuntu.com $RELEASE-updates main restricted universe multiverse
287 # deb http://ddebs.ubuntu.com $RELEASE-proposed main restricted universe multiverse
288 keyid: C8CAB6595FDFF622
289
290 --------------------
291
292 After provisioning machines "clubionic01, clubionic02, clubionic03" (Standard
293 D2s v3 (2 vcpus, 8 GiB memory)) with Linux Ubuntu Bionic (18.04), using the same
294 resource-group (clubionic), located in "West Central US" AND having the same
295 proximity placement group (clubionicplacement), you will be able to access all
296 the VMs through their public IPs... and make sure the shared disk works as a
297 fencing mechanism by testing SCSI persistent reservations using the "sg3-utils"
298 tools.
299
300 Run these commands in *at least* 1 node after the shared disk attached to it:
301
302 # clubionic01
303
304 # read current reservations:
305
306 rafaeldtinoco@clubionic01:~$ sudo sg_persist -r /dev/sdc
307 Msft Virtual Disk 1.0
308 Peripheral device type: disk
309 PR generation=0x0, there is NO reservation held
310
311 # register new reservation key 0x123abc:
312
313 rafaeldtinoco@clubionic01:~$ sudo sg_persist --out --register \
314 --param-sark=123abc /dev/sdc
315 Msft Virtual Disk 1.0
316 Peripheral device type: disk
317
318 # To reserve the DEVICE (write exclusive):
319
320 rafaeldtinoco@clubionic01:~$ sudo sg_persist --out --reserve \
321 --param-rk=123abc --prout-type=5 /dev/sdc
322 Msft Virtual Disk 1.0
323 Peripheral device type: disk
324
325 # Check reservation created:
326
327 rafaeldtinoco@clubionic01:~$ sudo sg_persist -r /dev/sdc
328 Msft Virtual Disk 1.0
329 Peripheral device type: disk
330 PR generation=0x3, Reservation follows:
331 Key=0x123abc
332 scope: LU_SCOPE, type: Write Exclusive, registrants only
333
334 # To release the reservation:
335
336 rafaeldtinoco@clubionic01:~$ sudo sg_persist --out --release \
337 --param-rk=123abc --prout-type=5 /dev/sdc
338 Msft Virtual Disk 1.0
339 Peripheral device type: disk
340
341 # To unregister a reservation key:
342
343 rafaeldtinoco@clubionic01:~$ sudo sg_persist --out --register \
344 --param-rk=123abc /dev/sdc
345 Msft Virtual Disk 1.0
346 Peripheral device type: disk
347
348 # Make sure reservation is gone:
349
350 rafaeldtinoco@clubionic01:~$ sudo sg_persist -r /dev/sdc
351 Msft Virtual Disk 1.0
352 Peripheral device type: disk
353 PR generation=0x4, there is NO reservation held
354
355 BEGIN --------------------
356
357 Now it is time to configure the cluster network. In the beginning of this recipe
358 you saw there were 2 subnet created in the virtual network assigned to this
359 environment:
360
361 clubionicnet Virtual network
362 subnets:
363 private 10.250.3.0/24
364 public 10.250.98.0/24
365
366 Since there might be a limit of 2 extra virtual network adapters attached to
367 your VMs, we are doing the *minimum* required amount of networks for the HA
368 cluster to operate in good conditions.
369
370 public network: This is the network where the HA cluster virtual IPs will be
371 placed on. This means that every cluster node will have 1 IP from this subnet
372 assigned to itself and possibly a floating IP, depending on where the service is
373 running (the resource is active).
374
375 private network: This is "internal-to-cluster" interface. Where all the cluster
376 nodes will continuously exchange messages regarding the cluster state. This
377 network is important as corosync relies on it to know if the cluster nodes are
378 online or not. It is also possible to create a "2nd" virtual adapter to each of
379 the nodes, having a 2nd private network (2nd ring in the messaging layer). This
380 may guarantee that there are no false-positives in cluster failure detections
381 because of network jittering/delays when having "a single nic adapter" for the
382 inter-node messaging.
383
384 Instructions:
385
386 - Provision the 3 VMs with 2 network interfaces each (public & private)
387 - Make sure that, when started, all 3 of them have an external IP (to access)
388 - A 4th machine is possible (just to access the env, depending on topology)
389 - Make sure both, public and private networks are configured as:
390
391 clubionic01:
392 - public = 10.250.98.10/24
393 - private = 10.250.3.10/24
394
395 clubionic02:
396 - public = 10.250.98.11/24
397 - private = 10.250.3.11/24
398
399 clubionic03:
400 - public = 10.250.98.12/24
401 - private = 10.250.3.12/24
402
403 And that all interfaces are configured as "static". Then, after powering up the
404 virtual machines, make sure to disable cloud-init networking configuration AND
405 to set the interfaces are "static" interfaces.
406
407 --------------------
408
409 Ubuntu Bionic Cloud Images, deployed by Microsoft Azure to our VMs, come, by
410 default, installed with "netplan.io" network tool installed, using systemd-
411 networkd as its backend network provider. This means that all the network
412 interfaces are being configured and managed by systemd.
413
414 Unfortunately, because of the following bug:
415
416 https://bugs.launchpad.net/netplan/+bug/1815101
417
418 (currently being worked on), any HA environment that wants to have "virtual
419 aliases" in any network interface should rely in the previous "ifupdown" network
420 management method. This happens because systemd-networkd had to "learn" how to
421 deal with restarting interfaces that were being controlled by HA software just
422 recently and, before that, it used to remove the aliases without cluster
423 synchronization (fixed in Eoan by using KeepConfiguration= stanza in
424 systemd-networkd .network file).
425
426 With that, here are the instructions on how to remove netplan.io AND install
427 ifupdown + resolvconf packages:
428
429 $ sudo apt-get remove --purge netplan.io
430 $ sudo apt-get install ifupdown bridge-utils vlan resolvconf
431 $ sudo apt-get install cloud-init
432
433 $ sudo rm /etc/netplan/50-cloud-init.yaml
434 $ sudo vi /etc/cloud/cloud.cfg.d/99-custom-networking.cfg
435 $ sudo cat /etc/cloud/cloud.cfg.d/99-custom-networking.cfg
436 network: {config: disabled}
437
438 And how to configure the interfaces using ifupdown:
439
440 $ cat /etc/network/interfaces
441
442 auto lo
443 iface lo inet loopback
444 dns-nameserver 168.63.129.16
445
446 # public
447
448 auto eth0
449 iface eth0 inet static
450 address 10.250.98.10
451 netmask 255.255.255.0
452 gateway 10.250.98.1
453
454 # private
455
456 auto eth1
457 iface eth1 inet static
458 address 10.250.3.10
459 netmask 255.255.255.0
460
461 $ cat /etc/hosts
462 127.0.0.1 localhost
463
464 ::1 ip6-localhost ip6-loopback
465 fe00::0 ip6-localnet
466 ff00::0 ip6-mcastprefix
467 ff02::1 ip6-allnodes
468 ff02::2 ip6-allrouters
469 ff02::3 ip6-allhosts
470
471 And disable systemd-networkd:
472
473 $ sudo systemctl disable systemd-networkd.service \
474 systemd-networkd.socket systemd-networkd-wait-online.service \
475 systemd-resolved.service
476
477 $ sudo update-initramfs -k all -u
478
479 And make sure grub configuration is right:
480
481 $ cat /etc/default/grub
482 GRUB_DEFAULT=0
483 GRUB_TIMEOUT=5
484 GRUB_DISTRIBUTOR="Ubuntu"
485 GRUB_CMDLINE_LINUX_DEFAULT="console=tty1 console=ttyS0 earlyprintk=ttyS0 rootdelay=300 elevator=noop apparmor=0"
486 GRUB_CMDLINE_LINUX=""
487 GRUB_TERMINAL=serial
488 GRUB_SERIAL_COMMAND="serial --speed=9600 --unit=0 --word=8 --parity=no --stop=1"
489 GRUB_RECORDFAIL_TIMEOUT=0
490
491 $ sudo update-grub
492
493 and reboot (stop and start the instance so grub cmdline is changed).
494
495 $ ifconfig -a
496 eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
497 inet 10.250.98.10 netmask 255.255.255.0 broadcast 10.250.98.255
498 inet6 fe80::20d:3aff:fef8:6551 prefixlen 64 scopeid 0x20<link>
499 ether 00:0d:3a:f8:65:51 txqueuelen 1000 (Ethernet)
500 RX packets 483 bytes 51186 (51.1 KB)
501 RX errors 0 dropped 0 overruns 0 frame 0
502 TX packets 415 bytes 65333 (65.3 KB)
503 TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
504
505 eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
506 inet 10.250.3.10 netmask 255.255.255.0 broadcast 10.250.3.255
507 inet6 fe80::20d:3aff:fef8:3d01 prefixlen 64 scopeid 0x20<link>
508 ether 00:0d:3a:f8:3d:01 txqueuelen 1000 (Ethernet)
509 RX packets 0 bytes 0 (0.0 B)
510 RX errors 0 dropped 0 overruns 0 frame 0
511 TX packets 11 bytes 866 (866.0 B)
512 TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
513
514 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
515 inet 127.0.0.1 netmask 255.0.0.0
516 inet6 ::1 prefixlen 128 scopeid 0x10<host>
517 loop txqueuelen 1000 (Local Loopback)
518 RX packets 84 bytes 6204 (6.2 KB)
519 RX errors 0 dropped 0 overruns 0 frame 0
520 TX packets 84 bytes 6204 (6.2 KB)
521 TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
522
523 Note: This has to be done in ALL cluster nodes in order for the HA software,
524 pacemaker in our case, to correctly manage the interfaces, virtual aliases and
525 services.
526 --------------------
527
528 Now let's start configuring the cluster. First /etc/hosts with all names. For
529 all nodes make sure you have something similar to:
530
531 rafaeldtinoco@clubionic01:~$ cat /etc/hosts
532 127.0.0.1 localhost
533 127.0.1.1 clubionic01
534
535 ::1 ip6-localhost ip6-loopback
536 fe00::0 ip6-localnet
537 ff00::0 ip6-mcastprefix
538 ff02::1 ip6-allnodes
539 ff02::2 ip6-allrouters
540 ff02::3 ip6-allhosts
541
542 # cluster
543
544 10.250.98.13 clubionic # floating IP (application)
545
546 10.250.98.10 bionic01 # node01 public IP
547 10.250.98.11 bionic02 # node02 public IP
548 10.250.98.12 bionic03 # node03 public IP
549
550 10.250.3.10 clubionic01 # node01 ring0 private IP
551 10.250.3.11 clubionic02 # node02 ring0 private IP
552 10.250.3.12 clubionic03 # node03 ring0 private IP
553
554 And that all names are accessible from all nodes:
555
556 $ ping clubionic01
557 --------------------
558
559 And let's install corosync package and make sure we are able to create a
560 messaging (only, for now) cluster with corosync. Install corosync in all the 3
561 nodes:
562
563 $ sudo apt-get install pacemaker pacemaker-cli-utils corosync corosync-doc \
564 resource-agents fence-agents crmsh
565
566 With packages properly installed it is time to create the corosync.conf file:
567
568 $ sudo cat /etc/corosync/corosync.conf
569 totem {
570 version: 2
571 secauth: off
572 cluster_name: clubionic
573 transport: udpu
574 }
575
576 nodelist {
577 node {
578 ring0_addr: 10.250.3.10
579 # ring1_addr: 10.250.4.10
580 name: clubionic01
581 nodeid: 1
582 }
583 node {
584 ring0_addr: 10.250.3.11
585 # ring1_addr: 10.250.4.11
586 name: clubionic02
587 nodeid: 2
588 }
589 node {
590 ring0_addr: 10.250.3.12
591 # ring1_addr: 10.250.4.12
592 name: clubionic03
593 nodeid: 3
594 }
595 }
596
597 quorum {
598 provider: corosync_votequorum
599 two_node: 0
600 }
601
602 qb {
603 ipc_type: native
604 }
605
606 logging {
607
608 fileline: on
609 to_stderr: on
610 to_logfile: yes
611 logfile: /var/log/corosync/corosync.log
612 to_syslog: no
613 debug: off
614 }
615
616 But, before restarting corosync with this new configuration, we have to make
617 sure we create a keyfile and share among all the cluster nodes:
618
619 rafaeldtinoco@clubionic01:~$ sudo corosync-keygen
620
621 Corosync Cluster Engine Authentication key generator.
622 Gathering 1024 bits for key from /dev/random.
623 Press keys on your keyboard to generate entropy.
624 Press keys on your keyboard to generate entropy (bits = 920).
625 Press keys on your keyboard to generate entropy (bits = 1000).
626 Writing corosync key to /etc/corosync/authkey.
627
628 rafaeldtinoco@clubionic01:~$ sudo scp /etc/corosync/authkey \
629 root@clubionic02:/etc/corosync/authkey
630
631 rafaeldtinoco@clubionic01:~$ sudo scp /etc/corosync/authkey \
632 root@clubionic03:/etc/corosync/authkey
633
634 And now we are ready to make corosync service started by default:
635
636 rafaeldtinoco@clubionic01:~$ systemctl enable --now corosync
637 rafaeldtinoco@clubionic01:~$ systemctl restart corosync
638
639 rafaeldtinoco@clubionic02:~$ systemctl enable --now corosync
640 rafaeldtinoco@clubionic02:~$ systemctl restart corosync
641
642 rafaeldtinoco@clubionic03:~$ systemctl enable --now corosync
643 rafaeldtinoco@clubionic03:~$ systemctl restart corosync
644
645 Finally it is time to check if the messaging layer of our new cluster is good.
646 Don't worry too much about restarting nodes as the resource-manager (pacemaker)
647 is not installed yet and quorum won't be enforced in any way.
648
649 rafaeldtinoco@clubionic01:~$ sudo corosync-quorumtool -si
650 Quorum information
651 ------------------
652 Date: Mon Feb 24 01:54:10 2020
653 Quorum provider: corosync_votequorum
654 Nodes: 3
655 Node ID: 1
656 Ring ID: 1/16
657 Quorate: Yes
658
659 Votequorum information
660 ----------------------
661 Expected votes: 3
662 Highest expected: 3
663 Total votes: 3
664 Quorum: 2
665 Flags: Quorate
666
667 Membership information
668 ----------------------
669 Nodeid Votes Name
670 1 1 10.250.3.10 (local)
671 2 1 10.250.3.11
672 3 1 10.250.3.12
673
674 Perfect! We have the messaging layer ready for the resource-manager to be
675 configured !
676
677 --------------------
678
679 It is time to configure the resource-manager (pacemaker) now:
680
681 rafaeldtinoco@clubionic01:~$ systemctl enable --now pacemaker
682
683 rafaeldtinoco@clubionic02:~$ systemctl enable --now pacemaker
684
685 rafaeldtinoco@clubionic03:~$ systemctl enable --now pacemaker
686
687 rafaeldtinoco@clubionic01:~$ sudo crm_mon -1
688 Stack: corosync
689 Current DC: NONE
690 Last updated: Mon Feb 24 01:56:11 2020
691 Last change: Mon Feb 24 01:40:53 2020 by hacluster via crmd on clubionic01
692
693 3 nodes configured
694 0 resources configured
695
696 Node clubionic01: UNCLEAN (offline)
697 Node clubionic02: UNCLEAN (offline)
698 Node clubionic03: UNCLEAN (offline)
699
700 No active resources
701
702 As you can see we have to wait until the resource manager uses the messaging
703 transport layer and defines all nodes status. Give it a few seconds to settle
704 and you will have:
705
706 rafaeldtinoco@clubionic01:~$ sudo crm_mon -1
707 Stack: corosync
708 Current DC: clubionic01 (version 1.1.18-2b07d5c5a9) - partition with quorum
709 Last updated: Mon Feb 24 01:57:22 2020
710 Last change: Mon Feb 24 01:40:54 2020 by hacluster via crmd on clubionic02
711
712 3 nodes configured
713 0 resources configured
714
715 Online: [ clubionic01 clubionic02 clubionic03 ]
716
717 No active resources
718 --------------------
719
720 Perfect! It is time to do a few "basic" setup for pacemaker. Here, in this doc,
721 I'm using "crmsh" tool to configure the cluster. For Ubuntu Bionic this is
722 the preferred way of configuring pacemaker.
723
724 At any time you can execute "crmsh" and join/leave the commands as they were
725 directories:
726
727 rafaeldtinoco@clubionic01:~$ sudo crm
728
729 crm(live)# ls
730
731 cibstatus help site
732 cd cluster quit
733 end script verify
734 exit ra maintenance
735 bye ? ls
736 node configure back
737 report cib resource
738 up status corosync
739 options history
740
741 crm(live)# cd configure
742
743 crm(live)configure# ls
744 .. get_property cibstatus
745 primitive set validate_all
746 help rsc_template ptest
747 back cd default-timeouts
748 erase validate-all rsctest
749 rename op_defaults modgroup
750 xml quit upgrade
751 group graph load
752 master location template
753 save collocation rm
754 bye clone ?
755 ls node default_timeouts
756 exit acl_target colocation
757 fencing_topology assist alert
758 ra schema user
759 simulate rsc_ticket end
760 role rsc_defaults monitor
761 cib property resource
762 edit show up
763 refresh order filter
764 get-property tag ms
765 verify commit history
766 delete
767
768 And you can even edit the CIB file for the cluster:
769
770 rafaeldtinoco@clubionic01:~$ crm configure edit
771 rafaeldtinoco@clubionic01:~$ crm
772 crm(live)# cd configure
773 crm(live)configure# edit
774 crm(live)configure# commit
775 INFO: apparently there is nothing to commit
776 INFO: try changing something first
777
778 --------------------
779
780 Let's check the current cluster configuration:
781
782 rafaeldtinoco@clubionic01:~$ crm configure show
783 node 1: clubionic01
784 node 2: clubionic02
785 node 3: clubionic03
786 property cib-bootstrap-options: \
787 have-watchdog=false \
788 dc-version=1.1.18-2b07d5c5a9 \
789 cluster-infrastructure=corosync \
790 cluster-name=clubionic
791
792 With this basic settings we can see 2 important things before we attempt to
793 configure any resource: we are missing a "watchdog" device AND there is no
794 "fencing" configured for the cluster.
795
796 NOTE:
797
798 1. This is an important note to read. Since we are going to rely our cluster
799 health on pacemaker, it is mandatory that pacemaker knows how to decide
800 which side of the cluster is the one that should have enabled resources IF
801 there is a rupture in the messaging (internal / ring0) layer. The size with
802 more "votes" is the size that will become "active" while the rest of node(s)
803 without communication will be "fenced".
804
805 Usually fencing comes in the form of power fencing: The quorate side of the
806 cluster is able to get a positive response from the fencing mechanism of the
807 broken side through an external communication path (like a network talking to
808 ILOs or BMCs).
809
810 For this case, we are going to use shared SCSI disk and its SCSI3 feature called
811 SCSI PERSISTENT RESERVATIONS as the fencing mechanism : Every time the
812 interconnect communication faces a disruption, the quorate side (in this 3-node
813 example, the side that has 2-node still communicating through the private ring
814 network) will make sure to "fence" the other node using SCSI PERSISTENT
815 RESERVATION (by removing the SCSI reservation key used by the node to be fenced,
816 for example).
817
818 Other fencing mechanisms support "reboot/reset" action whenever the quorate
819 cluster wants to fence some node. Let's start calling things by name: pacemaker
820 has a service called "stonith" (shot the other node in the head) and that's
821 how it executes fencing actions: by having fencing agents (fence_scsi in our
822 case) and having arguments given to these agents that will execute programmed
823 actions to "shoot the other node in the head".
824
825 Since fence_scsi agent does not have a "reboot/reset" action, it is good to have
826 a "watchdog" device capable of realizing that the node cannot read and/or write
827 to a shared disk and kill itself whenever that happens. With a watchdog device
828 we have a "complete" solution for HA: a fencing mechanism that will block the
829 fenced node to read or write from the application disk (saving a shared
830 filesystem from being corrupted, for example) AND a watchdog device that will,
831 as soon as it realizes the node has been fenced, reset the node.
832
833 --------------------
834
835 There are multiple HW watchdog devices around but if you don't have one in your
836 HW (and/or virtual machine) you can always count with the in-kernel software
837 watchdog device (kernel module called "softdog").
838
839 $ apt-get install watchdog
840
841 For the questions when installing the "watchdog" package, make sure to set:
842
843 Watchdog module to preload: softdog
844
845 and all the others to default. Install the "watchdog" package in all 3 nodes.
846
847 Of course watchdog won't do anything to pacemaker by itself. We have to
848 tell watchdog that we would like it to check for the fence_scsi shared disks
849 access from time to time. The way we do this is:
850
851 $ apt-file search fence_scsi_check
852 fence-agents: /usr/share/cluster/fence_scsi_check
853
854 $ sudo mkdir /etc/watchdog.d/
855 $ sudo cp /usr/share/cluster/fence_scsi_check /etc/watchdog.d/
856 $ systemctl restart watchdog
857
858 $ ps -ef | grep watch
859 root 41 2 0 00:10 ? 00:00:00 [watchdogd]
860 root 8612 1 0 02:21 ? 00:00:00 /usr/sbin/watchdog
861
862 Also do that for all the 3 nodes.
863
864 After configuring watchdog, lets keep it disabled and stopped for now... or else
865 your nodes will keep rebooting because the reservations are not in the shared
866 disk yet (as pacemaker is not configured).
867
868 $ systemctl disable watchdog
869 Synchronizing state of watchdog.service with SysV service script with /lib/systemd/systemd-sysv-install.
870 Executing: /lib/systemd/systemd-sysv-install disable watchdog
871
872 $ systemctl stop watchdog
873
874 --------------------
875
876 Now our cluster has "fence_scsi" resource to fence a node AND watchdog devices
877 (/dev/watchdog) created by the kernel module "softdog" and managed by the
878 watchdog daemon, which executes our fence_scsi_check script.
879
880 Let's tell this to the cluster:
881
882 rafaeldtinoco@clubionic01:~$ crm configure
883 crm(live)configure# property stonith-enabled=on
884 crm(live)configure# property stonith-action=off
885 crm(live)configure# property no-quorum-policy=stop
886 crm(live)configure# property have-watchdog=true
887 crm(live)configure# commit
888 crm(live)configure# end
889 crm(live)# end
890 bye
891
892 rafaeldtinoco@clubionic01:~$ crm configure show
893 node 1: clubionic01
894 node 2: clubionic02
895 node 3: clubionic03
896 property cib-bootstrap-options: \
897 have-watchdog=true \
898 dc-version=1.1.18-2b07d5c5a9 \
899 cluster-infrastructure=corosync \
900 cluster-name=clubionic \
901 stonith-enabled=on \
902 stonith-action=off \
903 no-quorum-policy=stop
904
905 And, not only telling cluster we have watchdog and what is the fencing policy,
906 we have also to configure the fence resource and tell where to run it.
907
908 --------------------
909
910 Let's continue creating the fencing resource in the cluster:
911
912 rafaeldtinoco@clubionic03:~$ sudo sg_persist --in --read-keys --device=/dev/sda
913 LIO-ORG cluster.bionic. 4.0
914 Peripheral device type: disk
915 PR generation=0x0, there are NO registered reservation keys
916
917 rafaeldtinoco@clubionic03:~$ sudo sg_persist -r /dev/sda
918 LIO-ORG cluster.bionic. 4.0
919 Peripheral device type: disk
920 PR generation=0x0, there is NO reservation held
921
922 rafaeldtinoco@clubionic01:~$ crm configure primitive fence_clubionic \
923 stonith:fence_scsi params \
924 pcmk_host_list="clubionic01 clubionic02 clubionic03" \
925 devices="/dev/disk/by-path/acpi-VMBUS:01-scsi-0:0:0:0" \
926 meta provides=unfencing
927
928 After creating the fencing agent, make sure it is running:
929
930 rafaeldtinoco@clubionic01:~$ crm_mon -1
931 Stack: corosync
932 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
933 Last updated: Mon Feb 24 04:06:15 2020
934 Last change: Mon Feb 24 04:06:11 2020 by root via cibadmin on clubionic01
935
936 3 nodes configured
937 1 resource configured
938
939 Online: [ clubionic01 clubionic02 clubionic03 ]
940
941 Active resources:
942
943 fence_clubionic (stonith:fence_scsi): Started clubionic01
944
945 and also make sure that the reservations are in place:
946
947 rafaeldtinoco@clubionic03:~$ sudo sg_persist --in --read-keys --device=/dev/sda
948 LIO-ORG cluster.bionic. 4.0
949 Peripheral device type: disk
950 PR generation=0x3, 3 registered reservation keys follow:
951 0x3abe0001
952 0x3abe0000
953 0x3abe0002
954
955 Having 3 keys registered show that all nodes have registered their keys while,
956 when checking which host has the reservation, you have to see a single node
957 key:
958
959 rafaeldtinoco@clubionic03:~$ sudo sg_persist -r /dev/sda
960 LIO-ORG cluster.bionic. 4.0
961 Peripheral device type: disk
962 PR generation=0x3, Reservation follows:
963 Key=0x3abe0001
964 scope: LU_SCOPE, type: Write Exclusive, registrants only
965
966 --------------------
967
968 Testing fencing before moving on
969
970 It is very important to make sure that we are able to fence the node that faced
971 issues. In our case, as we are also using a watchdog device, so we want to make
972 sure that our node will reboot in case it looses access to the share scsi disk.
973
974 In order to obtain that, we can do a simple test:
975
976 rafaeldtinoco@clubionic01:~$ crm_mon -1
977 Stack: corosync
978 Current DC: clubionic01 (version 1.1.18-2b07d5c5a9) - partition with quorum
979 Last updated: Fri Mar 6 16:43:01 2020
980 Last change: Fri Mar 6 16:38:55 2020 by hacluster via crmd on clubionic01
981
982 3 nodes configured
983 1 resource configured
984
985 Online: [ clubionic01 clubionic02 clubionic03 ]
986
987 Active resources:
988
989 fence_clubionic (stonith:fence_scsi): Started clubionic01
990
991 You can see that fence_clubionic resource is running at clubionic01. With that
992 information we can stop the interconnect (private) network communication of
993 that node only and check 2 things:
994
995 1) fence_clubionic service has to be started in another node
996 2) clubionic01 (where fence_clubionic is running) will reboot
997
998 rafaeldtinoco@clubionic01:~$ sudo iptables -A INPUT -i eth2 -j DROP
999
1000 rafaeldtinoco@clubionic02:~$ crm_mon -1
1001 Stack: corosync
1002 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
1003 Last updated: Fri Mar 6 16:45:31 2020
1004 Last change: Fri Mar 6 16:38:55 2020 by hacluster via crmd on clubionic01
1005
1006 3 nodes configured
1007 1 resource configured
1008
1009 Online: [ clubionic02 clubionic03 ]
1010 OFFLINE: [ clubionic01 ]
1011
1012 Active resources:
1013
1014 fence_clubionic (stonith:fence_scsi): Started clubionic02
1015
1016 Okay (1) worked. fence_clubionic resource migrated to clubionic02 node AND the
1017 reservation key from clubionic01 node was removed from the shared storage:
1018
1019 rafaeldtinoco@clubionic02:~$ sudo sg_persist --in --read-keys --device=/dev/sda
1020 LIO-ORG cluster.bionic. 4.0
1021 Peripheral device type: disk
1022 PR generation=0x4, 2 registered reservation keys follow:
1023 0x3abe0001
1024 0x3abe0002
1025
1026 After up to 60sec (default timeout for the softdog driver + watchdog daemon):
1027
1028 [ 596.943649] reboot: Restarting system
1029
1030 clubionic01 is rebooted by watchdog daemon (remember the file
1031 /etc/watchdog.d/fence_scsi_check ? that file was responsible for making
1032 watchdog daemon to reboot the node... when it realized the scsi disk wasn't
1033 accessible any longer by our node).
1034
1035 After the reboot succeeds:
1036
1037 rafaeldtinoco@clubionic02:~$ sudo sg_persist --in --read-keys --device=/dev/sda
1038 LIO-ORG cluster.bionic. 4.0
1039 Peripheral device type: disk
1040 PR generation=0x5, 3 registered reservation keys follow:
1041 0x3abe0001
1042 0x3abe0002
1043 0x3abe0000
1044
1045 rafaeldtinoco@clubionic02:~$ crm_mon -1
1046 Stack: corosync
1047 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
1048 Last updated: Fri Mar 6 16:49:44 2020
1049 Last change: Fri Mar 6 16:38:55 2020 by hacluster via crmd on clubionic01
1050
1051 3 nodes configured
1052 1 resource configured
1053
1054 Online: [ clubionic01 clubionic02 clubionic03 ]
1055
1056 Active resources:
1057
1058 fence_clubionic (stonith:fence_scsi): Started clubionic02
1059
1060 Its all back to normal, but fence_clubionic agent stays where it was:
1061 clubionic02 node. This cluster behavior is usually to avoid the "ping-pong"
1062 effect for intermittent failures.
1063
1064 --------------------
1065
1066 Now we will install a simple lighttpd service in all the nodes and have it
1067 managed by pacemaker. The idea is simple: to have a virtual IP migrating
1068 in between the nodes, serving a lighttpd service with files coming from the
1069 shared filesystem disk.
1070
1071 AN IMPORTANT THING TO NOTE HERE: If you are using SHARED SCSI disk to
1072 protect cluster concurrency, it is imperative that the data being serviced
1073 by HA application is also contained in the shared disk.
1074
1075 rafaeldtinoco@clubionic01:~$ apt-get install lighttpd
1076 rafaeldtinoco@clubionic01:~$ systemctl stop lighttpd.service
1077 rafaeldtinoco@clubionic01:~$ systemctl disable lighttpd.service
1078
1079 rafaeldtinoco@clubionic02:~$ apt-get install lighttpd
1080 rafaeldtinoco@clubionic02:~$ systemctl stop lighttpd.service
1081 rafaeldtinoco@clubionic02:~$ systemctl disable lighttpd.service
1082
1083 rafaeldtinoco@clubionic03:~$ apt-get install lighttpd
1084 rafaeldtinoco@clubionic03:~$ systemctl stop lighttpd.service
1085 rafaeldtinoco@clubionic03:~$ systemctl disable lighttpd.service
1086
1087 Having the hostname as the index.html file we will be able to know which node
1088 is active when accessing the virtual IP, that will be migrating among all 3
1089 nodes:
1090
1091 rafaeldtinoco@clubionic01:~$ sudo rm /var/www/html/*.html
1092 rafaeldtinoco@clubionic01:~$ echo $HOSTNAME | sudo tee /var/www/html/index.html
1093 clubionic01
1094
1095 rafaeldtinoco@clubionic02:~$ sudo rm /var/www/html/*.html
1096 rafaeldtinoco@clubionic02:~$ echo $HOSTNAME | sudo tee /var/www/html/index.html
1097 clubionic02
1098
1099 rafaeldtinoco@clubionic03:~$ sudo rm /var/www/html/*.html
1100 rafaeldtinoco@clubionic03:~$ echo $HOSTNAME | sudo tee /var/www/html/index.html
1101 clubionic03
1102
1103 And we will have a good way to tell from which source the lighttpd daemon is
1104 getting its files from:
1105
1106 rafaeldtinoco@clubionic01:~$ curl localhost
1107 clubionic01 -> local disk
1108
1109 rafaeldtinoco@clubionic01:~$ curl clubionic02
1110 clubionic02 -> local (to clubionic02) disk
1111
1112 rafaeldtinoco@clubionic01:~$ curl clubionic03
1113 clubionic03 -> local (to clubionic03) disk
1114
1115 --------------------
1116
1117 Next step is to configure the cluster as a HA Active-Passive only cluster. The
1118 shared disk in this scenario would only work as a fence mechanism.
1119
1120 rafaeldtinoco@clubionic01:~$ crm configure sh
1121 node 1: clubionic01
1122 node 2: clubionic02
1123 node 3: clubionic03
1124 primitive fence_clubionic stonith:fence_scsi \
1125 params pcmk_host_list="clubionic01 clubionic02 clubionic03" plug="" \
1126 devices="/dev/sda" meta provides=unfencing
1127 primitive virtual_ip IPaddr2 \
1128 params ip=10.250.98.13 nic=eth3 \
1129 op monitor interval=10s
1130 primitive webserver systemd:lighttpd \
1131 op monitor interval=10 timeout=30
1132 group webserver_vip webserver virtual_ip
1133 property cib-bootstrap-options: \
1134 have-watchdog=false \
1135 dc-version=1.1.18-2b07d5c5a9 \
1136 cluster-infrastructure=corosync \
1137 cluster-name=clubionic \
1138 stonith-enabled=on \
1139 stonith-action=off \
1140 no-quorum-policy=stop
1141
1142 As you can see I have created 2 resources and 1 group of resources. You can
1143 copy and paste the command from above sinde "crmsh" and do a "commit" at the end
1144 and it will create the resource for you. After creating the resource, check if
1145 it is working:
1146
1147 rafaeldtinoco@clubionic01:~$ crm_mon -1
1148 Stack: corosync
1149 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
1150 Last updated: Fri Mar 6 18:57:54 2020
1151 Last change: Fri Mar 6 18:52:17 2020 by root via cibadmin on clubionic01
1152
1153 3 nodes configured
1154 3 resources configured
1155
1156 Online: [ clubionic01 clubionic02 clubionic03 ]
1157
1158 Active resources:
1159
1160 fence_clubionic (stonith:fence_scsi): Started clubionic02
1161 Resource Group: webserver_vip
1162 webserver (systemd:lighttpd): Started clubionic01
1163 virtual_ip (ocf::heartbeat:IPaddr2): Started clubionic01
1164
1165 rafaeldtinoco@clubionic01:~$ ping -c 1 clubionic.public
1166 PING clubionic.public (10.250.98.13) 56(84) bytes of data.
1167 64 bytes from clubionic.public (10.250.98.13): icmp_seq=1 ttl=64 time=0.025 ms
1168
1169 --- clubionic.public ping statistics ---
1170 1 packets transmitted, 1 received, 0% packet loss, time 0ms
1171 rtt min/avg/max/mdev = 0.025/0.025/0.025/0.000 ms
1172
1173 And testing the resource is really active in clubionic01 host:
1174
1175 rafaeldtinoco@clubionic01:~$ curl clubionic.public
1176 clubionic01
1177
1178 Note that, in this example, we are not using the shared disk for much, only to
1179 have a way of fencing the failed host. This is important for virtual
1180 environments that does not necessarily give you a power fencing mechanism, for
1181 example, and you have to rely on SCSI FENCE + WATCHDOG to guarantee cluster
1182 consistence, as said in the beginning of this document.
1183
1184 Final step is to start using the shared scsi disk as a HA active/passive
1185 resource. It means that the webserver we are clustering will serve files from
1186 the shared disk but there won't be multiple active nodes simultaneously, just
1187 one. This example can serve as a clustering example for other services such as:
1188 CIFS, SAMBA, NFS, MTAs and MDAs such as postfix/qmail, etc.
1189
1190 --------------------
1191
1192 Note: I'm using "systemd" resource agent standard because its not relying on
1193 older agents and you can check supported agents by executing:
1194
1195 rafaeldtinoco@clubionic01:~$ crm_resource --list-standards
1196 ocf
1197 lsb
1198 service
1199 systemd
1200 stonith
1201
1202 rafaeldtinoco@clubionic01:~$ crm_resource --list-agents=systemd
1203 apt-daily
1204 apt-daily-upgrade
1205 atd
1206 autovt@
1207 bootlogd
1208 ...
1209
1210 The agents list will be compatible with the software you have installed at
1211 the moment you execute that command in a node (as the systemd standard basically
1212 uses existing service units from systemd on the nodes).
1213
1214 --------------------
1215
1216 For a HA environment we need to first migrate the shared disk (meaning umounting
1217 from one node and mounting it in the other one) and then migrate dependent
1218 services. For this scenario there isn't a need for configuring a locking
1219 manager of any kind.
1220
1221 Let's install LVM2 packages in all nodes:
1222
1223 $ apt-get install lvm2
1224
1225 And configure LVM2 to have a system id based in the uname cmd output:
1226
1227 rafaeldtinoco@clubionic01:~$ sudo vi /etc/lvm/lvm.conf
1228 ...
1229 system_id_source = "uname"
1230
1231 Do that in all 3 nodes.
1232
1233 rafaeldtinoco@clubionic01:~$ sudo lvm systemid
1234 system ID: clubionic01
1235
1236 rafaeldtinoco@clubionic02:~$ sudo lvm systemid
1237 system ID: clubionic02
1238
1239 rafaeldtinoco@clubionic03:~$ sudo lvm systemid
1240 system ID: clubionic03
1241
1242 Configure 1 partition for the shared disk:
1243
1244 rafaeldtinoco@clubionic01:~$ sudo gdisk /dev/sda
1245 GPT fdisk (gdisk) version 1.0.3
1246
1247 Partition table scan:
1248 MBR: not present
1249 BSD: not present
1250 APM: not present
1251 GPT: not present
1252
1253 Creating new GPT entries.
1254
1255 Command (? for help): n
1256 Partition number (1-128, default 1):
1257 First sector (34-2047966, default = 2048) or {+-}size{KMGTP}:
1258 Last sector (2048-2047966, default = 2047966) or {+-}size{KMGTP}:
1259 Current type is 'Linux filesystem'
1260 Hex code or GUID (L to show codes, Enter = 8300):
1261 Changed type of partition to 'Linux filesystem'
1262
1263 Command (? for help): w
1264
1265 Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING
1266 PARTITIONS!!
1267
1268 Do you want to proceed? (Y/N): y
1269 OK; writing new GUID partition table (GPT) to /dev/sda.
1270 The operation has completed successfully.
1271
1272 And create the physical and logical volumes using LVM2:
1273
1274 rafaeldtinoco@clubionic01:~$ sudo pvcreate /dev/sda1
1275
1276 rafaeldtinoco@clubionic01:~$ sudo vgcreate clustervg /dev/sda1
1277
1278 rafaeldtinoco@clubionic01:~$ sudo vgs -o+systemid
1279 VG #PV #LV #SN Attr VSize VFree System ID
1280 clustervg 1 0 0 wz--n- 988.00m 988.00m clubionic01
1281
1282 rafaeldtinoco@clubionic01:~$ sudo lvcreate -l100%FREE -n clustervol clustervg
1283 Logical volume "clustervol" created.
1284
1285 rafaeldtinoco@clubionic01:~$ sudo mkfs.ext4 -LCLUSTERDATA /dev/clustervg/clustervol
1286 mke2fs 1.44.1 (24-Mar-2018)
1287 Creating filesystem with 252928 4k blocks and 63232 inodes
1288 Filesystem UUID: d0c7ab5c-abf6-4ee0-aee1-ec1ce7917bea
1289 Superblock backups stored on blocks:
1290 32768, 98304, 163840, 229376
1291
1292 Allocating group tables: done
1293 Writing inode tables: done
1294 Creating journal (4096 blocks): done
1295 Writing superblocks and filesystem accounting information: done
1296
1297 Let's now create a directory to mount this volume in all 3 nodes. Remember, we
1298 are not *yet* configuring a cluster filesystem. The disk should be mounted
1299 in one node AT A TIME.
1300
1301 rafaeldtinoco@clubionic01:~$ sudo mkdir /clusterdata
1302
1303 rafaeldtinoco@clubionic02:~$ sudo mkdir /clusterdata
1304
1305 rafaeldtinoco@clubionic03:~$ sudo mkdir /clusterdata
1306
1307 And, in this particular case, it should be tested in the node that you did all
1308 the LVM2 commands and created the EXT4 filesystem:
1309
1310 rafaeldtinoco@clubionic01:~$ sudo mount /dev/clustervg/clustervol /clusterdata
1311
1312 rafaeldtinoco@clubionic01:~$ mount | grep cluster
1313 /dev/mapper/clustervg-clustervol on /clusterdata type ext4 (rw,relatime,stripe=2048,data=ordered)
1314
1315 Now we can go ahead and disable the volume group:
1316
1317 rafaeldtinoco@clubionic01:~$ sudo umount /clusterdata
1318
1319 rafaeldtinoco@clubionic01:~$ sudo vgchange -an clustervg
1320
1321 --------------------
1322
1323 Its time to remove the resources we have configured and re-configure them. This
1324 is needed because the resources of a group are started in the order you created
1325 them and, in this new case, lighttpd resource will depend on the shared disk
1326 filesystem we are creating on the node that has lighttpd started.
1327
1328 rafaeldtinoco@clubionic01:~$ sudo crm resource stop webserver_vip
1329 rafaeldtinoco@clubionic01:~$ sudo crm configure delete webserver
1330 rafaeldtinoco@clubionic01:~$ sudo crm configure delete virtual_ip
1331 rafaeldtinoco@clubionic01:~$ sudo crm configure sh
1332 node 1: clubionic01
1333 node 2: clubionic02
1334 node 3: clubionic03
1335 primitive fence_clubionic stonith:fence_scsi \
1336 params pcmk_host_list="clubionic01 clubionic02 clubionic03" \
1337 plug="" devices="/dev/sda" meta provides=unfencing
1338 property cib-bootstrap-options: \
1339 have-watchdog=false \
1340 dc-version=1.1.18-2b07d5c5a9 \
1341 cluster-infrastructure=corosync \
1342 cluster-name=clubionic \
1343 stonith-enabled=on \
1344 stonith-action=off \
1345 no-quorum-policy=stop
1346
1347 Now we can create the resource responsible for taking care of the LVM volume
1348 group migration: ocf:heartbeat:LVM-activate.
1349
1350 crm(live)configure# primitive lvm2 ocf:heartbeat:LVM-activate vgname=clustervg \
1351 vg_access_mode=system_id
1352
1353 crm(live)configure# commit
1354
1355 With only those 2 commands our cluster shall have one of the nodes accessing the
1356 volume group "clustervg" we have created. In my case it got enabled in the 2nd
1357 node of the cluster:
1358
1359 rafaeldtinoco@clubionic02:~$ crm_mon -1
1360 Stack: corosync
1361 Current DC: clubionic01 (version 1.1.18-2b07d5c5a9) - partition with quorum
1362 Last updated: Fri Mar 6 20:59:44 2020
1363 Last change: Fri Mar 6 20:58:33 2020 by root via cibadmin on clubionic01
1364
1365 3 nodes configured
1366 2 resources configured
1367
1368 Online: [ clubionic01 clubionic02 clubionic03 ]
1369
1370 Active resources:
1371
1372 fence_clubionic (stonith:fence_scsi): Started clubionic01
1373 lvm2 (ocf::heartbeat:LVM-activate): Started clubionic02
1374
1375 And I can check that by executing:
1376
1377 rafaeldtinoco@clubionic02:~$ sudo vgs
1378 VG #PV #LV #SN Attr VSize VFree
1379 clustervg 1 1 0 wz--n- 988.00m 0
1380
1381 rafaeldtinoco@clubionic02:~$ sudo lvs
1382 LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
1383 clustervol clustervg -wi-a----- 988.00m
1384
1385 rafaeldtinoco@clubionic02:~$ sudo vgs -o+systemid
1386 VG #PV #LV #SN Attr VSize VFree System ID
1387 clustervg 1 1 0 wz--n- 988.00m 0 clubionic02
1388
1389 rafaeldtinoco@clubionic02:~$ sudo mount -LCLUSTERDATA /clusterdata and
1390
1391 rafaeldtinoco@clubionic02:~$ sudo umount /clusterdata
1392
1393 should work in the node having "lvm2" resource started.
1394
1395 Now its time to re-create the resources we had before, in the group
1396 "webservergroup".
1397
1398 crm(live)configure# primitive webserver systemd:lighttpd \
1399 op monitor interval=10 timeout=30
1400
1401 crm(live)configure# group webservergroup lvm2 virtual_ip webserver
1402
1403 crm(live)configure# commit
1404
1405 Now pacemaker should show all resources inside "webservergroup":
1406
1407 - lvm2
1408 - virtual_ip
1409 - webserver
1410
1411 enabled in the *same* node:
1412
1413 rafaeldtinoco@clubionic02:~$ crm_mon -1
1414 Stack: corosync
1415 Current DC: clubionic01 (version 1.1.18-2b07d5c5a9) - partition with quorum
1416 Last updated: Fri Mar 6 21:05:24 2020
1417 Last change: Fri Mar 6 21:04:55 2020 by root via cibadmin on clubionic01
1418
1419 3 nodes configured
1420 4 resources configured
1421
1422 Online: [ clubionic01 clubionic02 clubionic03 ]
1423
1424 Active resources:
1425
1426 fence_clubionic (stonith:fence_scsi): Started clubionic01
1427 Resource Group: webservergroup
1428 lvm2 (ocf::heartbeat:LVM-activate): Started clubionic02
1429 virtual_ip (ocf::heartbeat:IPaddr2): Started clubionic02
1430 webserver (systemd:lighttpd): Started clubionic02
1431
1432 And it does: clubionic02 node.
1433
1434 --------------------
1435
1436 Perfect. Its time to configure the filesystem mount and umount now. Before
1437 moving on, make sure to install "psmisc" package in all nodes and:
1438
1439 crm(live)configure# primitive ext4 ocf:heartbeat:Filesystem device=/dev/clustervg/clustervol directory=/clusterdata fstype=ext4
1440
1441 crm(live)configure# del webservergroup
1442
1443 crm(live)configure# group webservergroup lvm2 ext4 virtual_ip webserver
1444
1445 crm(live)configure# commit
1446
1447 You will have:
1448
1449 rafaeldtinoco@clubionic02:~$ crm_mon -1
1450 Stack: corosync
1451 Current DC: clubionic01 (version 1.1.18-2b07d5c5a9) - partition with quorum
1452 Last updated: Fri Mar 6 21:16:39 2020
1453 Last change: Fri Mar 6 21:16:36 2020 by hacluster via crmd on clubionic03
1454
1455 3 nodes configured
1456 5 resources configured
1457
1458 Online: [ clubionic01 clubionic02 clubionic03 ]
1459
1460 Active resources:
1461
1462 fence_clubionic (stonith:fence_scsi): Started clubionic01
1463 Resource Group: webservergroup
1464 lvm2 (ocf::heartbeat:LVM-activate): Started clubionic03
1465 ext4 (ocf::heartbeat:Filesystem): Started clubionic03
1466 virtual_ip (ocf::heartbeat:IPaddr2): Started clubionic03
1467 webserver (systemd:lighttpd): Started clubionic03
1468
1469 rafaeldtinoco@clubionic03:~$ mount | grep -i clu
1470 /dev/mapper/clustervg-clustervol on /clusterdata type ext4 (rw,relatime,stripe=2048,data=ordered)
1471
1472 And that makes the environment we just created perfect to host the lighttpd
1473 service files, as the physical and logical volume will migrate from one node to
1474 another together with the needed service (lighttpd) AND virtual IP being used to
1475 serve our end users:
1476
1477 rafaeldtinoco@clubionic01:~$ curl clubionic.public
1478 clubionic03
1479
1480 rafaeldtinoco@clubionic01:~$ crm resource move webservergroup clubionic01
1481 INFO: Move constraint created for webservergroup to clubionic01
1482
1483 rafaeldtinoco@clubionic01:~$ curl clubionic.public
1484 clubionic01
1485
1486 We can start serving files/data from the volume that is currently being managed
1487 by the cluster. In the node with the resource group "webservergroup" enabled you
1488 could:
1489
1490 rafaeldtinoco@clubionic01:~$ sudo rsync -avz /var/www/ /clusterdata/www/
1491 sending incremental file list
1492 created directory /clusterdata/www
1493 ./
1494 cgi-bin/
1495 html/
1496 html/index.html
1497
1498 rafaeldtinoco@clubionic01:~$ sudo rm -rf /var/www
1499 rafaeldtinoco@clubionic01:~$ sudo ln -s /clusterdata/www /var/www
1500 rafaeldtinoco@clubionic01:~$ cd /clusterdata/www/html/
1501 rafaeldtinoco@clubionic01:.../html$ echo clubionic | sudo tee index.html
1502
1503 and in all other nodes:
1504
1505 rafaeldtinoco@clubionic02:~$ sudo rm -rf /var/www
1506 rafaeldtinoco@clubionic02:~$ sudo ln -s /clusterdata/www /var/www
1507
1508 rafaeldtinoco@clubionic03:~$ sudo rm -rf /var/www
1509 rafaeldtinoco@clubionic03:~$ sudo ln -s /clusterdata/www /var/www
1510
1511 and test the fact that, now, data being distributed by lighttpd is shared among
1512 the nodes in an active-passive way:
1513
1514 rafaeldtinoco@clubionic01:~$ curl clubionic.public
1515 clubionic
1516
1517 rafaeldtinoco@clubionic01:~$ crm resource move webservergroup clubionic02
1518 INFO: Move constraint created for webservergroup to clubionic02
1519
1520 rafaeldtinoco@clubionic01:~$ curl clubionic.public
1521 clubionic
1522
1523 rafaeldtinoco@clubionic01:~$ crm resource move webservergroup clubionic03
1524 INFO: Move constraint created for webservergroup to clubionic03
1525
1526 rafaeldtinoco@clubionic01:~$ curl clubionic.public
1527 clubionic
1528
1529 --------------------
1530 --------------------
1531
1532 Okay, so... we've done already 3 important things with our scsi-shared-disk
1533 fenced (+ watchdog'ed) cluster:
1534
1535 - configured scsi persistent-reservation based fencing
1536 - configured watchdog to fence a host without reservations
1537 - configured HA resource group that migrates disk, ip and service among nodes
1538
1539 --------------------
1540 --------------------
1541
1542 It is time to go further and make all the nodes to access the same filesystem
1543 in the shared disk being managed by the cluster. This could allow different
1544 applications to be enabled in different nodes while accessing the same disk, for
1545 example, or several other examples that you can find online.
1546
1547 Let's install the distributed lock manager in all cluster nodes:
1548
1549 rafaeldtinoco@clubionic01:~$ apt-get install -y dlm-controld
1550
1551 rafaeldtinoco@clubionic02:~$ apt-get install -y dlm-controld
1552
1553 rafaeldtinoco@clubionic03:~$ apt-get install -y dlm-controld
1554
1555 NOTE:
1556
1557 1. before enabling dlm-controld service you should disable the watchdog
1558 daemon "just in case" as it can cause you problems, rebooting your cluster
1559 nodes, if dlm-control daemon does not start successfully.
1560
1561 Check that dlm service has started successfully:
1562
1563 rafaeldtinoco@clubionic01:~$ systemctl status dlm
1564 ● dlm.service - dlm control daemon
1565 Loaded: loaded (/etc/systemd/system/dlm.service; enabled; vendor preset: enabled)
1566 Active: active (running) since Fri 2020-03-06 20:25:05 UTC; 1 day 22h ago
1567 Docs: man:dlm_controld
1568 man:dlm.conf
1569 man:dlm_stonith
1570 Main PID: 4029 (dlm_controld)
1571 Tasks: 2 (limit: 2338)
1572 CGroup: /system.slice/dlm.service
1573 └─4029 /usr/sbin/dlm_controld --foreground
1574
1575 and, if it didn't, try removing the dlm module:
1576
1577 rafaeldtinoco@clubionic01:~$ sudo modprobe -r dlm
1578
1579 and reloading it again:
1580
1581 rafaeldtinoco@clubionic01:~$ sudo modprobe dlm
1582
1583 as this might happen because udev rules were not interpreted yet during package
1584 installation and devices /dev/misc/XXXX were not created. One way of
1585 guaranteeing dlm will always find correct devices is to add it to /etc/modules
1586 file:
1587
1588 rafaeldtinoco@clubionic01:~$ cat /etc/modules
1589 virtio_balloon
1590 virtio_blk
1591 virtio_net
1592 virtio_pci
1593 virtio_ring
1594 virtio
1595 ext4
1596 9p
1597 9pnet
1598 9pnet_virtio
1599 dlm
1600
1601 So it is loaded during boot time:
1602
1603 rafaeldtinoco@clubionic01:~$ sudo update-initramfs -k all -u
1604
1605 rafaeldtinoco@clubionic01:~$ sudo reboot
1606
1607 rafaeldtinoco@clubionic01:~$ systemctl --value is-active corosync.service
1608 active
1609
1610 rafaeldtinoco@clubionic01:~$ systemctl --value is-active pacemaker.service
1611 active
1612
1613 rafaeldtinoco@clubionic01:~$ systemctl --value is-active dlm.service
1614 active
1615
1616 rafaeldtinoco@clubionic01:~$ systemctl --value is-active watchdog.service
1617 inactive
1618
1619 And, after making sure it works, disable dlm service:
1620
1621 rafaeldtinoco@clubionic01:~$ systemctl disable dlm
1622
1623 rafaeldtinoco@clubionic02:~$ systemctl disable dlm
1624
1625 rafaeldtinoco@clubionic03:~$ systemctl disable dlm
1626
1627 because this service will be managed by the cluster resource manager. The
1628 watchdog service will be enabled at the end, because it is watchdog daemon
1629 that reboots/resets the node after SCSI is fenced.
1630
1631 --------------------
1632
1633 In order to install the cluster filesystem (GFS2 in this case) we will be able
1634 to remove the configuration we did in the cluster:
1635
1636 rafaeldtinoco@clubionic01:~$ sudo crm conf show
1637 node 1: clubionic01
1638 node 2: clubionic02
1639 node 3: clubionic03
1640 primitive ext4 Filesystem \
1641 params device="/dev/clustervg/clustervol" directory="/clusterdata" \
1642 fstype=ext4
1643 primitive fence_clubionic stonith:fence_scsi \
1644 params pcmk_host_list="clubionic01 clubionic02 clubionic03" plug="" \
1645 devices="/dev/sda" meta provides=unfencing target-role=Started
1646 primitive lvm2 LVM-activate \
1647 params vgname=clustervg vg_access_mode=system_id
1648 primitive virtual_ip IPaddr2 \
1649 params ip=10.250.98.13 nic=eth3 \
1650 op monitor interval=10s
1651 primitive webserver systemd:lighttpd \
1652 op monitor interval=10 timeout=30
1653 group webservergroup lvm2 ext4 virtual_ip webserver \
1654 meta target-role=Started
1655 location cli-prefer-webservergroup webservergroup role=Started inf: clubionic03
1656 property cib-bootstrap-options: \
1657 have-watchdog=false \
1658 dc-version=1.1.18-2b07d5c5a9 \
1659 cluster-infrastructure=corosync \
1660 cluster-name=clubionic \
1661 stonith-enabled=on \
1662 stonith-action=off \
1663 no-quorum-policy=stop \
1664 last-lrm-refresh=1583529396
1665
1666 rafaeldtinoco@clubionic01:~$ sudo crm resource stop webservergroup
1667 rafaeldtinoco@clubionic01:~$ sudo crm conf delete webservergroup
1668
1669 rafaeldtinoco@clubionic01:~$ sudo crm resource stop webserver
1670 rafaeldtinoco@clubionic01:~$ sudo crm conf delete webserver
1671
1672 rafaeldtinoco@clubionic01:~$ sudo crm resource stop virtual_ip
1673 rafaeldtinoco@clubionic01:~$ sudo crm conf delete virtual_ip
1674
1675 rafaeldtinoco@clubionic01:~$ sudo crm resource stop lvm2
1676 rafaeldtinoco@clubionic01:~$ sudo crm conf delete lvm2
1677
1678 rafaeldtinoco@clubionic01:~$ sudo crm resource stop ext4
1679 rafaeldtinoco@clubionic01:~$ sudo crm conf delete ext4
1680
1681 rafaeldtinoco@clubionic01:~$ crm conf sh
1682 node 1: clubionic01
1683 node 2: clubionic02
1684 node 3: clubionic03
1685 primitive fence_clubionic stonith:fence_scsi \
1686 params pcmk_host_list="clubionic01 clubionic02 clubionic03" \
1687 plug="" devices="/dev/sda" meta provides=unfencing target-role=Started
1688 property cib-bootstrap-options: \
1689 have-watchdog=false \
1690 dc-version=1.1.18-2b07d5c5a9 \
1691 cluster-infrastructure=corosync \
1692 cluster-name=clubionic \
1693 stonith-enabled=on \
1694 stonith-action=off \
1695 no-quorum-policy=stop \
1696 last-lrm-refresh=1583529396
1697
1698 Now we are ready to create needed resources.
1699
1700 --------------------
1701
1702 Because now we want multiple cluster nodes to access simultaneously LVM volumes
1703 in an active/active way, we have to install "clvm". This package provides the
1704 clustering interface for lvm2, when used with corosync based (eg Pacemaker)
1705 cluster infrastructure. It allows logical volumes to be created on shared
1706 storage devices (eg Fibre Channel, or iSCSI).
1707
1708 rafaeldtinoco@clubionic01:~$ egrep "^\s+locking_type" /etc/lvm/lvm.conf
1709 locking_type = 1
1710
1711 The type being:
1712
1713 0 = no locking
1714 1 = local file-based locking
1715 2 = external shared lib locking_library
1716 3 = built-in clustered locking with clvmd
1717 - disable use_lvmetad and lvmetad service (incompatible)
1718 4 = read-only locking (forbits metadata changes)
1719 5 = dummy locking
1720
1721 Lets change LVM locking type to clustered in all 3 nodes:
1722
1723 rafaeldtinoco@clubionic01:~$ sudo lvmconf --enable-cluster
1724 rafaeldtinoco@clubionic02:~$ ...
1725 rafaeldtinoco@clubionic03:~$ ...
1726
1727 rafaeldtinoco@clubionic01:~$ egrep "^\s+locking_type" /etc/lvm/lvm.conf
1728 rafaeldtinoco@clubionic02:~$ ...
1729 rafaeldtinoco@clubionic03:~$ ...
1730 locking_type = 3
1731
1732 rafaeldtinoco@clubionic01:~$ systemctl disable lvm2-lvmetad.service
1733 rafaeldtinoco@clubionic02:~$ ...
1734 rafaeldtinoco@clubionic03:~$ ...
1735
1736 Finally, enable clustered lvm resource in the cluster:
1737
1738 # clubionic01 storage resources
1739
1740 crm(live)configure# primitive clubionic01_dlm ocf:pacemaker:controld op \
1741 monitor interval=10s on-fail=fence interleave=true ordered=true
1742
1743 crm(live)configure# primitive clubionic01_lvm ocf:heartbeat:clvm op \
1744 monitor interval=10s on-fail=fence interleave=true ordered=true
1745
1746 crm(live)configure# group clubionic01_storage clubionic01_dlm clubionic01_lvm
1747
1748 crm(live)configure# location l_clubionic01_storage clubionic01_storage \
1749 rule -inf: #uname ne clubionic01
1750
1751 # clubionic02 storage resources
1752
1753 crm(live)configure# primitive clubionic02_dlm ocf:pacemaker:controld op \
1754 monitor interval=10s on-fail=fence interleave=true ordered=true
1755
1756 crm(live)configure# primitive clubionic02_lvm ocf:heartbeat:clvm op \
1757 monitor interval=10s on-fail=fence interleave=true ordered=true
1758
1759 crm(live)configure# group clubionic02_storage clubionic02_dlm clubionic02_lvm
1760
1761 crm(live)configure# location l_clubionic02_storage clubionic02_storage \
1762 rule -inf: #uname ne clubionic02
1763
1764 # clubionic03 storage resources
1765
1766 crm(live)configure# primitive clubionic03_dlm ocf:pacemaker:controld op \
1767 monitor interval=10s on-fail=fence interleave=true ordered=true
1768
1769 crm(live)configure# primitive clubionic03_lvm ocf:heartbeat:clvm op \
1770 monitor interval=10s on-fail=fence interleave=true ordered=true
1771
1772 crm(live)configure# group clubionic03_storage clubionic03_dlm clubionic03_lvm
1773
1774 crm(live)configure# location l_clubionic03_storage clubionic03_storage \
1775 rule -inf: #uname ne clubionic03
1776
1777 crm(live)configure# commit
1778
1779 Note: I created the resource groups one by one and specified they could run
1780 in just one node each. This is basically to guarantee that all nodes will have
1781 the services "clvmd" and "dlm_controld" always running (or restarted in case
1782 of issues).
1783
1784 rafaeldtinoco@clubionic01:~$ crm_mon -1
1785 Stack: corosync
1786 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
1787 Last updated: Mon Mar 9 02:18:51 2020
1788 Last change: Mon Mar 9 02:17:58 2020 by root via cibadmin on clubionic01
1789
1790 3 nodes configured
1791 7 resources configured
1792
1793 Online: [ clubionic01 clubionic02 clubionic03 ]
1794
1795 Active resources:
1796
1797 fence_clubionic (stonith:fence_scsi): Started clubionic02
1798 Resource Group: clubionic01_storage
1799 clubionic01_dlm (ocf::pacemaker:controld): Started clubionic01
1800 clubionic01_lvm (ocf::heartbeat:clvm): Started clubionic01
1801 Resource Group: clubionic02_storage
1802 clubionic02_dlm (ocf::pacemaker:controld): Started clubionic02
1803 clubionic02_lvm (ocf::heartbeat:clvm): Started clubionic02
1804 Resource Group: clubionic03_storage
1805 clubionic03_dlm (ocf::pacemaker:controld): Started clubionic03
1806 clubionic03_lvm (ocf::heartbeat:clvm): Started clubionic03
1807
1808 So... now we are ready to have a clustered filesystem running in this cluster!
1809
1810 --------------------
1811
1812 Before creating the "clustered" volume group in LVM, I'm going to remove the
1813 previous volume group and volumes we had:
1814
1815 rafaeldtinoco@clubionic03:~$ sudo vgchange -an clustervg
1816
1817 rafaeldtinoco@clubionic03:~$ sudo vgremove clustervg
1818
1819 rafaeldtinoco@clubionic03:~$ sudo pvremove /dev/sda1
1820
1821 And re-create them as "clustered":
1822
1823 rafaeldtinoco@clubionic03:~$ sudo pvcreate /dev/sda1
1824
1825 rafaeldtinoco@clubionic03:~$ sudo vgcreate -Ay -cy --shared clustervg /dev/sda1
1826
1827 From man page:
1828
1829 --shared
1830
1831 Create a shared VG using lvmlockd if LVM is compiled with lockd support.
1832 lvmlockd will select lock type san‐ lock or dlm depending on which lock
1833 manager is running. This allows multiple hosts to share a VG on shared
1834 devices. lvmlockd and a lock manager must be configured and running.
1835
1836 rafaeldtinoco@clubionic03:~$ sudo vgs
1837 VG #PV #LV #SN Attr VSize VFree
1838 clustervg 1 0 0 wz--nc 988.00m 988.00m
1839
1840 rafaeldtinoco@clubionic03:~$ sudo lvcreate -l 100%FREE -n clustervol clustervg
1841
1842 --------------------
1843
1844 rafaeldtinoco@clubionic01:~$ apt-get install gfs2-utils
1845
1846 rafaeldtinoco@clubionic02:~$ apt-get install gfs2-utils
1847
1848 rafaeldtinoco@clubionic03:~$ apt-get install gfs2-utils
1849
1850 mkfs.gfs2 -j3 -p lock_dlm -t clubionic:gfs2fs /dev/vgclvm/lvcluster
1851
1852 - 3 journals (1 per each node is minimum)
1853 - use lock_dlm as the locking protocol
1854 - -t clustername:lockspace
1855
1856 The "lock table" pair used to uniquely identify this filesystem in
1857 a cluster. The cluster name segment (maxi‐ mum 32 characters)
1858 must match the name given to your cluster in its configuration;
1859 only members of this clus‐ ter are permitted to use this file
1860 system. The lockspace segment (maximum 30 characters) is a unique
1861 file system name used to distinguish this gfs2 file system. Valid
1862 clusternames and lockspaces may only contain alphanumeric
1863 characters, hyphens (-) and underscores (_).
1864
1865 rafaeldtinoco@clubionic01:~$ sudo mkfs.gfs2 -j3 -p lock_dlm \
1866 -t clubionic:clustervol /dev/clustervg/clustervol
1867
1868 Are you sure you want to proceed? [y/n]y
1869 Discarding device contents (may take a while on large devices): Done
1870 Adding journals: Done
1871 Building resource groups: Done
1872 Creating quota file: Done
1873 Writing superblock and syncing: Done
1874 Device: /dev/clustervg/clustervol
1875 Block size: 4096
1876 Device size: 0.96 GB (252928 blocks)
1877 Filesystem size: 0.96 GB (252927 blocks)
1878 Journals: 3
1879 Resource groups: 6
1880 Locking protocol: "lock_dlm"
1881 Lock table: "clubionic:clustervol"
1882 UUID: dac96896-bd83-d9f4-c0cb-e118f5572e0e
1883
1884 rafaeldtinoco@clubionic01:~$ sudo mount /dev/clustervg/clustervol /clusterdata \
1885 sudo umount /clusterdata
1886
1887 rafaeldtinoco@clubionic02:~$ sudo mount /dev/clustervg/clustervol /clusterdata \
1888 sudo umount /clusterdata
1889
1890 rafaeldtinoco@clubionic03:~$ sudo mount /dev/clustervg/clustervol /clusterdata \
1891 sudo umount /clusterdata
1892
1893 --------------------
1894
1895 Now, since we want to add a new resource in an already existing resource group
1896 I'll prefer executing the command: "crm configure edit" and manually edit the
1897 cluster configuration file to this (or something like this in your case):
1898
1899 node 1: clubionic01
1900 node 2: clubionic02
1901 node 3: clubionic03
1902 primitive clubionic01_dlm ocf:pacemaker:controld \
1903 op monitor interval=10s on-fail=fence interleave=true ordered=true
1904 primitive clubionic01_gfs2 Filesystem \
1905 params device="/dev/clustervg/clustervol" directory="/clusterdata" \
1906 fstype=gfs2 options=noatime \
1907 op monitor interval=10s on-fail=fence interleave=true
1908 primitive clubionic01_lvm clvm \
1909 op monitor interval=10s on-fail=fence interleave=true ordered=true
1910 primitive clubionic02_dlm ocf:pacemaker:controld \
1911 op monitor interval=10s on-fail=fence interleave=true ordered=true
1912 primitive clubionic02_gfs2 Filesystem \
1913 params device="/dev/clustervg/clustervol" directory="/clusterdata" \
1914 fstype=gfs2 options=noatime \
1915 op monitor interval=10s on-fail=fence interleave=true
1916 primitive clubionic02_lvm clvm \
1917 op monitor interval=10s on-fail=fence interleave=true ordered=true
1918 primitive clubionic03_dlm ocf:pacemaker:controld \
1919 op monitor interval=10s on-fail=fence interleave=true ordered=true
1920 primitive clubionic03_gfs2 Filesystem \
1921 params device="/dev/clustervg/clustervol" directory="/clusterdata" \
1922 fstype=gfs2 options=noatime \
1923 op monitor interval=10s on-fail=fence interleave=true
1924 primitive clubionic03_lvm clvm \
1925 op monitor interval=10s on-fail=fence interleave=true ordered=true
1926 primitive fence_clubionic stonith:fence_scsi \
1927 params pcmk_host_list="clubionic01 clubionic02 clubionic03" plug="" \
1928 devices="/dev/sda" meta provides=unfencing target-role=Started
1929 group clubionic01_storage clubionic01_dlm clubionic01_lvm clubionic01_gfs2
1930 group clubionic02_storage clubionic02_dlm clubionic02_lvm clubionic02_gfs2
1931 group clubionic03_storage clubionic03_dlm clubionic03_lvm clubionic03_gfs2
1932 location l_clubionic01_storage clubionic01_storage \
1933 rule -inf: #uname ne clubionic01
1934 location l_clubionic02_storage clubionic02_storage \
1935 rule -inf: #uname ne clubionic02
1936 location l_clubionic03_storage clubionic03_storage \
1937 rule -inf: #uname ne clubionic03
1938 property cib-bootstrap-options: \
1939 have-watchdog=false \
1940 dc-version=1.1.18-2b07d5c5a9 \
1941 cluster-infrastructure=corosync \
1942 cluster-name=clubionic \
1943 stonith-enabled=on \
1944 stonith-action=off \
1945 no-quorum-policy=stop \
1946 last-lrm-refresh=1583708321
1947 # vim: set filetype=pcmk:
1948
1949 NOTE:
1950
1951 1. I have created the following resources:
1952
1953 - clubionic01_gfs2
1954 - clubionic02_gfs2
1955 - clubionic03_gfs2
1956
1957 and added them to each of their correspondent groups.
1958
1959 The final result is:
1960
1961 rafaeldtinoco@clubionic02:~$ crm_mon -1
1962 Stack: corosync
1963 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
1964 Last updated: Mon Mar 9 03:26:43 2020
1965 Last change: Mon Mar 9 03:24:14 2020 by root via cibadmin on clubionic01
1966
1967 3 nodes configured
1968 10 resources configured
1969
1970 Online: [ clubionic01 clubionic02 clubionic03 ]
1971
1972 Active resources:
1973
1974 fence_clubionic (stonith:fence_scsi): Started clubionic02
1975 Resource Group: clubionic01_storage
1976 clubionic01_dlm (ocf::pacemaker:controld): Started clubionic01
1977 clubionic01_lvm (ocf::heartbeat:clvm): Started clubionic01
1978 clubionic01_gfs2 (ocf::heartbeat:Filesystem): Started clubionic01
1979 Resource Group: clubionic02_storage
1980 clubionic02_dlm (ocf::pacemaker:controld): Started clubionic02
1981 clubionic02_lvm (ocf::heartbeat:clvm): Started clubionic02
1982 clubionic02_gfs2 (ocf::heartbeat:Filesystem): Started clubionic02
1983 Resource Group: clubionic03_storage
1984 clubionic03_dlm (ocf::pacemaker:controld): Started clubionic03
1985 clubionic03_lvm (ocf::heartbeat:clvm): Started clubionic03
1986 clubionic03_gfs2 (ocf::heartbeat:Filesystem): Started clubionic03
1987
1988 And each of the nodes having the proper GFS2 filesystem mounted:
1989
1990 rafaeldtinoco@clubionic01:~$ for node in clubionic01 clubionic02 \
1991 clubionic03; do ssh $node "df -kh | grep cluster"; done
1992
1993 /dev/mapper/clustervg-clustervol 988M 388M 601M 40% /clusterdata
1994 /dev/mapper/clustervg-clustervol 988M 388M 601M 40% /clusterdata
1995 /dev/mapper/clustervg-clustervol 988M 388M 601M 40% /clusterdata
1996
1997 --------------------
1998
1999 Now we can go back to the previous (and original) idea of having lighttpd
2000 resources serving files from the same shared filesystem.
2001
2002 NOTES
2003
2004 1. So.. this is just an example and this setup isn't specifically good for
2005 anything but to show pacemaker working in an environment like this. I'm
2006 enabling 3 instances of lighttpd using the "systemd" standard and it is very
2007 likely that it does not accept multiple instances in the same node.
2008
2009 2. This is the reason that I'm not allowing the instances to run in all
2010 nodes. Using the right agent you can make the instances, and their virtual
2011 IP, to migrate among all nodes if one of them fails.
2012
2013 3. Instead of having 3 lighttpd instances here you could have 1 lighttpd, 1
2014 postfix and 1 mysql instance, all instances floating among all cluster
2015 nodes with no particular preference... for example. All the 3 instances
2016 would be able to access the same clustered filesystem mounted at
2017 /clusterdata.
2018 --------------------
2019
2020 rafaeldtinoco@clubionic01:~$ crm config show | cat -
2021 node 1: clubionic01
2022 node 2: clubionic02
2023 node 3: clubionic03
2024 primitive clubionic01_dlm ocf:pacemaker:controld \
2025 op monitor interval=10s on-fail=fence interleave=true ordered=true
2026 primitive clubionic01_gfs2 Filesystem \
2027 params device="/dev/clustervg/clustervol" directory="/clusterdata" \
2028 fstype=gfs2 options=noatime \
2029 op monitor interval=10s on-fail=fence interleave=true
2030 primitive clubionic01_lvm clvm \
2031 op monitor interval=10s on-fail=fence interleave=true ordered=true
2032 primitive clubionic02_dlm ocf:pacemaker:controld \
2033 op monitor interval=10s on-fail=fence interleave=true ordered=true
2034 primitive clubionic02_gfs2 Filesystem \
2035 params device="/dev/clustervg/clustervol" directory="/clusterdata" \
2036 fstype=gfs2 options=noatime \
2037 op monitor interval=10s on-fail=fence interleave=true
2038 primitive clubionic02_lvm clvm \
2039 op monitor interval=10s on-fail=fence interleave=true ordered=true
2040 primitive clubionic03_dlm ocf:pacemaker:controld \
2041 op monitor interval=10s on-fail=fence interleave=true ordered=true
2042 primitive clubionic03_gfs2 Filesystem \
2043 params device="/dev/clustervg/clustervol" directory="/clusterdata" \
2044 fstype=gfs2 options=noatime \
2045 op monitor interval=10s on-fail=fence interleave=true
2046 primitive clubionic03_lvm clvm \
2047 op monitor interval=10s on-fail=fence interleave=true ordered=true
2048 primitive fence_clubionic stonith:fence_scsi \
2049 params pcmk_host_list="clubionic01 clubionic02 clubionic03" plug="" \
2050 devices="/dev/sda" \
2051 meta provides=unfencing target-role=Started
2052 primitive instance01_ip IPaddr2 \
2053 params ip=10.250.98.13 nic=eth3 \
2054 op monitor interval=10s
2055 primitive instance01_web systemd:lighttpd \
2056 op monitor interval=10 timeout=30
2057 primitive instance02_ip IPaddr2 \
2058 params ip=10.250.98.14 nic=eth3 \
2059 op monitor interval=10s
2060 primitive instance02_web systemd:lighttpd \
2061 op monitor interval=10 timeout=30
2062 primitive instance03_ip IPaddr2 \
2063 params ip=10.250.98.15 nic=eth3 \
2064 op monitor interval=10s
2065 primitive instance03_web systemd:lighttpd \
2066 op monitor interval=10 timeout=30
2067 group clubionic01_storage clubionic01_dlm clubionic01_lvm clubionic01_gfs2
2068 group clubionic02_storage clubionic02_dlm clubionic02_lvm clubionic02_gfs2
2069 group clubionic03_storage clubionic03_dlm clubionic03_lvm clubionic03_gfs2
2070 group instance01 instance01_web instance01_ip
2071 group instance02 instance02_web instance02_ip
2072 group instance03 instance03_web instance03_ip
2073 location l_clubionic01_storage clubionic01_storage \
2074 rule -inf: #uname ne clubionic01
2075 location l_clubionic02_storage clubionic02_storage \
2076 rule -inf: #uname ne clubionic02
2077 location l_clubionic03_storage clubionic03_storage \
2078 rule -inf: #uname ne clubionic03
2079 location l_instance01 instance01 \
2080 rule -inf: #uname ne clubionic01
2081 location l_instance02 instance02 \
2082 rule -inf: #uname ne clubionic02
2083 location l_instance03 instance03 \
2084 rule -inf: #uname ne clubionic03
2085 property cib-bootstrap-options: \
2086 have-watchdog=false \
2087 dc-version=1.1.18-2b07d5c5a9 \
2088 cluster-infrastructure=corosync \
2089 cluster-name=clubionic \
2090 stonith-enabled=on \
2091 stonith-action=off \
2092 no-quorum-policy=stop \
2093 last-lrm-refresh=1583708321
2094
2095 rafaeldtinoco@clubionic01:~$ crm_mon -1
2096 Stack: corosync
2097 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
2098 Last updated: Mon Mar 9 03:42:11 2020
2099 Last change: Mon Mar 9 03:39:32 2020 by root via cibadmin on clubionic01
2100
2101 3 nodes configured
2102 16 resources configured
2103
2104 Online: [ clubionic01 clubionic02 clubionic03 ]
2105
2106 Active resources:
2107
2108 fence_clubionic (stonith:fence_scsi): Started clubionic02
2109 Resource Group: clubionic01_storage
2110 clubionic01_dlm (ocf::pacemaker:controld): Started clubionic01
2111 clubionic01_lvm (ocf::heartbeat:clvm): Started clubionic01
2112 clubionic01_gfs2 (ocf::heartbeat:Filesystem): Started clubionic01
2113 Resource Group: clubionic02_storage
2114 clubionic02_dlm (ocf::pacemaker:controld): Started clubionic02
2115 clubionic02_lvm (ocf::heartbeat:clvm): Started clubionic02
2116 clubionic02_gfs2 (ocf::heartbeat:Filesystem): Started clubionic02
2117 Resource Group: clubionic03_storage
2118 clubionic03_dlm (ocf::pacemaker:controld): Started clubionic03
2119 clubionic03_lvm (ocf::heartbeat:clvm): Started clubionic03
2120 clubionic03_gfs2 (ocf::heartbeat:Filesystem): Started clubionic03
2121 Resource Group: instance01
2122 instance01_web (systemd:lighttpd): Started clubionic01
2123 instance01_ip (ocf::heartbeat:IPaddr2): Started clubionic01
2124 Resource Group: instance02
2125 instance02_web (systemd:lighttpd): Started clubionic02
2126 instance02_ip (ocf::heartbeat:IPaddr2): Started clubionic02
2127 Resource Group: instance03
2128 instance03_web (systemd:lighttpd): Started clubionic03
2129 instance03_ip (ocf::heartbeat:IPaddr2): Started clubionic03
2130
2131 Like we did previously, let's create a symbolic link of /clusterdata/www, of
2132 each node, into its correspondent /var/www directory.
2133
2134 rafaeldtinoco@clubionic01:~$ sudo ln -s /clusterdata/www /var/www
2135
2136 rafaeldtinoco@clubionic02:~$ sudo ln -s /clusterdata/www /var/www
2137
2138 rafaeldtinoco@clubionic03:~$ sudo ln -s /clusterdata/www /var/www
2139
2140 But now, as this is a clustered filesystem, we have to create the file just
2141 once =) and it will be serviced by all lighttpd instances, running in all 3
2142 nodes:
2143
2144 rafaeldtinoco@clubionic01:~$ echo "all instances show the same thing" | \
2145 sudo tee /var/www/html/index.html
2146 all instances show the same thing
2147
2148 Check it out:
2149
2150 rafaeldtinoco@clubionic01:~$ curl http://instance01/
2151 all instances show the same thing
2152
2153 rafaeldtinoco@clubionic01:~$ curl http://instance02/
2154 all instances show the same thing
2155
2156 rafaeldtinoco@clubionic01:~$ curl http://instance03/
2157 all instances show the same thing
2158
2159 And Voilá =)
2160
2161 You now have a pretty cool cluster to play with! Congrats!
2162
2163 Rafael D. Tinoco
2164 rafaeldtinoco@ubuntu.com
2165 Ubuntu Linux Core Engineer
Attached Files
To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.You are not allowed to attach a file to this page.