rafaeldtinoco

Attachment 'ubuntu-ha-shared-disk-environment.txt'

Download

   1 HOWTO: Ubuntu High Availability - Shared SCSI Disk only Environment
   2        (Azure and other Environments)
   3 
   4                               --------------------
   5 
   6 This is a mini tutorial indicating how to deploy a High Availability Cluster in
   7 an environment that supports SCSI shared disks. Instead of relying in APIs of
   8 public or private clouds, for example, to fence the virtual machines being
   9 clustered, this example relies only in the SCSI shared disk feature, making
  10 this example a perfect example to virtual and/or real machines having shared
  11 SCSI disks.
  12 
  13 NOTES:
  14 
  15     1. I have made this document with Microsoft Azure Cloud environment in my
  16     head and that's why the beginning of this document shows how to get a SHARED
  17     SCSI DISK in an Azure environment. Clustering examples given bellow will
  18     work with any environment, physical or virtual.
  19 
  20     2. If you want to skip the cloud provider configuration, just search for
  21     BEGIN keyword and you will be taken to the cluster and OS specifics.
  22 
  23                               --------------------
  24 
  25   As all High Availability Clusters, this one also needs some way to guarantee
  26   consistence among different cluster resources. Clusters usually do that by
  27   having fencing mechanisms: A way to guarantee the other nodes are *not*
  28   accessing the resources before services running on them, and managed by the
  29   cluster, are taken over.
  30 
  31   If following this mini tutorial in a Microsoft Azure Environment, make sure to
  32   have in mind that this example needs Microsoft Azure Shared Disk feature:
  33   - docs.microsoft.com/en-us/azure/virtual-machines/windows/disks-shared-enable
  34 
  35   And the Linux Kernel Module called "softdog":
  36   - /lib/modules/xxxxxx-azure/kernel/drivers/watchdog/softdog.ko
  37 
  38                               --------------------
  39 
  40 Azure clubionicshared01 disk json file "shared-disk.json":
  41 
  42 {
  43     "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  44     "contentVersion": "1.0.0.0",
  45     "parameters": {
  46         "diskName": {
  47             "type": "string",
  48             "defaultValue": "clubionicshared01"
  49         },
  50         "diskSizeGb": {
  51             "type": "int",
  52             "defaultValue": 1024
  53         },
  54         "maxShares": {
  55             "type": "int",
  56             "defaultValue": 4
  57         }
  58     },
  59     "resources": [
  60         {
  61             "apiVersion": "2019-07-01",
  62             "type": "Microsoft.Compute/disks",
  63             "name": "[parameters('diskName')]",
  64             "location": "westcentralus",
  65             "sku": {
  66                 "name": "Premium_LRS"
  67             },
  68             "properties": {
  69                 "creationData": {
  70                     "createOption": "Empty"
  71                 },
  72                 "diskSizeGB": "[parameters('diskSizeGb')]",
  73                 "maxShares": "[parameters('maxShares')]"
  74             },
  75             "tags": {}
  76         }
  77     ]
  78 }
  79                               --------------------
  80 
  81 Command to create the resource in a resource-group called "clubionic":
  82 
  83 $ az group deployment create --resource-group clubionic \
  84     --template-file ./shared-disk.json
  85 
  86                               --------------------
  87 Basics:
  88 
  89 - You will create a resource-group called "clubionic" with the following
  90   resources at first:
  91 
  92 clubionicplacement      Proximity placement group
  93 
  94 clubionicnet            Virtual Network
  95     subnets:
  96     private             10.250.3.0/24
  97     public              10.250.98.024
  98 
  99 clubionic01             Virtual machine
 100 clubionic01-ip          Public IP address
 101 clubionic01private      Network interface
 102 clubionic01public       Network interface (clubionic01-ip associated)
 103 clubionic01_OsDisk...   OS Disk (automatic creation)
 104 
 105 clubionic02             Virtual machine
 106 clubionic02-ip          Public IP address
 107 clubionic02private      Network interface
 108 clubionic02public       Network interface (clubionic02-ip associated)
 109 clubionic02_OsDisk...   OS Disk (automatic creation)
 110 
 111 clubionic03             Virtual machine
 112 clubionic03-ip          Public IP address
 113 clubionic03private      Network interface
 114 clubionic03public       Network interface (clubionic03-ip associated)
 115 clubionic03_OsDisk...   OS Disk (automatic creation)
 116 
 117 clubionicshared01       Shared Disk (created using cmdline and json file)
 118 
 119 rafaeldtinocodiag       Storage account (needed for console access)
 120 
 121                               --------------------
 122 
 123 Initial idea is to create the network interfaces:
 124 
 125  - clubionic{01,02,03}{public,private}
 126  - clubionic{01,02,03}-public
 127  - associate XXX-public interfaces to clubionic{01,02,03}public
 128 
 129 And then create then create the clubionicshared01 disk (using yaml file).
 130 
 131 After those are created, next step is to create the 3 needed virtual machines
 132 with the proper resources, like showed above, so we can move on in with the
 133 cluster configuration.
 134 
 135                               --------------------
 136 
 137 I have created a small cloud-init file that can be used in "advanced" tab during
 138 VM creation screens (you can copy and paste it there):
 139 
 140 #cloud-config
 141 package_upgrade: true
 142 packages:
 143   - man
 144   - manpages
 145   - hello
 146   - locales
 147   - less
 148   - vim
 149   - jq
 150   - uuid
 151   - bash-completion
 152   - sudo
 153   - rsync
 154   - bridge-utils
 155   - net-tools
 156   - vlan
 157   - ncurses-term
 158   - iputils-arping
 159   - iputils-ping
 160   - iputils-tracepath
 161   - traceroute
 162   - mtr-tiny
 163   - tcpdump
 164   - dnsutils
 165   - ssh-import-id
 166   - openssh-server
 167   - openssh-client
 168   - software-properties-common
 169   - build-essential
 170   - devscripts
 171   - ubuntu-dev-tools
 172   - linux-headers-generic
 173   - gdb
 174   - strace
 175   - ltrace
 176   - lsof
 177   - sg3-utils
 178 write_files:
 179   - path: /etc/ssh/sshd_config
 180     content: |
 181       Port 22
 182       AddressFamily any
 183       SyslogFacility AUTH
 184       LogLevel INFO
 185       PermitRootLogin yes
 186       PubkeyAuthentication yes
 187       PasswordAuthentication yes
 188       ChallengeResponseAuthentication no
 189       GSSAPIAuthentication no
 190       HostbasedAuthentication no
 191       PermitEmptyPasswords no
 192       UsePAM yes
 193       IgnoreUserKnownHosts yes
 194       IgnoreRhosts yes
 195       X11Forwarding yes
 196       X11DisplayOffset 10
 197       X11UseLocalhost yes
 198       PermitTTY yes
 199       PrintMotd no
 200       TCPKeepAlive yes
 201       ClientAliveInterval 5
 202       PermitTunnel yes
 203       Banner none
 204       AcceptEnv LANG LC_* EDITOR PAGER SYSTEMD_EDITOR
 205       Subsystem     sftp /usr/lib/openssh/sftp-server
 206   - path: /etc/ssh/ssh_config
 207     content: |
 208       Host *
 209         ForwardAgent no
 210         ForwardX11 no
 211         PasswordAuthentication yes
 212         CheckHostIP no
 213         AddressFamily any
 214         SendEnv LANG LC_* EDITOR PAGER
 215         StrictHostKeyChecking no
 216         HashKnownHosts yes
 217   - path: /etc/sudoers
 218     content: |
 219         Defaults env_keep += "LANG LANGUAGE LINGUAS LC_* _XKB_CHARSET"
 220         Defaults env_keep += "HOME EDITOR SYSTEMD_EDITOR PAGER"
 221         Defaults env_keep += "XMODIFIERS GTK_IM_MODULE QT_IM_MODULE QT_IM_SWITCHER"
 222         Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
 223         Defaults logfile=/var/log/sudo.log,loglinelen=0
 224         Defaults !syslog, !pam_session
 225         root ALL=(ALL) NOPASSWD: ALL
 226         %wheel ALL=(ALL) NOPASSWD: ALL
 227         %sudo ALL=(ALL) NOPASSWD: ALL
 228         rafaeldtinoco ALL=(ALL) NOPASSWD: ALL
 229 runcmd:
 230   - systemctl stop snapd.service
 231   - systemctl stop unattended-upgrades
 232   - systemctl stop systemd-remount-fs
 233   - system reset-failed
 234   - passwd -d root
 235   - passwd -d rafaeldtinoco
 236   - echo "debconf debconf/priority select low" | sudo debconf-set-selections
 237   - DEBIAN_FRONTEND=noninteractive dpkg-reconfigure debconf
 238   - DEBIAN_FRONTEND=noninteractive apt-get update -y
 239   - DEBIAN_FRONTEND=noninteractive apt-get dist-upgrade -y
 240   - DEBIAN_FRONTEND=noninteractive apt-get autoremove -y
 241   - DEBIAN_FRONTEND=noninteractive apt-get autoclean -y
 242   - systemctl disable systemd-remount-fs
 243   - systemctl disable unattended-upgrades
 244   - systemctl disable apt-daily-upgrade.timer
 245   - systemctl disable apt-daily.timer
 246   - systemctl disable accounts-daemon.service
 247   - systemctl disable motd-news.timer
 248   - systemctl disable irqbalance.service
 249   - systemctl disable rsync.service
 250   - systemctl disable ebtables.service
 251   - systemctl disable pollinate.service
 252   - systemctl disable ufw.service
 253   - systemctl disable apparmor.service
 254   - systemctl disable apport-autoreport.path
 255   - systemctl disable apport-forward.socket
 256   - systemctl disable iscsi.service
 257   - systemctl disable open-iscsi.service
 258   - systemctl disable iscsid.socket
 259   - systemctl disable multipathd.socket
 260   - systemctl disable multipath-tools.service
 261   - systemctl disable multipathd.service
 262   - systemctl disable lvm2-monitor.service
 263   - systemctl disable lvm2-lvmpolld.socket
 264   - systemctl disable lvm2-lvmetad.socket
 265 apt:
 266   preserve_sources_list: false
 267   primary:
 268     - arches: [default]
 269       uri: http://us.archive.ubuntu.com/ubuntu
 270   sources_list: |
 271     deb $MIRROR $RELEASE main restricted universe multiverse
 272     deb $MIRROR $RELEASE-updates main restricted universe multiverse
 273     deb $MIRROR $RELEASE-proposed main restricted universe multiverse
 274     deb-src $MIRROR $RELEASE main restricted universe multiverse
 275     deb-src $MIRROR $RELEASE-updates main restricted universe multiverse
 276     deb-src $MIRROR $RELEASE-proposed main restricted universe multiverse
 277   conf: |
 278     Dpkg::Options {
 279       "--force-confdef";
 280       "--force-confold";
 281     };
 282   sources:
 283     debug.list:
 284       source: |
 285         # deb http://ddebs.ubuntu.com $RELEASE main restricted universe multiverse
 286         # deb http://ddebs.ubuntu.com $RELEASE-updates main restricted universe multiverse
 287         # deb http://ddebs.ubuntu.com $RELEASE-proposed main restricted universe multiverse
 288       keyid: C8CAB6595FDFF622
 289 
 290                               --------------------
 291 
 292 After provisioning machines "clubionic01, clubionic02, clubionic03" (Standard
 293 D2s v3 (2 vcpus, 8 GiB memory)) with Linux Ubuntu Bionic (18.04), using the same
 294 resource-group (clubionic), located in "West Central US" AND having the same
 295 proximity placement group (clubionicplacement), you will be able to access all
 296 the VMs through their public IPs... and make sure the shared disk works as a
 297 fencing mechanism by testing SCSI persistent reservations using the "sg3-utils"
 298 tools.
 299 
 300 Run these commands in *at least* 1 node after the shared disk attached to it:
 301 
 302 # clubionic01
 303 
 304 # read current reservations:
 305 
 306 rafaeldtinoco@clubionic01:~$ sudo sg_persist -r  /dev/sdc
 307   Msft      Virtual Disk      1.0
 308   Peripheral device type: disk
 309   PR generation=0x0, there is NO reservation held
 310 
 311 # register new reservation key 0x123abc:
 312 
 313 rafaeldtinoco@clubionic01:~$ sudo sg_persist --out --register \
 314   --param-sark=123abc /dev/sdc
 315   Msft      Virtual Disk      1.0
 316   Peripheral device type: disk
 317 
 318 # To reserve the DEVICE (write exclusive):
 319 
 320 rafaeldtinoco@clubionic01:~$ sudo sg_persist --out --reserve \
 321   --param-rk=123abc --prout-type=5 /dev/sdc
 322   Msft      Virtual Disk      1.0
 323   Peripheral device type: disk
 324 
 325 # Check reservation created:
 326 
 327 rafaeldtinoco@clubionic01:~$ sudo sg_persist -r /dev/sdc
 328   Msft      Virtual Disk      1.0
 329   Peripheral device type: disk
 330   PR generation=0x3, Reservation follows:
 331     Key=0x123abc
 332     scope: LU_SCOPE,  type: Write Exclusive, registrants only
 333 
 334 # To release the reservation:
 335 
 336 rafaeldtinoco@clubionic01:~$ sudo sg_persist --out --release \
 337   --param-rk=123abc --prout-type=5 /dev/sdc
 338   Msft      Virtual Disk      1.0
 339   Peripheral device type: disk
 340 
 341 # To unregister a reservation key:
 342 
 343 rafaeldtinoco@clubionic01:~$ sudo sg_persist --out --register \
 344   --param-rk=123abc /dev/sdc
 345   Msft      Virtual Disk      1.0
 346   Peripheral device type: disk
 347 
 348 # Make sure reservation is gone:
 349 
 350 rafaeldtinoco@clubionic01:~$ sudo sg_persist -r /dev/sdc
 351   Msft      Virtual Disk      1.0
 352   Peripheral device type: disk
 353   PR generation=0x4, there is NO reservation held
 354 
 355 BEGIN                         --------------------
 356 
 357 Now it is time to configure the cluster network. In the beginning of this recipe
 358 you saw there were 2 subnet created in the virtual network assigned to this
 359 environment:
 360 
 361 clubionicnet            Virtual network
 362     subnets:
 363     private             10.250.3.0/24
 364     public              10.250.98.0/24
 365 
 366 Since there might be a limit of 2 extra virtual network adapters attached to
 367 your VMs, we are doing the *minimum* required amount of networks for the HA
 368 cluster to operate in good conditions.
 369 
 370 public network: This is the network where the HA cluster virtual IPs will be
 371 placed on. This means that every cluster node will have 1 IP from this subnet
 372 assigned to itself and possibly a floating IP, depending on where the service is
 373 running (the resource is active).
 374 
 375 private network: This is "internal-to-cluster" interface. Where all the cluster
 376 nodes will continuously exchange messages regarding the cluster state. This
 377 network is important as corosync relies on it to know if the cluster nodes are
 378 online or not. It is also possible to create a "2nd" virtual adapter to each of
 379 the nodes, having a 2nd private network (2nd ring in the messaging layer). This
 380 may guarantee that there are no false-positives in cluster failure detections
 381 because of network jittering/delays when having "a single nic adapter" for the
 382 inter-node messaging.
 383 
 384 Instructions:
 385 
 386 - Provision the 3 VMs with 2 network interfaces each (public & private)
 387 - Make sure that, when started, all 3 of them have an external IP (to access)
 388 - A 4th machine is possible (just to access the env, depending on topology)
 389 - Make sure both, public and private networks are configured as:
 390 
 391 clubionic01:
 392  - public   = 10.250.98.10/24
 393  - private  = 10.250.3.10/24
 394 
 395 clubionic02:
 396  - public   = 10.250.98.11/24
 397  - private  = 10.250.3.11/24
 398 
 399 clubionic03:
 400  - public   = 10.250.98.12/24
 401  - private  = 10.250.3.12/24
 402 
 403 And that all interfaces are configured as "static". Then, after powering up the
 404 virtual machines, make sure to disable cloud-init networking configuration AND
 405 to set the interfaces are "static" interfaces.
 406 
 407                               --------------------
 408 
 409 Ubuntu Bionic Cloud Images, deployed by Microsoft Azure to our VMs, come, by
 410 default, installed with "netplan.io" network tool installed, using systemd-
 411 networkd as its backend network provider. This means that all the network
 412 interfaces are being configured and managed by systemd.
 413 
 414 Unfortunately, because of the following bug:
 415 
 416 https://bugs.launchpad.net/netplan/+bug/1815101
 417 
 418 (currently being worked on), any HA environment that wants to have "virtual
 419 aliases" in any network interface should rely in the previous "ifupdown" network
 420 management method. This happens because systemd-networkd had to "learn" how to
 421 deal with restarting interfaces that were being controlled by HA software just
 422 recently and, before that, it used to remove the aliases without cluster
 423 synchronization (fixed in Eoan by using KeepConfiguration= stanza in
 424 systemd-networkd .network file).
 425 
 426 With that, here are the instructions on how to remove netplan.io AND install
 427 ifupdown + resolvconf packages:
 428 
 429 $ sudo apt-get remove --purge netplan.io
 430 $ sudo apt-get install ifupdown bridge-utils vlan resolvconf
 431 $ sudo apt-get install cloud-init
 432 
 433 $ sudo rm /etc/netplan/50-cloud-init.yaml
 434 $ sudo vi /etc/cloud/cloud.cfg.d/99-custom-networking.cfg
 435 $ sudo cat /etc/cloud/cloud.cfg.d/99-custom-networking.cfg
 436 network: {config: disabled}
 437 
 438 And how to configure the interfaces using ifupdown:
 439 
 440 $ cat /etc/network/interfaces
 441 
 442 auto lo
 443 iface lo inet loopback
 444         dns-nameserver 168.63.129.16
 445 
 446 # public
 447 
 448 auto eth0
 449 iface eth0 inet static
 450         address 10.250.98.10
 451         netmask 255.255.255.0
 452         gateway 10.250.98.1
 453 
 454 # private
 455 
 456 auto eth1
 457 iface eth1 inet static
 458         address 10.250.3.10
 459         netmask 255.255.255.0
 460 
 461 $ cat /etc/hosts
 462 127.0.0.1 localhost
 463 
 464 ::1 ip6-localhost ip6-loopback
 465 fe00::0 ip6-localnet
 466 ff00::0 ip6-mcastprefix
 467 ff02::1 ip6-allnodes
 468 ff02::2 ip6-allrouters
 469 ff02::3 ip6-allhosts
 470 
 471 And disable systemd-networkd:
 472 
 473 $ sudo systemctl disable systemd-networkd.service \
 474   systemd-networkd.socket systemd-networkd-wait-online.service \
 475   systemd-resolved.service
 476 
 477 $ sudo update-initramfs -k all -u
 478 
 479 And make sure grub configuration is right:
 480 
 481 $ cat /etc/default/grub
 482 GRUB_DEFAULT=0
 483 GRUB_TIMEOUT=5
 484 GRUB_DISTRIBUTOR="Ubuntu"
 485 GRUB_CMDLINE_LINUX_DEFAULT="console=tty1 console=ttyS0 earlyprintk=ttyS0 rootdelay=300 elevator=noop apparmor=0"
 486 GRUB_CMDLINE_LINUX=""
 487 GRUB_TERMINAL=serial
 488 GRUB_SERIAL_COMMAND="serial --speed=9600 --unit=0 --word=8 --parity=no --stop=1"
 489 GRUB_RECORDFAIL_TIMEOUT=0
 490 
 491 $ sudo update-grub
 492 
 493 and reboot (stop and start the instance so grub cmdline is changed).
 494 
 495 $ ifconfig -a
 496 eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
 497         inet 10.250.98.10  netmask 255.255.255.0  broadcast 10.250.98.255
 498         inet6 fe80::20d:3aff:fef8:6551  prefixlen 64  scopeid 0x20<link>
 499         ether 00:0d:3a:f8:65:51  txqueuelen 1000  (Ethernet)
 500         RX packets 483  bytes 51186 (51.1 KB)
 501         RX errors 0  dropped 0  overruns 0  frame 0
 502         TX packets 415  bytes 65333 (65.3 KB)
 503         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
 504 
 505 eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
 506         inet 10.250.3.10  netmask 255.255.255.0  broadcast 10.250.3.255
 507         inet6 fe80::20d:3aff:fef8:3d01  prefixlen 64  scopeid 0x20<link>
 508         ether 00:0d:3a:f8:3d:01  txqueuelen 1000  (Ethernet)
 509         RX packets 0  bytes 0 (0.0 B)
 510         RX errors 0  dropped 0  overruns 0  frame 0
 511         TX packets 11  bytes 866 (866.0 B)
 512         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
 513 
 514 lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
 515         inet 127.0.0.1  netmask 255.0.0.0
 516         inet6 ::1  prefixlen 128  scopeid 0x10<host>
 517         loop  txqueuelen 1000  (Local Loopback)
 518         RX packets 84  bytes 6204 (6.2 KB)
 519         RX errors 0  dropped 0  overruns 0  frame 0
 520         TX packets 84  bytes 6204 (6.2 KB)
 521         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
 522 
 523 Note: This has to be done in ALL cluster nodes in order for the HA software,
 524 pacemaker in our case, to correctly manage the interfaces, virtual aliases and
 525 services.
 526                               --------------------
 527 
 528 Now let's start configuring the cluster. First /etc/hosts with all names. For
 529 all nodes make sure you have something similar to:
 530 
 531 rafaeldtinoco@clubionic01:~$ cat /etc/hosts
 532 127.0.0.1 localhost
 533 127.0.1.1 clubionic01
 534 
 535 ::1 ip6-localhost ip6-loopback
 536 fe00::0 ip6-localnet
 537 ff00::0 ip6-mcastprefix
 538 ff02::1 ip6-allnodes
 539 ff02::2 ip6-allrouters
 540 ff02::3 ip6-allhosts
 541 
 542 # cluster
 543 
 544 10.250.98.13 clubionic       # floating IP (application)
 545 
 546 10.250.98.10 bionic01        # node01 public IP
 547 10.250.98.11 bionic02        # node02 public IP
 548 10.250.98.12 bionic03        # node03 public IP
 549 
 550 10.250.3.10 clubionic01      # node01 ring0 private IP
 551 10.250.3.11 clubionic02      # node02 ring0 private IP
 552 10.250.3.12 clubionic03      # node03 ring0 private IP
 553 
 554 And that all names are accessible from all nodes:
 555 
 556 $ ping clubionic01
 557                               --------------------
 558 
 559 And let's install corosync package and make sure we are able to create a
 560 messaging (only, for now) cluster with corosync. Install corosync in all the 3
 561 nodes:
 562 
 563 $ sudo apt-get install pacemaker pacemaker-cli-utils corosync corosync-doc \
 564     resource-agents fence-agents crmsh
 565 
 566 With packages properly installed it is time to create the corosync.conf file:
 567 
 568 $ sudo cat /etc/corosync/corosync.conf
 569 totem {
 570         version: 2
 571         secauth: off
 572         cluster_name: clubionic
 573         transport: udpu
 574 }
 575 
 576 nodelist {
 577         node {
 578                 ring0_addr: 10.250.3.10
 579                 # ring1_addr: 10.250.4.10
 580                 name: clubionic01
 581                 nodeid: 1
 582         }
 583         node {
 584                 ring0_addr: 10.250.3.11
 585                 # ring1_addr: 10.250.4.11
 586                 name: clubionic02
 587                 nodeid: 2
 588         }
 589         node {
 590                 ring0_addr: 10.250.3.12
 591                 # ring1_addr: 10.250.4.12
 592                 name: clubionic03
 593                 nodeid: 3
 594         }
 595 }
 596 
 597 quorum {
 598         provider: corosync_votequorum
 599         two_node: 0
 600 }
 601 
 602 qb {
 603         ipc_type: native
 604 }
 605 
 606 logging {
 607 
 608         fileline: on
 609         to_stderr: on
 610         to_logfile: yes
 611         logfile: /var/log/corosync/corosync.log
 612         to_syslog: no
 613         debug: off
 614 }
 615 
 616 But, before restarting corosync with this new configuration, we have to make
 617 sure we create a keyfile and share among all the cluster nodes:
 618 
 619 rafaeldtinoco@clubionic01:~$ sudo corosync-keygen
 620 
 621 Corosync Cluster Engine Authentication key generator.
 622 Gathering 1024 bits for key from /dev/random.
 623 Press keys on your keyboard to generate entropy.
 624 Press keys on your keyboard to generate entropy (bits = 920).
 625 Press keys on your keyboard to generate entropy (bits = 1000).
 626 Writing corosync key to /etc/corosync/authkey.
 627 
 628 rafaeldtinoco@clubionic01:~$ sudo scp /etc/corosync/authkey \
 629         root@clubionic02:/etc/corosync/authkey
 630 
 631 rafaeldtinoco@clubionic01:~$ sudo scp /etc/corosync/authkey \
 632         root@clubionic03:/etc/corosync/authkey
 633 
 634 And now we are ready to make corosync service started by default:
 635 
 636 rafaeldtinoco@clubionic01:~$ systemctl enable --now corosync
 637 rafaeldtinoco@clubionic01:~$ systemctl restart corosync
 638 
 639 rafaeldtinoco@clubionic02:~$ systemctl enable --now corosync
 640 rafaeldtinoco@clubionic02:~$ systemctl restart corosync
 641 
 642 rafaeldtinoco@clubionic03:~$ systemctl enable --now corosync
 643 rafaeldtinoco@clubionic03:~$ systemctl restart corosync
 644 
 645 Finally it is time to check if the messaging layer of our new cluster is good.
 646 Don't worry too much about restarting nodes as the resource-manager (pacemaker)
 647 is not installed yet and quorum won't be enforced in any way.
 648 
 649 rafaeldtinoco@clubionic01:~$ sudo corosync-quorumtool -si
 650 Quorum information
 651 ------------------
 652 Date:             Mon Feb 24 01:54:10 2020
 653 Quorum provider:  corosync_votequorum
 654 Nodes:            3
 655 Node ID:          1
 656 Ring ID:          1/16
 657 Quorate:          Yes
 658 
 659 Votequorum information
 660 ----------------------
 661 Expected votes:   3
 662 Highest expected: 3
 663 Total votes:      3
 664 Quorum:           2
 665 Flags:            Quorate
 666 
 667 Membership information
 668 ----------------------
 669     Nodeid      Votes Name
 670          1          1 10.250.3.10 (local)
 671          2          1 10.250.3.11
 672          3          1 10.250.3.12
 673 
 674 Perfect! We have the messaging layer ready for the resource-manager to be
 675 configured !
 676 
 677                               --------------------
 678 
 679 It is time to configure the resource-manager (pacemaker) now:
 680 
 681 rafaeldtinoco@clubionic01:~$ systemctl enable --now pacemaker
 682 
 683 rafaeldtinoco@clubionic02:~$ systemctl enable --now pacemaker
 684 
 685 rafaeldtinoco@clubionic03:~$ systemctl enable --now pacemaker
 686 
 687 rafaeldtinoco@clubionic01:~$ sudo crm_mon -1
 688 Stack: corosync
 689 Current DC: NONE
 690 Last updated: Mon Feb 24 01:56:11 2020
 691 Last change: Mon Feb 24 01:40:53 2020 by hacluster via crmd on clubionic01
 692 
 693 3 nodes configured
 694 0 resources configured
 695 
 696 Node clubionic01: UNCLEAN (offline)
 697 Node clubionic02: UNCLEAN (offline)
 698 Node clubionic03: UNCLEAN (offline)
 699 
 700 No active resources
 701 
 702 As you can see we have to wait until the resource manager uses the messaging
 703 transport layer and defines all nodes status. Give it a few seconds to settle
 704 and you will have:
 705 
 706 rafaeldtinoco@clubionic01:~$ sudo crm_mon -1
 707 Stack: corosync
 708 Current DC: clubionic01 (version 1.1.18-2b07d5c5a9) - partition with quorum
 709 Last updated: Mon Feb 24 01:57:22 2020
 710 Last change: Mon Feb 24 01:40:54 2020 by hacluster via crmd on clubionic02
 711 
 712 3 nodes configured
 713 0 resources configured
 714 
 715 Online: [ clubionic01 clubionic02 clubionic03 ]
 716 
 717 No active resources
 718                               --------------------
 719 
 720 Perfect! It is time to do a few "basic" setup for pacemaker. Here, in this doc,
 721 I'm using "crmsh" tool to configure the cluster. For Ubuntu Bionic this is
 722 the preferred way of configuring pacemaker.
 723 
 724 At any time you can execute "crmsh" and join/leave the commands as they were
 725 directories:
 726 
 727 rafaeldtinoco@clubionic01:~$ sudo crm
 728 
 729 crm(live)# ls
 730 
 731 cibstatus        help             site
 732 cd               cluster          quit
 733 end              script           verify
 734 exit             ra               maintenance
 735 bye              ?                ls
 736 node             configure        back
 737 report           cib              resource
 738 up               status           corosync
 739 options          history
 740 
 741 crm(live)# cd configure
 742 
 743 crm(live)configure# ls
 744 ..               get_property     cibstatus
 745 primitive        set              validate_all
 746 help             rsc_template     ptest
 747 back             cd               default-timeouts
 748 erase            validate-all     rsctest
 749 rename           op_defaults      modgroup
 750 xml              quit             upgrade
 751 group            graph            load
 752 master           location         template
 753 save             collocation      rm
 754 bye              clone            ?
 755 ls               node             default_timeouts
 756 exit             acl_target       colocation
 757 fencing_topology assist           alert
 758 ra               schema           user
 759 simulate         rsc_ticket       end
 760 role             rsc_defaults     monitor
 761 cib              property         resource
 762 edit             show             up
 763 refresh          order            filter
 764 get-property     tag              ms
 765 verify           commit           history
 766 delete
 767 
 768 And you can even edit the CIB file for the cluster:
 769 
 770 rafaeldtinoco@clubionic01:~$ crm configure edit
 771 rafaeldtinoco@clubionic01:~$ crm
 772 crm(live)# cd configure
 773 crm(live)configure# edit
 774 crm(live)configure# commit
 775 INFO: apparently there is nothing to commit
 776 INFO: try changing something first
 777 
 778                               --------------------
 779 
 780 Let's check the current cluster configuration:
 781 
 782 rafaeldtinoco@clubionic01:~$ crm configure show
 783 node 1: clubionic01
 784 node 2: clubionic02
 785 node 3: clubionic03
 786 property cib-bootstrap-options: \
 787         have-watchdog=false \
 788         dc-version=1.1.18-2b07d5c5a9 \
 789         cluster-infrastructure=corosync \
 790         cluster-name=clubionic
 791 
 792 With this basic settings we can see 2 important things before we attempt to
 793 configure any resource: we are missing a "watchdog" device AND there is no
 794 "fencing" configured for the cluster.
 795 
 796 NOTE:
 797 
 798     1. This is an important note to read. Since we are going to rely our cluster
 799     health on pacemaker, it is mandatory that pacemaker knows how to decide
 800     which side of the cluster is the one that should have enabled resources IF
 801     there is a rupture in the messaging (internal / ring0) layer. The size with
 802     more "votes" is the size that will become "active" while the rest of node(s)
 803     without communication will be "fenced".
 804 
 805 Usually fencing comes in the form of power fencing: The quorate side of the
 806 cluster is able to get a positive response from the fencing mechanism of the
 807 broken side through an external communication path (like a network talking to
 808 ILOs or BMCs).
 809 
 810 For this case, we are going to use shared SCSI disk and its SCSI3 feature called
 811 SCSI PERSISTENT RESERVATIONS as the fencing mechanism : Every time the
 812 interconnect communication faces a disruption, the quorate side (in this 3-node
 813 example, the side that has 2-node still communicating through the private ring
 814 network) will make sure to "fence" the other node using SCSI PERSISTENT
 815 RESERVATION (by removing the SCSI reservation key used by the node to be fenced,
 816 for example).
 817 
 818 Other fencing mechanisms support "reboot/reset" action whenever the quorate
 819 cluster wants to fence some node. Let's start calling things by name: pacemaker
 820 has a service called "stonith" (shot the other node in the head) and that's
 821 how it executes fencing actions: by having fencing agents (fence_scsi in our
 822 case) and having arguments given to these agents that will execute programmed
 823 actions to "shoot the other node in the head".
 824 
 825 Since fence_scsi agent does not have a "reboot/reset" action, it is good to have
 826 a "watchdog" device capable of realizing that the node cannot read and/or write
 827 to a shared disk and kill itself whenever that happens. With a watchdog device
 828 we have a "complete" solution for HA: a fencing mechanism that will block the
 829 fenced node to read or write from the application disk (saving a shared
 830 filesystem from being corrupted, for example) AND a watchdog device that will,
 831 as soon as it realizes the node has been fenced, reset the node.
 832 
 833                               --------------------
 834 
 835 There are multiple HW watchdog devices around but if you don't have one in your
 836 HW (and/or virtual machine) you can always count with the in-kernel software
 837 watchdog device (kernel module called "softdog").
 838 
 839 $ apt-get install watchdog
 840 
 841 For the questions when installing the "watchdog" package, make sure to set:
 842 
 843 Watchdog module to preload: softdog
 844 
 845 and all the others to default. Install the "watchdog" package in all 3 nodes.
 846 
 847 Of course watchdog won't do anything to pacemaker by itself. We have to
 848 tell watchdog that we would like it to check for the fence_scsi shared disks
 849 access from time to time. The way we do this is:
 850 
 851 $ apt-file search fence_scsi_check
 852 fence-agents: /usr/share/cluster/fence_scsi_check
 853 
 854 $ sudo mkdir /etc/watchdog.d/
 855 $ sudo cp /usr/share/cluster/fence_scsi_check /etc/watchdog.d/
 856 $ systemctl restart watchdog
 857 
 858 $ ps -ef | grep watch
 859 root        41     2  0 00:10 ?        00:00:00 [watchdogd]
 860 root      8612     1  0 02:21 ?        00:00:00 /usr/sbin/watchdog
 861 
 862 Also do that for all the 3 nodes.
 863 
 864 After configuring watchdog, lets keep it disabled and stopped for now... or else
 865 your nodes will keep rebooting because the reservations are not in the shared
 866 disk yet (as pacemaker is not configured).
 867 
 868 $ systemctl disable watchdog
 869 Synchronizing state of watchdog.service with SysV service script with /lib/systemd/systemd-sysv-install.
 870 Executing: /lib/systemd/systemd-sysv-install disable watchdog
 871 
 872 $ systemctl stop watchdog
 873 
 874                               --------------------
 875 
 876 Now our cluster has "fence_scsi" resource to fence a node AND watchdog devices
 877 (/dev/watchdog) created by the kernel module "softdog" and managed by the
 878 watchdog daemon, which executes our fence_scsi_check script.
 879 
 880 Let's tell this to the cluster:
 881 
 882 rafaeldtinoco@clubionic01:~$ crm configure
 883 crm(live)configure# property stonith-enabled=on
 884 crm(live)configure# property stonith-action=off
 885 crm(live)configure# property no-quorum-policy=stop
 886 crm(live)configure# property have-watchdog=true
 887 crm(live)configure# commit
 888 crm(live)configure# end
 889 crm(live)# end
 890 bye
 891 
 892 rafaeldtinoco@clubionic01:~$ crm configure show
 893 node 1: clubionic01
 894 node 2: clubionic02
 895 node 3: clubionic03
 896 property cib-bootstrap-options: \
 897         have-watchdog=true \
 898         dc-version=1.1.18-2b07d5c5a9 \
 899         cluster-infrastructure=corosync \
 900         cluster-name=clubionic \
 901         stonith-enabled=on \
 902         stonith-action=off \
 903         no-quorum-policy=stop
 904 
 905 And, not only telling cluster we have watchdog and what is the fencing policy,
 906 we have also to configure the fence resource and tell where to run it.
 907 
 908                               --------------------
 909 
 910 Let's continue creating the fencing resource in the cluster:
 911 
 912 rafaeldtinoco@clubionic03:~$ sudo sg_persist --in --read-keys --device=/dev/sda
 913   LIO-ORG   cluster.bionic.   4.0
 914   Peripheral device type: disk
 915   PR generation=0x0, there are NO registered reservation keys
 916 
 917 rafaeldtinoco@clubionic03:~$ sudo sg_persist -r /dev/sda
 918   LIO-ORG   cluster.bionic.   4.0
 919   Peripheral device type: disk
 920   PR generation=0x0, there is NO reservation held
 921 
 922 rafaeldtinoco@clubionic01:~$ crm configure primitive fence_clubionic \
 923     stonith:fence_scsi params \
 924     pcmk_host_list="clubionic01 clubionic02 clubionic03" \
 925     devices="/dev/disk/by-path/acpi-VMBUS:01-scsi-0:0:0:0" \
 926     meta provides=unfencing
 927 
 928 After creating the fencing agent, make sure it is running:
 929 
 930 rafaeldtinoco@clubionic01:~$ crm_mon -1
 931 Stack: corosync
 932 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
 933 Last updated: Mon Feb 24 04:06:15 2020
 934 Last change: Mon Feb 24 04:06:11 2020 by root via cibadmin on clubionic01
 935 
 936 3 nodes configured
 937 1 resource configured
 938 
 939 Online: [ clubionic01 clubionic02 clubionic03 ]
 940 
 941 Active resources:
 942 
 943  fence_clubionic        (stonith:fence_scsi):   Started clubionic01
 944 
 945 and also make sure that the reservations are in place:
 946 
 947 rafaeldtinoco@clubionic03:~$ sudo sg_persist --in --read-keys --device=/dev/sda
 948   LIO-ORG   cluster.bionic.   4.0
 949   Peripheral device type: disk
 950   PR generation=0x3, 3 registered reservation keys follow:
 951     0x3abe0001
 952     0x3abe0000
 953     0x3abe0002
 954 
 955 Having 3 keys registered show that all nodes have registered their keys while,
 956 when checking which host has the reservation, you have to see a single node
 957 key:
 958 
 959 rafaeldtinoco@clubionic03:~$ sudo sg_persist -r /dev/sda
 960   LIO-ORG   cluster.bionic.   4.0
 961   Peripheral device type: disk
 962   PR generation=0x3, Reservation follows:
 963     Key=0x3abe0001
 964     scope: LU_SCOPE,  type: Write Exclusive, registrants only
 965 
 966                               --------------------
 967 
 968 Testing fencing before moving on
 969 
 970 It is very important to make sure that we are able to fence the node that faced
 971 issues. In our case, as we are also using a watchdog device, so we want to make
 972 sure that our node will reboot in case it looses access to the share scsi disk.
 973 
 974 In order to obtain that, we can do a simple test:
 975 
 976 rafaeldtinoco@clubionic01:~$ crm_mon -1
 977 Stack: corosync
 978 Current DC: clubionic01 (version 1.1.18-2b07d5c5a9) - partition with quorum
 979 Last updated: Fri Mar  6 16:43:01 2020
 980 Last change: Fri Mar  6 16:38:55 2020 by hacluster via crmd on clubionic01
 981 
 982 3 nodes configured
 983 1 resource configured
 984 
 985 Online: [ clubionic01 clubionic02 clubionic03 ]
 986 
 987 Active resources:
 988 
 989  fence_clubionic        (stonith:fence_scsi):   Started clubionic01
 990 
 991 You can see that fence_clubionic resource is running at clubionic01. With that
 992 information we can stop the interconnect (private) network communication of
 993 that node only and check 2 things:
 994 
 995 1) fence_clubionic service has to be started in another node
 996 2) clubionic01 (where fence_clubionic is running) will reboot
 997 
 998 rafaeldtinoco@clubionic01:~$ sudo iptables -A INPUT -i eth2 -j DROP
 999 
1000 rafaeldtinoco@clubionic02:~$ crm_mon  -1
1001 Stack: corosync
1002 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
1003 Last updated: Fri Mar  6 16:45:31 2020
1004 Last change: Fri Mar  6 16:38:55 2020 by hacluster via crmd on clubionic01
1005 
1006 3 nodes configured
1007 1 resource configured
1008 
1009 Online: [ clubionic02 clubionic03 ]
1010 OFFLINE: [ clubionic01 ]
1011 
1012 Active resources:
1013 
1014  fence_clubionic        (stonith:fence_scsi):   Started clubionic02
1015 
1016 Okay (1) worked. fence_clubionic resource migrated to clubionic02 node AND the
1017 reservation key from clubionic01 node was removed from the shared storage:
1018 
1019 rafaeldtinoco@clubionic02:~$ sudo sg_persist --in --read-keys --device=/dev/sda
1020   LIO-ORG   cluster.bionic.   4.0
1021   Peripheral device type: disk
1022   PR generation=0x4, 2 registered reservation keys follow:
1023     0x3abe0001
1024     0x3abe0002
1025 
1026 After up to 60sec (default timeout for the softdog driver + watchdog daemon):
1027 
1028 [  596.943649] reboot: Restarting system
1029 
1030 clubionic01 is rebooted by watchdog daemon (remember the file
1031 /etc/watchdog.d/fence_scsi_check ? that file was responsible for making
1032 watchdog daemon to reboot the node... when it realized the scsi disk wasn't
1033 accessible any longer by our node).
1034 
1035 After the reboot succeeds:
1036 
1037 rafaeldtinoco@clubionic02:~$ sudo sg_persist --in --read-keys --device=/dev/sda
1038   LIO-ORG   cluster.bionic.   4.0
1039   Peripheral device type: disk
1040   PR generation=0x5, 3 registered reservation keys follow:
1041     0x3abe0001
1042     0x3abe0002
1043     0x3abe0000
1044 
1045 rafaeldtinoco@clubionic02:~$ crm_mon -1
1046 Stack: corosync
1047 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
1048 Last updated: Fri Mar  6 16:49:44 2020
1049 Last change: Fri Mar  6 16:38:55 2020 by hacluster via crmd on clubionic01
1050 
1051 3 nodes configured
1052 1 resource configured
1053 
1054 Online: [ clubionic01 clubionic02 clubionic03 ]
1055 
1056 Active resources:
1057 
1058  fence_clubionic        (stonith:fence_scsi):   Started clubionic02
1059 
1060 Its all back to normal, but fence_clubionic agent stays where it was:
1061 clubionic02 node. This cluster behavior is usually to avoid the "ping-pong"
1062 effect for intermittent failures.
1063 
1064                               --------------------
1065 
1066 Now we will install a simple lighttpd service in all the nodes and have it
1067 managed by pacemaker. The idea is simple: to have a virtual IP migrating
1068 in between the nodes, serving a lighttpd service with files coming from the
1069 shared filesystem disk.
1070 
1071     AN IMPORTANT THING TO NOTE HERE: If you are using SHARED SCSI disk to
1072     protect cluster concurrency, it is imperative that the data being serviced
1073     by HA application is also contained in the shared disk.
1074 
1075 rafaeldtinoco@clubionic01:~$ apt-get install lighttpd
1076 rafaeldtinoco@clubionic01:~$ systemctl stop lighttpd.service
1077 rafaeldtinoco@clubionic01:~$ systemctl disable lighttpd.service
1078 
1079 rafaeldtinoco@clubionic02:~$ apt-get install lighttpd
1080 rafaeldtinoco@clubionic02:~$ systemctl stop lighttpd.service
1081 rafaeldtinoco@clubionic02:~$ systemctl disable lighttpd.service
1082 
1083 rafaeldtinoco@clubionic03:~$ apt-get install lighttpd
1084 rafaeldtinoco@clubionic03:~$ systemctl stop lighttpd.service
1085 rafaeldtinoco@clubionic03:~$ systemctl disable lighttpd.service
1086 
1087 Having the hostname as the index.html file we will be able to know which node
1088 is active when accessing the virtual IP, that will be migrating among all 3
1089 nodes:
1090 
1091 rafaeldtinoco@clubionic01:~$ sudo rm /var/www/html/*.html
1092 rafaeldtinoco@clubionic01:~$ echo $HOSTNAME | sudo tee /var/www/html/index.html
1093 clubionic01
1094 
1095 rafaeldtinoco@clubionic02:~$ sudo rm /var/www/html/*.html
1096 rafaeldtinoco@clubionic02:~$ echo $HOSTNAME | sudo tee /var/www/html/index.html
1097 clubionic02
1098 
1099 rafaeldtinoco@clubionic03:~$ sudo rm /var/www/html/*.html
1100 rafaeldtinoco@clubionic03:~$ echo $HOSTNAME | sudo tee /var/www/html/index.html
1101 clubionic03
1102 
1103 And we will have a good way to tell from which source the lighttpd daemon is
1104 getting its files from:
1105 
1106 rafaeldtinoco@clubionic01:~$ curl localhost
1107 clubionic01     -> local disk
1108 
1109 rafaeldtinoco@clubionic01:~$ curl clubionic02
1110 clubionic02     -> local (to clubionic02) disk
1111 
1112 rafaeldtinoco@clubionic01:~$ curl clubionic03
1113 clubionic03     -> local (to clubionic03) disk
1114 
1115                               --------------------
1116 
1117 Next step is to configure the cluster as a HA Active-Passive only cluster. The
1118 shared disk in this scenario would only work as a fence mechanism.
1119 
1120 rafaeldtinoco@clubionic01:~$ crm configure sh
1121 node 1: clubionic01
1122 node 2: clubionic02
1123 node 3: clubionic03
1124 primitive fence_clubionic stonith:fence_scsi \
1125         params pcmk_host_list="clubionic01 clubionic02 clubionic03" plug="" \
1126         devices="/dev/sda" meta provides=unfencing
1127 primitive virtual_ip IPaddr2 \
1128         params ip=10.250.98.13 nic=eth3 \
1129         op monitor interval=10s
1130 primitive webserver systemd:lighttpd \
1131         op monitor interval=10 timeout=30
1132 group webserver_vip webserver virtual_ip
1133 property cib-bootstrap-options: \
1134         have-watchdog=false \
1135         dc-version=1.1.18-2b07d5c5a9 \
1136         cluster-infrastructure=corosync \
1137         cluster-name=clubionic \
1138         stonith-enabled=on \
1139         stonith-action=off \
1140         no-quorum-policy=stop
1141 
1142 As you can see I have created 2 resources and 1 group of resources. You can
1143 copy and paste the command from above sinde "crmsh" and do a "commit" at the end
1144 and it will create the resource for you. After creating the resource, check if
1145 it is working:
1146 
1147 rafaeldtinoco@clubionic01:~$ crm_mon -1
1148 Stack: corosync
1149 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
1150 Last updated: Fri Mar  6 18:57:54 2020
1151 Last change: Fri Mar  6 18:52:17 2020 by root via cibadmin on clubionic01
1152 
1153 3 nodes configured
1154 3 resources configured
1155 
1156 Online: [ clubionic01 clubionic02 clubionic03 ]
1157 
1158 Active resources:
1159 
1160  fence_clubionic        (stonith:fence_scsi):   Started clubionic02
1161  Resource Group: webserver_vip
1162      webserver  (systemd:lighttpd):     Started clubionic01
1163      virtual_ip (ocf::heartbeat:IPaddr2):       Started clubionic01
1164 
1165 rafaeldtinoco@clubionic01:~$ ping -c 1 clubionic.public
1166 PING clubionic.public (10.250.98.13) 56(84) bytes of data.
1167 64 bytes from clubionic.public (10.250.98.13): icmp_seq=1 ttl=64 time=0.025 ms
1168 
1169 --- clubionic.public ping statistics ---
1170 1 packets transmitted, 1 received, 0% packet loss, time 0ms
1171 rtt min/avg/max/mdev = 0.025/0.025/0.025/0.000 ms
1172 
1173 And testing the resource is really active in clubionic01 host:
1174 
1175 rafaeldtinoco@clubionic01:~$ curl clubionic.public
1176 clubionic01
1177 
1178 Note that, in this example, we are not using the shared disk for much, only to
1179 have a way of fencing the failed host. This is important for virtual
1180 environments that does not necessarily give you a power fencing mechanism, for
1181 example, and you have to rely on SCSI FENCE + WATCHDOG to guarantee cluster
1182 consistence, as said in the beginning of this document.
1183 
1184 Final step is to start using the shared scsi disk as a HA active/passive
1185 resource. It means that the webserver we are clustering will serve files from
1186 the shared disk but there won't be multiple active nodes simultaneously, just
1187 one. This example can serve as a clustering example for other services such as:
1188 CIFS, SAMBA, NFS, MTAs and MDAs such as postfix/qmail, etc.
1189 
1190                               --------------------
1191 
1192 Note: I'm using "systemd" resource agent standard because its not relying on
1193 older agents and you can check supported agents by executing:
1194 
1195 rafaeldtinoco@clubionic01:~$ crm_resource --list-standards
1196 ocf
1197 lsb
1198 service
1199 systemd
1200 stonith
1201 
1202 rafaeldtinoco@clubionic01:~$ crm_resource --list-agents=systemd
1203 apt-daily
1204 apt-daily-upgrade
1205 atd
1206 autovt@
1207 bootlogd
1208 ...
1209 
1210 The agents list will be compatible with the software you have installed at
1211 the moment you execute that command in a node (as the systemd standard basically
1212 uses existing service units from systemd on the nodes).
1213 
1214                               --------------------
1215 
1216 For a HA environment we need to first migrate the shared disk (meaning umounting
1217 from one node and mounting it in the other one) and then migrate dependent
1218 services. For this scenario there isn't a need for configuring a locking
1219 manager of any kind.
1220 
1221 Let's install LVM2 packages in all nodes:
1222 
1223 $ apt-get install lvm2
1224 
1225 And configure LVM2 to have a system id based in the uname cmd output:
1226 
1227 rafaeldtinoco@clubionic01:~$ sudo vi /etc/lvm/lvm.conf
1228 ...
1229         system_id_source = "uname"
1230 
1231 Do that in all 3 nodes.
1232 
1233 rafaeldtinoco@clubionic01:~$ sudo lvm systemid
1234   system ID: clubionic01
1235 
1236 rafaeldtinoco@clubionic02:~$ sudo lvm systemid
1237   system ID: clubionic02
1238 
1239 rafaeldtinoco@clubionic03:~$ sudo lvm systemid
1240   system ID: clubionic03
1241 
1242 Configure 1 partition for the shared disk:
1243 
1244 rafaeldtinoco@clubionic01:~$ sudo gdisk /dev/sda
1245 GPT fdisk (gdisk) version 1.0.3
1246 
1247 Partition table scan:
1248   MBR: not present
1249   BSD: not present
1250   APM: not present
1251   GPT: not present
1252 
1253 Creating new GPT entries.
1254 
1255 Command (? for help): n
1256 Partition number (1-128, default 1):
1257 First sector (34-2047966, default = 2048) or {+-}size{KMGTP}:
1258 Last sector (2048-2047966, default = 2047966) or {+-}size{KMGTP}:
1259 Current type is 'Linux filesystem'
1260 Hex code or GUID (L to show codes, Enter = 8300):
1261 Changed type of partition to 'Linux filesystem'
1262 
1263 Command (? for help): w
1264 
1265 Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING
1266 PARTITIONS!!
1267 
1268 Do you want to proceed? (Y/N): y
1269 OK; writing new GUID partition table (GPT) to /dev/sda.
1270 The operation has completed successfully.
1271 
1272 And create the physical and logical volumes using LVM2:
1273 
1274 rafaeldtinoco@clubionic01:~$ sudo pvcreate /dev/sda1
1275 
1276 rafaeldtinoco@clubionic01:~$ sudo vgcreate clustervg /dev/sda1
1277 
1278 rafaeldtinoco@clubionic01:~$ sudo vgs -o+systemid
1279   VG        #PV #LV #SN Attr   VSize   VFree   System ID
1280   clustervg   1   0   0 wz--n- 988.00m 988.00m clubionic01
1281 
1282 rafaeldtinoco@clubionic01:~$ sudo lvcreate -l100%FREE -n clustervol clustervg
1283   Logical volume "clustervol" created.
1284 
1285 rafaeldtinoco@clubionic01:~$ sudo mkfs.ext4 -LCLUSTERDATA /dev/clustervg/clustervol
1286 mke2fs 1.44.1 (24-Mar-2018)
1287 Creating filesystem with 252928 4k blocks and 63232 inodes
1288 Filesystem UUID: d0c7ab5c-abf6-4ee0-aee1-ec1ce7917bea
1289 Superblock backups stored on blocks:
1290         32768, 98304, 163840, 229376
1291 
1292 Allocating group tables: done
1293 Writing inode tables: done
1294 Creating journal (4096 blocks): done
1295 Writing superblocks and filesystem accounting information: done
1296 
1297 Let's now create a directory to mount this volume in all 3 nodes. Remember, we
1298 are not *yet* configuring a cluster filesystem. The disk should be mounted
1299 in one node AT A TIME.
1300 
1301 rafaeldtinoco@clubionic01:~$ sudo mkdir /clusterdata
1302 
1303 rafaeldtinoco@clubionic02:~$ sudo mkdir /clusterdata
1304 
1305 rafaeldtinoco@clubionic03:~$ sudo mkdir /clusterdata
1306 
1307 And, in this particular case, it should be tested in the node that you did all
1308 the LVM2 commands and created the EXT4 filesystem:
1309 
1310 rafaeldtinoco@clubionic01:~$ sudo mount /dev/clustervg/clustervol /clusterdata
1311 
1312 rafaeldtinoco@clubionic01:~$ mount | grep cluster
1313 /dev/mapper/clustervg-clustervol on /clusterdata type ext4 (rw,relatime,stripe=2048,data=ordered)
1314 
1315 Now we can go ahead and disable the volume group:
1316 
1317 rafaeldtinoco@clubionic01:~$ sudo umount /clusterdata
1318 
1319 rafaeldtinoco@clubionic01:~$ sudo vgchange -an clustervg
1320 
1321                               --------------------
1322 
1323 Its time to remove the resources we have configured and re-configure them. This
1324 is needed because the resources of a group are started in the order you created
1325 them and, in this new case, lighttpd resource will depend on the shared disk
1326 filesystem we are creating on the node that has lighttpd started.
1327 
1328 rafaeldtinoco@clubionic01:~$ sudo crm resource stop webserver_vip
1329 rafaeldtinoco@clubionic01:~$ sudo crm configure delete webserver
1330 rafaeldtinoco@clubionic01:~$ sudo crm configure delete virtual_ip
1331 rafaeldtinoco@clubionic01:~$ sudo crm configure sh
1332 node 1: clubionic01
1333 node 2: clubionic02
1334 node 3: clubionic03
1335 primitive fence_clubionic stonith:fence_scsi \
1336         params pcmk_host_list="clubionic01 clubionic02 clubionic03" \
1337         plug="" devices="/dev/sda" meta provides=unfencing
1338 property cib-bootstrap-options: \
1339         have-watchdog=false \
1340         dc-version=1.1.18-2b07d5c5a9 \
1341         cluster-infrastructure=corosync \
1342         cluster-name=clubionic \
1343         stonith-enabled=on \
1344         stonith-action=off \
1345         no-quorum-policy=stop
1346 
1347 Now we can create the resource responsible for taking care of the LVM volume
1348 group migration: ocf:heartbeat:LVM-activate.
1349 
1350 crm(live)configure# primitive lvm2 ocf:heartbeat:LVM-activate vgname=clustervg \
1351     vg_access_mode=system_id
1352 
1353 crm(live)configure# commit
1354 
1355 With only those 2 commands our cluster shall have one of the nodes accessing the
1356 volume group "clustervg" we have created. In my case it got enabled in the 2nd
1357 node of the cluster:
1358 
1359 rafaeldtinoco@clubionic02:~$ crm_mon -1
1360 Stack: corosync
1361 Current DC: clubionic01 (version 1.1.18-2b07d5c5a9) - partition with quorum
1362 Last updated: Fri Mar  6 20:59:44 2020
1363 Last change: Fri Mar  6 20:58:33 2020 by root via cibadmin on clubionic01
1364 
1365 3 nodes configured
1366 2 resources configured
1367 
1368 Online: [ clubionic01 clubionic02 clubionic03 ]
1369 
1370 Active resources:
1371 
1372  fence_clubionic        (stonith:fence_scsi):   Started clubionic01
1373  lvm2   (ocf::heartbeat:LVM-activate):  Started clubionic02
1374 
1375 And I can check that by executing:
1376 
1377 rafaeldtinoco@clubionic02:~$ sudo vgs
1378   VG        #PV #LV #SN Attr   VSize   VFree
1379   clustervg   1   1   0 wz--n- 988.00m    0
1380 
1381 rafaeldtinoco@clubionic02:~$ sudo lvs
1382   LV         VG        Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
1383   clustervol clustervg -wi-a----- 988.00m
1384 
1385 rafaeldtinoco@clubionic02:~$ sudo vgs -o+systemid
1386   VG        #PV #LV #SN Attr   VSize   VFree System ID
1387   clustervg   1   1   0 wz--n- 988.00m    0  clubionic02
1388 
1389 rafaeldtinoco@clubionic02:~$ sudo mount -LCLUSTERDATA /clusterdata      and
1390 
1391 rafaeldtinoco@clubionic02:~$ sudo umount /clusterdata
1392 
1393 should work in the node having "lvm2" resource started.
1394 
1395 Now its time to re-create the resources we had before, in the group
1396 "webservergroup".
1397 
1398 crm(live)configure# primitive webserver systemd:lighttpd \
1399                     op monitor interval=10 timeout=30
1400 
1401 crm(live)configure# group webservergroup lvm2 virtual_ip webserver
1402 
1403 crm(live)configure# commit
1404 
1405 Now pacemaker should show all resources inside "webservergroup":
1406 
1407 - lvm2
1408 - virtual_ip
1409 - webserver
1410 
1411 enabled in the *same* node:
1412 
1413 rafaeldtinoco@clubionic02:~$ crm_mon -1
1414 Stack: corosync
1415 Current DC: clubionic01 (version 1.1.18-2b07d5c5a9) - partition with quorum
1416 Last updated: Fri Mar  6 21:05:24 2020
1417 Last change: Fri Mar  6 21:04:55 2020 by root via cibadmin on clubionic01
1418 
1419 3 nodes configured
1420 4 resources configured
1421 
1422 Online: [ clubionic01 clubionic02 clubionic03 ]
1423 
1424 Active resources:
1425 
1426  fence_clubionic        (stonith:fence_scsi):   Started clubionic01
1427  Resource Group: webservergroup
1428      lvm2       (ocf::heartbeat:LVM-activate):  Started clubionic02
1429      virtual_ip (ocf::heartbeat:IPaddr2):       Started clubionic02
1430      webserver  (systemd:lighttpd):     Started clubionic02
1431 
1432 And it does: clubionic02 node.
1433 
1434                               --------------------
1435 
1436 Perfect. Its time to configure the filesystem mount and umount now. Before
1437 moving on, make sure to install "psmisc" package in all nodes and:
1438 
1439 crm(live)configure# primitive ext4 ocf:heartbeat:Filesystem device=/dev/clustervg/clustervol directory=/clusterdata fstype=ext4
1440 
1441 crm(live)configure# del webservergroup
1442 
1443 crm(live)configure# group webservergroup lvm2 ext4 virtual_ip webserver
1444 
1445 crm(live)configure# commit
1446 
1447 You will have:
1448 
1449 rafaeldtinoco@clubionic02:~$ crm_mon -1
1450 Stack: corosync
1451 Current DC: clubionic01 (version 1.1.18-2b07d5c5a9) - partition with quorum
1452 Last updated: Fri Mar  6 21:16:39 2020
1453 Last change: Fri Mar  6 21:16:36 2020 by hacluster via crmd on clubionic03
1454 
1455 3 nodes configured
1456 5 resources configured
1457 
1458 Online: [ clubionic01 clubionic02 clubionic03 ]
1459 
1460 Active resources:
1461 
1462  fence_clubionic        (stonith:fence_scsi):   Started clubionic01
1463  Resource Group: webservergroup
1464      lvm2       (ocf::heartbeat:LVM-activate):  Started clubionic03
1465      ext4       (ocf::heartbeat:Filesystem):    Started clubionic03
1466      virtual_ip (ocf::heartbeat:IPaddr2):       Started clubionic03
1467      webserver  (systemd:lighttpd):     Started clubionic03
1468 
1469 rafaeldtinoco@clubionic03:~$ mount | grep -i clu
1470 /dev/mapper/clustervg-clustervol on /clusterdata type ext4 (rw,relatime,stripe=2048,data=ordered)
1471 
1472 And that makes the environment we just created perfect to host the lighttpd
1473 service files, as the physical and logical volume will migrate from one node to
1474 another together with the needed service (lighttpd) AND virtual IP being used to
1475 serve our end users:
1476 
1477 rafaeldtinoco@clubionic01:~$ curl clubionic.public
1478 clubionic03
1479 
1480 rafaeldtinoco@clubionic01:~$ crm resource move webservergroup clubionic01
1481 INFO: Move constraint created for webservergroup to clubionic01
1482 
1483 rafaeldtinoco@clubionic01:~$ curl clubionic.public
1484 clubionic01
1485 
1486 We can start serving files/data from the volume that is currently being managed
1487 by the cluster. In the node with the resource group "webservergroup" enabled you
1488 could:
1489 
1490 rafaeldtinoco@clubionic01:~$ sudo rsync -avz /var/www/ /clusterdata/www/
1491 sending incremental file list
1492 created directory /clusterdata/www
1493 ./
1494 cgi-bin/
1495 html/
1496 html/index.html
1497 
1498 rafaeldtinoco@clubionic01:~$ sudo rm -rf /var/www
1499 rafaeldtinoco@clubionic01:~$ sudo ln -s /clusterdata/www /var/www
1500 rafaeldtinoco@clubionic01:~$ cd /clusterdata/www/html/
1501 rafaeldtinoco@clubionic01:.../html$ echo clubionic | sudo tee index.html
1502 
1503 and in all other nodes:
1504 
1505 rafaeldtinoco@clubionic02:~$ sudo rm -rf /var/www
1506 rafaeldtinoco@clubionic02:~$ sudo ln -s /clusterdata/www /var/www
1507 
1508 rafaeldtinoco@clubionic03:~$ sudo rm -rf /var/www
1509 rafaeldtinoco@clubionic03:~$ sudo ln -s /clusterdata/www /var/www
1510 
1511 and test the fact that, now, data being distributed by lighttpd is shared among
1512 the nodes in an active-passive way:
1513 
1514 rafaeldtinoco@clubionic01:~$ curl clubionic.public
1515 clubionic
1516 
1517 rafaeldtinoco@clubionic01:~$ crm resource move webservergroup clubionic02
1518 INFO: Move constraint created for webservergroup to clubionic02
1519 
1520 rafaeldtinoco@clubionic01:~$ curl clubionic.public
1521 clubionic
1522 
1523 rafaeldtinoco@clubionic01:~$ crm resource move webservergroup clubionic03
1524 INFO: Move constraint created for webservergroup to clubionic03
1525 
1526 rafaeldtinoco@clubionic01:~$ curl clubionic.public
1527 clubionic
1528 
1529                               --------------------
1530                               --------------------
1531 
1532 Okay, so... we've done already 3 important things with our scsi-shared-disk
1533 fenced (+ watchdog'ed) cluster:
1534 
1535 - configured scsi persistent-reservation based fencing
1536 - configured watchdog to fence a host without reservations
1537 - configured HA resource group that migrates disk, ip and service among nodes
1538 
1539                               --------------------
1540                               --------------------
1541 
1542 It is time to go further and make all the nodes to access the same filesystem
1543 in the shared disk being managed by the cluster. This could allow different
1544 applications to be enabled in different nodes while accessing the same disk, for
1545 example, or several other examples that you can find online.
1546 
1547 Let's install the distributed lock manager in all cluster nodes:
1548 
1549 rafaeldtinoco@clubionic01:~$ apt-get install -y dlm-controld
1550 
1551 rafaeldtinoco@clubionic02:~$ apt-get install -y dlm-controld
1552 
1553 rafaeldtinoco@clubionic03:~$ apt-get install -y dlm-controld
1554 
1555     NOTE:
1556 
1557     1. before enabling dlm-controld service you should disable the watchdog
1558     daemon "just in case" as it can cause you problems, rebooting your cluster
1559     nodes, if dlm-control daemon does not start successfully.
1560 
1561 Check that dlm service has started successfully:
1562 
1563 rafaeldtinoco@clubionic01:~$ systemctl status dlm
1564 ● dlm.service - dlm control daemon
1565    Loaded: loaded (/etc/systemd/system/dlm.service; enabled; vendor preset: enabled)
1566    Active: active (running) since Fri 2020-03-06 20:25:05 UTC; 1 day 22h ago
1567      Docs: man:dlm_controld
1568            man:dlm.conf
1569            man:dlm_stonith
1570  Main PID: 4029 (dlm_controld)
1571     Tasks: 2 (limit: 2338)
1572    CGroup: /system.slice/dlm.service
1573            └─4029 /usr/sbin/dlm_controld --foreground
1574 
1575 and, if it didn't, try removing the dlm module:
1576 
1577 rafaeldtinoco@clubionic01:~$ sudo modprobe -r dlm
1578 
1579 and reloading it again:
1580 
1581 rafaeldtinoco@clubionic01:~$ sudo modprobe dlm
1582 
1583 as this might happen because udev rules were not interpreted yet during package
1584 installation and devices /dev/misc/XXXX were not created. One way of
1585 guaranteeing dlm will always find correct devices is to add it to /etc/modules
1586 file:
1587 
1588 rafaeldtinoco@clubionic01:~$ cat /etc/modules
1589 virtio_balloon
1590 virtio_blk
1591 virtio_net
1592 virtio_pci
1593 virtio_ring
1594 virtio
1595 ext4
1596 9p
1597 9pnet
1598 9pnet_virtio
1599 dlm
1600 
1601 So it is loaded during boot time:
1602 
1603 rafaeldtinoco@clubionic01:~$ sudo update-initramfs -k all -u
1604 
1605 rafaeldtinoco@clubionic01:~$ sudo reboot
1606 
1607 rafaeldtinoco@clubionic01:~$ systemctl --value is-active corosync.service
1608 active
1609 
1610 rafaeldtinoco@clubionic01:~$ systemctl --value is-active pacemaker.service
1611 active
1612 
1613 rafaeldtinoco@clubionic01:~$ systemctl --value is-active dlm.service
1614 active
1615 
1616 rafaeldtinoco@clubionic01:~$ systemctl --value is-active watchdog.service
1617 inactive
1618 
1619 And, after making sure it works, disable dlm service:
1620 
1621     rafaeldtinoco@clubionic01:~$ systemctl disable dlm
1622 
1623     rafaeldtinoco@clubionic02:~$ systemctl disable dlm
1624 
1625     rafaeldtinoco@clubionic03:~$ systemctl disable dlm
1626 
1627 because this service will be managed by the cluster resource manager. The
1628 watchdog service will be enabled at the end, because it is watchdog daemon
1629 that reboots/resets the node after SCSI is fenced.
1630 
1631                               --------------------
1632 
1633 In order to install the cluster filesystem (GFS2 in this case) we will be able
1634 to remove the configuration we did in the cluster:
1635 
1636 rafaeldtinoco@clubionic01:~$ sudo crm conf show
1637 node 1: clubionic01
1638 node 2: clubionic02
1639 node 3: clubionic03
1640 primitive ext4 Filesystem \
1641         params device="/dev/clustervg/clustervol" directory="/clusterdata" \
1642         fstype=ext4
1643 primitive fence_clubionic stonith:fence_scsi \
1644         params pcmk_host_list="clubionic01 clubionic02 clubionic03" plug="" \
1645         devices="/dev/sda" meta provides=unfencing target-role=Started
1646 primitive lvm2 LVM-activate \
1647         params vgname=clustervg vg_access_mode=system_id
1648 primitive virtual_ip IPaddr2 \
1649         params ip=10.250.98.13 nic=eth3 \
1650         op monitor interval=10s
1651 primitive webserver systemd:lighttpd \
1652         op monitor interval=10 timeout=30
1653 group webservergroup lvm2 ext4 virtual_ip webserver \
1654         meta target-role=Started
1655 location cli-prefer-webservergroup webservergroup role=Started inf: clubionic03
1656 property cib-bootstrap-options: \
1657         have-watchdog=false \
1658         dc-version=1.1.18-2b07d5c5a9 \
1659         cluster-infrastructure=corosync \
1660         cluster-name=clubionic \
1661         stonith-enabled=on \
1662         stonith-action=off \
1663         no-quorum-policy=stop \
1664         last-lrm-refresh=1583529396
1665 
1666 rafaeldtinoco@clubionic01:~$ sudo crm resource stop webservergroup
1667 rafaeldtinoco@clubionic01:~$ sudo crm conf delete webservergroup
1668 
1669 rafaeldtinoco@clubionic01:~$ sudo crm resource stop webserver
1670 rafaeldtinoco@clubionic01:~$ sudo crm conf delete webserver
1671 
1672 rafaeldtinoco@clubionic01:~$ sudo crm resource stop virtual_ip
1673 rafaeldtinoco@clubionic01:~$ sudo crm conf delete virtual_ip
1674 
1675 rafaeldtinoco@clubionic01:~$ sudo crm resource stop lvm2
1676 rafaeldtinoco@clubionic01:~$ sudo crm conf delete lvm2
1677 
1678 rafaeldtinoco@clubionic01:~$ sudo crm resource stop ext4
1679 rafaeldtinoco@clubionic01:~$ sudo crm conf delete ext4
1680 
1681 rafaeldtinoco@clubionic01:~$ crm conf sh
1682 node 1: clubionic01
1683 node 2: clubionic02
1684 node 3: clubionic03
1685 primitive fence_clubionic stonith:fence_scsi \
1686         params pcmk_host_list="clubionic01 clubionic02 clubionic03" \
1687         plug="" devices="/dev/sda" meta provides=unfencing target-role=Started
1688 property cib-bootstrap-options: \
1689         have-watchdog=false \
1690         dc-version=1.1.18-2b07d5c5a9 \
1691         cluster-infrastructure=corosync \
1692         cluster-name=clubionic \
1693         stonith-enabled=on \
1694         stonith-action=off \
1695         no-quorum-policy=stop \
1696         last-lrm-refresh=1583529396
1697 
1698 Now we are ready to create needed resources.
1699 
1700                               --------------------
1701 
1702 Because now we want multiple cluster nodes to access simultaneously LVM volumes
1703 in an active/active way, we have to install "clvm". This package provides the
1704 clustering interface for lvm2, when used with corosync based (eg Pacemaker)
1705 cluster infrastructure.  It allows logical volumes to be created on shared
1706 storage devices (eg Fibre Channel, or iSCSI).
1707 
1708 rafaeldtinoco@clubionic01:~$ egrep "^\s+locking_type" /etc/lvm/lvm.conf
1709         locking_type = 1
1710 
1711 The type being:
1712 
1713     0 = no locking
1714     1 = local file-based locking
1715     2 = external shared lib locking_library
1716     3 = built-in clustered locking with clvmd
1717         - disable use_lvmetad and lvmetad service (incompatible)
1718     4 = read-only locking (forbits metadata changes)
1719     5 = dummy locking
1720 
1721 Lets change LVM locking type to clustered in all 3 nodes:
1722 
1723     rafaeldtinoco@clubionic01:~$ sudo lvmconf --enable-cluster
1724     rafaeldtinoco@clubionic02:~$ ...
1725     rafaeldtinoco@clubionic03:~$ ...
1726 
1727     rafaeldtinoco@clubionic01:~$ egrep "^\s+locking_type" /etc/lvm/lvm.conf
1728     rafaeldtinoco@clubionic02:~$ ...
1729     rafaeldtinoco@clubionic03:~$ ...
1730         locking_type = 3
1731 
1732     rafaeldtinoco@clubionic01:~$ systemctl disable lvm2-lvmetad.service
1733     rafaeldtinoco@clubionic02:~$ ...
1734     rafaeldtinoco@clubionic03:~$ ...
1735 
1736 Finally, enable clustered lvm resource in the cluster:
1737 
1738 # clubionic01 storage resources
1739 
1740 crm(live)configure# primitive clubionic01_dlm ocf:pacemaker:controld op \
1741     monitor interval=10s on-fail=fence interleave=true ordered=true
1742 
1743 crm(live)configure# primitive clubionic01_lvm ocf:heartbeat:clvm op \
1744     monitor interval=10s on-fail=fence interleave=true ordered=true
1745 
1746 crm(live)configure# group clubionic01_storage clubionic01_dlm clubionic01_lvm
1747 
1748 crm(live)configure# location l_clubionic01_storage clubionic01_storage \
1749     rule -inf: #uname ne clubionic01
1750 
1751 # clubionic02 storage resources
1752 
1753 crm(live)configure# primitive clubionic02_dlm ocf:pacemaker:controld op \
1754     monitor interval=10s on-fail=fence interleave=true ordered=true
1755 
1756 crm(live)configure# primitive clubionic02_lvm ocf:heartbeat:clvm op \
1757     monitor interval=10s on-fail=fence interleave=true ordered=true
1758 
1759 crm(live)configure# group clubionic02_storage clubionic02_dlm clubionic02_lvm
1760 
1761 crm(live)configure# location l_clubionic02_storage clubionic02_storage \
1762     rule -inf: #uname ne clubionic02
1763 
1764 # clubionic03 storage resources
1765 
1766 crm(live)configure# primitive clubionic03_dlm ocf:pacemaker:controld op \
1767     monitor interval=10s on-fail=fence interleave=true ordered=true
1768 
1769 crm(live)configure# primitive clubionic03_lvm ocf:heartbeat:clvm op \
1770     monitor interval=10s on-fail=fence interleave=true ordered=true
1771 
1772 crm(live)configure# group clubionic03_storage clubionic03_dlm clubionic03_lvm
1773 
1774 crm(live)configure# location l_clubionic03_storage clubionic03_storage \
1775     rule -inf: #uname ne clubionic03
1776 
1777 crm(live)configure# commit
1778 
1779 Note: I created the resource groups one by one and specified they could run
1780 in just one node each. This is basically to guarantee that all nodes will have
1781 the services "clvmd" and "dlm_controld" always running (or restarted in case
1782 of issues).
1783 
1784 rafaeldtinoco@clubionic01:~$ crm_mon -1
1785 Stack: corosync
1786 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
1787 Last updated: Mon Mar  9 02:18:51 2020
1788 Last change: Mon Mar  9 02:17:58 2020 by root via cibadmin on clubionic01
1789 
1790 3 nodes configured
1791 7 resources configured
1792 
1793 Online: [ clubionic01 clubionic02 clubionic03 ]
1794 
1795 Active resources:
1796 
1797  fence_clubionic        (stonith:fence_scsi):   Started clubionic02
1798  Resource Group: clubionic01_storage
1799      clubionic01_dlm    (ocf::pacemaker:controld):      Started clubionic01
1800      clubionic01_lvm    (ocf::heartbeat:clvm):  Started clubionic01
1801  Resource Group: clubionic02_storage
1802      clubionic02_dlm    (ocf::pacemaker:controld):      Started clubionic02
1803      clubionic02_lvm    (ocf::heartbeat:clvm):  Started clubionic02
1804  Resource Group: clubionic03_storage
1805      clubionic03_dlm    (ocf::pacemaker:controld):      Started clubionic03
1806      clubionic03_lvm    (ocf::heartbeat:clvm):  Started clubionic03
1807 
1808 So... now we are ready to have a clustered filesystem running in this cluster!
1809 
1810                               --------------------
1811 
1812 Before creating the "clustered" volume group in LVM, I'm going to remove the
1813 previous volume group and volumes we had:
1814 
1815 rafaeldtinoco@clubionic03:~$ sudo vgchange -an clustervg
1816 
1817 rafaeldtinoco@clubionic03:~$ sudo vgremove clustervg
1818 
1819 rafaeldtinoco@clubionic03:~$ sudo pvremove /dev/sda1
1820 
1821 And re-create them as "clustered":
1822 
1823 rafaeldtinoco@clubionic03:~$ sudo pvcreate /dev/sda1
1824 
1825 rafaeldtinoco@clubionic03:~$ sudo vgcreate -Ay -cy --shared clustervg /dev/sda1
1826 
1827 From man page:
1828 
1829     --shared
1830 
1831     Create a shared VG using lvmlockd if LVM is compiled with lockd support.
1832     lvmlockd will select lock type san‐ lock or dlm depending on which lock
1833     manager is running. This allows multiple hosts to share a VG on shared
1834     devices. lvmlockd and a lock manager must be configured and running.
1835 
1836 rafaeldtinoco@clubionic03:~$ sudo vgs
1837   VG        #PV #LV #SN Attr   VSize   VFree
1838   clustervg   1   0   0 wz--nc 988.00m 988.00m
1839 
1840 rafaeldtinoco@clubionic03:~$ sudo lvcreate -l 100%FREE -n clustervol clustervg
1841 
1842                               --------------------
1843 
1844 rafaeldtinoco@clubionic01:~$ apt-get install gfs2-utils
1845 
1846 rafaeldtinoco@clubionic02:~$ apt-get install gfs2-utils
1847 
1848 rafaeldtinoco@clubionic03:~$ apt-get install gfs2-utils
1849 
1850 mkfs.gfs2 -j3 -p lock_dlm -t clubionic:gfs2fs /dev/vgclvm/lvcluster
1851 
1852     - 3 journals (1 per each node is minimum)
1853     - use lock_dlm as the locking protocol
1854     - -t clustername:lockspace
1855 
1856         The "lock table" pair used to uniquely identify this filesystem in
1857         a cluster.  The cluster name segment (maxi‐ mum 32 characters)
1858         must match the name given to your cluster in its configuration;
1859         only members of this  clus‐ ter  are  permitted  to  use this file
1860         system.  The lockspace segment (maximum 30 characters) is a unique
1861         file system name used to distinguish this gfs2 file system.  Valid
1862         clusternames and  lockspaces  may  only  contain alphanumeric
1863         characters, hyphens (-) and underscores (_).
1864 
1865 rafaeldtinoco@clubionic01:~$ sudo mkfs.gfs2 -j3 -p lock_dlm \
1866     -t clubionic:clustervol /dev/clustervg/clustervol
1867 
1868 Are you sure you want to proceed? [y/n]y
1869 Discarding device contents (may take a while on large devices): Done
1870 Adding journals: Done
1871 Building resource groups: Done
1872 Creating quota file: Done
1873 Writing superblock and syncing: Done
1874 Device:                    /dev/clustervg/clustervol
1875 Block size:                4096
1876 Device size:               0.96 GB (252928 blocks)
1877 Filesystem size:           0.96 GB (252927 blocks)
1878 Journals:                  3
1879 Resource groups:           6
1880 Locking protocol:          "lock_dlm"
1881 Lock table:                "clubionic:clustervol"
1882 UUID:                      dac96896-bd83-d9f4-c0cb-e118f5572e0e
1883 
1884 rafaeldtinoco@clubionic01:~$ sudo mount /dev/clustervg/clustervol /clusterdata \
1885     sudo umount /clusterdata
1886 
1887 rafaeldtinoco@clubionic02:~$ sudo mount /dev/clustervg/clustervol /clusterdata \
1888     sudo umount /clusterdata
1889 
1890 rafaeldtinoco@clubionic03:~$ sudo mount /dev/clustervg/clustervol /clusterdata \
1891     sudo umount /clusterdata
1892 
1893                               --------------------
1894 
1895 Now, since we want to add a new resource in an already existing resource group
1896 I'll prefer executing the command: "crm configure edit" and manually edit the
1897 cluster configuration file to this (or something like this in your case):
1898 
1899 node 1: clubionic01
1900 node 2: clubionic02
1901 node 3: clubionic03
1902 primitive clubionic01_dlm ocf:pacemaker:controld \
1903         op monitor interval=10s on-fail=fence interleave=true ordered=true
1904 primitive clubionic01_gfs2 Filesystem \
1905         params device="/dev/clustervg/clustervol" directory="/clusterdata" \
1906         fstype=gfs2 options=noatime \
1907         op monitor interval=10s on-fail=fence interleave=true
1908 primitive clubionic01_lvm clvm \
1909         op monitor interval=10s on-fail=fence interleave=true ordered=true
1910 primitive clubionic02_dlm ocf:pacemaker:controld \
1911         op monitor interval=10s on-fail=fence interleave=true ordered=true
1912 primitive clubionic02_gfs2 Filesystem \
1913         params device="/dev/clustervg/clustervol" directory="/clusterdata" \
1914         fstype=gfs2 options=noatime \
1915         op monitor interval=10s on-fail=fence interleave=true
1916 primitive clubionic02_lvm clvm \
1917         op monitor interval=10s on-fail=fence interleave=true ordered=true
1918 primitive clubionic03_dlm ocf:pacemaker:controld \
1919         op monitor interval=10s on-fail=fence interleave=true ordered=true
1920 primitive clubionic03_gfs2 Filesystem \
1921         params device="/dev/clustervg/clustervol" directory="/clusterdata" \
1922         fstype=gfs2 options=noatime \
1923         op monitor interval=10s on-fail=fence interleave=true
1924 primitive clubionic03_lvm clvm \
1925         op monitor interval=10s on-fail=fence interleave=true ordered=true
1926 primitive fence_clubionic stonith:fence_scsi \
1927         params pcmk_host_list="clubionic01 clubionic02 clubionic03" plug="" \
1928         devices="/dev/sda" meta provides=unfencing target-role=Started
1929 group clubionic01_storage clubionic01_dlm clubionic01_lvm clubionic01_gfs2
1930 group clubionic02_storage clubionic02_dlm clubionic02_lvm clubionic02_gfs2
1931 group clubionic03_storage clubionic03_dlm clubionic03_lvm clubionic03_gfs2
1932 location l_clubionic01_storage clubionic01_storage \
1933         rule -inf: #uname ne clubionic01
1934 location l_clubionic02_storage clubionic02_storage \
1935         rule -inf: #uname ne clubionic02
1936 location l_clubionic03_storage clubionic03_storage \
1937         rule -inf: #uname ne clubionic03
1938 property cib-bootstrap-options: \
1939         have-watchdog=false \
1940         dc-version=1.1.18-2b07d5c5a9 \
1941         cluster-infrastructure=corosync \
1942         cluster-name=clubionic \
1943         stonith-enabled=on \
1944         stonith-action=off \
1945         no-quorum-policy=stop \
1946         last-lrm-refresh=1583708321
1947 # vim: set filetype=pcmk:
1948 
1949     NOTE:
1950 
1951     1. I have created the following resources:
1952 
1953         - clubionic01_gfs2
1954         - clubionic02_gfs2
1955         - clubionic03_gfs2
1956 
1957         and added them to each of their correspondent groups.
1958 
1959 The final result is:
1960 
1961 rafaeldtinoco@clubionic02:~$ crm_mon -1
1962 Stack: corosync
1963 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
1964 Last updated: Mon Mar  9 03:26:43 2020
1965 Last change: Mon Mar  9 03:24:14 2020 by root via cibadmin on clubionic01
1966 
1967 3 nodes configured
1968 10 resources configured
1969 
1970 Online: [ clubionic01 clubionic02 clubionic03 ]
1971 
1972 Active resources:
1973 
1974  fence_clubionic        (stonith:fence_scsi):   Started clubionic02
1975  Resource Group: clubionic01_storage
1976      clubionic01_dlm    (ocf::pacemaker:controld):      Started clubionic01
1977      clubionic01_lvm    (ocf::heartbeat:clvm):  Started clubionic01
1978      clubionic01_gfs2   (ocf::heartbeat:Filesystem):    Started clubionic01
1979  Resource Group: clubionic02_storage
1980      clubionic02_dlm    (ocf::pacemaker:controld):      Started clubionic02
1981      clubionic02_lvm    (ocf::heartbeat:clvm):  Started clubionic02
1982      clubionic02_gfs2   (ocf::heartbeat:Filesystem):    Started clubionic02
1983  Resource Group: clubionic03_storage
1984      clubionic03_dlm    (ocf::pacemaker:controld):      Started clubionic03
1985      clubionic03_lvm    (ocf::heartbeat:clvm):  Started clubionic03
1986      clubionic03_gfs2   (ocf::heartbeat:Filesystem):    Started clubionic03
1987 
1988 And each of the nodes having the proper GFS2 filesystem mounted:
1989 
1990 rafaeldtinoco@clubionic01:~$ for node in clubionic01 clubionic02 \
1991     clubionic03; do ssh $node "df -kh | grep cluster"; done
1992 
1993 /dev/mapper/clustervg-clustervol  988M  388M  601M  40% /clusterdata
1994 /dev/mapper/clustervg-clustervol  988M  388M  601M  40% /clusterdata
1995 /dev/mapper/clustervg-clustervol  988M  388M  601M  40% /clusterdata
1996 
1997                               --------------------
1998 
1999 Now we can go back to the previous (and original) idea of having lighttpd
2000 resources serving files from the same shared filesystem.
2001 
2002     NOTES
2003 
2004     1. So.. this is just an example and this setup isn't specifically good for
2005     anything but to show pacemaker working in an environment like this. I'm
2006     enabling 3 instances of lighttpd using the "systemd" standard and it is very
2007     likely that it does not accept multiple instances in the same node.
2008 
2009      2. This is the reason that I'm not allowing the instances to run in all
2010      nodes. Using the right agent you can make the instances, and their virtual
2011      IP, to migrate among all nodes if one of them fails.
2012 
2013      3. Instead of having 3 lighttpd instances here you could have 1 lighttpd, 1
2014      postfix and 1 mysql instance, all instances floating among all cluster
2015      nodes with no particular preference... for example. All the 3 instances
2016      would be able to access the same clustered filesystem mounted at
2017      /clusterdata.
2018                               --------------------
2019 
2020 rafaeldtinoco@clubionic01:~$ crm config show | cat -
2021 node 1: clubionic01
2022 node 2: clubionic02
2023 node 3: clubionic03
2024 primitive clubionic01_dlm ocf:pacemaker:controld \
2025         op monitor interval=10s on-fail=fence interleave=true ordered=true
2026 primitive clubionic01_gfs2 Filesystem \
2027         params device="/dev/clustervg/clustervol" directory="/clusterdata" \
2028         fstype=gfs2 options=noatime \
2029         op monitor interval=10s on-fail=fence interleave=true
2030 primitive clubionic01_lvm clvm \
2031         op monitor interval=10s on-fail=fence interleave=true ordered=true
2032 primitive clubionic02_dlm ocf:pacemaker:controld \
2033         op monitor interval=10s on-fail=fence interleave=true ordered=true
2034 primitive clubionic02_gfs2 Filesystem \
2035         params device="/dev/clustervg/clustervol" directory="/clusterdata" \
2036         fstype=gfs2 options=noatime \
2037         op monitor interval=10s on-fail=fence interleave=true
2038 primitive clubionic02_lvm clvm \
2039         op monitor interval=10s on-fail=fence interleave=true ordered=true
2040 primitive clubionic03_dlm ocf:pacemaker:controld \
2041         op monitor interval=10s on-fail=fence interleave=true ordered=true
2042 primitive clubionic03_gfs2 Filesystem \
2043         params device="/dev/clustervg/clustervol" directory="/clusterdata" \
2044         fstype=gfs2 options=noatime \
2045         op monitor interval=10s on-fail=fence interleave=true
2046 primitive clubionic03_lvm clvm \
2047         op monitor interval=10s on-fail=fence interleave=true ordered=true
2048 primitive fence_clubionic stonith:fence_scsi \
2049         params pcmk_host_list="clubionic01 clubionic02 clubionic03" plug="" \
2050         devices="/dev/sda" \
2051         meta provides=unfencing target-role=Started
2052 primitive instance01_ip IPaddr2 \
2053         params ip=10.250.98.13 nic=eth3 \
2054         op monitor interval=10s
2055 primitive instance01_web systemd:lighttpd \
2056         op monitor interval=10 timeout=30
2057 primitive instance02_ip IPaddr2 \
2058         params ip=10.250.98.14 nic=eth3 \
2059         op monitor interval=10s
2060 primitive instance02_web systemd:lighttpd \
2061         op monitor interval=10 timeout=30
2062 primitive instance03_ip IPaddr2 \
2063         params ip=10.250.98.15 nic=eth3 \
2064         op monitor interval=10s
2065 primitive instance03_web systemd:lighttpd \
2066         op monitor interval=10 timeout=30
2067 group clubionic01_storage clubionic01_dlm clubionic01_lvm clubionic01_gfs2
2068 group clubionic02_storage clubionic02_dlm clubionic02_lvm clubionic02_gfs2
2069 group clubionic03_storage clubionic03_dlm clubionic03_lvm clubionic03_gfs2
2070 group instance01 instance01_web instance01_ip
2071 group instance02 instance02_web instance02_ip
2072 group instance03 instance03_web instance03_ip
2073 location l_clubionic01_storage clubionic01_storage \
2074         rule -inf: #uname ne clubionic01
2075 location l_clubionic02_storage clubionic02_storage \
2076         rule -inf: #uname ne clubionic02
2077 location l_clubionic03_storage clubionic03_storage \
2078         rule -inf: #uname ne clubionic03
2079 location l_instance01 instance01 \
2080         rule -inf: #uname ne clubionic01
2081 location l_instance02 instance02 \
2082         rule -inf: #uname ne clubionic02
2083 location l_instance03 instance03 \
2084         rule -inf: #uname ne clubionic03
2085 property cib-bootstrap-options: \
2086         have-watchdog=false \
2087         dc-version=1.1.18-2b07d5c5a9 \
2088         cluster-infrastructure=corosync \
2089         cluster-name=clubionic \
2090         stonith-enabled=on \
2091         stonith-action=off \
2092         no-quorum-policy=stop \
2093         last-lrm-refresh=1583708321
2094 
2095 rafaeldtinoco@clubionic01:~$ crm_mon -1
2096 Stack: corosync
2097 Current DC: clubionic02 (version 1.1.18-2b07d5c5a9) - partition with quorum
2098 Last updated: Mon Mar  9 03:42:11 2020
2099 Last change: Mon Mar  9 03:39:32 2020 by root via cibadmin on clubionic01
2100 
2101 3 nodes configured
2102 16 resources configured
2103 
2104 Online: [ clubionic01 clubionic02 clubionic03 ]
2105 
2106 Active resources:
2107 
2108  fence_clubionic        (stonith:fence_scsi):   Started clubionic02
2109  Resource Group: clubionic01_storage
2110      clubionic01_dlm    (ocf::pacemaker:controld):      Started clubionic01
2111      clubionic01_lvm    (ocf::heartbeat:clvm):  Started clubionic01
2112      clubionic01_gfs2   (ocf::heartbeat:Filesystem):    Started clubionic01
2113  Resource Group: clubionic02_storage
2114      clubionic02_dlm    (ocf::pacemaker:controld):      Started clubionic02
2115      clubionic02_lvm    (ocf::heartbeat:clvm):  Started clubionic02
2116      clubionic02_gfs2   (ocf::heartbeat:Filesystem):    Started clubionic02
2117  Resource Group: clubionic03_storage
2118      clubionic03_dlm    (ocf::pacemaker:controld):      Started clubionic03
2119      clubionic03_lvm    (ocf::heartbeat:clvm):  Started clubionic03
2120      clubionic03_gfs2   (ocf::heartbeat:Filesystem):    Started clubionic03
2121  Resource Group: instance01
2122      instance01_web     (systemd:lighttpd):     Started clubionic01
2123      instance01_ip      (ocf::heartbeat:IPaddr2):       Started clubionic01
2124  Resource Group: instance02
2125      instance02_web     (systemd:lighttpd):     Started clubionic02
2126      instance02_ip      (ocf::heartbeat:IPaddr2):       Started clubionic02
2127  Resource Group: instance03
2128      instance03_web     (systemd:lighttpd):     Started clubionic03
2129      instance03_ip      (ocf::heartbeat:IPaddr2):       Started clubionic03
2130 
2131 Like we did previously, let's create a symbolic link of /clusterdata/www, of
2132 each node, into its correspondent /var/www directory.
2133 
2134 rafaeldtinoco@clubionic01:~$ sudo ln -s /clusterdata/www /var/www
2135 
2136 rafaeldtinoco@clubionic02:~$ sudo ln -s /clusterdata/www /var/www
2137 
2138 rafaeldtinoco@clubionic03:~$ sudo ln -s /clusterdata/www /var/www
2139 
2140 But now, as this is a clustered filesystem, we have to create the file just
2141 once =) and it will be serviced by all lighttpd instances, running in all 3
2142 nodes:
2143 
2144 rafaeldtinoco@clubionic01:~$ echo "all instances show the same thing" | \
2145 sudo tee /var/www/html/index.html
2146 all instances show the same thing
2147 
2148 Check it out:
2149 
2150 rafaeldtinoco@clubionic01:~$ curl http://instance01/
2151 all instances show the same thing
2152 
2153 rafaeldtinoco@clubionic01:~$ curl http://instance02/
2154 all instances show the same thing
2155 
2156 rafaeldtinoco@clubionic01:~$ curl http://instance03/
2157 all instances show the same thing
2158 
2159 And Voilá =)
2160 
2161 You now have a pretty cool cluster to play with! Congrats!
2162 
2163 Rafael D. Tinoco
2164 rafaeldtinoco@ubuntu.com
2165 Ubuntu Linux Core Engineer

Attached Files

To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.
  • [get | view] (2019-10-28 03:41:27, 198.3 KB) [[attachment:QEMU_vuln_and_mit_explained.html]]
  • [get | view] (2020-03-09 04:47:15, 77.7 KB) [[attachment:ubuntu-ha-shared-disk-environment.txt]]
 All files | Selected Files: delete move to page

You are not allowed to attach a file to this page.