High Availability move cluster replication onto different NIC

Running High Availability on FreePBX distro.
Currently I have both of my machines do the cluster inter-node communication and syncing (subnet 172.26.32.1 on eth0 as well) over my LAN, without a dedicated cable connecting the two servers. However I want to change that.

I ran a direct connection between my two HA machines and want to run inter-node communication on there and assign 172.26.32.1 to a different NIC (eth1). Is that possible without having to rebuild the cluster from scratch?

Yes, it is possible. Hereā€™s the docs: http://clusterlabs.org/doc/en-US/Pacemaker/1.0/html-single/Pacemaker_Explained/index.html#s-add-heartbeat

You probably donā€™t want to do it that way unless you carefully read that whole document and know what you are doing. Unfortunately, there are also currently issues with HA backup/restore, so if this is a live system, I suggest waiting until freePBX 13 HA backup/restore works properly.

If you must, fully update the machine, backup the machine, clean re-install on same machine, activate (you will get your licenses back), update to same level, copy backup file to /var/spool/asterisk/backups, apply backup, immediately go to advanced settings and select view and override read-only settings, submit (NOT APPLY), then find the block ā€œRemote CDR Databaseā€ and clear out all settings. Now apply config. There will be a few left-over issues in your freepbx_settings table.

Fix them at the command line with:
# fwconsole chown
# amportal a s
# fwconsole ma refreshsignatures (repeat until it doesnā€™t change anything)

Then create the cluster with separate interconnect NIC.
Then fresh install, update freepbx-b and join to cluster.

Good luck! :smile:

Is it possible without having to rebuild the entire cluster as you are suggesting?

Paging doctor @xrobau ā€¦

1 Like

What are the current issues with FreePBX 13 HA backup/restore?

Hmmm - I thought I understood from Schmooze there was an open issue on this, but I donā€™t see it. Iā€™ll create on shortly. I just lived through this saga.
In short, in freePBX 13, the restore process restores the table freepbx_settings, which includes all your settings under ā€œAdvanced Settingsā€. Settings remote CDR (part of HA), module signatures and keys are stored in this file. (some in sip_settings, too). When you do a fresh install, and try to restore an HA backup, the HA will break because your restoration just overwrote all the keys that were stored in your database and you will see old HA IPs oddly appearing in strange places. The process I outlined earlier takes this stuff into consideration and will allow you to restore to a new installation.

Yes, as mentioned it is in Configuration Explained. Directly messing with Pacemaker is somewhat courageous, but it can certainly be done.

Yep. Thereā€™s no configuration of ā€˜this physical interface does thisā€™ in the cluster.

The easiest way to do this is to move the internal IP address on the standby node, reboot it, and then move the services across, and then move the IP on the other node.

(You donā€™t need to reboot it, but Iā€™d do it, just to be sure)

ā€“Rob

No, if you restore a HA backup using mysqldump everything breaks. If you restore a HA backup using Backup and Restore, everything is fine. (Well, almost fine, you still need to do is go into HA and run a Check to fix the CDR entries)

This is a bit off topic because avayax original question was about adding a heartbeat NIC to an existing system. (I remember another HA Backup post I responded to, and would link, but I can no longer find it. A mind is a terrible thing to lose.)

However, regarding the HA Backup, I respectfully disagree, as that has not been our experience. It may work on small systems with just a CDR fix. However, we have over 5000 extensions, almost 300 IVRs, 250+ Inbound routes. I do agree that once you do manage to get the backup restored, all you have to do is fix the CDR issuse, module signing and possibly sync creds between asterisk, freePBX and MySQL.

The problem with getting the restore done, with large restore files is:

  1. PHP default maximum_file_size is 20M. If your restore file is larger than that, it wonā€™t restore through the GUI without changing this and restarting httpd or scp/ftp the restor file to the local machine. (If CDR is included, the file will very likely be over 20M. In our case it was 70M without CDR.)

  2. Once on the local machine, the restoration will crash on timeouts if it is large and takes more than 30 seconds per process. Runs into timeouts from php.ini, libraries/BMO/Less.class.php and libraries/utility.functions.php.

I do agree that just restoring a mysqldump is not a good idea. However, we did finally get a successful restore by loading the HA mysqldump into another machine, dropping freepbxha and freepbx_settings tables, then dummping and restoring to freePBX sans those two troublesome tables.

Also, forgot the max_memory settings that must be increased (for us) from 512M to 1025M in php.ini and Less.class.php.

Thatā€™s a bug, and we actually have (Had?) a ticket open about that. It should be 512M.

Thatā€™s another bug 8-( ā€“ Can you get me another backup (using the same method you did last time - dropbox) and Iā€™ll spend some time this weekend if I can playing around with it.