High Availability (HA) Join Cluster Errors: "Version Mismatch. Cluster is version 3, should be 4." and "Insufficient disk space"

After having issues with the original cluster not failing over properly we decided to perform a clean install on Node A with the 10.13.66-64bit ISO download. When first trying to use the HA 64 bit option both the /dev/mapper/vg_ha-slash and the /boot partitions were different sizes and when performing a Join Cluster it resulted in a UID Mismatch and Insufficient disk space errors. Switching to the “Advanced” option the partition sizes are a match, but in addition to the insufficient disk space error we now have a cluster is version 3, should be 4 error. Google searches are not yielding anything on the version error so any assistance on this would be greatly appreciated. :sweat:

The ‘cluster version mismatch’ error means that the RUNNING cluster isn’t at the correct version. I’ve had another report of this, and I’ve been scratching my head as to how the cluster isn’t upgraded. However, I’ve had a bit of a eureka moment and realised how it could have happened (and how to stop it happening again!)

As I said on the other post, insufficient disk space means ‘insufficient UNALLOCATED space’. I’m going to fix that wording, and give a bit of debugging in the error too. I’ll edit this post when I’ve released the update.

Edit 1: That disk size error is bogus. Because a variable isn’t being set in a PREVIOUS check, it can’t validate.

OK, I’ve released HA 13.0.10.1, which handles (hopefully!) all of these issues, with better debugging, too.

In the Module Administration have ‘Standard’, ‘Extended’, ‘Commercial’, and ‘Unsupported’ all checked and clicking on Manage local modules shows HA as being Enabled and up to date at 13.0.9.2. :frowning:

You need to get it from the edge repo:

fwconsole ma --edge upgrade freepbx_ha

Ran the above command and refreshed the Module Administration page in the web administration. The “Apply Config” link came up so I clicked on it and it still showed the 13.0.9.2 version as being installed. I went ahead and rebooted and still has 13.0.9.2. Looking online I found the --listonline command and ran it against the repository and it’s showing 13.0.9.2. Should I be using a different repository or am I doing something wrong?

[[email protected] ~]# fwconsole ma --edge listonline
Edge repository temporarily enabled
No repos specified, using: [standard,commercial,extended,unsupported] from last GUI settings

+----------------------+-------------+--------------------------------------------+------------+
| Module               | Version     | Status                                     | License    |
+----------------------+-------------+--------------------------------------------+------------+
| freepbx_ha           | 13.0.9.2    | Enabled and up to date                     | Commercial |

:sweat:

Edit: Changed command from --list to --listonline which is showing version 13.0.9.2 and included outcome of running upgrade command.

1 Like

At a guess, your ‘free updates’ have expired. But that should be immediately obvious to you when you click ‘check online’, as there’s a big red ‘RENEW’ button next to HA?

There’s a bit of a discussion about that in this ticket. http://issues.freepbx.org/browse/FREEPBX-12977

How could I make it more obvious?

However, the version upgrade command can be resolved by running this on the production machine:

touch /var/spool/asterisk/incron/freepbx_ha.update-cluster

That will trigger the update process and upgrade it to version 4!

Ran the command

touch /var/spool/asterisk/incron/freepbx_ha.update-cluster

on the cluster I’m trying to join (node b) and nothing happens. ls -al shows the directory as being empty and the node still shows version 13.0.9.2. Trying to join the cluster also results in the same mismatch error.

You do that on the live node, to upgrade the cluster version. Then you will be able to join.

Node B is the live node and is the server I tried running the above command on. Node A is the new clean install which I’m trying to join to the cluster After running the above string it didn’t output anything to the screen and just returned to the cursor. I tried joining from Node A with the same two errors, so then I went ahead and ran Cluster > Manage > Run on Node B and then tried to join Node A with the same results.