FreePBX | Register | Issues | Wiki | Portal | Support

Mysql always fails over to slave node on High Availability


(Avayax) #1

I have an issue with my HA cluster (HA version 13.0.11), where mysql_services, mysql_ip and mysql_fs always want to failover to the slave node.

freepbx-b is currently master on all services, except for mysql, which runs on freepbx-a.
When I put freepbx-a into standby, then mysql will run fine on freepbx-b.

Why does mysql not want to run on the node, that is master for all other services, like asterisk and httpd?

So I am looking at this:

pcs status when both nodes are online:

Online: [ freepbx-a freepbx-b ]

Full list of resources:

spare_ip (ocf::heartbeat:IPaddr2): Started freepbx-a
floating_ip (ocf::heartbeat:IPaddr2): Started freepbx-b
Master/Slave Set: ms-asterisk [drbd_asterisk]
Masters: [ freepbx-b ]
Slaves: [ freepbx-a ]
Master/Slave Set: ms-mysql [drbd_mysql]
Masters: [ freepbx-a ]
Slaves: [ freepbx-b ]
Master/Slave Set: ms-httpd [drbd_httpd]
Masters: [ freepbx-b ]
Slaves: [ freepbx-a ]
Master/Slave Set: ms-spare [drbd_spare]
Masters: [ freepbx-a ]
Slaves: [ freepbx-b ]
spare_fs (ocf::heartbeat:Filesystem): Started freepbx-a
Resource Group: mysql
mysql_fs (ocf::heartbeat:Filesystem): Started freepbx-a
mysql_ip (ocf::heartbeat:IPaddr2): Started freepbx-a
mysql_service (ocf::heartbeat:mysql): Started freepbx-a
Resource Group: asterisk
asterisk_fs (ocf::heartbeat:Filesystem): Started freepbx-b
asterisk_ip (ocf::heartbeat:IPaddr2): Started freepbx-b
asterisk_service (ocf::heartbeat:freepbx): Started freepbx-b
Resource Group: httpd
httpd_fs (ocf::heartbeat:Filesystem): Started freepbx-b
httpd_ip (ocf::heartbeat:IPaddr2): Started freepbx-b
httpd_service (ocf::heartbeat:apache): Started freepbx-b
Clone Set: ClusterMon-SMTP-clone [ClusterMon-SMTP]
Started: [ freepbx-a freepbx-b ]
fence_a (stonith:fence_ipmilan): Started freepbx-a
fence_b (stonith:fence_ipmilan): Stopped

pcs status when freepbx-a in standby:

Node freepbx-a: standby
Online: [ freepbx-b ]

Full list of resources:

spare_ip (ocf::heartbeat:IPaddr2): Started freepbx-b
floating_ip (ocf::heartbeat:IPaddr2): Started freepbx-b
Master/Slave Set: ms-asterisk [drbd_asterisk]
Masters: [ freepbx-b ]
Stopped: [ freepbx-a ]
Master/Slave Set: ms-mysql [drbd_mysql]
Masters: [ freepbx-b ]
Stopped: [ freepbx-a ]
Master/Slave Set: ms-httpd [drbd_httpd]
Masters: [ freepbx-b ]
Stopped: [ freepbx-a ]
Master/Slave Set: ms-spare [drbd_spare]
Masters: [ freepbx-b ]
Stopped: [ freepbx-a ]
spare_fs (ocf::heartbeat:Filesystem): Started freepbx-b
Resource Group: mysql
mysql_fs (ocf::heartbeat:Filesystem): Started freepbx-b
mysql_ip (ocf::heartbeat:IPaddr2): Started freepbx-b
mysql_service (ocf::heartbeat:mysql): Started freepbx-b
Resource Group: asterisk
asterisk_fs (ocf::heartbeat:Filesystem): Started freepbx-b
asterisk_ip (ocf::heartbeat:IPaddr2): Started freepbx-b
asterisk_service (ocf::heartbeat:freepbx): Started freepbx-b
Resource Group: httpd
httpd_fs (ocf::heartbeat:Filesystem): Started freepbx-b
httpd_ip (ocf::heartbeat:IPaddr2): Started freepbx-b
httpd_service (ocf::heartbeat:apache): Started freepbx-b
Clone Set: ClusterMon-SMTP-clone [ClusterMon-SMTP]
Started: [ freepbx-b ]
Stopped: [ freepbx-a ]
fence_a (stonith:fence_ipmilan): Stopped
fence_b (stonith:fence_ipmilan): Stopped


(Avayax) #2

Can anybody help out?


(Avayax) #3

Here is a simple description of the problem:
Node B is master, Node A slave. All the file systems are mounted correctly on Node B if Node A is in standby. Once node A is out of standby and online, mysql service immediately switches over to Node A.

What’s wrong and how do I investigate the problem further?
I would like to perform distro and yum updates, as well as an Asterisk version upgrade, and my concern is that things will break with that issue.

Would you have an idea, @xrobau?


(Rob Thomas) #4

Nothing will break, and that works fine. If it’s moving back BY ITSELF, that means that someone’s deleted the resource-stickyness attribute, or used ‘pcs resource move’, to force it to move.

You can run pcs constraint --full to see what the problem is, but running pcs resource clear mysqld will probably be sufficient.


(Avayax) #5

Thanks.

Did you mean pcs resource clear mysql?

pcs resource clear mysqld gives me an error:
Error: mysqld is not a valid resource


(Rob Thomas) #6

Yeah, sorry, mysql. You would have seen it in pcs constraint --full (or whatever you DO see there)


(Avayax) #7

This is what I am seeing, but that’s when Node A is master and it always has all file systems mounted correctly. It’s when B is master, that mysql fails over to A, if it’s online.

 promote ms-asterisk then start asterisk_fs (score:INFINITY) (id:order-ms-asterisk-asterisk_fs-INFINITY)
  promote ms-mysql then start mysql_fs (score:INFINITY) (id:order-ms-mysql-mysql_fs-INFINITY)
  promote ms-httpd then start httpd_fs (score:INFINITY) (id:order-ms-httpd-httpd_fs-INFINITY)
  promote ms-spare then start spare_fs (score:INFINITY) (id:order-ms-spare-spare_fs-INFINITY)
  Resource Sets:
    set mysql httpd asterisk sequential=true (id:mysql-httpd-asterisk) setoptions kind=Optional (id:freepbx-start-order)
Colocation Constraints:
  asterisk_fs with ms-asterisk (score:INFINITY) (with-rsc-role:Master) (id:colocation-asterisk_fs-ms-asterisk-INFINITY)
  mysql_fs with ms-mysql (score:INFINITY) (with-rsc-role:Master) (id:colocation-mysql_fs-ms-mysql-INFINITY)
  httpd_fs with ms-httpd (score:INFINITY) (with-rsc-role:Master) (id:colocation-httpd_fs-ms-httpd-INFINITY)
  spare_fs with ms-spare (score:INFINITY) (with-rsc-role:Master) (id:colocation-spare_fs-ms-spare-INFINITY)
  floating_ip with asterisk_ip (score:INFINITY) (id:colocation-floating_ip-asterisk_ip-INFINITY)
  Resource Sets:
    set asterisk httpd (id:colo-freepbx-0) set ms-asterisk ms-httpd role=Master (id:colo-freepbx-1) setoptions score=INFINITY (id:freepbx-colo)

(Avayax) #8

We actually did use the ‘pcs resource move’ at one point, so that was the reason apparently.

So whenever you move a resource manually, you have to make sure you do a pcs resource clear 'service' on it afterwards?

Good to know, thank you.


(system) #9

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.