We have a passive/active cluster of freepbx servers. We had a failover this morning after which none of the SIP extensions could receive calls (they could, however, make calls) . The only way we could enable the SIP Extensions was to edit each one and make a small change to the configuration (we added “1” to the Display Name), the extension would then work once the changes had been applied. Obviously we don’t want to do this everytime we have a cluster failover so does anyone have any suggestions as to the cause and the fix for this problem?
The SIP phones involved are
X-Lite 3.0 34025
Are you using Dundi for failover?
That would seem logical - I’ve not been able to confirm that I am having the same problem as you as it is a live system and I don’t want to fail it over too often (in light of this problem).
Do you have to amend each extension before doing the save/apply as I do?
I am using csync2 to synchronise the following directories and files:
One of the files included is /var/lib/asterisk/astdb, although I have noticed (when I do a manual csync2) that astdb is sometimes marked as dirty it could be that the copy on the backup server is corrupt - that wouldn’t explain why the primary server then fails when it is restarted, unless the file has been overwritten during it’s restart.
No, I am not using DUNDI for failover - I have an Active/Passive cluster. The MySQL Database is set up and running on both servers and running database replication between them. the asterisk config files (and some others) are replicated using csync2 whenever a file is modified.
How would DUNDI failover work?
IIRC, I do not have to make any actual changes before doing the save-apply. I just have to open the edit page on the extension, press the form submit at the bottom of the page, then the red “apply changes” button.
Thanks for that - it will save me a lot of extra typing when it happens again. Do you have to do it for each end every extension though?
freepbx stores data ‘objects’ in the astdb that are essential to the dialplan. If you are trying to keep a backup server in sync, copying the sql database and rsync-ing may not get everything and can have problems with the sql database.
You may want to take a look at a couple of functions, dumpastdb.php and restoreastdb.php. These can be found under the /var/www/html/admin/modules/backup/bin directory or linked to /var/lib/asterisk/bin directory.
Those functions as is will need some minor modification to be used more generally. However, what you would want to do as part of your sync process (and ideally, every time you made a configuration change in freepx) is to do a dump of that database on the live machine and restore it to the backup machine.
Without these objects properly configured, the system will not function, as you have found. They are integral to the freepbx application.
The astdb is one of the files that is copied by the csync2 process, which is initiated after any change is made to the configuration and asterisk is not started on the backup server until the failover takes place. So I would have thought that at the point of failover the astdb ought to be up to date. I shall be trying a failover later and will synch immediately before hand to see what happens.
It looks like my problem was probably that the astdb file was not synchronised correctly. I have just failed over two or 3 times and it seems to have worked each time after a clean csync2 session.
I am having exactly the same problem here and it’s driving me mad. I think I have figured out what the cause of the problem but have yet to solve it. I am not an expert on Asterisk or FreePBX, so don’t take anything as gospel, but this is what I’ve been able to piece together by looking at the problem for a couple of days.
Basically the problem is that asterisk uses a script called dialparties.agi to determine what you really mean when you enter an extension (i.e. pressing 1000 might be a ring group that rings 10 different phones). This information is stored in an internal asterisk database-like thing (which may or may not be an actual relational SQL database, I haven’t been able to figure that out yet) which is seperate from the freePBX “asterisk” database.
If you debug the call, I’m almost positive you’ll see a line like this…
Executing NoOp(“SIP/101-081e2258”, “Returned from dialparties with no extensions to call”) in new stack
where SIP/101-081e2258 is the extension you dialed from (101 on my testing setup). This means that Asterisk asked dialparties.agi to interpret the dialed string and it returned nothing. I’ve spent considerable time debugging this and it seems like this line in dialparties.agi…
$device = $agi->database_get(‘AMPUSER’,$device_str);
is the culprit. The result of this call to asterisk database is blank. Or, more accurately, an array with no values. If you do the same exact thing after doing a save changes to the extension through freePBX, you’d get something more meaningful back (the extension number in the simple case).
The problem is that even though freePBX’s database is right, and Asterisks configs are right (I rsync them off of the live server every half hour in my case, I’m sure you do something comparable), the internal asterisk database is not correct. It does not have entries for the extensions that have not gone through the save/apply routine. And as far as I’ve been able to tell save/apply is the only way to get freePBX to either a) rebuild asterisks internal database, or b) tell asterisk to rebuild its database on its own. I don’t know for sure who is “in charge” during that process, so it could be either asterisk or freePBX that actually performs the update.
The solution that I think both of us require is some way for the failover script to tell the newly live asterisk server to rebuild its database based on the freePBX mysql database, essentially the equivalent of saving and reloading all extensions.
Sadly I haven’t figured out how to do this. Hopefully someone can help.