My [awful] upgrade from 13 to 14

slonkak · February 17, 2018, 7:10pm

I compiled a list of everything that went wrong during my upgrade from the latest 66.x version to SNG7 so hopefully others doing an in-place upgrade know what to look out for.
Issues upgrading to SNG7

Asterisk service was left as a SysV service and set to disabled, so literally nothing worked. I had to ‘chkconfig asterisk on’ and reboot.
Fail2Ban service was converted to sysctl and set to disabled. I had to ‘systemctl enable fail2ban.service’.
Snmpd service was converted to sysctl and set to disabled. I had to ‘systemctl enable snmpd.service’.
.htaccess in /var/www/html/admin/ was overwritten and broke my ‘protected’ directory I made to run custom PHP scripts.
System timezone info was deleted (set to n/a), which broke all Time Conditions on FreePBX. Ran ‘timedatectl status’ to see that the timezone was ‘n/a’. Ran ‘timedatectl list-timezones | grep America’ to find the EST timezone. Then ran ‘timedatectl set-timezone America/New_York’.
DB table asteriskcdrdb/cel was crashed, which broke CDR reports. My FreePBX server didn’t have enough disk space (2x the size of cel) to run a REPAIR TABLE, so I had to copy the cel.* files from /var/lib/mysql/asteriskcdrdb to another server with mysql and run ‘myisamchk --safe-recover ./cel’, then on FreePBX open MariaDB and run ‘FLUSH TABLES WITH READ LOCK’, copy the cel.* files back in place, ‘UNLOCK TABLES’, then restart the MariaDB service.
Asterisk Info module was removed. Added back in the Module Admin interface.
SSHD config was overwritten, breaking my custom access. Fixed /etc/ssh/sshd_config.
asterisk cron job for storage was readded. I don’t like the default storage alerting values, but they’re in a ZEND protected PHP file so they can’t be changed. I commented out asterisk’s storage cron job so I could have my own script run but it was added back into asterisk’s cron jobs. I had to delete it again.

Hopefully this list will help someone going through their own upgrade and also help the SNG people to make their upgrade process better.

tonyclewis · February 17, 2018, 7:19pm

Most of your document was custom changes you made that a upgrade to Linux and FreePBX will remove. . As far as asterisk it should not be set to start. fwconsole starts asterisk. Same with fail2ban so those need to be off like they were.

slonkak · February 17, 2018, 8:17pm

Only two of the items I mentioned were because of my custom configuration: the htaccess file and the SSH configuration. Everything else was the upgrade process not checking previous values before changing things.

Additionally, I would be OK with fwconsole starting asterisk and fail2ban, but they didn’t. After the upgrade I was left with a totally unusable system until I manually started those services. It was at that point that I set them to start automatically.

tm1000 · February 18, 2018, 3:37am

Freepbx has controlled this file for a while. Your custom changes would be over written on every upgrade of framework and also generate a security warning

It does.

mbello · May 23, 2018, 11:51pm

Slonkak, I for one will thank you for your report. One can see you probably spent many hours fixing things. I tried upgrading to v14 some 3 months ago and had a few issues that I did not bother to investigate further so I just deleted the VM and ran the v13 VM again (virtualization does make some things much easier).

I wonder why the attitude from Sangoma staff to your post was so negative.

slonkak · May 24, 2018, 12:18am

Yeah. I’m not sure. I kind of get it, I don’t like when people criticize my code either. But my issues are real and hopefully this post helps others who may run into the same thing.

xrobau · May 24, 2018, 12:58am

But there aren’t actually any issues in there, apart from the timezone breaking. No-one’s actually mentioned that before, and I agree that could be a significant inconvenience.

However, nothing else you posted is caused by the upgrade process, or is by design.

fail2ban and asterisk are started by the freepbx service, and should be disabled.
snmpd is upgraded, and if the package itself doesn’t check its current state, there’s nothing we can do about that
Same with sshd. This is just the RPM itself upgrading
htaccess is overwritten on every upgrade of framework. (Or it should be… I don’t know. I can see both sides of the story here). But as part of a core version upgrade, YES it 100% should be.
Database tables being crashed has nothing to do with us. They were already crashed 8)

There is a ticket open about the asteriskinfo module going missing, and I haven’t been able to pin it down. I may have some time next week to try it, as I need to rebuild and run through the 6->7 upgrade repository now that I’ve published 7.5

So… Sorry, but there’s nothing really we can do about any of that (apart from timezone). If you had set it through the Sysadmin Admin GUI, previously, it would have upgraded properly.

This is because it’s extremely difficult to find which timezone you are using AFTER you’ve set it on 6, as the timezone file is copied and changed when it’s installed, so you can’t even compare it against all the EXISTING timezone files to see which one you’re on.

tm1000 · May 24, 2018, 3:42am

I actually don’t see anything negative from any sangoma employee. We all commented on different parts but didn’t reply negatively to this thread.

How else would you have liked Tony and I to respond?

mbello · June 4, 2018, 10:03pm

xrobau’s answer was much nicer in my view. He at least demonstrated he read what was initially written.

Although evenhim was quite on the defensive side: “But there aren’t actually any issues in there, apart from the timezone breaking”. Yes, there are and when one says “there aren’t, except” it is already quite defensive.

Let’s count how “there aren’t actually any issues”, when we have:

Timezone breaking;
Asteriskinfo (not confirmed but xrobau says he heard that before so should be counted)
sshd_config… an update that overwrites security-sensitive configuration files is doing something wrong. Would be better to force changed conf files to be kept unchanged than to force them to be overwritten. In other words, this is an issue whether or not you thing it is worth fixing. Afterall, we all know that when we update our Linux servers, these files are never silently overwritten.

And if you open your mind to the posibilities of unknown bugs in your codebase we could have a 4th or 5th element there from the database crashing and the services not being brought up, but even without those we already have 3.

In light of all this, if you read again Tony’s and Andrew’s first replies you will now see how poor and lacking the answers were. Would have been much better to thank the user for the report and acknowledging his pain points (rather than quickly pointing one or two things and then quickly dismissing it entirely).

You know, when we take the time to report these things to you, the devs, we do that in consideration for your work. The worst thing that can happen is you quickly dismissing us, giving the impression you did not even consider the possibility of your system being at fault somewhere.

Anyhow, I got too far into this, I am not the OP author. But the kind of answer he got here resembles my own experience the few times I actually came here or to the IRC to report my findings.

tm1000 · June 4, 2018, 10:19pm

Thanks. But this was never “reported”. It’s a forum thread that has a compiled list of issues. It was never reported to issues.freepbx.org. Which is what we work off of. Therefore I don’t consider this “taking the time”, it’s more of a complaint. Which you and the OP can surely make. But no one is going to do anything about them unless they are reported to the issue tracker. Which is what we ask of everyone.

sorvani · June 5, 2018, 2:04am

And for the record I did two 13 to 14 migrations at two different clients in the last 30 days with no issues like this.

mbello · June 5, 2018, 10:17am

If you see it only as a complaint then it explains the attitude.
However, keep in mind that the first step of many users is to first come here and tell people what they found out, check if the finding is indeed a bug or not (for instance, the .htaccess thing is not a bug) and then if you are lucky they will finally distill everything into separate bug reports.
Also, many people do not consider themselves knowledgeable enough to fill a bug report and prefer to report their findings in a more informal way directly to the community. This is a reality of every software project, open source or not.
However it is that your users are reporting bugs to you, it should be seen as a good thing. But if you prefer to see it as a complaint and react negatively to it you are wasting that and making people upset in the way.

As a last point, a dedicated developer can always pay attention to bugs reported on the forums and take the time himself to check if it makes sense and them opening the bug himself on the issue tracker. Actually, engaged developers quite often do exactly that in contrast to your “this is a forum, it is not on issues.freepbx.org which is what I work off of”. But it requires a “cool, another bug we finally found and can now squash” rather than a “damn, another item on my task list” attitude.