Reboot working super slow


(Jared Busch) #2

After the Vultr console showed the login prompt, I attempted to SSH in and was refused.
I logged in via the console and check htop lots of asterisk instances but things seem normal, and then I was able to SSH in.

Def some weird stuff tonight. There were 240+ system updates via yum and 40 module updates. Debating rolling back to the snapshot.

image


#3

Mysql won’t run if there is no space for it to log to, the firwall apparently depends on Mysql, to fix this Catch 22 just restart the instance on the Vultr console Yosarian.


(Jared Busch) #4

It eventually restarted but wow something weird is happening tonight.


#5

Just a musing, asterisk won’t do that if started with safe asterisk using sysVinit just make sure your default file for asterisk specifies it use the asterisk group and user, if using systemd, make tsure that the asterisk service file is sane.


(Jared Busch) #6

This is a native FreePBX distro. It was FreePBX 13 installed in 2016 and upgraded to 14 last fall.

Never had any weird issues not caused by me.


(Jared Busch) #7

Just tried to reboot from the GUI, but left my SSH session open.

MySQL was definitely running as I was making changes and reloading with no issues.

Same thing.
removed unredacted image.

This repeated until
removed unredacted image

After that the console showed the reboot.


(Jared Busch) #8

Okay, I reverted the PBX to the snapshot I had prior to the update last night.

I rebooted cleanly, and made sure everything was working normal, and then I made a new snapshot.

After the snapshot I rebooted again just because I want to make sure everything is as normal as possible. Everything was working fine.

I then ran yum update from a SSH session. It took almost an hour. I have never seen this happen before. The system was last updated sometime in April or early May.

When the update was done I issued the reboot command from the SSH session.

That gave me a weird error message that seems like some script was still running. So I exited that session.
image

I still had a separate SSH session running to see if I got the above MySQL errors this time. Yup, I did.

Here is everything from the yum update session it is quite long: https://hastebin.com/bidupexasi.lua

Eventually, it did reboot after about 5 minutes of those errors.

Everything came up normal and was working. I waited about 30 minutes and then issued a reboot from SysAdmin within the web GUI.

I did not get MySQL errors, but I did get this. Again like 5 minutes to actually reboot.

I had a separate window up and it showed /sbin/reboot executing the entire time.

I waited 20 minutes and issued another reboot from the web GUI, and look what happened.
image

At this point I assume it will fully reboot in 5 minutes as it has been.
edit: yup.

Edit 2:
For reference this was FreePBX 13 installed back in 2016. It was upgraded to FreePBX 14 via the upgrade script in 2017 (October I think).

The system has 6 users and a few test extensions.

The system is running on a $5 Vultr instance.
image


(Jared Busch) #9

This morning I continued testing things.

I logged in and upgraded the firewall module. Then rebooted from the web GUI.

I logged back in after the reboot end waited 10 minutes and restarted from the web GUI again.

Nope. This is certainly not a good thing.
image

5 minutes later it rebooted.

Logged in and upgraded framework and core, then rebooted. same thing.


(Jared Busch) #10

One note I never mentioned.

While this thing is puking out Firewall errors, Asterisk is already shut down. All call processing is dead for this 5 minutes.


(Jared Busch) #11

I spun up a new FreePBX 14 instance, made a backup on the original, restored on the new one, and zero issues.

But the original system is doing this everytime.

This was our company PBX with few extensions, but we have clients with hundreds. I’m a bit worried about updating the ones that came from FreePBX 13 originally.


(Rob Thomas) #12

When you run ‘reboot’, Systemd asks all the services to shut down nicely, and gives them plenty of time to do so. I’ve noticed that sometimes apache does NOT shut down when it should - not saying that this is your problem, but it’s going to be something like that.

The easiest way to work around this is to just force a reboot, next time - /sbin/reboot -f and the machine will reboot (almost) instantly.

Or, you can spend the time figuring out which service is not shutting down, which I BELIEVE you can get by digging through the systemd journals.

Additionally, a lot of people don’t know (because, I haven’t documented it!) that we provide a ‘kreboot’ service, that lets you reboot the KERNEL without rebooting the entire machine. 99% of the time, this is all you need - eg, a kernel update.


(Jared Busch) #13

Thanks @xrobau

Well the firewall cannot talk to MySQL according to the big error banner.

So frustrating because I know a new clean system does not do this.

I have the old system on line still but it is no longer activated so I cannot use SysAdmin to reboot.


(Rob Thomas) #14

That’s because MySQL and Asterisk have already shut down, but something else hasn’t. Firewall is extremely persistent about hanging around when things try to kill it, which is why it’s yelling (legitimately) about other things being broken.


(Jared Busch) #15

Wow, just logged in and after updating Yesterday, there is a new kernel today.


(Rob Thomas) #16

Yep, it was just released a few hours ago. It’s related to this:

https://access.redhat.com/solutions/3485131


(Jared Busch) #17

Well then, let’s see what happens!


(Jared Busch) #18

Nice to know. But, I do a full reboot once a month so that I know with zero doubt that the system can do so in case of some hardware or other external issue.

I’m not a believer in striving for some stupid high “uptime” like some people do.


#19

Maybe the firewall systemd .service file could be better tuned/ordered as to dependancies and requirements. In other arenas such an approach has worked for me.


(Rob Thomas) #20

Firewall is started and stopped by fwconsole. This has nothing to do with firewall, firewall is just helpfully - and correctly - saying ‘Things that should be running are not running’.


(Jared Busch) #21

Updated yum and rebooted same error. but I decided to stop working for the night.

I will look at it more tomorrow.

Thanks for pointing a direction @xrobau