Server Reboots

cloudpbxfuzz · February 5, 2019, 6:02pm

I typically reboot my servers at least once a month. This is normally scripted and just runs a reboot command via cron. For rebooting FreePBX, is there any script/command that I should run before the reboot command to ensure Asterisk has shutdown properly?

dickson · February 5, 2019, 6:11pm

Don’t like to play the uptime game? Centos is pretty stable, you can probably go years without a reboot.

fwconsole stop
or if running an older build amportal stop

follow that by a
shutdown -r now
to reboot it

cloudpbxfuzz · February 5, 2019, 7:06pm

Uptime game… I used to play that game. Now I like rebooting every so often; stable or not stable.

I’ll script out what you mentioned, as that’s exactly what I was thinking of doing. I didn’t know if there was a better way or not. It’s quick and simple though. It should work just fine.
Thanks!

sorvani · February 6, 2019, 2:22am

The uptime game is stupid. You have no idea if the system will boot after a fail. It doesn’t matter how stable the underlying OS is.

Rebooting everything monthly and testing processes is definitely the better path than some small man complex of getting the best uptime.

dickson · February 6, 2019, 4:29am

haha! Well I guess you can run your DC anyway you like! I’d be vilified by my paycheck signers if I kept regularly rebooting systems without a really good reason in 2019.

dicko · February 6, 2019, 5:15am

You could tell 'em it’s windoze perhaps?

dickson · February 6, 2019, 12:16pm

Ha yea, sometimes MS does force your hand, my experience its usually for OS patching than performance problems. For people reading this, asterisk on Centos can go a long time. Measurable in years.

BlazeStudios · February 6, 2019, 1:57pm

Some of us don’t have that luxury. Rebooting a single system once a month, for some, would require a lot of work just for a silly reboot that isn’t needed or required. Mainly informing the entire customer base that there will be downtime, having a plan in place for that just in case situation of “Oh crap, things didn’t come back right after the reboot” and then of course there is that factor of some don’t have just a system they have multiple.

Seriously there is absolutely no reason to be rebooting these servers on a monthly or even regular basis. This has nothing to do with playing the “uptime game” this has to do with it’s Linux. You can restart the services you need when you need to for things that might need a restart. The amount of people in this community that randomly reboot their systems with changes is just staggering and it’s a horrible practice to have.

sorvani · February 6, 2019, 2:12pm

It is most certainly not randomly rebooting with changes. A scheduled (pick your own schedule, monthly in my case) reboot is part of a good DR process. An untested DR process is the same as no DR process.

I have many systems out there that I have set up. Not all are under my direct control. But every single deployment has been recommended to follow through on what I just stated.

dicko · February 6, 2019, 2:44pm

Disaster recovery is what is needed, it has nothing to do with rebooting.

The machines might no longer exist
due to fire flood or malicious practice .

By all means rehearse ‘failure recovery’ on any schedule you choose.

BlazeStudios · February 6, 2019, 2:44pm

No body said anything about untested DR processes and nothing states that testing your DR processes require you to reboot/shutdown the server. What are you actually testing each month when you reboot these servers? Because most things would require you just to shut down the service that your DR systems would take over for. Shut down Asterisk and then all your SIP/Asterisk destination traffic should flip to the DR system. No need to shut the entire server down.

Are you really just testing the whole “I gave it a reboot and it came back cleaning with everything running how it should” part of it? I mean that’s great and all but when have you run into a situation where a server has died/randomly shutdown cleanly?! That is generally not the case, something locked up, it lost its mind or power and it rebooted on its own or required someone to do it. Either way that’s the real honey and milk of it. What happens when a the system takes a hit and doesn’t do things like the shutdown process cleanly.

A clean reboot as a test only verifies that the system rebooted properly and all the services started up in the proper order and correctly. That does nothing to test for actual disaster situations. The real test would be randomly pulling the power to force a hard shut down that isn’t clean and then bringing it back up. Does it recover fine then?

sorvani · February 6, 2019, 3:10pm

I very clearly said part of the process.

Yes, for the systems I have direct control over, I also test that the backup boots, and services come up, in a new copy of the VM, or in a new instance if on something like Vultr or Digital Ocean.

BlazeStudios · February 6, 2019, 3:37pm

So your process includes shutting down the primary for like 5-10 minutes, watching everyone roll over to the backup server, make sure that everything is work as it should on the back (calls, modules, etc) and then you bring up the primary after about 15 minutes or so to see if everyone rolls back over to the primary and then do the testing again? Or are you just rebooting the servers to make sure they come up properly?

This would be about 15-30 minutes of possible down time per location that you needed to “test” because rebooting servers in a controlled and safe environment/process isn’t a test it’s just rebooting servers.

dicko · February 7, 2019, 2:57am

Then if that all works, what is the point of “rebooting” ? It will likely come up in exactly the same way as your “copies” , no?