Any updates on letsencrypt certs not renewing automatically?

Another example that caused me a heart attack for about an hour that was caused by a non renewing certificate.

In an environment with two PBXact servers in a redundancy configuration using the Advanced Recovery module, starting at midnight tonight the primary server’s asterisk wouldn’t stay running for more than two minutes. Nothing we did would ensure that asterisk stayed up. We would watch it start up after running asterisk -vvvc and then a minute into the startup procedure we would see it getting shut down. It wasn’t crashing or freezing up, asterisk was simply going through the shutdown procedure and then when we checked right after there was no running asterisk on the primary system to process calls.

Restarting the system would make no difference either.

After about an hour of sheer panic we figured out that the certificate on the secondary unit was expired. It took us a while to think of checking on the secondary server because the problem was manifesting itself only on the primary.

The expired certificate caused heartbeat connectivity issues between the servers and basically was causing the secondary server to keep shutting down asterisk on the primary without really anything showing up in the asterisk full logs on the primary side (side note, it would be great if Advanced Recovery would somehow log that it was shutting down asterisk when it was shutting it down inside the asterisk full logs).

Again, it took us a while to consider checking anything related to the secondary server and Advanced Recovery because nothing obvious was showing up in the asterisk full logs of the primary.

Renewing the certificate on the secondary unit immediately resolved the issue.

No idea how we got unbelievably lucky that the expiration fell on a Sunday, the only day this environment is shut down and receives no calls so we had some breathing room to troubleshoot.

Confirmed that the cert renewal cron line 44 1 * * * /usr/sbin/fwconsole certificates --updateall -q 2>&1 >/dev/null is missing inside the /var/spool/cron/asterisk file on the affected systems.

Going to try adding just that line in manually and see if it disappears again.