Asterisk Randomly stopped. Can you help me interpret these logs?

Hi. One of my FreePBX/Asterisk instances randomly stopped today. It required an fwconsole restart to get it back up and running. I checked the logs but I can’t seem to decipher if the problem is actually apparent from them. Clearly something went wrong, because there are a lot of unepected warnings and whatnot all happening at the same time. But I can’t tell if the cause for all of these problems is present.

I’m running asterisk 16.17.0
FreePBX 15.0.17.34

I’m mostly worried it might be some time of hack. If that’s not it, then I just don’t want it to happen again. I noticed that my full logfile isn’t rotating anymore. I’m not sure if that’s relevant. This one goes all the way back to February 1.

Here is a pastebin to the logs starting just after everything had been calm for a while until until just after restarting asterisk. Can anyone see what might have gone wrong?

I looked for the core dump in the tmp directory, but couldn’t find it. There are a few unexpected directories there, though. There are a number of systemd service directories. There is a reader.log file that has the line “Module is not licensed” repeated almost 2 million times. There is a cron.error file too, but it seems to be empty.

I noticed two lines that said:

[2022-04-20 12:35:16] VERBOSE[3365] res_pjsip.c: Local IPv4 address determined to be: obfuscated
[2022-04-20 12:35:16] VERBOSE[3365] res_pjsip.c: Local IPv6 address determined to be: obfuscated

Except that these were my public IPs, not local.

I did some searching for some of the logs and came across this post, this post, and this post, but none of them led to me a solution.

Lines like this:
[2022-04-20 12:35:21] VERBOSE[3365] translate.c: Registered translator 'slin 24000khz -> 16000khz' from codec slin to slin, table cost, 850000, computational cost 999999
seem to indicate a problem too.

My guess is that just before 2022-04-20 12:34:45 , your Asterisk crashed, look in /var/log/messages (/var/log/syslog) at about that time for a clue.

I couldn’t download the messages file for some reason, so I copied the pertinent parts out of the terminal. I couldn’t find the syslog file.

Here is the messages file

Is your host set to suspend or hibernate?

I’ve never adjusted that setting, to be honest. I did a google search on how to change/check it, but I didn’t come across anything particularly helpful. What power method should it be set to?

I checked my monitoring server and didn’t see any downtime on the host. Perhaps it was very brief…

Your system crashed at 12:34 and restarted 7 hours later :slight_smile: , presumably a ‘clock issue’

Apr 20 12:34:09 convergentspokanemts systemd: Switching root.
Apr 20 12:34:10 convergentspokanemts journal: Journal stopped
Apr 20 19:34:10 convergentspokanemts journal: Runtime journal is using 8.0M (max allowed 91.8M, trying to leave 137.8M free of 910.8M available → current limit 91.8M).

10 seconds later you will see fwconsole restarting. There follows a loop as FreePBX can’t see Asterisk. (safe_asterisk)

Starting around your line 800: . . .

Two things I would suggest, make sure your clock settings both hardware and software are solid. Verify that the service that controls fwconsole is behaving properly to start and stop asterisk in a timely fashion and has the correct permissions

Hi. Thanks for looking into the files. Yeah, I don’t think it actually restarted 7 hours later, so probably a clock issue, or something. The logs actually switch back to the correct time after 34 seconds or so (which can be seen on line 809). However, I don’t think asterisk came back up until I was made aware of the problem a couple hours later.

Would an fwconsole chown be a good way to ensure proper permissions? What is the best way to test starting and stopping asterisk in a timely manner? Should I just restart it with fwconsole and time it?

As far as time settings, they’re on the west coast and the server is hosted in seattle. I have the PHP timezone set as America/Los_Angeles. But I’m not sure how to check to see if my hardware settings are “solid.” I’m hosting with Vultr and I’m not aware of how to change a hardware time setting. I think you were wondering about more than just timezone settings on the software side as well. Do you have a recommendations on what to check there?

Here is the output of timedatectl:

      Local time: Thu 2022-04-21 13:10:48 PDT
  Universal time: Thu 2022-04-21 20:10:48 UTC
        RTC time: Thu 2022-04-21 20:10:48
       Time zone: America/Los_Angeles (PDT, -0700)
     NTP enabled: yes
NTP synchronized: yes
 RTC in local TZ: yes
      DST active: yes
 Last DST change: DST began at
                  Sun 2022-03-13 01:59:59 PST
                  Sun 2022-03-13 03:00:00 PDT
 Next DST change: DST ends (the clock jumps one hour backwards) at
                  Sun 2022-11-06 01:59:59 PDT
                  Sun 2022-11-06 01:00:00 PST

Warning: The system is configured to read the RTC time in the local time zone.
         This mode can not be fully supported. It will create various problems
         with time zone changes and daylight saving time adjustments. The RTC
         time is never updated, it relies on external facilities to maintain it.
         If at all possible, use RTC in UTC by calling
         'timedatectl set-local-rtc 0'.

Should I heed that warning and run the set-local-rtc command?

dahdi was started before the time was reset, that might be a problem but what is dahdi doing on a vultr instance?

says the hwclock should be UTC , so I can’t see the problem, but if it isn’t

man hwclock
( hwclock -w , when 'the time is right :slight_smile: and should likely be done before starting dahdi)
Apparently set-local-rtc should be 0 (the default)

No, the freepbx.service file has execute permissions, systemd doesn’t like that and I don’t believe fwconsole chown would fix that either. it should be owned by root:root with 0644 mode

lines 1359 and following suggests there might be a problem with your firewall, as soon after it reloads a failure to connect to asterisk ( TCP:5038 on the lo interface usually) is noted which is not resolved by the end of your posted logs.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.