Consistent Asterisk/FreePBX Crash Issue

It crashes when I have the Proxmox VM processor type set to kvm64. I have tried other processor types and still have the crash.

The other Proxmox hosts that the same VM works flawlessly on are Xeon servers that are supported by CentOS 6.6

If you’re trying to run CentOS 6.6 on an unsupported processor it’s entirely possible your problem could be the same as mine.

Is there a reason you can’t use FreePBX 14 based on CentOS 7 which is fully supported by newer CPU’s?

It’s not ProxMox, it’s not FreePBX . it’s Asterisk . Use asterisk <> 13.1[67].n

As mentioned in the original post, I was running 13.17.0.

That said, I’m on 13.17.2 now and haven’t had a crash yet. BUT, I’ve intentionally spread the extensions to a couple other servers, as to lower the load, which lowered the crashing frequency to once a week, then lowered it more until no crashing for about three weeks. Still had another crash about three weeks later.

Since on 13.17.2, we haven’t had any more, but it’s only been about a week. I’m going to move all the extensions back over to the one server and see if the increased load causes another crash on this newer version of Asterisk.

We could. Would just prefer some more time to go by before we make the switch. Also, wasn’t convinced that would fix anything.

Also, we are using Xeon processors that should be fully supported by CentOS 6.6, but I haven’t double checked that yet.

Hi!

You don’t have .wav files but this is the same problem:

and it has nothing to do with your problem:

Good luck and have a nice day!

Nick

Again , there IS an acknowledged problem by Digium with 13.17 , just don’t use it and you won’t have to chase rabbits down the wrong hole.

Thanks Dicko. However, we’ve experienced the issue on 13.15 as well. So it’s hard to decide what’s safe. Go to current 14 version?

Also, our procs are CPUs24 x Intel® Xeon® CPU E5-2620 and CPUs24 x Intel® Xeon® CPU L5640, and both seemed to be supported by CentOS 6.6, so that theory’s out the window.

Personally, I have found that using 13. pretty well anything with cdr-mysql will cause that , update to odbc, unload the cdr mysql stuff and maybe . . . ., but yes my machines are quite happy now under 13.18.rc? or 14.6.2 under ProxMox or Vultr ( both have been a PITA for a few months)

13.17.2-3.shmz65.1.183 is now live which as you can see has BETTER_BACKTRACES.

The same applies to Asterisk 14 as well

freepbxdev1*CLI> core show settings

PBX Core settings
-----------------
  Version:                     13.17.2
  Build Options:               DONT_OPTIMIZE, COMPILE_DOUBLE, BETTER_BACKTRACES, OPTIONAL_API

Great news!

We have ran the FreePBX update scripts and can confirm that BETTER_BACKTRACES show in the build options. Thanks!

Are you referring to the “bad magic number” FRACK Error I mentioned at the top of the thread?

If so, our next crash or FRACK, now that we have BETTER_BACKTRACES enabled, should show that, if it is indeed the cause, correct?

no, a predicdtable asterisk crash on the second ‘core reload’ (I have never seen a frack in asterisk)

So we had another FRACK finally. It now does appear that the “Serious Network Trouble” Error issue we’ve been having seems related. Here’s part of our log:

[2017-11-11 06:03:50] WARNING[8721] chan_sip.c: Unable to cancel schedule ID 0. This is probably a bug (chan_sip.c: do_dialog_unlink_sched_items, line 3266).
[2017-11-11 06:03:50] ERROR[5146] /builddir/build/BUILD/asterisk-13.17.2/include/asterisk/utils.h: Memory Allocation Failure in function ast_str_create at line 655 of /builddir/build/BUILD/asterisk-13.17.2/include/asterisk/strings.h
[2017-11-11 06:03:50] WARNING[5146] chan_sip.c: sip_xmit of 0x7f0428c3af80 (len 139655827686296) to 108.23.78.98:4279 returned -2: Cannot allocate memory
[2017-11-11 06:03:50] ERROR[5146] chan_sip.c: Serious Network Trouble; __sip_xmit returns error for pkt data
[2017-11-11 06:03:50] ERROR[5146] astobj2.c: FRACK!, Failed assertion bad magic number 0x0 for object 0x7f04286eac38 (0)

More info on Asterisk bug tracker here: https://issues.asterisk.org/jira/browse/ASTERISK-27321

You didn’t upload the backtrace to the asterisk ticket. Please ensure you do this.

So there was no core dump file, as Asterisk didn’t crash fully, but FRACKs only. How do I go about getting a backtrace for that?

You can’t since it didn’t crash it’s not really related to your original issue.

Did you see the log I attached? Seems to have a lot more info around the FRACKs than previous ones, before BETTER_BACKTRACES was enabled. Do you see anything helpful there?

Particularly here:

[2017-11-11 06:03:50] ERROR[5146] /builddir/build/BUILD/asterisk-13.17.2/include/asterisk/utils.h: Memory Allocation Failure in function ast_str_create at line 655 of /builddir/build/BUILD/asterisk-13.17.2/include/asterisk/strings.h

So we’ve been good since November or so.

Yesterday, seemingly out of no where, we had 8,000 FRACKs! And today so far, 4,000!

But sadly, there’s still no useful info. This is what the asterisk cli is showing every several seconds:

[2018-01-26 09:50:48] ERROR[20467]: astobj2.c:131 INTERNAL_OBJ: FRACK!, Failed assertion bad magic number 0x0 for object 0x3fe16a0 (0)
Got 18 backtrace records
#0: [0x607112] asterisk __ast_assert_failed() (0x60708a+88)
#1: [0x45e2c6] asterisk <unknown>()
#2: [0x45e2f3] asterisk <unknown>()
#3: [0x45f5f2] asterisk <unknown>()
#4: [0x45f829] asterisk __ao2_link() (0x45f7e6+43)
#5: [0x45fc9c] asterisk <unknown>()
#6: [0x45ff3f] asterisk __ao2_callback() (0x45fee0+5F)
#7: [0x7fed4997312d] chan_sip.so <unknown>()
#8: [0x7fed49972e6f] chan_sip.so <unknown>()
#9: [0x4dba68] asterisk ast_cli_command_full() (0x4db7f4+274)
#10: [0x4dbbcc] asterisk ast_cli_command_multiple_full() (0x4dbb34+98)
#11: [0x45512a] asterisk <unknown>()
#12: [0x603d14] asterisk <unknown>()

That said, no crashes yet… but calls seem to take a long time to initiate.

Has anything changed to where we can see what is behind the "unknown"s above?

No. We have followed all of Digium’s recommendations.