Reoccurring Issues: Multiple Problems with DAHDI Line

Let me start with what is going on, and apologizing with how long this is, I just want to provide all I can upfront and be detailed in the issues and attempted fixes. We just moved buildings and decided to upgrade to the latest FreePBX since the version installed was from 2009. We set up a Dell Server with 12GB of RAM, running a minimal CentOS 7.4.1708 install. Software is FreePBX 14.2.0.10, Asterisk 15.2.0, and Dahdi 2.11.1+2.11.1. We are using a Digium 8 FXO AEX800 PCI-E Card with WCTDM24XXP module and VPMADT032 Hardware Echo Cancellation. We are running one DAHDI line into a hunt group of four total numbers and PJSIP extensions. When the phone goes down, the PJSIP lines still function with the ability to call extensions, but the incoming DAHDI line does not work until we run “fwconsole restart”. I have made it a point to call in 4-5 times a day, to check the status of the phones, which is not a solution for moving forward.

There are multiple issues that have presented itself since this install has gone live the past month. Almost occurs daily or every other day. Sometimes when people call in they are unable to hear the IVR. Other times people call in and it just rings and rings and rings without stopping and do not reach the IVR or default route set up by the IVR. Sometimes, people can call in, and the IVR answers, but does not accept any of the inputs on the IVR but if let alone, will reach the default route. The last of the reoccurring issues is when people call in, and our side answers, the calling in side cannot hear the answering side and it appears to be one-way sound.

We have been troubleshooting and scouring many of the WikiPages of help and many more help pages and feel like we have tried everything. We are hoping the community will be able to assist, before we attempt a rollback or reaching out to Digium.

The following have been error messages observed in dmesg which seem to be occurring prior to finding out the phones are down.

  • Missed interrupt. Increasing latency to 5 ms in order to compensate.
  • Unable to disable sw companding on echo cancellation channel 0 (reason 4)
  • Unable to set SW Companding on channel 0 (reason 4)

Some of the things we have attempted to do to resolve the issue include, but are definitely not limited to:

  • fwconsole restart fixes issues most of the time
  • Changing from Software Echo Cancellation to Hardware
  • Upgraded Server’s BIOS
  • Latest Dahdi firmware
  • Set relaxdtmf to ‘yes’
  • Used fxotune to attempt to configure
  • ‎Disabled the system frame buffer by adding nomodeset to the kernel boot line per the following knowledge base article: ‎https://support.digium.com/community/s/article/How-to-disable-the-Linux-frame-buffer-if-it-s-causing-problems
  • Installed acpid.
  • Made sure irqbalance was installed
  • Installed OS updates and rebuilt the kernel module
  • Ran the fxotune command again with a silence timeout of 2 seconds

https://www.voip-info.org/wiki/view/Asterisk+debugging
https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information
Issuing systemctl restart dahdi did not fix the issue. Only fwconsole restart did.

http://lists.digium.com/pipermail/asterisk-users/2009-August/236189.html

The only way we have been able to fully restore phone functionality would be to run “fwconsole restart”. This has also occasionally caused the FXO Port groups and Context to need to be reconfigured under DAHDI Config.

Any help or suggestions would be greatly appreciated!

Just a thought, Perhaps it is sharing interrupts

watch -d -n 1 ‘cat /proc/interrupts’

can you move your hardware to another slot?

Sorry for the delay in the reply, we wanted to wait a bit to see if changing the card to a processor with lower interrupt count would have an effect. We did not experience the “Missed interrupt. Increasing latency to 5 ms in order to compensate.” but the phones did go down again, where I called in and Asterisk says it picked up and was playing the IVR, but I was unable to hear anything on my side and I was also unable to enter a specific extension or anything. Ran “fwconsole restart” called back in and was able to hear again. We did still receive the “Unable to set SW Companding on channel 0 (reason 4)” in dmesg though. I can take a look at moving it to a different slot if they go down again today.

dicko,

Phones went down again this morning. I was able to call in but did not hear anything. I had the logs pulled up and dmesg running while I called in and the logs showed the call connecting and all the playbacks happening such as the IVR and asking for a response, but there was silence on my side. I tried typing in an extension and saw the numbers go through on the log, but still silence on my side. Also, the logs didn’t get me any errors such as the Missed interrupt or the SW Companding ones that we were experiencing. I ran fwconsole restart and called back in and could hear everything properly. It is still set to its own CPU for interrupts and this server is only running FreePBX and Asterisk. It is not used for anything else.

Any ideas?

What do you exactly mean by that? Do you mean that phones went straight to UNREACHABLE status, or the issue with one-way audio?

Sorry, it appears my first reply didn’t respond to you and just in general.

By down again, I mean it was not functioning correctly or as expected/intended. The phones were not in unreachable status, but I was not able to hear the IVR when I called in, even though according to the logs and Asterisk console, it was playing the IVR and receiving my input. But I was not able to hear anything.

Look to the basics,

http://kb.digium.com/articles/FAQ/How-do-I-record-calls-using-dahdi-monitor

Is the audio present after passing through DAHDI ?

Do you happen to also have a SIP trunk or just the dahdi card?

Yes, we do. We have a pjsip trunk through VoicePulse and the Dahdi trunk through Verizon. So far we have not experienced any issues with the pjsip lines.

I’ll give the audio recording a try next time it happens and let you know. Thank you by the way.

Have you considered the possibility of the DAHDI card being faulty? Probably not, but worth the while to test it with the dahdi command, I can’t remember it exactly, but it is the one that shows the reliability in percentage.

Another possibility could be that you are running on Asterisk 15, have you tried switching to Asterisk 13? It can be done on the fly, just a couple of minutes of downtime.

man dahdi_monitor

Just choose an active channel and -vv it to have a quick tui graph of rx/tx. No jumping lines, no audio in that direction

Audio is passed by the DAHDI channel driver to asterisk. The two systems are separate so you need to isolate whether the problem is DAHDI or asterisk.

Yes we have considered, but without a way to confirm this, was going to be an after resources were exhausted.

[ServerName]# dahdi_test
Opened pseudo dahdi interface, measuring accuracy…
100.000% 99.998% 99.996% 99.997% 99.999% 99.999% 99.998% 100.000%
99.996% 100.000% 99.999% 99.999% 99.999% 99.998% 99.996% 99.995%
99.996% 99.999% 99.999% 99.998% ^C
— Results after 20 passes —
Best: 100.000% – Worst: 99.995% – Average: 99.998068%
Cummulative Accuracy (not per pass): 99.999

We have considered this, but was hoping for some kind of positive results in troubleshooting before just giving up and rolling back.

See my edited post, asterisk and dahdi are totally separate processes You can run either without the other one running

I ran dahdi_monitor --vv 1 and got this:

( # = Audio Level * = Max Audio Hit )
<----------------(RX)----------------> <----------------(TX)---------------->
Rx: 100 ( 101) Tx: 0 ( 0)

I called in and saw the TX move when the IVR was talking and the RX move when I was talking or inputing an extension.

There’s no need to rollback. Asterisk version can be changed on the fly with a command from CLI.

[ServerName ~]# asterisk-version-switch
-bash: asterisk-version-switch: command not found

This is not a FreePBX Distro. We installed it from Asterisk’s website:
https://www.asterisk.org/downloads/asterisk/all-asterisk-versions

the RX is moving when you “input an extension”, is the dtmf , so its working, “set it and forget it” to record the files as a stereo pair

dahdi_monitor 1 -s -t capture1.wav

see what happens when it breaks :slight_smile:

dicko,

Well so far I haven’t had to test the

you provided. Phones have been working each time I’ve called in since the 14th. This is the longest they have remained up consistently, knock on wood, but I am not really confident in the system. Thank you so much for your help thus far, and if anything changes, I’ll update this.

Alright dicko, spoke too soon or that wood wasn’t good. Called in this morning and couldn’t hear the IVR or anything. So I ran the command you gave me and recorded the Stereo and the txstream. When I played them back on my computer, on the Stereo I first heard what sounded like a dial-up connection sounds, then the IVR played and I could hear me type in my extension, then my greeting played. On the txstream, I didn’t hear those weird sounds before, but heard the IVR and my greeting. But nothing on my phone when I call in.