How to troubleshoot these kind of problems

We have multiple FreePBX on-prem setups for different customers in different locations and everything is all right, except 1 client that blows our mind.

How can we troubleshoot these problems:

  1. After an agent answers a call from the queue, that queue continue to ring to the other agents with the same number. For example, X client calls the queue 1000 through IVR → which call 4 agents, one of them answer/take the call, but the system continue to call that 3 agents until the call is hang up. This thing happens all the time in one location.

  2. Our FreePBX system is configured to record all the (in&out)bound calls. Everything works perfectly except sometimes when the call is not recorded and in CDR that call appears as FAILED or BUSY, but in reality that call happened, the agent answered and everything worked perfectly.

We are Googling & GPTing & testing for at least 3 months and nothing works. We fixed a lot of customer’s network problems but nothing solved the above problems.

Regarding logs, everthing seems to be ok. We are worried about the fact that may be some bugs in asterisk, but I don’t want to think about that.

For that customer we use: FreePBX

Have you experienced something like this?

For problem 1 I would start with a SIP packet capture to confirm that calls are completing as expected and that all parties involved (extensions and asterisk) are receiving and sending the expecting network traffic. You’ll also comb through /var/log/asterisk/full for any indication of a problem when a call is sent to an agent that actually takes a call.

I would even start there for problem 2 and ensure that there aren’t networking problems that are causing these issues for you.

I would also make sure that you are using the latest version of asterisk on the particular branch that you are running and that your system is and all modules are up to date.

The OP didn’t say they were using VoIP, although past experience is that people not using SIP say so explicitly, and the main problem is there is no simple way of deducing whether they are using the legacy (chan_sip) or the current (chan_pjsip) SIP driver.

I can’t think of anything that would cause (1), and I doubt that an Asterisk version with such a bug would make it into the field; maybe there is a problem, downstream, with handing CANCEL. The logs will confirm that.

Whilst I also can’t think of how (2) would happen, we need the Asterisk full log and the CDR report to really understand what you are describing.

The only thing that I can think of causing problem #1 are networking issues. If SIP packets from the phone system telling the other phones that the call has been picked up and that the ringing can stop never reach the phones. This would of course show up in a packet trace.

I am of course also assuming that these queues are setup with a ringall strategy.

Your point stands though, for sure many assumptions on my part that were not clarified by the @doubleweb.

When a call is answered Asterisk sends out a SIP message to terminate the timing calls on the other phones. The phone then hangs up. If that is not happening sounds like a phone or network issue. The trace and logs will help. More details about your queue setup and endpoints will also be helpful.

Are you open to a round robin ring strategy with auto fill set to yes?

For all our customers we use SIP Trunks from Vodafone and the hardware server colocated into a datacenter. All the agents have extensions chan_pjsip and all of them use softphones which works greate. The ring strategy for all the queues is “ringall”.

What exactly to verify and how to check/analyze the network issues for this random things? Maybe there are router / switches problems? How to check which customer’s device(s) makes troubles?

Thank you for all your responses!

Wireshark the an example of the fail scenario.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.