PJSIP Trunk stops working

FreePBX 15.0.17.34

I have a PJSIP Trunk that will stop working every month or so. What seems to be happening is that my system will stop sending requests to my provider and my trunk will fail once the registration period ends. (Configured to 3600 seconds as of right now.)

Here is the easy fix I found :
Connectivity → Trunks → Choose Trunk
Disable Trunk No–> Yes, Submit / Apply Config
Enable Trunk Yes → No, Submit / Apply Config

After 30 seconds the Trunk is back Online and everything starts working as it should… For another month :')

Any Ideas ?

Are you sure? Not even OPTIONS (qualify) requests?
What does the Asterisk log show around the time of the failure?

AFAIK, Asterisk never gives up retrying a failed registration, unless Permanent Auth Rejection is Yes, but that’s not the default. Of course, it could be sending requests that are unreplied, or getting error responses.

If your PBX is behind a NAT, what router/firewall do you have? What VoIP-related settings in it?

Definitely need more information to help find the real fix

Next time the trunk dies, try to grab a sip capture in the failed state to be sure that your side isn’t still trying to send OPTIONS, and if it is, hopefully you can see what your provider is complaining about.

Also, grab your logs from /var/log/asterisk/full as soon as you realize the trunk is down and there may be something in there that will help troubleshoot.

Last time this happened, I believe there were OPTIONS requests, but nothing else.

I will get asterisk logs on the next failure.

I have the trunk “Max Retries” to 1000000 and “Permanent Auth Rejection” is set to no, so this should be ok.

It is behind a Watchguard T25 router, but to my knowledge, no VOIP specific options. I know it doesn’t have SIP-ALG enabled, but that’s about it.

By SIP capture, would logging in to asterisk with “asterisk -rvvv” and setting sip debug on be good enough or is there a better way ?

Will do.

yes, pjsip set logger on in your asterisk cli will add the sip traffic to your logs, but also tcpdump or pcaping would do the trick as well. alternatively, you could use the sngrep tool on the freepbx console too. My favorite trick is to pipe tcpdump output straight into wireshark on my windows machine via ssh in a command prompt:

ssh root@freepbxhostnameorIPhere "tcpdump port not 22 -s 0 -U -n -w -" | "C:\Program Files\Wireshark\Wireshark.exe" -k -i -

(if you don’t have ssh keys, don’t forget to put the password in your command prompt window after wireshark launches)

By doing the wireshark method, i don’t need to activate any SIP logging beforehand ?

I am pretty sure i don’t so I will be testing this to see what happens and where i need to enter the password.

Also, bonus question, if i enable SIP logging does it also write the SIP messages i see in the CLI to /var/log/asterisk/full ?

Correct, you’ll have the info in Wireshark, so you shouldn’t need to fill your logs with sip traffic

it will.

When did this last happen? By default, the Asterisk logs are kept for a week and even without the SIP trace, there is useful info there, e.g., no response, auth failed, etc.

This happened today, i will get the log and post a link. I’ve never posted logs but i have seen people use pastebin. Is that the way to go ?

I think i’ve got something interesting for you guys.

Here is what happens when it fails : Asterisk Log

WARNING[14022] res_pjsip_outbound_registration.c: No response received from ‘sip:PROVIDER-SERVERNAME:5060’ on registration attempt to ‘sip:username@PROVIDER-SERVERNAME:5060’, retrying in ‘60’

This repeats forever until i deactivated / reactivated my Trunk in the GUI.

This is often caused by a ‘poisoned’ NAT association being kept alive by aggressive retries.
The initial cause is typically a brief internet outage.

Try setting General Retry Interval to 600. Also set Qualify Frequency to 600, as the OPTIONS packets are sometimes also involved.

You may have to wait a month to see whether this helps, but (unless you’re a 24/7 shop) you could try an after-hours test (before making the changes) to see whether you can cause the issue at will. Disconnect the Ethernet cable between modem/ONT and the Watchguard, wait about three minutes, then reconnect it. This will often trigger the issue you are seeing.

I have changed these settings and will be testing this. Thank you !

If the post gets archived before I can get my answer, can I still mark this as Solution after ?

I think I might have ran into this issue myself. Sometimes after an Internet outage, the trunk will not re-register with the sip provider. A reboot fixes it right away though. However, sometimes I don’t get around to doing a reboot, and I have noticed that about 30 minutes later it will be working again. This has happened a few times where I don’t get to it right away, but it starts working at some later point by itself.

Does this 30 minute delay also fit in with the NAT issue that you mentioned? My re-registration period is set to around 5 minutes, so I know it isn’t the registration timeout causing it to start working again.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.