I have a PJSIP Trunk that will stop working every month or so. What seems to be happening is that my system will stop sending requests to my provider and my trunk will fail once the registration period ends. (Configured to 3600 seconds as of right now.)
Here is the easy fix I found :
Connectivity → Trunks → Choose Trunk
Disable Trunk No–> Yes, Submit / Apply Config
Enable Trunk Yes → No, Submit / Apply Config
After 30 seconds the Trunk is back Online and everything starts working as it should… For another month :')
Are you sure? Not even OPTIONS (qualify) requests?
What does the Asterisk log show around the time of the failure?
AFAIK, Asterisk never gives up retrying a failed registration, unless Permanent Auth Rejection is Yes, but that’s not the default. Of course, it could be sending requests that are unreplied, or getting error responses.
If your PBX is behind a NAT, what router/firewall do you have? What VoIP-related settings in it?
Definitely need more information to help find the real fix
Next time the trunk dies, try to grab a sip capture in the failed state to be sure that your side isn’t still trying to send OPTIONS, and if it is, hopefully you can see what your provider is complaining about.
Also, grab your logs from /var/log/asterisk/full as soon as you realize the trunk is down and there may be something in there that will help troubleshoot.
yes, pjsip set logger on in your asterisk cli will add the sip traffic to your logs, but also tcpdump or pcaping would do the trick as well. alternatively, you could use the sngrep tool on the freepbx console too. My favorite trick is to pipe tcpdump output straight into wireshark on my windows machine via ssh in a command prompt:
ssh root@freepbxhostnameorIPhere "tcpdump port not 22 -s 0 -U -n -w -" | "C:\Program Files\Wireshark\Wireshark.exe" -k -i -
(if you don’t have ssh keys, don’t forget to put the password in your command prompt window after wireshark launches)
When did this last happen? By default, the Asterisk logs are kept for a week and even without the SIP trace, there is useful info there, e.g., no response, auth failed, etc.
WARNING[14022] res_pjsip_outbound_registration.c: No response received from ‘sip:PROVIDER-SERVERNAME:5060’ on registration attempt to ‘sip:username@PROVIDER-SERVERNAME:5060’, retrying in ‘60’
This repeats forever until i deactivated / reactivated my Trunk in the GUI.
This is often caused by a ‘poisoned’ NAT association being kept alive by aggressive retries.
The initial cause is typically a brief internet outage.
Try setting General Retry Interval to 600. Also set Qualify Frequency to 600, as the OPTIONS packets are sometimes also involved.
You may have to wait a month to see whether this helps, but (unless you’re a 24/7 shop) you could try an after-hours test (before making the changes) to see whether you can cause the issue at will. Disconnect the Ethernet cable between modem/ONT and the Watchguard, wait about three minutes, then reconnect it. This will often trigger the issue you are seeing.
I think I might have ran into this issue myself. Sometimes after an Internet outage, the trunk will not re-register with the sip provider. A reboot fixes it right away though. However, sometimes I don’t get around to doing a reboot, and I have noticed that about 30 minutes later it will be working again. This has happened a few times where I don’t get to it right away, but it starts working at some later point by itself.
Does this 30 minute delay also fit in with the NAT issue that you mentioned? My re-registration period is set to around 5 minutes, so I know it isn’t the registration timeout causing it to start working again.