SIP trunks not re-registering after failing

pbxact
Tags: #<Tag:0x00007f7030ed6a00>

(Brad Facer) #1

I have a few of these installed and using SIP trunks, using pjsip in the configuration It varies from site to site, but one recurring issue is that the trunks get un-registered and do not come back on their own. Some sites have not had this issue, one site it’s happened one time, another site it happens almost weekly.

Usually this seems to be due to interruptions in network causing the trunk to go down, or power issues causing a brief outage.

When this happens, i go in manually and restart the trunks(change a authenticaion name, and then change it back) and they’re back to normal

There has to be a way to automate this so that when it drops it knows to try restoring it. I see an option for “monitor trunk failures” but that is asking for a custom script. I upped the “max retries” from 10 to 1000, hoping that maybe the 10 was too quick for the network to come back


(Dave Burgess) #2

“fwconsole reload” should get you back online pretty quickly, if history is any indication. We’ve been hearing about this problem, and if you look back through the archives, you should find a few places where we’ve talked about it at length.


(Jared Busch) #3

That is not for checking if a trunk is up or not. That is fired when an outbound call fails on the trunk.


(Jared Busch) #4

Set maxretries to 0 to never stop trying.


#5

If you have a static (or de facto static) IP address and your provider supports it, use IP authentication instead of registration.

Strange; I believe the default is 10000.

The usual cause of this issue is a ‘poisoned’ NAT association kept alive by aggressive retries. Router/firewall make/model? In general, you want to disable any SIP ALG or source port rewriting and enable consistent NAT. The UDP ‘unreplied’ timeout should be less than half the retry interval but the ‘assured’ timeout should be at least twice as great. If you can’t control these, try setting all three Retry Intervals and Qualify Frequency to 600. That should allow sufficient time for the router to time out any bad entries.

If you still have trouble, look at whether the log shows no reply, or an error response, to the failed register attempts.