Trunk unregistering

VoipMS as provider setup as PJSIP trunk on PBXACT UC 25 behind NAT
Tried setting registration expiration to 60, 100, 120, 300, 3600… (both on VoipMS provider and PBX side), all end up with the same outcome:

PBX unregisters after some time; this can be 30mins, 1 hr, 24hrs… it will eventually happen. VoipMS already did traces and checked logs, this is what they replied finally:
“I’ve ran several traces, your device is not sending any request to registration. There is no setting on our side to may cause your device to unregister, I’d suggest to check device and network configuration.”

PBX is using port 5080 to communicate with VoipMS server. Firewall is not blocking any traffic.
My question is - why is FreePBX failing to send re-registration attempts and how to fix it?

Registration logs attached.
registrationlogs.tgz (1.4 KB)

Below are the trunk’s relevant PJSIP settings from FreePBX:

Username: SIPACCOUNTNUMBERFROMPROVIDER
Auth username: SIPACCOUNTNUMBERFROMPROVIDER
Secret: SIPACCOUNTPASSWORDFROMPROVIDER
Authentication: Outbound
Registration: Receive
SIP Server: toronto3.voip.ms
SIP Server Port: 5080
Context: from-pstn
Transport: 0.0.0.0-udp
Send Line in Registration: Yes
Send Connected Line: No
Permanent Auth Rejection: No
Forbidden Retry Interval: 20sec
Fatal Retry Interval: 20sec
General Retry Interval: 20sec
Expiration: 100 (also tried 60, 120, 300, 3600…)
Max Retries: 5
Qualify Frequency: 5
User = Phone: No
From Domain: WAN IP of PBX
Match Inbound Authentication: Default
Message Contect: (this field is blank)
Codecs: ulaw, g729

Asterisk General SIP Settings:

External Address: WANIPOFPBX
RTP Port Ranges: 10001-20000
RTP Checksums: Yes
Strict RTP: Yes
RTP Timeout: 30
RTP Hold Timeout: 300
RTP Keep Alive: 1

Asterisk SIP Settings chan_pjsip:

Allow Transports Reload: No
Keep Alive Interval: 45

Advanced Settings:
SIP nat: yes

Settings on VoipMS (provider) side:

NAT: yes
Max Expiry: 100 (also tried 60, 120, 300, 3600…)
Encrypted SIP Traffic: No
RTP Time Out: 30
RTP Hold Time Out: 300

No issues at all with calls dropping or audio loss; the calls (both incoming+outgoing) are perfect when the PBX is registered. Issue is that it unregisters every day several times!

The log you attached (showing unreachable / reachable) is almost certainly related to slow responses to OPTIONS (qualify) sent by Asterisk, unrelated to registration. This is common because providers de-prioritize OPTIONS so they can respond more quickly to real requests.

Try adding the following to /etc/asterisk/pjsip_custom_post.conf:

[voip.ms](+type=aor)
qualify_timeout=16.0

then restart Asterisk.

Your config shows Registration: Receive, but it should be Send. I assume that’s a typo, because incoming calls would not work at all when set to Receive (unless they come in on a different trunk or by SIP URI).

Loss of registration is typically caused by a ‘poisoned’ NAT association in your router/firewall (caused by a brief loss of connectivity) being kept alive by aggressive retries. Assuming that you actually are losing registration (shows not registered on VoIP.ms side) as opposed to the trunk merely showing unreachable, the settings below should permit recovery:

Forbidden Retry Interval: 600
Fatal Retry Interval: 600
General Retry Interval: 600
Expiration:120
Qualify Frequency: 600

This won’t help the underlying issue, but by waiting 10 minutes to retry, the bad NAT association should time out and the next registration attempt should succeed. Of course, you will be unable to receive calls for up to 10 minutes when this happens. If it’s more than a few times per year, you should try to find what is going wrong. Router/firewall make/model? VoIP-related settings? ISP?

1 Like

Thanks, Stewart.

I’ve added what you mentioned to pjsip_custom_post.conf

Yes I had a typo in my config above, thanks for noticing; the registration is indeed set as SEND…

Now I will monitor after these changes were applied and if it starts unregistering again I will follow up.

If anyone has any more tips/things to tweak in these settings that is also appreciated.
Thanks

Can you also explain how you determine it has unregistered and what is happening as a result? Does it return to a working state by itself? Do you have to restart things?

I find out that it is unregistered by:

  1. when you try to place outbound call it says ‘all circuits are busy, please try your call again later’, and inbound calls don’t work/go to the failover dest. set on the provider’s side (because PBX is unreachable)
  2. I visit the provider voip.ms portal and see that the PBX is unregistered (their portal displays trunk registration status including IP/port/next registration time etc.) I then visit PBX and check logs/run trunk registration status commands in Asterisk to confirm that it is indeed unregistered.

To force trunk on the PBX to re-register (to be able to make calls again), I go into the trunk’s settings (click Edit), don’t make any changes and click Submit, then Apply Changes and it re-registers
I haven’t seen it return to working state by itself yet. Since the phone system needs to work, I didn’t have time to let it sit and wait for the outcome (people need to use the phone system all the time at this location)

Ok update - the trunk is still unregistering. See attached registration logs for today.
registration logs second.tgz (985 Bytes)

Please advise what to do

This is all that you provided:

1	[2022-05-31 09:51:12] VERBOSE[1204] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Unreachable	
2	[2022-05-31 14:21:12] VERBOSE[6288] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Unreachable	
3	[2022-05-31 15:31:12] VERBOSE[1204] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Unreachable	
4	[2022-05-31 16:41:12] VERBOSE[12163] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Unreachable	
5	[2022-05-31 17:11:12] VERBOSE[12163] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Unreachable

You need to provide a lot more logging. You definitely need to find log lines saying why it is unregistered, and you probably need to capture the SIP protocol for the railed re-registration.

Search the log for ‘regist’ (without the quotes) and post what you find.

Also, at the Asterisk command prompt, type
pjsip show aor voip.ms
to confirm that qualify_timeout actually changed to 16.

1 Like

However, the only place that AST_ENDPOINT_ONLINE gets unset appears to be in a file that handles OPTIONS (res/res_pjsip/pjsip_options.c), and AST_ENDPOINT_ONLINE is what controls the production of this message, it looks to me as though this log may be refuting the theory that the trunk is unregistering.

FYI I changed the trunk name to voipms (from voip.ms) to make it easier to distinguish between the domain and the trunk’s name in the PBX logs etc.

Ok I checked the logs again and it looks like there is a pattern…

After yesterday’s changes, the trunk de-registers on times ending in 1 min 12secs and then EXACTLY 10 minutes later, it re-registers:

i.e. de-registered at 09:51:12, then re-registered at 10:01:11.
de-registered at 14:21:12, then re-registered at 14:31:10
de-registered at 15:31:12, then re-registered at 15:41:12
de-registered at 16:41:12, then re-registered at 16:51:12
de-registered at 17:11:12, then re-registered at 17:21:21

Now this definitely looks more like a PBX-config related problem
Yesterday was advised to change retry intervals to 600secs (10 mins) and the trunk is automatically re-registering exactly 10 mins later

@Stewart1
Forbidden Retry Interval: 600
Fatal Retry Interval: 600
General Retry Interval: 600
Expiration:120
Qualify Frequency: 600

in response to

@Stewart1
Also, at the Asterisk command prompt, type “pjsip show aor voipms”

I got the following:

Aor: <Aor…>
Contact: <Aor/ContactUri…> <Hash…> <RTT(ms)…>

Aor: voipms 0
Contact: voipms/sip:[email protected]:5080 3a7d78a806 NonQual nan

ParameterName : ParameterValue

authenticate_qualify : false
contact : sip:[email protected]:5080
default_expiration : 3600
mailboxes :
max_contacts : 0
maximum_expiration : 7200
minimum_expiration : 60
outbound_proxy :
qualify_frequency : 600
qualify_timeout : 16.000000
remove_existing : false
remove_unavailable : false
support_path : false
voicemail_extension :

in the asterisk logs it only specifies Unregistered then it specifies registered 10 mins later like below:

[2022-05-31 09:51:12] VERBOSE[1204] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Unreachable
[2022-05-31 09:51:12] VERBOSE[1204] res_pjsip/pjsip_options.c: Contact voipms/sip:[email protected]:5080 is now Unreachable. RTT: 0.000 msec
[2022-05-31 10:01:09] VERBOSE[1204] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Reachable
[2022-05-31 10:01:09] VERBOSE[1204] res_pjsip/pjsip_options.c: Contact voipms/sip:[email protected]:5080 is now Reachable. RTT: 26.343 msec

It says absolutely nothing about being unregistered or registered!

[2022-05-31 09:51:12] VERBOSE[1204] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Unreachable
[2022-05-31 09:51:12] VERBOSE[1204] res_pjsip/pjsip_options.c: Contact voipms/sip:[email protected]:5080 is now Unreachable. RTT: 0.000 msec
[2022-05-31 10:01:09] VERBOSE[1204] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Reachable
[2022-05-31 10:01:09] VERBOSE[1204] res_pjsip/pjsip_options.c: Contact voipms/sip:[email protected]:5080 is now Reachable. RTT: 26.343 msec

Exactly, unreachable is not unregistered.

ok my bad it was a typo
I mixed up the two terms because voipMS showed it as unregistered but asterisk logs on PBX show voipMS as unreachable

the logs show unreachable then reachable exactly 600secs later
kindly advise

Although the qualify and registration issues are likely caused by the same problem, if you are actually losing registration, the Asterisk log will show it. Search for ‘regist’ (without the quotes).

The only registration log I could find for the trunk is:
[2022-05-31 17:16:10] ERROR[1204] res_pjsip.c: Endpoint 'voipms': Could not create dialog to invalid URI 'voipms'. Is endpoint registered and reachable?

However I noticed the following…

I recalled that originally I set the following in pjsip_custom_post.conf:

[voip](+type=aor)
qualify_timeout=16.0

However the above was a typo on my part because there is no ‘voip’ trunk, it is ‘voipms’
so I recalled that the ‘pjsip show aor voipms’ command originally showed the qualify_timeout as 3, but then after your post today, I fixed the entry in pjsip_custom_post.conf to be [voipms](+type=aor) qualify_timeout=16.0 which changed the qualify_timeout to 16.000000 in the ‘pjsip show aor voipms’ command. My bad there.

So since the qualify_timeout is 16 now for sure (finally), are there any other tips right now? If it unregisters again (which results in no outbound/inbound calls using the trunk) I will chime back here… Thank you all for your support

Being unregistered will stop inbound calls but doesn’t necessarily stop outbound calls. Being unreachable will stop outbound calls, but not inbound calls. However, being unreachable is also stopping the re-registration, if that happens during the outage.

1 Like

Ok just now I experienced first-hand the trunk losing connection. It displayed as “no registration found” on provider’s status panel.

This time, no matter how many times I went into edit the trunk, click submit, then Apply Changes, it doesn’t re-establish connection. I had to reboot the entire PBX for it to re-establish connection.

And it was still becoming unreachable then reachable every few hours for ten minutes:

1	[2022-06-01 03:26:32] VERBOSE[32628] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Unreachable	
2	[2022-06-01 03:36:28] VERBOSE[27872] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Reachable	
3	[2022-06-01 04:06:32] VERBOSE[32628] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Unreachable	
4	[2022-06-01 04:16:16] VERBOSE[20463] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Reachable	
5	[2022-06-01 06:46:32] VERBOSE[10772] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Unreachable	
6	[2022-06-01 06:56:16] VERBOSE[2762] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Reachable	
7	[2022-06-01 07:46:32] VERBOSE[32628] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Unreachable	
8	[2022-06-01 07:56:16] VERBOSE[27872] res_pjsip/pjsip_configuration.c: Endpoint voipms is now Reachable

This time the qualify frequency for the trunk was definitely 16 secs.

Please advise how I can stop the trunk from becoming unreachable.

Fix the network problem. That’s the only really good solution.

You can suppress some of the symptoms by disabling qualify, but if you try to make a call when it really is unreachable, you may have to wait 30 seconds before the call fails. Note that you might be relying on qualify to keep temporary NAT and firewall rules live.

You can reduce the qualify interval so it detects the recovery of the network faster.

You can suppress the registration failures by using IP authentication, with voip.ms, which will mean you don’t need to register.

See my other thread on this issue being solved with the solution: