FreePBX Losing SIP/PJSIP Registrations over time

Hello!

I have an issue that has been bugging me for months, which I haven’t found a solution for. It started happening maybe a year ago (or after the last major upgrade from v15 to v16, but I can’t confirm this is exactly the case).

FPBX loses registration status for both PJSIP and SIP trunks overtime. It is random though. Once all trunks are successfully registered (I have 5 PJSIP and 5 SIP), ever 2 or 3 days or so, one or more of the PJSIP start showing the status “Rejected”. One or more of the SIP ones show the status “Request Sent”. It is very random.

The ONLY way I’ve found that would make them all re-register successfully is:

  • Change my DHCP Settings to force FPBX get a new internal IP address. Once this happens, 9 out of 10 times, all trunks register correctly, and the cycle starts again (in which they degrade overtime and I have to switch back to the previous IP, allowing them to re-register again).

FPBX is running as a Hyper-V VM that I backup twice a week (Hyper-V Export). To automated this process and not spend 30 minutes on it, I created a script that would switch back between assigning one of 2 internal IPs I have so FPBX can register the trunks correctly (Of course, changing this ip requires me to reboot all ip phones and devices connected to the FPBX so they get the new IP address. This usually happens at night).

Now the question I have for this group is: Where do I start troubleshooting to understand why are registrations are being lost (but maybe more importantly, why can I get FPBX to work fine again right after I switch to a new internal IP address?). It seems like something inside FPBX is cleared or flushed with the new IP, causing routes to work (something that doesn’t happen with a simple reboot).

I’ve tried pinging the trunks from FPBX and they all respond fine (even when the registrations fail).

Thoughts?

Thank you!

The typical cause: A NAT association in your router/firewall somehow becomes ‘poisoned’ and aggressive retries from the PBX keeps it alive. I suspect that if instead of changing the local IP, you simply shut down the PBX for 20 minutes (allowing all associations to time out), it would work fine when you start it up again.

Router/firewall make/model? Confirm that any SIP ALG or similar is disabled. Confirm that firmware is up to date. Post any VoIP-related settings such as port forwarding. Does it have a public IP address on its WAN interface? If not, please explain (ISP does NAT, ISP gateway can’t be placed in bridge mode, etc.)

Is the VM using bridged networking (PBX local address is in the same subnet as the Hyper-V host)? If not, explain why.

Why do you have multiple trunks with the same provider (they don’t support multiple numbers on the same trunk, pricing advantage, etc.)?

Why are you using registration at all (you don’t have a static IP address, provider doesn’t support IP authentication, etc.)?

At the Asterisk command prompt, type
pjsip set logger on
sip set debug on
which will cause SIP requests and responses to appear in the Asterisk log (along with the regular entries). Then, when the trouble occurs, you can see why the response was classified as Rejected.

Thanks Stewart! Here are some answers to your questions:

  • I’ll try shutting down the PBX for 20 minutes to see if that makes a difference.

  • My firewall is “Sophos XG Firewall v18”. I’ll need to check more about the “SIP ALG”. Firmware is up-to-date. It has a public IP address on its WAN interface.

  • The VM is using bridged networking (PBX IP on the same subnet as VM host).

  • The reason I have multiple trunks with the same provider is because that’s how I was able to delay a bit further having to deal with this issue. I know is not ideal, but now I have it so if provider B using PJSIP fails, I can still make calls through the SIP trunk. Once I am able to fix this problem, I’ll only use one trunk per provider.

  • I am using registration because my WAN IP is “sticky” but not static. I can have it for months, but is not guaranteed to change. VoIP providers don’t usually take DNS names (they use IPs instead). If I can make them take DNS names, I can solve this issue (since I have dyndns setup).

After enabling logging and debugging, this is how the only PJSIP trunk that is showing as rejected (as of now) is coming in the logs:

<— Transmitting SIP request (472 bytes) to UDP:208.100.60.29:5060 —>
OPTIONS sip:[ACCOUNT_NAME]@dallas1.voip.ms:5060 SIP/2.0
Via: SIP/2.0/UDP [MY-WAN-IP]:5060;rport;branch=z9hG4bKPj88c6c13c-1907-44e5-9ccc-50f0963aa98c
From: <sip: [TRUNK-NAME]@[LOCAL-PBX-IP]>;tag=fa0251c4-3674-44f5-8bce-6704c17349a0
To: <sip: [ACCOUNT_NAME]@dallas1.voip.ms>
Contact: <sip: [TRUNK-NAME]@[MY-WAN-IP]:5060>
Call-ID: f5d3e957-deba-4dba-bc0a-f8fe8d6e33ba
CSeq: 28245 OPTIONS
Max-Forwards: 70
User-Agent: FPBX-15.0.17.55(16.20.0)
Content-Length: 0

<— Received SIP response (570 bytes) from UDP:208.100.60.29:5060 —>
SIP/2.0 200 OK
Via: SIP/2.0/UDP [MY-WAN-IP]:5060;branch=z9hG4bKPj88c6c13c-1907-44e5-9ccc-50f0963aa98c;received=[MY-WAN-IP];rport=1024
From: <sip: [TRUNK-NAME]@[LOCAL-PBX-IP]:5060>;tag=fa0251c4-3674-44f5-8bce-6704c17349a0
To: <sip: [ACCOUNT_NAME]@dallas1.voip.ms>;tag=as27835d10
Call-ID: f5d3e957-deba-4dba-bc0a-f8fe8d6e33ba
CSeq: 28245 OPTIONS
Server: voip.ms
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO, PUBLISH, MESSAGE
Supported: replaces, timer
Contact: <sip: 208.100.60.29:5060>
Accept: application/sdp
Content-Length: 0

But you have 10 trunks – does that mean you have 5 providers? There should be no reason to have (for example) multiple pjsip trunks connecting to the same VoIP.ms server; several DIDs can route to the same subaccount. If you are using multiple VoIP.ms servers, then (from Asterisk’s point of view) they are essentially separate providers and should not cause trouble.

You posted an OPTIONS request (caused by qualify) and an OK response; you need to find a REGISTER and the related response(s). However, I did notice

so the Sophos rewrote the source port from 5060 to 1024. This can be an issue for Contact headers, etc. If there is a firewall setting to avoid this (sometimes called ‘consistent NAT’), try it. Otherwise, forwarding UDP port 5060 to the PBX should prevent that. You could limit the forwarding to source addresses from which VoIP.ms sends calls, or forward everything but make sure that FreePBX Firewall is set to block unwanted access.

As a result of the recent DDoS attack on VoIP.ms, they have implemented some security measures that may not be properly adjusted yet. Possibly, they are rejecting your valid REGISTER requests because they see ‘excessive’ traffic. How many of these ten trunks are registering to the same server? Expiry? Qualify frequency?

But you have 10 trunks – does that mean you have 5 providers? There should be no reason to have (for example) multiple pjsip trunks connecting to the same VoIP.ms server; several DIDs can route to the same subaccount. If you are using multiple VoIP.ms servers, then (from Asterisk’s point of view) they are essentially separate providers and should not cause trouble.

Yes, there’s in fact 10 trunks (5 SIP and 5 PJSIP). Of those, I realized 2 (one SIP and one PJSIP) I don’t use anymore. The other ones are needed since they provide different calling rates for different destinations. Also, I have them as a backup. If one provider is down, the outbound rules will try the next one available.

Out of the 8 trunks, only 2 (voip.ms) are the ones where there can be inbound calls. Is worth noting that even though I have different trunks with the same provider, they are using different sub_accounts (and hitting different SIP servers in different locations), so there shouldn’t be any conflicts.

How many of these ten trunks are registering to the same server? Expiry? Qualify frequency?

At most, 2 trunks (one PJSIP, one SIP) are registering to the same server using separate sub_accounts. The are set to re-register every 120 seconds (might not be ideal, but was part of my troubleshooting on trying them to stay registered. If you think I should change to a different value, let me know. The SIP trunk has “qualify=yes”, and the PJSIP has 60 as Qualify frequency.

Something worth noting is this issue is happening on ALL providers I use (2 more), not just voip.ms (so I don’t believe this has anything to do with Voip.ms hardening their environments due to the recent attacks).

I am leaning more towards your first assessment of the firewall somehow blocking the access (or NAT poisoning), but I want logs on the PBX to show that before I can look at changing settings on the firewall.

Here’s another log. Let me know if there’s anything else you spot here (and thanks for the prompt help!)

Something worth noting was that while I was capturing the logs, I opened the failing trunk settings, didn’t change anything and saved/applied the changes (it shouldn’t have been any), and the trunk registered successfully again… weird.

Log:

<— SIP read from UDP:208.100.60.29:5060 —>
SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP [LOCAL_DEVICE_IP]:5160;branch=z9hG4bK7353e256;received=[LOCAL_DEVICE_IP];rport=5160
From: <sip: [email protected]>;tag=as1dfeeffb
To: <sip: [email protected]>;tag=as548e41e2
Call-ID: [email protected][::1]
CSeq: 5119 REGISTER
Server: voip.ms
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO, PUBLISH, MESSAGE
Supported: replaces, timer
WWW-Authenticate: Digest algorithm=MD5, realm=“dallasnew1.voip.ms”, nonce=“6f825b71”
Content-Length: 0

<------------->
— (11 headers 0 lines) —

Responding to challenge, registration to domain/host name dallas1.voip.ms
REGISTER 12 headers, 0 lines
Reliably Transmitting (NAT) to 208.100.60.29:5060:
REGISTER sip:dallas1.voip.ms SIP/2.0
Via: SIP/2.0/UDP [WAN-IP]:5160;branch=z9hG4bK3c4fc400;rport
Max-Forwards: 70
From: <sip: [email protected]>;tag=as1dfeeffb
To: <sip: [email protected]>
Call-ID: [email protected][::1]
CSeq: 5120 REGISTER
Supported: replaces, timer
User-Agent: FPBX-15.0.17.55(16.20.0)
Authorization: Digest username=“SUBACCOUNT_3”, realm=“dallasnew1.voip.ms”, algorithm=MD5, uri=“sip:dallas1.voip.ms”, nonce=“6f825b71”, response=“3fffee150bd689e476d01210bd54a5a9”
Expires: 120
Contact: <sip: [email protected][WAN-IP]:5160>
Content-Length: 0


<— SIP read from UDP:208.100.60.29:5060 —>

SIP/2.0 200 OK
Via: SIP/2.0/UDP [LOCAL_DEVICE_IP]:5160;branch=z9hG4bK3c4fc400;received=[LOCAL_DEVICE_IP];rport=5160
From: <sip: [email protected]>;tag=as1dfeeffb
To: <sip: [email protected]>;tag=as548e41e2
Call-ID: [email protected][::1]
CSeq: 5120 REGISTER
Server: voip.ms
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO, PUBLISH, MESSAGE
Supported: replaces, timer
Expires: 120
Contact: <sip: [email protected][LOCAL_DEVICE_IP]:5160>;expires=120
Date: Tue, 19 Oct 2021 17:42:07 GMT
Content-Length: 0

This shows that SIP ALG is enabled on the Sophos, since VoIP.ms could not have possibly received the request from the PBX local IP. See
https://support.sophos.com/support/s/article/KB-000035917?language=en_US
for checking that and turning it off. (If turning it off causes problems with no audio, you will likely need to forward the RTP port range to the PBX.)

If I understand your trunk setup correctly, I would assume that you have only two registrations, both to VoIP.ms, one from pjsip and one from chan_sip. If things are working correctly, there should be no conflict.

Very few providers require registration for outbound calls. If you are registering to those providers, try turning that off and confirm that you can still call out.

Thanks Stewart. I disabled SIP ALG (it was in fact enabled) and also increased the UPD stream timeout value from 60 to 150 on the firewall as well. I checked the debug output again, and this time it was like this:

<— SIP read from UDP:208.100.60.29:5060 —>

SIP/2.0 200 OK
Via: SIP/2.0/UDP [WAN-IP]:5160;branch=z9hG4bK3c4fc400;received=[WAN-IP];rport=5160

If I understood you correctly, this should be better. all trunks are registered after these changes arid rebooting both the firewall and the PBX. I’ll let you know if registration issues keep coming up. I think we are moving in the right direction.

Thanks again for the help!

Update, as of today (2 days later) only the voip.ms PJSIP trunks are getting a status of “Rejected”. To fix, what I do is to try editing the Trunk, and with no changes, click “Submit” and “Apply Changes” and it gets registered again.

The rest seems to be showing “Registered”. I’ll keep monitoring to see if this is related to just voip.ms.

Is there a way to get notified when a trunk’s status changes?

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.