Issues connecting from external network after migration

Hi All

I attempted to migrate my vm running freepbx from vmware esxi to xcp-ng cluster and this didnt work as there where booting issues after so i have gone down the route of building a new server on xcp-ng

so what i have done is i have performed a backup of the old host. Built a new host along side, updated the modules on the new server so they are up to date. unregistered the old server from the gui then registered the new one. performed the restore from the backup. shut down the old server. Changed the static dhcp lease so the IP address will be recieved by the new host.

on the new host i can connect no issues internally on the network however externally that has worked on the old server is not working.

I can confirm the connection attempts are being recieved because i run tail -f /var/log/fail2ban and i can see the successful connection attempts associated with the external device.

under fail2ban i see
[2024-02-29 05:13:05] SECURITY[2477] res_security_log.c: SecurityEvent=“SuccessfulAuth”,EventTV=“2024-02-29T05:13:05.633+0000”,Severity=“Informational”,Service=“PJSIP”,EventVersion=“1”,AccountID=“101”,SessionID="[email protected]",LocalAddress=“IPV4/UDP/192.168.0.234/5060”,RemoteAddress=“IPV4/UDP/58.87.6.70/1030”,UsingPassword=“redacted”

under full logs i see it doing this
[2024-02-29 05:11:11] VERBOSE[9285] res_pjsip_registrar.c: Added contact ‘sip:[email protected]:1030;user=phone’ to AOR ‘101’ with expiration of 60 seconds
[2024-02-29 05:11:14] VERBOSE[9285] res_pjsip/pjsip_options.c: Contact 101/sip:[email protected]:1030;user=phone is now Unreachable. RTT: 0.000 msec
[2024-02-29 05:14:05] VERBOSE[2415] res_pjsip_registrar.c: Removed contact ‘sip:[email protected]:1030;user=phone’ from AOR ‘101’ due to expiration
[2024-02-29 05:14:05] VERBOSE[13543] res_pjsip/pjsip_options.c: Contact 101/sip:[email protected]:1030;user=phone has been deleted
[2024-02-29 05:14:22] VERBOSE[9285] res_pjsip_registrar.c: Added contact ‘sip:[email protected]:1030;user=phone’ to AOR ‘101’ with expiration of 60 seconds
[2024-02-29 05:14:25] VERBOSE[9285] res_pjsip/pjsip_options.c: Contact 101/sip:[email protected]:1030;user=phone is now Unreachable. RTT: 0.000 msec

What could be going wrong here?

I would check a couple of sections and verify that they are setup correctly for your environment.

1st the firewall settings inside of FreePBX under Connectivity → Firewall → Interfaces and Networks.

2nd Settings → Asterisk SIP settings and make sure the WAN and LAN IP addresses/networks are properly configured and restart asterisk once the settings have been saved and applied if anything needed to change.

If you continue having issues you’ll need to enable SIP logging and provide the output again.

In the Asterisk console type:

pjsip set logger on

And share the output with pastebin.

at present the local firewall is disabled for testing so it wont be the cause.
The network firewall is unchanged from the working configuration.

I have checked the sip settings and rebooted the server to restart asterisk with no success

Here is paste bin output.

in this logging i use one of my soft phones to dial the extension that is effected.
the only thing i notice out of the ordinary is this

[2024-03-01 02:13:18] WARNING[6144][C-00000007]: ast_expr2.fl:470 ast_yyerror: ast_yyerror(): syntax error: syntax error, unexpected ‘>’, expecting ‘-’ or ‘!’ or ‘(’ or ‘’; Input:
“”=“LIMIT” & 3 & 0 & >0 & 0>=
^
[2024-03-01 02:13:18] WARNING[6144][C-00000007]: ast_expr2.fl:474 ast_yyerror: If you have questions, please refer to Home - Asterisk Documentation
<— Transmitting SIP response (900 bytes) to UDP:192.168.0.1:54446 —>

What is the network topology here?

Are the endpoints on the same subnet as your server? Is the server hosted in the cloud?

Something is not configured correctly here as the phone is trying to connect to the server at an external IP and then I am also seeing a 192.168.1.x and a 192.168.0.x subnets in these connection attempts. Not really in a position to analyze this in depth but something with your network is not configured correctly.

My network local network where the pbx server is located is 192.168.0.0
The network 192.168.1.0 is a completely different network where the endpoint is located. its the local ip range at that site. so the traffic travels from that network out across the internet using a sip domain address that resolves to the external ip address of the home network. is routed on the ubiquity gateway to the ip address of the pbx server.

ok this seems to be a problem with the way the backup and restore process happens.

I just blew away the server and rebuilt from the ground up configuring exactly the same way as before and it now works

ok there seems to be some major issues with this most recent version.

I have rebuilt the environment and the sip trunking doesnt work anymore.

I have confirmed the settings are all the same. can dial out but not recieve any incoming calls.

There seems to be something that is not being passed through to the trunk provider. dispite the configuration being identical.
there are however two new options that where never there in older versions.

I have also powered up the old host with a different ip address and interestingly just having it on sends whatever registration request is needed by the voip provider to get the lines working properly.

This isnt a solution really i need to repurpose that hardware and im not leaving that running. Is this a bug?

You have SIP ALG enabled in your Ubiquiti, which I suspect is causing trouble. Try disabling it. Also, make sure that you don’t have the 30 second UDP stream timeout (the default of 180 is ok). https://community.ui.com/questions/Disable-SIP-ALG-on-USG/6ce1f278-e658-4ac8-8063-2c60696cbcb6

ok i might be just being stupid here but why would that setting be only effecting this new server? when i power up the old server that still sits behind the ubiquity network in this configuration it works?

I mean if this was an issue across the board i would accept it to be an issue on the network. My issue is everything but the os deployment is the same. that points to the issue not being network related.

OK, so if you insist that we prove that the ALG is causing trouble:

First, does anything appear in the Asterisk log (including pjsip logger) on an attempted incoming call? If so, paste that and we can see whether it is corrupted.

Otherwise, capture traffic on the WAN interface and look at the successful REGISTER going out (the one that gets a 200 OK response), specifically the Contact header. It should contain your public IP and some port number. Then call in and you should an INVITE from the provider directed to that port, which the firewall should forward to port 5060 on the PBX.

Yes please because if i am going to be making unsupported changes to a closed ecosystem i need to be able to justify that.

and unfortunately that doesn’t appear to be the cause at this stage because the only thing that has changed is this server

no. nothing is coming inbound because the trunk provider has said the pbx is not registering on their system. unfortunately they just ran down the clock without resolving.

however they said their system is not passing anything to the pbx because they are not receiving a registration request.

ok i have begrudgingly gone through those instructions provided and alg is already disabled so that cant be the problem.

i went through the steps anyway then retested and the issue remains.

Does the PBX show that the trunk is registered ok? If not, if it is sending requests, what shows in Reports → Asterisk Info → Registries and what replies is it getting?

If the PBX does think it’s registered ok, what do the requests going out on the WAN interface look like? Does the Contact header have the correct public IP address and port?

<Registration/ServerURI…> <Auth…> <Status…>

number redacted/sip:siptcpeast.simtex.com.au:5062 number redacted Registered (exp. 3580s)

now here is something interesting.

I just changed the virtual network the vm is connected to, there are two in there
one is HB_VLAN100
the other is LB_VLAN100
both configured the same with the exception one has mtu 1500 being low bandwidth gb connection the other is a 10gb fiber connection with mtu 9000. when connected to the high bandwidth link this issue seems to be present when connected to the 1gb ethernet link then it registers and inbound routes seem to work (well it does what i have told it to do that is route to voice mail while testing.) If this means i have to commission another dedicated ethernet link so be it but I was hoping to avoid doing that if possible

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.