How a cloudbased PBX reaches phones that are behind NAT, am I correct?

OK so I’ve been trying to wrap my head around why my setup functions:

Normally, sending any type of request to a public IP address for a home or business router is dropped unless you explicitly open up a port on your router and forward it to an internal IP address within your LAN.

My phones are in a LAN, behind NAT and the PBX in on a public IP address on a VPN. When someone calls my company, the call travels to the trunk, then to the PBX and the PBX in turn, signals the phones to ring.

So that’s the part that baffles me, since the PBX can’t just send a signal to my router’s public (WAN) IP address. Well, it can, but it will be dropped. So for this to work: does the SIP protocol mean that the phone registers by sending a request similar to an HTTP request to the PBX and tell the NAT router to wait a really, really long time for an answer?

With the above I mean, the PBX doesn’t really send an unsolicited signal to my router, but an answer to a request from a relatively (relative to for instance HTTP requests timing out) old SIP request.

Am I right in assuming that this is how the PBX is able to reach my phones on incoming calls in the first place?

There might be better ways to explain it: You have the gmail app on your phone, but you don’t need to open any ports on your phone to allow “receiving email” your mail client has a connection to the mail server, it notifies you once there’s new mail for you on the server and it allows you to view/send/manage mail in the mail client.

Essentially same thing is with SIP clients. Your phones has a live session to the SIP Server, and of course for that to happen you have to allow the SIP Ports on the Firewall the SIP Server is behind of.
Once there’s a new call the SIP Server “tells the phone” via its live session that there’s a new call. So no need to open any ports on the client side.

However, in some cases, there are routers/firewalls that can kill live SIP Sessions resulting in no incoming calls. Then you need to play with the SIP Settings on the client router/firewall side.

Hope I came across clear.

Not sure how accurate this is, but take a look: https://www.asteriskpbxsystems.com/sip-signaling.html

Well I’m not really sure this is correct to be frank. I mean e-mail is either a response to an IMAP request or pushed through a push-protocol (like a websocket). But I don’t think SIP is a modern enough protocol to allow push and I certainly don’t have any ports open on my router.

The way @Stewart1 explained it (if I’m correct), is that the router keeps a port ‘available’ for a certain phone, as long as the incoming message from the PBX has the right ‘nonce’. The problem I sometimes have is that the PBX still thinks it’s using a valid port/nonce combination while the router has already deleted it. If I’m correct, and that’s what I’m wondering about.

Thanks for the link!

The SIP client continuously sends REGISTER packets to the PBX and the PBX continuously sends SIP OPTIONS packets to the client. Most of the time this is enough to keep the NAT session alive in the client router, but sometimes there are issues and you have to tweak the default settings to increase the frequency. You can see the packets for yourself by looking at the signalling.

Cool post! So the OPTIONS packets are in response to the REGISTER packets? Anyway, let me read the post first and then ask questions.

1 Like

They aren’t strictly in response. They are sent once registered periodically to make sure the remote endpoint is still present, and to keep any NAT mappings open. The act of registering just provides the address information so that OPTIONS requests know where to be sent.

Then my knowledge of routers is lacking; won’t a router drop anything that isn’t traceable to a request made from inside the LAN except for when you specifically opened up ports?

For UDP routers map an external port to an inside IP address and port. This mapping is kept alive for a period of time, and the lifetime is refreshed as traffic flows. For example, a router may keep it alive for 180 seconds, then when traffic flows through it (be it traffic from inside or outside) at 60 seconds it then goes back up to expiring in 180 seconds.

In the case of SIP since the inside device sent a REGISTER externally the router establishes a port mapping, so traffic from outside goes back to the inside device. The OPTIONS then ensures that this mapping remains valid.

Yeah the UDP part is mostly the call itself I guess.

I think we’re roughly talking about the same things when it comes to registering and keeping some entry open to the outside world in the routing table. The problems I experience are mostly due to a mismatch between the OPTIONS packet and the routing table of the router, causing phones to unregister.

The most crude (and effective) way of solving this would just be to take off the power to the router at night so that the routing table is emptied.

Keep in mind what the other posts have told you is a great simplification of how router translation code words. In reality the translator has code to identify it’s a sip packet and there’s udp packets associated with a registration and to not reject them since they are connectionless. Phones also can mess around with the sip packets if they think they are behind a translator in an effort to “help” the connection stay open and the routers have to know about that, too.

For example with a Cisco IOS router, if you port forward 5060 from the outside to the inside you can connect SIP phones on the outside to a PBX on the inside no problem. If you change that port forward from 5060 to 5061 then it won’t work even if the pbx and phones are reconfigured to use the new port. That’s because the port forward code isn’t just looking at a strict map & header rewrite that would work for any tcp service, it’s doing something special for a forward on 5060 that it won’t do for any other forward. Similarly, with the dd-wrt firmware some versions will work with SIP going through them other versions with all other config being identical, the SIP calls fail out in the middle of the call.

The simplified explanation is useful to pretend we know what’s going on to set the router config properly but you will never really know exactly what is going on unless you read the router’s source code (and you won’t get access to it for many devices)

I learned a long time ago with problems getting SIP across translators to just keep trying different configs, different phones, different router firmware. If you try to use logic to solve these problems you will never get anywhere.

That’s… alarming :wink:

The router in this specific case is a ISP provided Fritz Box from AVM, not much you can do with it. Isn’t SIP a protocol that’s higher up the OSI model that transport protocols like TCP or UCP? I mean SIP packets can travel over both TCP and UDP right? Depending on what you’re doing, either setting up registration or a call versus actually calling?

In the end, it seems a thorough knowledge of the SIP protocol is the only way to truly debug this sort of thing properly.

Wrong:

https://www.edpnet.be/en/support/installation-and-usage/internet/manage-fritz!box/how-do-i-configure-my-fritzbox-in-bridge-mode.html

considering the box has a voip phone and a sip server in it I am sure it is really screwing around with SIP. If you are forced to use it then turn all that garbage off, put it into bridged mode, and put a real router behind it.

Yeah sure you can bridge it but that’s not what I meant with ‘do with it’. What I meant there was change settings that would better support the setup in use. Indeed to have that kind of flexibility you’re better off setting up a proper router and bridging the ISP box. Which, if I was to setup more of these installations would surely be something I would do.

I found this by the way, really cool.

I’ve seen hundreds of tickets associated with routers running SIP ALG (variously called, SIP Inspection, VOIP optimization, etc) and in every single case except one, the fix is to just disable SIP ALG on the router and configure the PBX correctly for NAT. I can’t think of a single reason why one should put any effort into making SIP work with the ALG enabled.

6 Likes

So that’s a router’s attempt to mangle the packets and ‘improve’ them, but actually making it worse?

Correct. Possible workarounds:

Setting up NAT traversal on the phone might avoid further modifications by the router.

Use nonstandard ports for both the phone’s local port and the PBX. The ALG on most routers is port based.

SIP over TCP or TLS. Most routers won’t butcher TCP; modifying TLS is impossible.

VPN client on phone to VPN server on PBX.

VPN client on auxiliary router provides non-NAT connection for all phones on the LAN.

The above quote is incorrect, and it is the reason why you don’t understand why your setup works. You need not explicitly open up a port to allow incoming traffic to traverse your router and end up at your PBX or another device on your network. If that were the case, your computer could never receive a response when you reached out to a web-site, your Fire TV could never receive a response from Netflix when you pushed “play” on a movie, and your PBX could never receive a call.

Most modern routers use Linux at their core. Linux typically uses iptables to manage its NAT and Firewall. Iptables has a firewall rule entitled “allow related” and “allow established.” Those two rules are the magic that allows your PBX to work.

A register packet is how your PBX tells a remote system what your IP address is and the port to route calls to on your IP address. Register packets are not necessary for SIP to work, but you’d have to provide the remote system with your IP address and port number in some other manner (via a web interface, for example) if you didn’t do so with a registration packet.

A qualify (options) packet is how your PBX asks the remote system to verify that it is functioning and what functions it supports. Generally, if you configure Asterisk to send a qualify packet, and it doesn’t receive a response within 2000ms (2 seconds), Asterisk will assume that the trunk is down and will not send calls out on that trunk until it comes back up. Options packets are typically sent every 60 seconds, but again you can change that by changing the qualifyfreq= settings in the PEER details.

When your PBX sends out a register packet or a qualify (options) packet to a remote host from port 5060 (or to port 5060), iptables recognizes that the remote host is likely to respond at some point in the near future. Your router forwards the register or options packet out on a defined port from your public IP address (usually it starts with the port your PBX is using (i.e., 5060) unless some other device on your network got to it first, in which case it will pick some other random port), and then it will keep that port open for replies for a certain period of time.

If your PBX continues to send periodic traffic to that destination from the same source port to the same destination, your router will hold that port open for replies essentially forever. If your PBX doesn’t keep sending traffic, eventually, your router will expire the relationship and stop forwarding replies to that port back to your PBX.

A good router (most routers) will only keep the port open for replies from the destination IP address you reached out to. A crummy router will open the port to the entire world (i.e., any packet from any IP address to that port will get forwarded to the device on your network). I’ve seen consumer grade routers that appeared to do that, i.e., opening port 5060 to the world and thus allowing SIP scanning tools to attempt to hack the device.

“Allow related” and “allow established” are not the same as SIP ALG, but they could be confused with SIP ALG. “Allow related” means that IPtables looks at your traffic, attempts to determine what it is, and then allows packets that seem related to those packets back to the same device, sometimes on related ports. For example, if your router sees traffic relating to ports 5060, it might allow packets in ports 10,000 to 20,000 through, thus allowing RTP traffic through that are related to your 5060 traffic.

SIP ALG is different than allow related/established. Instead of simply allowing in traffic that would otherwise be discarded, SIP ALG changes the contents of the SIP Headers, in recognition of the fact that when SIP is configured strictly, SIP devices will ignore the UDP headers and instead respond to the addresses contained in the SIP headers.

If you’re behind a firewall and you aren’t configured correctly, the SIP headers could have internal IP addresses (192.168.1.xx) instead of external IP addresses. If your remote system is configured strictly, it will attempt to reply to those internal IP addresses, and you’ll never get the response. And, as noted above, even if you send out requests from Port 5060, your router might have to change the source port on the UDP headers if another device on your network is already using port 5060. Again, if the remote system attempts to respond to the port in the SIP headers (5060), the traffic may go to a port that your router may not be expecting, and the traffic may be dropped.

SIP ALG is almost never necessary because (1) most people do configure their SIP headers correctly (to show their public IP address instead of their internal private IP address) and (2) most SIP providers configure their systems to use the UDP headers rather than the SIP headers (NAT=Yes) because of the port altering that can occur when the source port is already being used by another device behind your router.

SIP ALG is also the cause of problems because, more often than not, SIP ALG rewrites the SIP headers incorrectly, i.e. by re-writing the wrong packets or writing with the wrong information. Added to the fact that its never necessary (see above) and SIP ALG is one of the most common causes of VOIP issues.

I’ve greatly oversimplified this post in order to answer your questions, and I’m confident that someone will reply to point out that I’ve gotten something wrong. But, basically, this is why your system works even though you haven’t explicitly forwarded a port. If you want to read more about it, I suggest that you google “iptables” and “allow related”.

1 Like

You’re not entirely right. I was referring to unsolicited requests coming in. If I send an HTTP request to my private home IP address, it’ll be dropped unless (I understand how firewalls work, I’ve configured iptables on my VPS) it’s part of an established connection and therefor, is a response to something initiated from inside the LAN.

The rest of your post seems like a very good read, thanks!

Could I also assume that ‘trunk’ here is synonym for ‘peer’ (like a phone)?