Sudden NAT handling change in FreePBX 13

michaelschefczyk · November 20, 2015, 4:29pm

Dear Daniel Friedman,

Thank you very much! I am testing basically one step per day now so that I can see if changes make a practical difference. So far, I am at FreePBX 13 with system firewall disabled, no localnet, and on the device/extension side: no nat, trustripd but no sendrpid. Replacing sendripd=yes with no from the trunk definition will be next and then it will be time to try the firewall again. I will report where I will end up.

Regards,

Michael

michaelschefczyk · November 28, 2015, 11:03am

Dear Daniel Friedman,

As next steps, I did replace sendripd=yes with =no for my trunks without causing issues which I could detect. With Core 3.0.14 and Framework 3.0.25, I could also turn on Firewall 3.0.11.1 without NAT/audio issues again. This probably does need practical testing for a few more days, but it does look/sound good, so far. I have Network Manager off and the interface configured through System Admin -> Network Settings. My interface is in the external zone and everything else is pretty much standard.

Thank you very much for guiding me to get to this point!

The only thing I could not get back after upgrading to FreePBX 13 is NIC bonding (NIC Bonding in FreePBX 13). It seems that it is now a must to configure interfaces through System Admin -> Network Settings and that does plainly not permit bonding. Is that correct?

Regards,

Michael

danielf · November 28, 2015, 1:51pm

Hi Michael,

I am glad that you got it working at last.

As for the bonding configuration, to avoid misconfiguration, I would have restore a backup to another machine and configure the bonding manually to see its implications through the network settings module in the Freepbx. Just make sure you have a console connection to the backup server before configuring the bonding.

You can read a little about network bonding in Centos 6.6 here (read this manual to its end before configuring):

http://www.paulmellors.net/centos-6-6-network-bonding/

Thank you,

Daniel Friedman
Trixton LTD.

michaelschefczyk · November 28, 2015, 2:53pm

Hi Daniel,

Basically, I am almost certain that I have the bonding configuration set correctly. It seems to be in line with what Paul Mellors and the Centos documentation provides and it was in use for a long time with Centos 6.6 (and remains to be in use with other Centos 7.1 servers I am using). The files in /etc/sysconfig/network-scripts are:

ifcfg-bond0
DEVICE=bond0
ONBOOT=yes
TYPE=Ethernet
BONDING_OPTS='mode=802.3ad miimon=100’
BRIDGE=br0
NM_CONTROLLED=no
BOOTPROTO=none
IPV6INIT=no
NOZEROCONF=yes
ifcfg-br0
DEVICE=br0
ONBOOT=yes
TYPE=Bridge
IPADDR=192.168.12.10
NETMASK=255.255.255.0
GATEWAY=192.168.12.1
DNS1=192.168.12.1
NM_CONTROLLED=no
NOZEROCONF=yes
ifcfg-eth0
DEVICE=eth0
TYPE=Ethernet
USERCTRL=no
SLAVE=yes
MASTER=bond0
BOOTPROTO=none
HWADDR=00:25:90:C7:B5:32
NM_CONTROLLED=no
ifcfg-eth1
DEVICE=eth1
TYPE=Ethernet
USERCTRL=no
SLAVE=yes
MASTER=bond0
BOOTPROTO=none
HWADDR=00:25:90:C7:B5:33
NM_CONTROLLED=no

This configuration used to work with FreePBX12 and it does still work with similar servers running Centos 7.1. In line with this, the switch does think that a Link Aggregation Group is running. At Centos, package bridge-utils is present and current, of course. Ifcfg output:

bond0 Link encap:Ethernet HWaddr 00:25:90:C7:B5:32
inet6 addr: fe80::225:90ff:fec7:b532/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:1125 errors:0 dropped:0 overruns:0 frame:0
TX packets:814 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:122959 (120.0 KiB) TX bytes:341828 (333.8 KiB)

br0 Link encap:Ethernet HWaddr 00:25:90:C7:B5:32
inet addr:192.168.12.10 Bcast:192.168.12.255 Mask:255.255.255.0
inet6 addr: fe80::225:90ff:fec7:b532/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1093 errors:0 dropped:0 overruns:0 frame:0
TX packets:710 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:102549 (100.1 KiB) TX bytes:334652 (326.8 KiB)

eth0 Link encap:Ethernet HWaddr 00:25:90:C7:B5:32
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:242 errors:0 dropped:0 overruns:0 frame:0
TX packets:100 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:27926 (27.2 KiB) TX bytes:23414 (22.8 KiB)
Memory:fe120000-fe13ffff

eth1 Link encap:Ethernet HWaddr 00:25:90:C7:B5:32
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:883 errors:0 dropped:0 overruns:0 frame:0
TX packets:714 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:95033 (92.8 KiB) TX bytes:318414 (310.9 KiB)
Memory:fe100000-fe11ffff

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:570 errors:0 dropped:0 overruns:0 frame:0
TX packets:570 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:52567 (51.3 KiB) TX bytes:52567 (51.3 KiB)

The firewall does see all interfaces including bond and bridge and one can move all of them into the external zone. System Admin -> Network Settings is practically out of the loop.

What may indicate the problem cause is that whenever one does open a terminal, the following red error message is shown:

[Whoops\Exception\ErrorException]
file_get_contents(/sys/class/net/bonding_masters/type): failed to open stream: file or directory not found

I do not really understand this but I sense that sysfs is now involved in bonding and as far as I understand that is not as stable and persistent as the classical method.

Regards,

Michael

P.S.: One edit, because I did happen to turn off one of the interfaces at the switch during testing. This is now corrected - with the key error message remaining, though.

danielf · November 28, 2015, 3:50pm

Hi Michael,

It seems that this is a bug with the firewall module (It cannot read the correct file because of permissions problem probably). I would try to remove the bridge since you do not really need it. Did you tried disabling the firewall module?

Try to run this without the bridge interface and if you still get the same results open a bug in the Freepbx bug tracker.

Thank you,

Daniel Friedman
Trixton LTD.

michaelschefczyk · November 28, 2015, 5:54pm

Hi Daniel,

The error shows up whenever the bridge is defined. Enabling or disabling (even removing) the firewall does not seem to have any impact on the error message. Nevertheless, the system seems to work. Indeed, I could go without NIC bonding. On the other hand, if high availability hardware is available, one may just want to use it.

As I am in a dual SOHO situation, I tend to leave one server running with NIC bonding and one with just a single NIC connected. Then, I should be able to collect further experiences.

Would you recommend to file a bug now?

Regards,

Michael

danielf · November 29, 2015, 12:22pm

Hi Michael,

I have meant that you can remove the bridge interface as you do not need it for bonding. You can configure the ip address directly on the bond0 interface. Try to configure your server with bonding according to this (refer to the second part of the article): http://www.unixmen.com/linux-basics-create-network-bonding-on-centos-76-5/

It is more clearer than the first link that I have gave you.

Thank you,

Daniel Friedman
Trixton LTD.

michaelschefczyk · November 29, 2015, 3:48pm

Hi Daniel,

thanks again. Practical networking does work with bond and bridge as well as with the bond only. However, the error message does stay the same in both variants. So in the end, the only hard fact that I can see is the error message - possibly there are no other implications.

Regards,

Michael

danielf · November 29, 2015, 7:27pm

Hi,

This error:

[Whoops\Exception\ErrorException] file_get_contents(/sys/class/net/bonding_masters/type): failed to open stream: file or directory not found

indicates of permission problem usually, but just to be on the safe side, can you paste the results of:

ls -la /sys/class/net/bonding_masters/type

and if it is exists, please paste the contents of the file:

cat /sys/class/net/bonding_masters/type

Thank you,

Daniel Friedman
Trixton LTD.

michaelschefczyk · November 29, 2015, 8:20pm

Hi Daniel,

thanks again! Unfortunately, /sys/class/net/bonding_masters/type does not exist. The directory /sys/class/net just contains symlinks to all interfaces plus a file (readable to all users and read/write to root as its owner) “bonding_masters” with one line of content: “bond0”.

Regards,

Michael

danielf · November 30, 2015, 7:51am

Hi,

Well here is your bug. Take your findings from your last answer and report a bug.

Thank you,

Daniel Friedman
Trixton LTD.

michaelschefczyk · December 5, 2015, 9:50pm

Dear All,

Oh no! I thought that NAT and one sided audio issues would be resolved and my systems would be running better than ever.

Today, I did check the SIP settings and noticed that the RTP port ranges were 10000 through 20000. No big thing, but I had set them to 30000 through 30099 originally with optimized firewall settings - while 10000 through 20000 would still work. I did change that back in the SIP settings and reloaded thereafter. From that point onwards, I had one sided audio again. Regardless of further changes of the SIP settings and reboots, these problems persisted and “sip show settings” did not reveal a cause.

What I did then was to revert one of my virtual machine warm spares by two weeks for Framework 13.0.19, Core 13.0.11 and SIP settings 13.0.14.5. With that version, I was able to change the SIP settings as I liked, reload and find no one sided audio issues. I had to reboot once but that as all there was. Then, I upgraded to the current versions. This reverted RTP port ranges from 30000 through 30099 to 10000 through 20000 again. From that point onwards, anything that leads to a reload after the slightest change of SIP settings results to one sided audio.

What I ended up doing is use my virtual machine warm spares based on the backup of the early hours of today, i.e. before making any changes to the SIP settings. I left the SIP settings untouched, as they do work and as I know that any change would render the system unusable. My main hardware servers must remain off, since I am unable to revert them to before the change of the SIP settings.

I would very much welcome any advice on how to cure this.

Regards,

Michael

dicko · December 5, 2015, 10:11pm

In /etc/asterisk/rtp.conf is your usable range of ports (and it is definitive) you need to match that range in your router, it should start on an even number and span at least twice the number of likely concurrent calls. If that file is not the same as you set it in the gui it is likely a bug.

michaelschefczyk · December 6, 2015, 2:34pm

Hi dicko,

Thank you very much. In the meantime, I checked the issue further, found a solution, but did not understand the cause. Thus, I would very much welcome feedback:

The RTP Port Range in Asterisk SIP Settings and in /etc/asterisk/rtp_additional.conf are always sychronized. This does seem to work as it should.
Our carrier (Deutsche Telekom AG) specifies RTP ports 30000 through 31000 plus inbound port forwarding for UDP packets. In line with recommendations, I did set RTP port ranges of 30000 through 30099 (more than wide enough for me) plus corresponding port forwarding rules. If I recall correctly, port forwarding did make sense one or two years ago. Today, I could not find that to be critical anymore. Furthermore, the carrier also connects if one specifies standard RTP port ranges of 10000 through 20000.
To my surprise, if I take a wireshark trace at my WAN interface, the RTP packets’ ports do never correspond to the RTP port ranges set in FreePBX. I have some older traces on file to compare - in the past, there was also no correspondence between RTP port ranges entered and what wireshark finds. Does this make sense??
When upgrading Asterisk SIP Settings from 13.0.14.5 to 13.0.14.6, it changes the RTP port ranges back to 10000 through 20000 regardless of what one did specify before. This did not happen with other recent upgrades which I tested.
In my scenario, I did not immediately notice the change of RTP port ranges due to the Asterisk SIP Settings module upgrade - how should I? When reverting RTP port ranges back to 30000 through 30099 yesterday, I did trigger a FreePBX System Firewall issue. It seems that ony may go either with 10000 through 20000 and the System Firewall on or with other port ranges and the System Firewall off. Unfortunately, this is not obvious - at least this was not obvious to me.

Regards,

Michael

dicko · December 6, 2015, 3:02pm

rtp set debug on

will show the audio payload as asterisk sees it hopefully bi-directional. Make sure any sip " helper" function on your router is disabled. The rest might be freepbx bugs.

lgaetz · December 6, 2015, 3:49pm

If the RTP port ranges specified in the conf files are not respected (even after asterisk restart) that points to a possible asterisk issue. It’s hard to imagine how the dahdi config module would have any influence over this, tho stranger things have happened.

The rtp ranges getting reset on upgrading the SIP Settings module sounds like a FreePBX bug, recommend filing a report at issues.FreePBX.org.

Difficult to say for sure, but the Firewall module sounds like it may be working properly, it should recognize the non default port range and allow the traffic. If Asterisk is not using these ports tho, it will be a problem.

plindheimer · December 6, 2015, 9:04pm

When a provider specifies RTP port ranges, that is almost always the ranges that they will be signalling to you where to send the media to them. The rpt.conf port ranges are where Asterisk is specifying to receive the streams. Unless your provider is requring some really invasive rules on you, unlikely, then there should be no reason to change your ranges as they should send the media to where ever you tell them to and as such you shouldn’t have to change the rtp.conf default configuration. The provider never dicates where it’s going to send to, only where you are to send to it.

In either case, if you choose to change your ports, it’s usually a good idea to have a range that is a lot more then adequate to cover the number of simultaneous calls you’ll have (x2 since it’s only even port numbers it will use).

It does appear form what you’ve described there may be a bug in the changing and resetting of default rtp port settings in Asterisk, so if so, it’s a valid bug to report for us to explore. But, my recommendation would be to stick with the defaults if the only reason you made these changes was thinking you had to based on your provider. (Unless I’m wrong, and they really are dictating the ‘from ports’ for your RTP as if they would reject any traffic that didn’t come ‘from’ their specified range, possible, but unlikely.)

michaelschefczyk · December 16, 2015, 5:40pm

Dear danielf, dicko, lgaetz, p_lindheimer,

Again, thank you very much for your responses! Over the last 10 days, I have been - unlike in the years before, where I did try to go by the book - ignoring the carrier’s specifications and changed the RTP ports back to the standard 10000 - 20000 range. I have also deleted all RTP port forwarding rules. The only port which I am still actively forwarding to FreePBX is now SIP 5060 UDP.

That led to a completely functional system without any issues - except for the mildly related NIC bundling point which I did file as an issue.

Regards,

Michael

plindheimer · December 16, 2015, 8:45pm

If you don’t have the RTP ports forwarded to you it may result in intermitent issues, so you may want to consider whether or not you want to keep it that way. The reason it works ‘most of the time’ is because upon the negotiated signalling, you begin to transmit to them through the ports that you’ve requested they send the media to you in. Once you have initiated that, they’ll be able to reach you through those ports as you’ve opened up a hole from inside. The things that can go wrong are two fold. Until you do that, the hole isn’t opened so any transmission they do to you will get blocked. Secondly, if your firewall re-maps your ports, you may or may not recieve the media, it depends on how aggresive they are in trying to accomodate an ‘incorrect configuration’ on your part. Some providers may wait until they see your media stream, ignore the signallilng information you supplied them in the SIP dialog, and send the media back to the ports that your stream is coming from. As long as they are doing that, it would continue to work. However, the proper operation would be for them to send the media to your advertised ports. If your firewall has remapped those ports, then the media will get blocked and you’ll end up with one way media where as if you had the ports forwarded, they would still arrive at your PBX.

So … the crux here is, it may work now, and it may continue working, but it may also break or even get intermitent problems if you don’t have those ports forwarded. The latter could happen because of changes on their side (which would not be anything they’re doing wrong) or changes on your side because of firewall behavior.

michaelschefczyk · December 16, 2015, 9:54pm

Dear Philippe Lindheimer,

Well, the attitude of Deutsche Telekom AG reflects that they are the incumbent and still dominant carrier in Germany operating hundreds of VoIP servers. While they push for fast migration to VoIP in order to scrap ISDN, they do not care a bit about users of Asterisk and the like. They rather want to sell their proprietary Speedport routers most of which are far too limited in quality and functionality to be used in any (semi-)professional environment. There is technical documentation about Telekom’s VoIP functionality available, hundreds of pages of it, but rather illegible and out of touch with reality, as it tends to be outdated, different servers seem to have different policies at each point in time and issues such as load balancing and proprietary DNS resolution strategies remain behind the curtain. Smaller competitors have working sample Asterisk configurations available for their customers, of course.

In the past, I did only open and forward a small range of ports (30000 - 30099), which are a subset of the range claimed by the carrier. I did also limit that to RTP traffic coming from Deutsche Telekom’s servers. I do hesitate, however, to open and forward ranges very broad ranges, such as 10000 - 20000. Furthermore, limiting port forwarding to traffic originating from the carrier’s addresses is also tricky, as they refuse to reveal from which IP range their traffic is sent, so it is a constant game of watching and updating. Thus, I intend to avoid RTP port forwarding as long as I can while watching for adverse consequences. I did try this last about 18 months ago and as you said, it did work for some time and then it did suddenly stop.

Regards,

Michael