Unusual disconnects

I just upgraded my PBX system to the newest stable 2.10 iso distro, and I am having an unusual disconnect problem. I have a ring group that ringall’s to an extension and to an external number (“210” and “7195551212#”). When a call comes in, the system disconnects the call. If I remove the external number, “7195551212#”, the call rings through fine to the extension. See that attached asterisk log (note the two WARNING[3790], these are present whenever a call is disconnected):

[2012-08-09 20:54:04] VERBOSE[14543] app_dial.c: – SIP/Falcon1-00000071 is making progress passing it to Local/[email protected];2
[2012-08-09 20:54:04] VERBOSE[14540] app_dial.c: – Local/[email protected];1 is making progress passing it to SIP/Falcon3-0000006f
[2012-08-09 20:54:04] VERBOSE[14543] app_dial.c: – SIP/Falcon1-00000071 answered Local/[email protected];2
[2012-08-09 20:54:04] VERBOSE[14540] app_dial.c: – Local/[email protected];1 answered SIP/Falcon3-0000006f
[2012-08-09 20:54:04] VERBOSE[14540] pbx.c: – Executing [[email protected]:1] Set(“Local/[email protected];1”, “__MACRO_RESULT=”) in new stack
[2012-08-09 20:54:04] VERBOSE[14540] pbx.c: – Executing [[email protected]:2] Macro(“Local/[email protected];1”, “blkvm-clr,”) in new stack
[2012-08-09 20:54:04] VERBOSE[14540] pbx.c: – Executing [[email protected]:1] Set(“Local/[email protected];1”, “SHARED(BLKVM,SIP/Falcon3-0000006f)=”) in new stack
[2012-08-09 20:54:04] VERBOSE[14540] pbx.c: – Executing [[email protected]:2] Set(“Local/[email protected];1”, “GOSUB_RETVAL=”) in new stack
[2012-08-09 20:54:04] VERBOSE[14540] pbx.c: – Executing [[email protected]:3] MacroExit(“Local/[email protected];1”, “”) in new stack
[2012-08-09 20:54:04] VERBOSE[14540] pbx.c: – Executing [[email protected]:3] ExecIf(“Local/[email protected];1”, “0?Set(MASTER_CHANNEL(CONNECTEDLINE(num))[email protected]/n)”) in new stack
[2012-08-09 20:54:04] VERBOSE[14540] pbx.c: – Executing [[email protected]:4] ExecIf(“Local/[email protected];1”, “0?Set(MASTER_CHANNEL(CONNECTEDLINE(name))=)”) in new stack
[2012-08-09 20:54:10] WARNING[3790] chan_sip.c: Retransmission timeout reached on transmission [email protected] for seqno 101 (Critical Response) – See https://wiki.asterisk.org/wiki/display/AST/SIP+Retransmissions
Packet timed out after 6400ms with no response
[2012-08-09 20:54:10] WARNING[3790] chan_sip.c: Hanging up call [email protected] - no reply to our critical packet (see https://wiki.asterisk.org/wiki/display/AST/SIP+Retransmissions).
[2012-08-09 20:54:10] VERBOSE[14540] pbx.c: – Executing [[email protected]:1] Macro(“SIP/Falcon3-0000006f”, “hangupcall”) in new stack
[2012-08-09 20:54:10] VERBOSE[14540] pbx.c: – Executing [[email protected]:1] GotoIf(“SIP/Falcon3-0000006f”, “1?theend”) in new stack
[2012-08-09 20:54:10] VERBOSE[14540] pbx.c: – Goto (macro-hangupcall,s,3)
[2012-08-09 20:54:10] VERBOSE[14540] pbx.c: – Executing [[email protected]:3] ExecIf(“SIP/Falcon3-0000006f”, “0?Set(CDR(recordingfile)=)”) in new stack
[2012-08-09 20:54:10] VERBOSE[14540] pbx.c: – Executing [[email protected]:4] Hangup(“SIP/Falcon3-0000006f”, “”) in new stack
[2012-08-09 20:54:10] VERBOSE[14540] app_macro.c: == Spawn extension (macro-hangupcall, s, 4) exited non-zero on ‘SIP/Falcon3-0000006f’ in macro ‘hangupcall’
[2012-08-09 20:54:10] VERBOSE[14540] features.c: == Spawn extension (macro-dial, h, 1) exited non-zero on ‘SIP/Falcon3-0000006f’
[2012-08-09 20:54:10] VERBOSE[14540] app_macro.c: == Spawn extension (macro-dial, s, 7) exited non-zero on ‘SIP/Falcon3-0000006f’ in macro ‘dial’
[2012-08-09 20:54:10] VERBOSE[14540] pbx.c: == Spawn extension (ext-group, 810, 14) exited non-zero on ‘SIP/Falcon3-0000006f’
[2012-08-09 20:54:10] VERBOSE[14543] pbx.c: – Executing [[email protected]:1] Macro(“Local/[email protected];2”, “hangupcall,”) in new stack
[2012-08-09 20:54:10] VERBOSE[14543] pbx.c: – Executing [[email protected]:1] GotoIf(“Local/[email protected];2”, “1?theend”) in new stack
[2012-08-09 20:54:10] VERBOSE[14541] app_mixmonitor.c: == End MixMonitor Recording SIP/Falcon3-0000006f
[2012-08-09 20:54:10] VERBOSE[14543] pbx.c: – Goto (macro-hangupcall,s,3)
[2012-08-09 20:54:10] VERBOSE[14543] pbx.c: – Executing [[email protected]:3] ExecIf(“Local/[email protected];2”, “0?Set(CDR(recordingfile)=)”) in new stack
[2012-08-09 20:54:10] VERBOSE[14543] pbx.c: – Executing [[email protected]:4] Hangup(“Local/[email protected];2”, “”) in new stack
[2012-08-09 20:54:10] VERBOSE[14543] app_macro.c: == Spawn extension (macro-hangupcall, s, 4) exited non-zero on ‘Local/[email protected];2’ in macro ‘hangupcall’
[2012-08-09 20:54:10] VERBOSE[14543] features.c: == Spawn extension (macro-dialout-trunk, h, 1) exited non-zero on ‘Local/[email protected];2’
[2012-08-09 20:54:10] VERBOSE[14543] app_macro.c: == Spawn extension (macro-dialout-trunk, s, 22) exited non-zero on ‘Local/[email protected];2’ in macro ‘dialout-trunk’
[2012-08-09 20:54:10] VERBOSE[14543] pbx.c: == Spawn extension (from-internal, 7193390500, 6) exited non-zero on ‘Local/[email protected];2’

Additionally, the call is disconnected when a time condition or ring group calls a custom extension that calls an external number (“Local/[email protected]”). I can manually dial the extension from a phone and successfully connect, but if the extension is the destination of another function, it disconnects. The log files are almost identical.

Help!

Thanks,
Russ

I hope there is someone out there who may be able to help with this issue. I have a completely ready to go system, but I cant deploy it until I get this resolved.

Is this a possible bug? It seems like this is a software problem since I can dial the outbound extension from a phone but when I have a ringgroup dial it, the call gets disconnected.

I see timeouts on a SIP transaction. Are you sure it’s not a network issue?

I can guarantee you that you can have outside destinations in ring groups and follow me with the current Asterisk.

I see that same warnings. I dont know what to do with that info. They go through perfectly if I dial the extension but it fails if I dial a ringgroup that dial the extension. I tried it with my iptables service stopped. I just dont known enough about the software back end to know what the difference is between the two routes. I read the linked article, but it seems to reference a network problem - like you said. I dont know how it could be a network problem if it works when I direct dial it.

I assume that it also works if you dial the ring group number internally (from an on net extension)?

If that assumption is correct when the call comes in from the trunk Asterisk is “looping” the media back through itself and out to the same carrier, perhaps even the same proxy.

Make sure you have direct media and reinvite behavior set to never. Then test and do an RTP debug. Make sure you have two symetric RTP sessions that originate and terminate from reachable IP’s. If any of those IP’s are NAT’d make sure the localnet and externip settings are correct.

Your assumption is correct. All works perfectly dialing any internal extension.

Im going to do some googling tonight and see if I can find a problem based on your last paragraph. Thanks! If I solve it, Ill post a solution.

This might be something the developers look at. This was pretty much a stock deployment.

I don’t understand this statement. What do you want the developers to look at? What developers are you talking about?

Asterisk default behavior is to allow reinvite. If you need to anchor media you need to change that behavior. That is why the FreePBX developers exposed those setting in the SIP Settings module so you can configure for your deployment.

SIP over NAT is not tolerant of incorrect configuration or network equipment that alters SIP communications.

There is no way to have a default configuration that will make everyone happy.

in sip_general_additional.conf which is referenced by sip.conf

canreinvite=no
nat=yes
externip is set to my static ip address
localnet=192.168.1.0/255.255.255.0

That looks correct, these settings are written out by SIP settings module.

You need to do the 'RTP debug" command from Asterisk CLI. You will see a from and to address. When the call drops take note of where the media session was going. That’s the address Asterisk is loosing touch with.

This is not an easy problem to fix. It seems to be different providers are more tolerant of the scenario that others. I had one client I could never get it to work with his provider. We ended up using an IAX trunk.

IAX is immune from these issues because the media is in the same socket as the signalling.

I guess I just assumed that my configuration is pretty standard and probably the most common. Subnet with the server/phones behind with port forwarding. If that is not correct, then I stand corrected.

The above referenced settings are after I set them in the gui.

I have a did routing directly to an extension that calls out. The extension works from internal phones. The behavior is still not correct but a little different. Before it would hang up in about 1 second. Now it rings once, goes silent, and hangs up in about 6-7 seconds.

I tried the asterisk command “rtp debug” in the asterisk cli and it gave me the following “No such command ‘rtp debug’ (type ‘core show help rtp debug’ for other possible commands)” Im working on getting a error log of the issue.

I have been using this provider for a while now. I have an old box that I have hooked up during the day. Its behavior is the same as this new box. I have no problem calling in and routing out.

The old box was hacked so I wanted to update the asterisk/freepbx, but it is proving more difficult than I anticipated.

the cli command was “rtp set debug on”

with debug on, I assume the data of interest is on in the reports>asterisk logfiles (which I assume is /var/log/asterisk/full

Sorry, old memory.

The output should also scroll across the Asterisk CLI. Usually you can spot the anomaly in real time, as it is easier to correlate with your ears and your eyes.

Anyway yes, the data is also in the /var/log/asterisk/full.

I would not use the FreePBX log facility. It is way too cumbersome for this type of real time work.

You can tail the file from the shell (tail -f /var/log/asterisk/full) or pull it up in your favorite editor (IE: vi /var/log/asterisk/full)

You also want to disable dial plan and debug logging to eliminate the clutter.

core set verbose 0

and

core set debug 0

You also may find that this is an issue with your router.

I think I may have found the problem. The following was in the file:

[2012-08-12 23:41:57] VERBOSE[13997] netsock2.c: == Using SIP RTP TOS bits 184
[2012-08-12 23:41:57] VERBOSE[13997] netsock2.c: == Using SIP RTP CoS mark 5
[2012-08-12 23:41:57] VERBOSE[13997] app_dial.c: – Called SIP/Falcon1/3035737469
[2012-08-12 23:42:03] VERBOSE[13997] app_dial.c: – SIP/Falcon1-00000015 answered Local/[email protected];2
[2012-08-12 23:42:03] VERBOSE[13995] app_dial.c: – Local/[email protected];1 answered SIP/Falcon1-00000014
[2012-08-12 23:42:34] NOTICE[3787] chan_sip.c: Disconnecting call ‘SIP/Falcon1-00000015’ for lack of RTP activity in 31 seconds
[2012-08-12 23:42:34] NOTICE[3787] chan_sip.c: Disconnecting call ‘SIP/Falcon1-00000014’ for lack of RTP activity in 31 seconds

After reading a little but about the error, it dawned on me the when we locked down this server after the previous hack, we may have locked down ports 10000-20000. -crossing my fingers that it is that simple.

Why does everyone think it is a problem with the software? I go through this many times in a week. You suggested bringing it up with developers. I asked you to review your network config in my first post. We have to go through all this pain until you are willing to even consider that it’s your issue.

Asterisk is deployed at millions of sites around the world. If there was a fundamental bug in the SIP stack one would think it would have surfaced long ago.

When it comes to technology that folks don’t completely understand it seems as if we suffer from “terminal uniqueness” where I have found that someone long before me has had the same problem and taken the time to document how they fixed it. We must remain open minded to avail ourselves of this wisdom.

You are correct.

I actually discovered that the problem was not my firewall. One of the following fixed the problem (or a combination of them):

canreinvite=no
nat=yes
externip is set to my static ip address
localnet=192.168.1.0/255.255.255.0

When I switched over the servers this evening I forgot to change the port forwarding. After I made these changes, the problem symptom changed. Then it was a forwarding problem, and the root problem was resolved.

Thanks again for your help!!!

I’m also experiencing disconnects on outgoing calls on my new FreePbx install. My Granstream GX2000 when configured to register directly with my VoIP provider does not do this, but when it registered as an extension to my FreePBX it does.

I have NAT enabled in SIP Settings and reinvite=no in Trunk Peer settings.
Can anyone advise on what I can do to stop this happening please?

NOTICE[1734]: chan_sip.c:26488 check_rtp_timeout: Disconnecting call ‘SIP/100-0000005b’ for lack of RTP activity in 31 seconds

looks like you are experiencing " for lack of RTP activity in 31 seconds", look into setting :-

rtpholdtimeout

and/or

rtptimeout

Thanks for this. I have been playing around with these setting but it still happens.

On a call today, it suddenly terminated for no apparent reason leaving a buzzing sound in the handset. CLI reported the following:

– Executing [[email protected]:1] Macro(“SIP/100-00000014”, “hangupcall,”) in new stack
– Executing [[email protected]:1] GotoIf(“SIP/100-00000014”, “1?theend”) in new stack
– Goto (macro-hangupcall,s,3)
– Executing [[email protected]:3] ExecIf(“SIP/100-00000014”, “0?Set(CDR(recordingfile)=)”) in new stack
– Executing [[email protected]:4] Hangup(“SIP/100-00000014”, “”) in new stack
== Spawn extension (macro-hangupcall, s, 4) exited non-zero on ‘SIP/100-00000014’ in macro ‘hangupcall’
== Spawn extension (macro-dialout-trunk, h, 1) exited non-zero on ‘SIP/100-00000014’
== Spawn extension (macro-dialout-trunk, s, 22) exited non-zero on ‘SIP/100-00000014’ in macro ‘dialout-trunk’
== Spawn extension (from-internal, 08003761090, 6) exited non-zero on ‘SIP/100-00000014’

Can anyone advise what happened here?

Broken hardware? SIP can’t “buzz” :wink: