I’m encountering a tricky problem with a newly installed client, where in both outbound and inbound calls, sometimes the audio starts after 15-20 seconds.
It don’t happen in every call, but it’s a considerable amount that is causing some inconvenience.
The trunk and endpoints use PJSIP, and the server is on a VPS at Vultr.
After the client reported the situation to us, I tested it on our network and on alternative networks and realized that this only happens with endpoints registered on their network, which has a Sonicwall Firewall.
Analyzing the sngrep comparison between a call that worked well and one that presented this problem, in addition to the time it took for the RTP packets to start flowing, one thing that caught my attention was a change in ports where the RTP was being sent - first it tried on the 6000 of the endpoint, and after 20 seconds of silence, it switched to the correct port and the audio started flowing on both sides.
6000 will be what the SDP (which you didn’t provide) is telling it to use. The change will be the result of receiving incoming traffic,from a different port, combined with having symmetric RTP enabled (the latter a good thing, in this case).
This would suggest that NAT is rewriting the port number.
That doesn’t explain the late start of the media in the opposite direction, only that the port changes when that starts. It is quite possible that you don’t generate any media in that time. There is enough information to determine whether that might be the case.
Yes it shows they have told Asterisk to send to port 10000, when they are actually sending from a different one (and presumably receiving on it). Asterisk cannot correct the destination port until it receives something from them.
A router is mangling the port number. Whilst Asterisk can recover from that, it can only do so if it receives traffic in the other direction.