Just guessing here, perhaps Flowroute deprioritizes OPTIONS and it sometimes takes longer than 3 seconds to respond. I don’t have this issue with them but had it with another provider.
Try putting this in (or adding it to) /etc/asterisk/pjsip.aor_custom_post.conf:
[FlowRoute](+type=aor)
qualify_timeout=16.0
Restart Asterisk and see whether this eliminates or at least greatly reduces the occurrences.
If not, I’m wondering if pjsip is dumb about picking from the many servers that us-east-va.sip.flowroute.com resolves to. If it hits one that’s down, it might keep beating on the same one with successive OPTIONS retries, instead of trying another. Given that your log segment is in the wee morning hours on a Sunday, this might just be their routine maintenance. A current nslookup shows:
> _sip._udp.us-east-va.sip.flowroute.com
Server: dns.google
Address: 8.8.8.8
Non-authoritative answer:
_sip._udp.us-east-va.sip.flowroute.com SRV service location:
priority = 30
weight = 50
port = 5060
svr hostname = ep-us-west-or-02.flowroute.com
_sip._udp.us-east-va.sip.flowroute.com SRV service location:
priority = 30
weight = 50
port = 5060
svr hostname = ep-us-west-or-01.flowroute.com
_sip._udp.us-east-va.sip.flowroute.com SRV service location:
priority = 20
weight = 50
port = 5060
svr hostname = ep-us-east-nj-01.flowroute.com
_sip._udp.us-east-va.sip.flowroute.com SRV service location:
priority = 10
weight = 50
port = 5060
svr hostname = ep-us-east-va-01.flowroute.com
_sip._udp.us-east-va.sip.flowroute.com SRV service location:
priority = 10
weight = 50
port = 5060
svr hostname = ep-us-east-va-02.flowroute.com
_sip._udp.us-east-va.sip.flowroute.com SRV service location:
priority = 20
weight = 50
port = 5060
svr hostname = ep-us-east-nj-04.flowroute.com
_sip._udp.us-east-va.sip.flowroute.com SRV service location:
priority = 20
weight = 50
port = 5060
svr hostname = ep-us-east-nj-02.flowroute.com
_sip._udp.us-east-va.sip.flowroute.com SRV service location:
priority = 20
weight = 50
port = 5060
svr hostname = ep-us-east-nj-03.flowroute.com
>
Turning on pjsip logger (or capturing with tcpdump, etc.) would show whether this is the case.
Fwiw, I didn’t mean to imply that this was time specific - those log lines are still being added, on the xx:xx:39 and xx:xx:42 on the dot as it happens. (63 and 57 seconds apart. Indeed, qualify_frequency is 60sec, as is general retry).
The 3 second difference between 42 and 39 is the default value of qualify_timeout. If the config change works as intended but doesn’t solve the problem, you should now see a 16 second difference.
At the Asterisk command prompt, you can issue pjsip show aor FlowRoute
and it should show the qualify_timeout value.
Yeah, I think ignore this whole thread. I had turned the responsive firewall back “all the way” on, and had a typo on the permit line I think For reference, I had the firewall on since setup, but I was debugging a bunch of clients, had run into issues, and dropped the pjsip service back to the Internet zone. Fixed that right about the same time this cropped up.
Gonna let it do it’s thing overnight and see if anything persists.
Thanks for your time.
EDIT - some lies. maybe the match field is still wrong:
Unless there is also a bug in the firewall logic, this doesn’t make make sense. The OPTIONS request is an outbound packet and AFAIK the firewall blocks nothing going out. The response should appear as an established/related packet and should be accepted regardless of firewall settings.
AFAIK match/permit is only used for routing incoming INVITEs; responses are routed based on Call-ID, and other SIP tags if needed.
Update: that change seems to have fixed the problem. No log entries since my last post, connection seems stable.
That said, @Stewart1, I totally agree - it should be an outgoing connection, and have nothing to do with what I changed assuming I am thinking about it right. But, it works, and I’ll not complain over-much.