SSL_ERROR_SSL TLS/SIPS SRTP Endpoints periodically disconnect

Running on most recent updates, asterisk 18.15.1. Trying to get a solid SSL/TLS setup going. I’m using sangoma connect, as well, for some of these. Endpoints are a mix of Polycom and Grandstream phones.

All my remote phones connect and register, but after several minutes (seems random, anywhere from 2 to 30 minutes or so), an SSL_ERROR_SSL (Read) error marked as "sslv3 alert certificate expired’ shows up. Immediately after, the AOR is ‘shutdown’ and deleted. The phone doesn’t know… they stay green and looking happy, but attempts to call them fail since the server doesn’t have them on the aor list. Each extension re-registers after a while, making it available to receive calls again, and then gets disconnected again. I have UDP connected peers that don’t experience anything like this. Here’s some log lines from my freepbx ‘full’ log:

The PBX is using the ‘lets encrypt’ functionality built into the certificate manager through freepbx.

What is causing these errors to pop in? Is this the PBX server? Is it telling me about a phone certificate or itself? If it’s just a ‘verbose’ entry and not an error, why am I seeing my AOR get shut down? So confused.

Any ideas?

[2023-01-11 08:54:01] VERBOSE[14229] res_pjsip_registrar.c: Removed contact 'sips:[email protected][sangoma_talk_ip1]:28003;transport=TLS;rinstance=0D6D3F82;x-ast-orig-host=[internal_ip1]:28003' from AOR '804' due to shutdown
[2023-01-11 08:54:01] VERBOSE[12642] res_pjsip/pjsip_options.c: Contact 804/sips:[email protected][sangoma_talk_ip1]:28003;transport=TLS;rinstance=0D6D3F82;x-ast-orig-host=[internal_ip1]:28003 has been deleted
[2023-01-11 08:54:01] VERBOSE[12642] res_pjsip/pjsip_configuration.c: Endpoint 804 is now Unreachable
[2023-01-11 08:54:01] VERBOSE[12642] res_pjsip_registrar.c: Attempted to remove non-existent contact 'sips:[email protected][sangoma_talk_ip1]:25381;transport=TLS;rinstance=0D6D3F82;x-ast-orig-host=[internal_ip1]:28003' from AOR '804' by request
[2023-01-11 08:54:01] VERBOSE[14229] res_pjsip_registrar.c: Added contact 'sips:[email protected][sangoma_talk_ip1]:25381;transport=TLS;rinstance=0D6D3F82;x-ast-orig-host=[internal_ip1]:25381' to AOR '804' with expiration of 600 seconds
[2023-01-11 08:54:01] VERBOSE[2552] res_pjsip/pjsip_configuration.c: Endpoint 804 is now Reachable
[2023-01-11 08:54:01] VERBOSE[2552] res_pjsip/pjsip_options.c: Contact 804/sips:[email protected][sangoma_talk_ip1]:25381;transport=TLS;rinstance=0D6D3F82;x-ast-orig-host=[internal_ip1]:25381 is now Reachable.  RTT: 39.673 msec
[2023-01-11 08:58:11] WARNING[26965] pjproject:                            SSL SSL_ERROR_SSL (Read): Level: 0 err: <336151573> <SSL routines-ssl3_read_bytes-sslv3 alert certificate expired> len: 65535 peer: 111.111.111.111:57432
[2023-01-11 08:59:46] WARNING[26965] pjproject:                            SSL SSL_ERROR_SSL (Read): Level: 0 err: <336151576> <SSL routines-ssl3_read_bytes-tlsv1 alert unknown ca> len: 65535 peer: [sangoma_talk_ip2]:29939
[2023-01-11 08:59:46] VERBOSE[2552] res_pjsip_registrar.c: Removed contact 'sips:[email protected][sangoma_talk_ip2]:29939;transport=TLS;rinstance=E72BE005;x-ast-orig-host=[internal_ip2]:29939' from AOR '803' due to shutdown
[2023-01-11 08:59:46] VERBOSE[14229] res_pjsip/pjsip_options.c: Contact 803/sips:[email protected][sangoma_talk_ip2]:29939;transport=TLS;rinstance=E72BE005;x-ast-orig-host=[internal_ip2]:29939 has been deleted
[2023-01-11 08:59:46] VERBOSE[14229] res_pjsip/pjsip_configuration.c: Endpoint 803 is now Unreachable
[2023-01-11 08:59:46] VERBOSE[14229] res_pjsip_registrar.c: Attempted to remove non-existent contact 'sips:[email protected][sangoma_talk_ip2]:37493;transport=TLS;rinstance=E72BE005;x-ast-orig-host=[internal_ip2]:29939' from AOR '803' by request
[2023-01-11 08:59:46] VERBOSE[2552] res_pjsip_registrar.c: Added contact 'sips:[email protected][sangoma_talk_ip2]:37493;transport=TLS;rinstance=E72BE005;x-ast-orig-host=[internal_ip2]:37493' to AOR '803' with expiration of 600 seconds

edit: corrected asterisk version. Added additional PBX server information and questions about direction.

The most recent Asterisk 18 is 18.15.1. Certified versions are only intended for people with Asterisk support contracts.

A soft error for certificate expired is very strange. Are the phones actually configured with private keys and certificates. I’m wondering if they are generating temporary, time limited, certificates for the key exchange (and you are not validating the signer). The only other thing I can think of is that something is messing with the time on your machine.

Thanks for that info about certified versions. When I went to swap to asterisk 18.15.1, non-cert, looks like I was already on it, so I must have switched while troubleshooting at some point and not switched back. No change with my problem, but good to know.

The phones authenticate with user/secret, so they’re not using certificate based auth. One of these is a grandstream cert; it doesn’t expire for several years, but there isn’t a CA (self signed). afaik, that’s the certificate attached to the web/sips on the phone that it uses to bind and communicate. So, not sure why it would say expired, but ‘unknown ca’ seems more likely. Either way, the error is “verbose” and I’ve seen people talk about the messages being “ignorable” in forums. I’m not sure what to look at. (phone cert says: Not After Mon, 21 Dec 2026 07:37:40 GMT)

Time on the PBX and on the phones is all NTP; I assume there’s an issue on the server since this doesn’t seem to care which extension is connecting, it’s “all” tls peer connections, but server time is right on.

I’m not sure where to focus my efforts. Does anybody understand these messages a little better than me that can tell me where to look? Maybe I don’t have enough info? I’ve also been grabbing the syslog off my phone; it doesn’t seem to be giving me much. I have it only showing warnings/errors right now. This is the only error from my phone:

2023-01-12,09:33:20,2023-01-12,09:33:20,10.1.1.55,16,05,GXP2170_PHONE: USER.ERROR   [00:0b:82:a9:bf:11][1.0.11.64][2910032992] append:SSLKeyLogFile.cc(43)->Can not open /var/user/pcap/sslkeylogfile.txt

I was pointed to a forum post in which a gentleman was having similar trouble, and it turned out to be a bad endpoint. One old / troublesome endpoint was causing all the endpoints to periodically disconnect at seemingly random intervals. I’ve started to remove any physical phones and switch completely to sangomaconnect.

I got this all down to sangomaconnect clients, and I’m still getting the same behavior: I continue to get stuff like this (the log above). Nothing has changed.

<edit: found some old information I removed from December>

There doesn’t seem to be any errors that correlate with this; just endpoint disconnected and reconnected, a few ssl warnings that don’t seem to line up with the time of the disconnections all the time… Calls establish and work just fine, don’t get cut off… just occasionally an endpoint will disconnect and reconnect somewhere between 1 and 10 minutes later.

I would try Asterisk 18.13.0 (or prior) to see if that changes anything. They added TLS wildcards after that which gave me fits. I now patch it (just in one spot) for my systems.