Let's Encrypt Certificate renewals failing


(Jon Frey) #1

Has anyone else noticed problems recently with Let’s Encrypt certificates automatically renewing?
I’ve been getting the following error on just about all of my servers:

Some Certificates are expiring or have expired:

There was an error updating certificate “exampledomain.com”: Verification timed out

When I go to certificate management, I attempt to update the certificate which then times out. I then have to disable the firewall, go back to certificate management and the renewal then works. I haven’t changed any firewall settings and this is consistent across 10-15 servers that I manage.

I am curious if a recent update may have broken something and would like to know if anyone else is experiencing this, and if so, we should notify Sangoma for a fix.


#2

There is a ‘search’ icon at the top of this page, I suggest you use it to see the dozens of posts about this.


(Jon Frey) #3

Thank you. I did not see any relevant threads since this appears to be something that happened from a recent update.


#4

No, it was revealed after letsencrypt’s repeated warnings to not have acme clients ‘do that’ went unheeded for five years.


(Jon Frey) #5

Thank you.


#6

I posted this solution in another thread, but didn’t get any feedback. I’ve tested this on a few systems now with positive results…

I’m pretty green when it comes to iptables, but I messed around with it long enough to get a working proof of concept. I’m hoping someone can tell me if this is a bad idea and/or how it could be improved.

Start by enabling “Custom Firewall Rules” via Connectivity > Firewall > Advanced > Advanced Settings > Custom Firewall Rules: Enabled

edit /etc/firewall-4.rules and insert

#Lets Encrypt
#Create lefilter chain
-N lefilter

#Remove FreePBX rule that allows all established states through
-D fpbxfirewall -p tcp -m state --state RELATED,ESTABLISHED -j ACCEPT

#Insert rule into INPUT chain to pass all port 80 traffic to lefilter
-I INPUT -p tcp -m tcp --dport 80 -j lefilter

#Insert rule back into fpbxfirewall chain to allow all established traffic except for traffic on port 80
-I fpbxfirewall -p tcp ! --dport 80 -m state --state RELATED,ESTABLISHED -j ACCEPT

#Allow new port 80 states to be generated
-A lefilter -m state --state NEW -j ACCEPT

#Filter subsequent traffic to allow access to /.well-known/acme-challenge
-A lefilter -m string --string "GET /.well-known/acme-challenge" --algo kmp -j ACCEPT

#Return back to the INPUT chain for further processing
-A lefilter -j RETURN
#End of Lets Encrypt

Ensure /etc/firewall-4.rules is owned by the ‘root’ user and not writable by any other user. If it’s not, then ‘chown root:root /etc/firewall-4.rules’ and ‘chmod 644 /etc/firewall-4.rules’.

Stop and start the firewall, either in the gui or on the command line. From the command line: ‘fwconsole firewall disable’ then ‘fwconsole firewall start’

After that, with any luck, letsencrypt traffic on port 80 will get through, any other exceptions will continue to get through, and any non-whitelisted traffic to port 80 should be dropped.

I’m hoping this, or something like it, can be added to the official distro/firewall module. My fear now, is that the ‘fpbxfirewall’ rule that I’m removing could be renamed which will prevent this custom rule from removing it, and subsequently allow all port 80 traffic through.


Let's Encrypt, DNS challenge, and scripting?
Firewall custom rules > help pls
#7

I noticed the same problem with dozens of FreePBX instances. All out of the box, where it all worked for a long time, suddenly no new certificates.


(Jared K Smith) #8

Correct – Let’s Encrypt used to send all of it’s traffic from one of three known hosts. They recently changed their policy, and now send their traffic from any number of hosts. Hence it’s no longer good enough just to whitelist those three known hosts.


(Lorne Gaetz) #9

Hi @thx2000

I’ve tested your hack and confirm it’s working as expected, very nice. I’m not versed well enough in iptables config to comment on the technical aspects, but I will advance this concept internally as a potential improvement for how firewall deals with this.


#10

Just to be clear, this is NOT a recent “change in policy”

jshaLet’s Encrypt engineer

Dec '15

We plan to frequently change the set of IPs from which we validate, and will validate from multiple IPs in the future. Any host answering challenges should have port 80 or 443 available to the Internet.

the 443 bit was removed in 2018

So either its 80 to the world for acme challenges or if you have control over your nameserver, the better solution of DNS-01 challenges which doesnt need any ports opened


Firewall Module – Improvements in LetsEncrypt certificate management
#11

Thanks Lorne!


(Lucas Ryan) #12

@lgaetz, This would be fantastic. Any way to fast track this to get it into the production tracks?


(Jared Busch) #13

@jsmith very much this.

It is only recently that this was enforced. But it has been know for a long time.

This is a failure of the Sangoma team to stay current. This is not any different than the recent thread about fail2ban being so old. Or the entire fiasco as to why php7 is not standard yet.

And of course the core OS is now even getting out of date. RHEL 8 was released almost a year ago (May 7, 2019). Yet we have heard zero about plans to update.

Schmoozecom chose to go their own way and build SNG7 for their own reasons. Honestly, there is not much wrong with that, but it does absolutely require that you dedicate resources to maintaining and moving forward. In theory Sangoma knew this when they bought Schmoozecom.


(Adolfo) #14

just want to say thanks. this custom firewall ruleset worked for me and was easy enough to follow.
now i need to sit down and run this on several more systems.


(Lorne Gaetz) #15

Hi @thx2000

Development is looking at your feature suggestion now and found a fatal flaw. I said a month ago:

well that’s changed (a bit anyway) and I now realize that adding static known rules to iptables is not a solution to this problem. The flaw is that I, and malicious users, are free to set the user agent to match the rule string, GET /.well-known/acme-challenge, and when I do, I end up with full access to all resources on port 80, which is definitely not what you want. Test for yourself by doing the following curls from an untrusted host:

curl http://<pbx_ip>
curl http://<pbx_ip>  -A "GET /.well-known/acme-challenge"

The first will time out, the second won’t. I would advise everyone against using these rules as currently written. At this point the only recommended method of using LE is dedicating port 80 to LE validation.

The rules are also incomplete. In addition to allowing access to /.well-known/acme-challenge it is also necessary to allow access to /.freepbx-known, at least for cert creation.

edit:

Cause is not hopeless. Static, non-unique iptables rules are out, but you could create a rule unique to your system that allows all inbound access from world using the fqdn. In which case you can change the ACCEPT rule to:

 -A lefilter -m string --string "pbx.example.com" --algo kmp -j ACCEPT

(Jeremy Lizzotte) #16

What i did for this (since I have over 40 PBX’s with the same issue) was I changed the port management for http to 8888 and the LetsEncrypt to 80. Then I went into the firewall rules, then went to status and deleted all of the existing blocked connections (you’ll find many ips that start in 45, that is, i believe, the letsencrypt IPs) but remove them all and wait a few minutes then try, you should be all set after. I didn’t have to mess with Iptables.


#17

Not sure I like the approach, but for the sake of discussion:

That still opens up to anyone that can derive the fqdn. Not sure if it’s much better.

Require the Get and Host headers to match and limit to the beginning of the packet:

iptables -A lefilter -m string --hex-string "GET /.well-known/acme-challenge HTTP/1.1|0D0A|Host: pbx.example.com|0D0A|" --algo kmp --to 40 -j ACCEPT

Should only allow access to the specific resource and require a proper fqdn. A header added later (like the above user agent hack) won’t match.

Above is untested - might need to be tweaked to match LE’s exact request syntax.


(Jared Busch) #18

Why do this the hard way?

FreePBX knows when it is issuing a request to LE.

Simply have it add the rule above prior to executing the the request and then remove it again.


(Lorne Gaetz) #19

Hi Jared:

To be clear, my suggestion above was for individual users who want to test with their own custom firewall rules. Whatever gets done with certman will probably be along the lines you suggest, but obviously there’s a lot more moving parts than having firewall define static rules.


#20

Good catch Lorne. What about this…

#Lets Encrypt
#Create lefilter chain
-N lefilter

#Remove FreePBX rule that allows all established states through
-D fpbxfirewall -p tcp -m state --state RELATED,ESTABLISHED -j ACCEPT

#Insert rule into INPUT chain to pass all port 80 traffic to lefilter
-I INPUT -p tcp -m tcp --dport 80 -j lefilter

#Insert rule back into fpbxfirewall chain to allow all established traffic except for traffic on port 80
-I fpbxfirewall -p tcp ! --dport 80 -m state --state RELATED,ESTABLISHED -j ACCEPT

#Allow new port 80 states to be generated
-A lefilter -m state --state NEW -j ACCEPT

#Filter subsequent traffic to allow access to /.well-known/acme-challenge
-A lefilter -m string --from 52 --to 53 --string "GET /.well-known/acme-challenge/" --algo kmp -j ACCEPT

#Return back to the INPUT chain for further processing
-A lefilter -j RETURN
#End of Lets Encrypt

This modification looks for “GET /.well-known/acme-challenge/” at the start of the HTTP header, and only at the start of the http header. So any malformed headers, or packets with that string anywhere else, will be dropped.

My initial interpretation of the --from and --to offsets seems to be a little off. I thought the --to offset would be where the matched string would have to end, but it appears that it will attempt to match the entire string, regardless of where it ends, from that starting point.

From my tests, this appears to solve the user-agent problem you mentioned, and still allows legitimate LE requests through.