Cpu usage maxed

bksales · August 9, 2023, 5:56pm

I need some help here. Ran through a few different ideas (had cxpanel causing problems in a few servers, checked the intrustion detection firewall sync) and in the process the GUI broke. I see a bunch of these in the firewall.err log, tried turning off ip6 to no avail.

ip6tables v1.4.21: invalid port/service disabled' specified Try ip6tables -h’ or ‘ip6tables --help’ for more information.

I found a thread here from a couple years ago where upgrading the firewall to the edge track fixed it, but that’s old and it didnt work on this one.

Here’s what all it’s doing now:

I see a lot of stuff getting written to the cron log (maybe 10 lines at the top of every minute), not sure if this is relevant.

Asterisk 18.9
Freepbx 12.7.8-2306-1.sng7

comtech · August 9, 2023, 8:11pm

This doesn’t look like the CPU is maxed?

bksales · August 9, 2023, 11:12pm

When I run top it bounces around in the mid 80% range but at the hypervisor level it shows 100%

comtech · August 10, 2023, 4:44pm

top –i

What does mpstat show

dobrosavljevic · August 10, 2023, 5:36pm

You also didn’t specify the specifics of your CPU and how much of it you have assigned to your virtual machine. These stats all could be correct for how many resources you have assigned to your virtual machine.

bksales · August 10, 2023, 6:01pm

top -i Initially it shows this

and then changes to

mpstat shows

comtech · August 10, 2023, 6:11pm

So it doesn’t stay maxed out according to top, but on the hypervisor, it looks like it does. What about mpstat? So maybe the delta is the hypervisor or interaction?

Are you really on version 12? You might need to update to get more support, but I dont have anything more.

bksales · August 10, 2023, 6:26pm

I see that the ucp error logs are getting huge, i made the mistake of trying to open one without trimming a bunch of stuff out first, but here is what I’m seeing repeating

2023-08-09 03:08 -07:00: Error: error:0906D06C:PEM routines:PEM_read_bio:no start line
2023-08-09 03:08 -07:00: at Object.createSecureContext (_tls_common.js:88:17)
2023-08-09 03:08 -07:00: at Server (_tls_wrap.js:819:25)
2023-08-09 03:08 -07:00: at new Server (https.js:60:14)
2023-08-09 03:08 -07:00: at Object.createServer (https.js:82:10)
2023-08-09 03:08 -07:00: at Server (/var/www/html/admin/modules/ucp/node/lib/server.js:98:19)
2023-08-09 03:08 -07:00: at EventEmitter. (/var/www/html/admin/modules/ucp/node/index.js:19:49)
2023-08-09 03:08 -07:00: at emitNone (events.js:106:13)
2023-08-09 03:08 -07:00: at EventEmitter.emit (events.js:208:7)
2023-08-09 03:08 -07:00: at /var/www/html/admin/modules/ucp/node/lib/freepbx.js:44:11
2023-08-09 03:08 -07:00: at EventEmitter. (/var/www/html/admin/modules/ucp/node/lib/freepbx.js:150:4)

So maybe this is a UCP issue.

bksales · August 10, 2023, 6:32pm

fwconsole restart ucp

drops the cpu usage way down, will see if it stays down. GUI is still broken, systemctl restart httpd does not fix that, neither does an fwconsole restart.

bksales · August 10, 2023, 8:53pm

CPU usage seems to stay low until I do fwconsole restart or rebot the PBX, then it shoots back up. I cleared the logs that are getting big (ucp_out and ucp_err), will have to see whats happening.

Any suggestions on getting the GUI back up?

dicko · August 10, 2023, 9:20pm

systemctl status {httpd,apache2}

bksales · August 10, 2023, 10:07pm

systemctl status {httpd,apache2} -l

● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2023-08-10 11:42:32 PDT; 3h 20min ago
Docs: man:httpd(8)
man:apachectl(8)
Process: 23843 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited, status=0/SUCCESS)
Process: 13176 ExecReload=/usr/sbin/httpd $OPTIONS -k graceful (code=exited, status=0/SUCCESS)
Main PID: 23879 (httpd)
Status: “Total requests: 0; Current requests/sec: 0; Current traffic: 0 B/sec”
CGroup: /system.slice/httpd.service
├─13177 /usr/sbin/httpd -DFOREGROUND
├─13178 /usr/sbin/httpd -DFOREGROUND
├─13179 /usr/sbin/httpd -DFOREGROUND
├─13180 /usr/sbin/httpd -DFOREGROUND
├─13181 /usr/sbin/httpd -DFOREGROUND
└─23879 /usr/sbin/httpd -DFOREGROUND

Aug 10 11:42:31 host systemd[1]: Starting The Apache HTTP Server…
Aug 10 11:42:32 host httpd[23879]: AH00112: Warning: DocumentRoot [/invalid/folder/name] does not exist
Aug 10 11:42:32 host systemd[1]: Started The Apache HTTP Server.
Aug 10 14:59:37 host systemd[1]: Reloading The Apache HTTP Server.
Aug 10 14:59:37 host httpd[13176]: AH00112: Warning: DocumentRoot [/invalid/folder/name] does not exist
Unit apache2.service could not be found.

dicko · August 10, 2023, 10:25pm

Then RedHat not Debian

grep -ir DocumentRoot /etc/httpd/
systemctl restart httpd
systemctl status httpd

bksales · August 10, 2023, 10:43pm

grep -ir DocumentRoot /etc/httpd/

/etc/httpd/conf/httpd.conf:# DocumentRoot: The directory out of which you will serve your
/etc/httpd/conf/httpd.conf:DocumentRoot “/var/www/html”
/etc/httpd/conf/httpd.conf: # access content that does not live under the DocumentRoot.
/etc/httpd/conf.d/schmoozecom.conf: DocumentRoot /var/www/html
/etc/httpd/conf.d/schmoozecom.conf: DocumentRoot /var/www/html/restapps/
/etc/httpd/conf.d/schmoozecom.conf: DocumentRoot /tftpboot/
/etc/httpd/conf.d/schmoozecom.conf: DocumentRoot /invalid/folder/name
/etc/httpd/conf.d/schmoozecom.conf: DocumentRoot /var/spool/asterisk/sangoma_phone_service/
/etc/httpd/conf.d/freepbx.conf:# This should be changed to whatever you set DocumentRoot to.
/etc/httpd/conf.d/ssl.conf.old: DocumentRoot /var/www/html
/etc/httpd/conf.d/ssl.conf.old: DocumentRoot /var/www/html/ucp/

About the same as before

systemctl status httpd -l
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2023-08-10 15:26:35 PDT; 15min ago
Docs: man:httpd(8)
man:apachectl(8)
Process: 17562 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited, status=0/SUCCESS)
Process: 13176 ExecReload=/usr/sbin/httpd $OPTIONS -k graceful (code=exited, status=0/SUCCESS)
Main PID: 17565 (httpd)
Status: “Total requests: 0; Current requests/sec: 0; Current traffic: 0 B/sec”
CGroup: /system.slice/httpd.service
├─17565 /usr/sbin/httpd -DFOREGROUND
├─17566 /usr/sbin/httpd -DFOREGROUND
├─17567 /usr/sbin/httpd -DFOREGROUND
├─17568 /usr/sbin/httpd -DFOREGROUND
├─17569 /usr/sbin/httpd -DFOREGROUND
└─17570 /usr/sbin/httpd -DFOREGROUND

Aug 10 15:26:34 host systemd[1]: Stopped The Apache HTTP Server.
Aug 10 15:26:34 host systemd[1]: Starting The Apache HTTP Server…
Aug 10 15:26:35 host httpd[17565]: AH00112: Warning: DocumentRoot [/invalid/folder/name] does not exist
Aug 10 15:26:35 host systemd[1]: Started The Apache HTTP Server.
.

bksales · August 10, 2023, 11:18pm

Now HTTP is working, just not HTTPS. I see a note on the dashboard that the certificate cant be renewed “Self test error: Pest_NotFound - 404 - File or directory not found”

and while I can get to the certificate manager, if i try to edit the existing lets encrypt cert, i get this screen

I deleted and successfully recreated and reinstalled the cert and now I can get back in via https.

Not sure exactly what got http access back or how a broken cert caused this much trouble, or even how it broke. Suggestions/explanations welcomed, but for now the GUI is working.

Restarted asterisk and it immediately jumps back to 80-90% CPU

from ucp_out.log there are tons of these

2023-08-10 16:13 -07:00: Starting FreePBX…
2023-08-10 16:13 -07:00: 28800000
2023-08-10 16:13 -07:00: 14400000
2023-08-10 16:13 -07:00: Result set finished
2023-08-10 16:13 -07:00: No more result sets!
2023-08-10 16:13 -07:00: FreePBX is Ready!
2023-08-10 16:13 -07:00: Asterisk version is: 18.9
2023-08-10 16:13 -07:00: Starting FreePBX…
2023-08-10 16:13 -07:00: 28800000
2023-08-10 16:13 -07:00: 14400000
2023-08-10 16:13 -07:00: Result set finished
2023-08-10 16:13 -07:00: No more result sets!
2023-08-10 16:13 -07:00: FreePBX is Ready!
2023-08-10 16:13 -07:00: Asterisk version is: 18.9
2023-08-10 16:13 -07:00: Starting FreePBX…
2023-08-10 16:13 -07:00: 28800000
2023-08-10 16:13 -07:00: 14400000
2023-08-10 16:13 -07:00: Result set finished

and from ucp_err.log there are tons of these

2023-08-10 16:13 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: 2023-08-10 16:13 -07:00: -07:00: Error: error:0906D06C:PEM routines:PEM_read_bio:no start line
at Object.createSecureContext (_tls_common.js:88:17)
at Server (_tls_wrap.js:819:25)
at new Server (https.js:60:14)
at Object.createServer (https.js:82:10)
at Server (/var/www/html/admin/modules/ucp/node/lib/server.js:98:19)
at EventEmitter. (/var/www/html/admin/modules/ucp/node/index.js:19:49)
at emitNone (events.js:106:13)
at EventEmitter.emit (events.js:208:7)
at /var/www/html/admin/modules/ucp/node/lib/freepbx.js:44:11
at EventEmitter. (/var/www/html/admin/modules/ucp/node/lib/freepbx.js:150:4)
-07:00: Error: error:0906D06C:PEM routines:PEM_read_bio:no start line
at Object.createSecureContext (_tls_common.js:88:17)
at Server (_tls_wrap.js:819:25)
at new Server (https.js:60:14)
at Object.createServer (https.js:82:10)
at Server (/var/www/html/admin/modules/ucp/node/lib/server.js:98:19)
at EventEmitter. (/var/www/html/admin/modules/ucp/node/index.js:19:49)
at emitNone (events.js:106:13)
at EventEmitter.emit (events.js:208:7)
at /var/www/html/admin/modules/ucp/node/lib/freepbx.js:44:11
at EventEmitter. (/var/www/html/admin/modules/ucp/node/lib/freepbx.js:150:4)

bksales · August 11, 2023, 5:20pm

I followed Andrew’s steps to reinstall pm2 and while it took forever it appeared to comlete from the CLI

PM2 won’t install – how do I troubleshoot? - FreePBX - FreePBX Community Forums

But I see this in the GUI

tried running that command again.

The log it references only has this in it

Session terminated, killing shell…The process “runuser ‘asterisk’ -s ‘/bin/bash’ -c ‘cd /var/www/html/admin/modules/pm2/node && mkdir -p /home/asterisk/.pm2 && mkdir -p /var/www/html/admin/modules/pm2/node/logs && export NODE_TLS_REJECT_UNAUTHORIZED=0 && export HOME=/home/asterisk && export PM2_HOME=/home/asterisk/.pm2 && export ASTLOGDIR=/var/log/asterisk && export ASTVARLIBDIR=/var/lib/asterisk && export PATH=$HOME/.node/bin:$PATH && export NODE_PATH=$HOME/.node/lib/node_modules:$NODE_PATH && export MANPATH=$HOME/.node/share/man:$MANPATH && npm install --only=production’” exceeded the timeout of 600 seconds.

bksales · August 14, 2023, 5:39am

So this continues to get worse. I tried uninstalling pm2 again and now the GUI is broken again, and I cant reinstall it. I get the same error if I try to do anything fwconsole-related

GUI just shows this

dwsiemens · August 14, 2023, 2:37pm

i would try to reinstall core and framework with the most current version.

I’m also on rhel and sometimes some items are a bit different as the way we got to things working is a bit different.

bksales · August 14, 2023, 8:24pm

With downloadinstall or how do i reinstall from the CLI?

dwsiemens · August 14, 2023, 8:38pm

from cli

like fwconsole ma downloadinstall core