Today we had out PBX go down. I used putty and sshed into it.I ran fwconsole restart and it showedthat the file system was read only. I tried to run reboot now and it said the same thing. I had to go to the server physically and reboot it.
After the reboot everything appeared to be ok. But after a while it did it again but asterisk/freepbx was still working.
The server has been in place for about 2 years and working great. It is a HP DL380 G6 with a RADI 5 array with 2 hout spares. Drives show ok. Only issue I found was that the array battery has failed per HP Array config at boot.
I check the drive space with df -h and it is not full
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 672G 85G 554G 14% /
tmpfs 7.8G 0 7.8G 0% /dev/shm
/dev/sda1 283M 49M 219M 19% /boot
I ran fsck and it showed that the files sytems were ok.
Has anybody else had this issue.
There might be clue in /var/log/messages
nothing in there that i see obivous. Have you ever seen this before. I know it can happen if there is a bad disk but it is not showing a bad drive at this time.
A few times, it is often an OOM for a runaway process or a horribly screwed journal, but they both are noticed by the kernel .
But without doubt replace the battery or the disk can no longer “write through” cos that can screw the journal and decrease performance
@dicko So I changed the BBWC Battery and everything was fine for a few days then BAM it again at1:15am today.
Makes for a long day.
Here is that error
Nasty. Last time I saw that I had to replace the raid controller and rebuild the array
I ended up swapping the drives into another server of the same model DL380 G6 and what do you know. One of the drives were reporting a failure. Replaced the drive and it has been a week today with no issues (KNOCK ON WOOD)
@dicko thanks for the help.