Major system crash

Last Friday one of my systems crashed. Users could still dial via Dundi and extension to extension (I figure that is because of what is loaded in memory). However they could not dial out and all inbound calls were just getting a ring. System is a Dell OptiPlex 755 with Core 2 Duo proc. 2 Gigs ram. Sata 250 gig drive. Running Trix 2.somthing. Upgrade the FreePBX to 2.5 rc2.

The site was 3.5 hours away, so my first step was to log into the system to see if I could find errors. All attempts to login failed. I then did a remote desktop connection to the windows network and tried to use webmin or the gui to connect. Still could not connect to the box.

I asked one of the the people there to see what was on the screen, nothing it was black. Wiggle the mouse, tap a key on the keyboard. Nothing still the same. So I said power it down and restart see what happens. All that happend was it came to grub. I walked them through mounting and booting the files system. No luck, kept getting no files system found.

So, I grab a spare box that is 90% setup and head out. When I get there I go through the same steps. Nothing. Loaded a boot cd and started up linux rescue. Same message no file system. It was if the hard drive was wiped. So I get them up and going with spare box all is good in their world.

I now have the time to see if I can find the cause. I ran the diagnostics that is built into the cmos. No errors reported. It shows 2 gigs ram all good. Video good. Net card good. Hard drive is present 250 gig - no file system.

So I tired to install Centos 5.2 and it comes up to the first screen then you go to install and get a bunch of very colour full flashing square boxes. Weird. I’m thinking video card. Try a new one same thing. Hmmmmm. Grab an old Trix install cd and it boots, formats file system, but does not complete as it should. Cannot connect to it after install. Cannot amportal start, restart, stop. Seems like it is not installed.

Scratch my head some more. Ran a file system check. Seems good no errors. Scratch some more. Decide to try yum update. This seemed to update everything. Did a shutdown reboot. I have now noticed that Trix did not complete the install. No asterisk, no FreePBX nothing. Going to try PBX in a Flash and see what happens… Well PBX in a Flash seemed to install fine.

I’m now concerned as to what happened. Is my hard drive bad, bad controller. Don’t know if I should use this system or not. I would hate to get it working just to crash again. I have a number of systems all based on the same hardware and software and all appear to be running with no fuss.

Currently I’m running the smart drive tests to see if I can get any info from there.

Does anyone have an ideas of what I could try. Or as I’m thinking just toss the drive and get a new one. I’m just concerned that it was somthing else.

Rob

I would put that machine through a couple of hours worth of hard drive tests and memory tests. If it passes, run a burn-in on it. All of these utilities are on the Ultimate BootCD distro of Linux. You might also want to take voltmeter tests of the power supply while you are running these tests. The 12 volt side should be over 11.5 volts and the 5 volt side should be within 1/10 of a volt of 5 volts. If the 5 volt side is less than 4.8 volts, I would replace it. If everything tests out OK, you may have had an unwanted visitor.

That was my last thought. Unwanted visitor. I cannot see how they could have gotten in though. But I know anything is possible.

I will try your other suggestions and then check my security.

Rob

There have been many hacks on TB as of late.

Well I’m thinking gremlins. I have run every disk check disk scan I can find. I did come up with one bad sector. Everything else checks out. Power is in the range expected.

So, drive seems to check out. I’m still reluctant to use it and think I’ll get it replace from dell.

Rob

Rob,

In my company, one bad sector is adequate reason to trash the drive. Our experience is that they fail fairly quickly after that first sector goes bad.

took two hours but i finally convinced dell that the drive was NFG. They have sent a replacement. System is running fine but, I will feel more comfortable knowing that it is brand new.

Rob