SPA phones randomly restarting

Hi everyone:

I’m not exactly a noob but this issue has me wondering if I’ve done something wrong. New PBX distro v12 was installed on Saturday and starting Tuesday, some phones are randomly rebooting. Phones are SPA509G but not on latest firmware yet. EPM is installed. Of note, Phone Restart doesn’t work reliably using either the Admin menu or EPM option to reboot. Sometimes a phone will restart right away but more often not at all and I have to restart manually. I’m guessing there are stuck ‘restart’ commands that are getting processed later, much later. Is anyone else having this issue? Is there a way to check what SIP restart commands are waiting to be sent to phones and, if so, clear them out? I restarted the PBX last night and one of the phone restarted this morning.

Thanks very much!

Guru,

I have several clients with FreePBX servers and SPA514G phones in production. I dont use EPM tho, i write my own configs.

I also had a problem where phones were rebooting every hour or so.

The reason the phones reboot randomly is because, well, its actually not random, it is the resync periodically command that causes it.

Resync periodically is set to 1hr in my settings. Now, normally, resync does not cause reboot, reboot only happens if the config on the server is different from the current config OR if the current config has a problem / is corrupt.

It was due to some updates I had made to their configs, the XML was corrupt or had an incorrect paramter. Its a pain in the ass to figure out where the problem is in the XML file because it could be as simple as not closing an open tag.

Post a phone XML config for us to look at

Also, on a phone

Press settings
11
10

This will take you to the provision status menu.
What does it say? Sometimes it will say failed or corrupt.

The first thing I do to debug the XML is run it through an online XML validator to check for syntax errors. Then if theres no syntax errors, i read the config line by line and make sure everything looks good, verify proxy, secret are correct etc

Reilly, thanks much for the heads-up! I’ll be onsite tomorrow morning and will check what you’ve asked and upload one of the XML files. If time permits, I’ll do that tonight after hours.

No problem! Sounds good

Hi Reilly:

How do I attach a file to a post? Copy/Paste the XML file was not the ideal move.

Checked the .XML with an online validator, www.xmlvalidation.com, but it found no errors. Didn’t choose external scheme or any other options - just copy/pasted the file contents.

Checked phone provision status menu - Menu, 11, 26 (on these phones) - some says ‘Idle’ but newly rebooted phones says ‘Succeeded’. None say failed or corrupt.

Thx

Paste the config and then you have to highlight it and click the “< >” icon so that the forum knows to treat it as code. Or you could drop it in http://www.pastebin.com/ and link to it here

Thx :smile:

http://pastebin.com/LS7atfw0

Also, I just update FW to 7.5.6a on all phones - I wonder if that will affect anything.

Cool! All my phones are on 7.5.6a and it has been very stable.

Checking out your config

Guru,
Have you ever used wireshark?

In situations like these,

I plug the “SW” port of the phone into a PoE brick, then plug that into the network jack on my laptop.

In windows, i select both my LAN and Wireless cards, right click and bridge them.

Then i connect to companys wireless network.

I run wireshark and watch for errors as the phone registers - the traffic passes thru my LAN card and is bridged to my WLAN card and is connected to the company network. I sniff the LAN card. Then if i dont see anything obvious, i then turn on the debug to 3 and set the debug server to my wireless IP, and look at those messages

If youre not familiar, i might be able to load this config on one of my phones on Monday and debug it

Also, i noticed that your “resync periodic” parameter is set to 24 hrs. Can you verify that the phones are rebooting once every 24 hrs? It should be at the same time everyday.

To check, go to the phones IP in a web browser and on the information page it will have a “last reboots” section. Copy and paste that off a few phones for us

Hi Reilly:

Thanks so much for your help. The issue is resolved, and it was related to power issues.
First of all, FW upgrade to 7.5.6a helped a lot - most phones stopped rebooting.
But I was using these inexpensive POE pigtail adapters - they use the original power adapter and carry it over ethernet, using a RJ45/male power on one end and RJ45/female power on the other end. I only needed them for a few phones so didn’t need a POE switch. They were apparently causing issues with the ethernet signal and causing spontaneous reboots. Once I got rid of those and installed a dedicated POE brick, problems went away. I’m still keeping my fingers crossed because it was such a nightmare for a few days.

Thanks again. Happy Holidays (Christmas, Chanukah, etc) and Happy New Year. :smile:

Wow,
That one was a pain to troubleshoot I bet!

The last FreePBX/Cisco SPA phone system I put in, the client had 25+ phones so it would be financially logical to get a PoE switch, but the problem was there was no real MDF, there was a bunch of randomly connected Netgear switches because it was a large warehouse.

So we used a bunch of these $20 PoE injectors by TPLink

Theyve worked great, hoping that they don’t cause any weird issues in the future like what you described