Sangoma S500 sluggish and keeps rebooting every 15 minutes

A few more notes:

  1. After purchasing 20 or s500 phone and putting them in production, everything worked great for a couple weeks. Then 2 phones bugged out in a similar fashion with our first phone, the one we purchased for testing purposes. We have replaced one of them but decided to hold onto the second one. Such a high failure rate is suspicious. In addition all of our phones were randomly turning sluggish/unresponsive (once every 5 minutes or so). Occasionally, users would be unable to answer calls etc. (phones would not comply to commands).

  2. Having the buggy phone plugged in caused high amounts of network traffic between it and the FreePBX box (GBs every 5 minutes or so) as it was locking out/rebooting. We are using 10/100 POE Cisco switches and the network traffic was capping the available bandwidth degrading call quality (that’s how we detected the high traffic in the first place).

  3. We disconnected/powered off the phone. A few hours later the IP address originally assigned to the phone was assigned to another device (an employee time tracking card reading box) by our DHCP server. The high network traffic between the FreePBX box and the IP (now no longer assigned to the phone but to the time tracking box) resumed. I was really curious so I ran a tcpdump and peeked at the traffic. FreePBX was sending SIP packages (complete with the correct - matching the disconnected phone - SIP extension number, doh!) to the time tracking box thinking it was the phone. EndPointManager accurately portrayed the phone as disconnected but Asterisk obviously did not get the memo (even though the two devices have different mac addresses). We had to statically bind the IP to the phone’s MAC address in order to resolve the high traffic problem.

  4. After reading this Sangoma Phone Quality Thoughts I decided to go ahead an upgrade to latest firmware for all our CISCO POE switches as they were powering our phones. I even placed one of the phones behind a POE injector. Did not make any difference.

  5. Grasping at straws, I rebooted the FreePBX box. I had applied several patches in the last couple of weeks, without rebooting. Uptime around 1 month. The reboot solved most of the call quality/phones being sluggish issues. I also noticed memory usage on the FreePBX went form 70% with 11% swap to 34%. Note to self: always reboot after patching. Our old Digium Switchvox box (also running Asterisk) mandated it.

  6. Things were much better but we still had occasional call quality/phones being sluggish issues so we dug deeper. Turns out call history length is the culprit. When that gets big (greater than couple of hundred items) problems start to appear. I looked but couldn’t find an option to limit call history length to 99 items (old Digium D70 phones only did 99 items). Frankly I don’t think one needs 1000 items anyways. If you are looking that far back it’s far better to use UCP as it sports a vastly superior interface and contains all entries anyways (not just the most recent 1000).