Sangoma S500 sluggish and keeps rebooting every 15 minutes

Using a brand new Sangoma FreePBX Phone System 300 (bought in February). We are currently using Digium phones provisioned with EPM (old phones from our previous Switchvox setup). They work okish (anything other than basic features is buggy or quirky as hell).

We bought a test Sangoma S500. Provisioning with EPM was a breeze. Phone boots up with the settings except after 10-15 seconds it is sluggish/unresponsive (very long delays, measured in seconds, when interacting with it) and it eventually reboots itself after 15 minutes or so.

All phone modules are up to date:

asterisk 23656 0.0 1.0 728104 39740 ? Ssl 11:17 0:01 /usr/bin/node /var/www/html/admin/modules/restapps/node/node_modules/forever/bin/monitor …/restapps.php
asterisk 23677 2.1 1.3 362284 54256 ? S 11:17 5:28 php /var/www/html/admin/modules/restapps/restapps.php

I am thinking (and hoping) this is probably an obvious issue with my settings. Maybe some FreePBX guru could provide me some guidance?

What firmware are you using on the phone? Upgrade to current if you haven’t.
https://wiki.freepbx.org/display/PHON/Updating+Firmware

Latest EPM allows. It reads version 1.31 which in turn translates to 2.0.4.28 for the S500.

Also downgraded to 1.30 (2.0.4.27) for giggles and it made no difference.

Can you verify on the actual phone what firmware it has installed.

Also might be best to open a support ticket at support.sangoma.com and they can help figure out what’s going on.

Tony,

Thank you for replying. Yes, this has been looked at. Going into the phone’s website’s status page shows:

Product Model S500

Firmware Version
BOOT–2.0.3.36(2016-01-31 11:10:00)
IMG–2.0.4.28(2017-03-27 16:33:00)
ROM–2.0.4.28(2017-03-27 16:33:00)
DSP–9.0.3(Patch 1.0.16)

EHS Version
Version on phone V1.2
EHS Module Version N/A

I opened a ticket on Tuesday morning. I uploaded the phone’s log and config files as per wiki instructions. It has been 48 hours and the ticket has not yet been assigned. I am giving it another day. But if I hear nothing from the support team 3 days after I posted the ticket it will be necessary for me to move on to other phones and/or phone systems.

Well the team can get behind on things but rest assured someone will get back with you soon

Talked to support. He concluded the phone is broken.
I am having it replaced. I will update this thread once I get the new phone.

Just a quick update and then probably this thread can be closed.

I got the replacement today and the phone works without issues. We are very happy with the functionality. We will be replacing all our old phones with the s500 model.

A few more notes:

  1. After purchasing 20 or s500 phone and putting them in production, everything worked great for a couple weeks. Then 2 phones bugged out in a similar fashion with our first phone, the one we purchased for testing purposes. We have replaced one of them but decided to hold onto the second one. Such a high failure rate is suspicious. In addition all of our phones were randomly turning sluggish/unresponsive (once every 5 minutes or so). Occasionally, users would be unable to answer calls etc. (phones would not comply to commands).

  2. Having the buggy phone plugged in caused high amounts of network traffic between it and the FreePBX box (GBs every 5 minutes or so) as it was locking out/rebooting. We are using 10/100 POE Cisco switches and the network traffic was capping the available bandwidth degrading call quality (that’s how we detected the high traffic in the first place).

  3. We disconnected/powered off the phone. A few hours later the IP address originally assigned to the phone was assigned to another device (an employee time tracking card reading box) by our DHCP server. The high network traffic between the FreePBX box and the IP (now no longer assigned to the phone but to the time tracking box) resumed. I was really curious so I ran a tcpdump and peeked at the traffic. FreePBX was sending SIP packages (complete with the correct - matching the disconnected phone - SIP extension number, doh!) to the time tracking box thinking it was the phone. EndPointManager accurately portrayed the phone as disconnected but Asterisk obviously did not get the memo (even though the two devices have different mac addresses). We had to statically bind the IP to the phone’s MAC address in order to resolve the high traffic problem.

  4. After reading this Sangoma Phone Quality Thoughts I decided to go ahead an upgrade to latest firmware for all our CISCO POE switches as they were powering our phones. I even placed one of the phones behind a POE injector. Did not make any difference.

  5. Grasping at straws, I rebooted the FreePBX box. I had applied several patches in the last couple of weeks, without rebooting. Uptime around 1 month. The reboot solved most of the call quality/phones being sluggish issues. I also noticed memory usage on the FreePBX went form 70% with 11% swap to 34%. Note to self: always reboot after patching. Our old Digium Switchvox box (also running Asterisk) mandated it.

  6. Things were much better but we still had occasional call quality/phones being sluggish issues so we dug deeper. Turns out call history length is the culprit. When that gets big (greater than couple of hundred items) problems start to appear. I looked but couldn’t find an option to limit call history length to 99 items (old Digium D70 phones only did 99 items). Frankly I don’t think one needs 1000 items anyways. If you are looking that far back it’s far better to use UCP as it sports a vastly superior interface and contains all entries anyways (not just the most recent 1000).

So in the end your issue was mainly related to your PBX not being rebooted after upgrades and some weird network issues. Your DHCP server should not have given a IP that was also assigned somewhere else or a existing IP to another device. That is very dangerous.

As far as call history in the phone I have not seen this. Can you open a bug report on this at issues.freepbx.org so we can dig into it and decide best course of action. On my phone it only shows the last 100 calls in call history. Where are you seeing on the phone a larger call history.

We need a schematic of your hardware layout to better help you.

Not quite, Tony.

The one phone that bugged out completely (reboot every 5-15 minutes) never recovered. Still does this. The reboot of the server fixed the other phones’ behavior of intermittently freezing/being sluggish. To clear any confusion, these are in my mind two separate issues, albeit the phones being sluggish is a shared symptom.

The DHCP leases are assigned for 2 hours. Default setup on PFsense. After 2 hours another device was assigned the phone’s IP (since the phone was disconnected). Nothing dangerous or strange, just standard procedure.

Regarding phone history, I stand corrected. It appears the call history only shows 100 entries but the missed call counter goes up to 999 (ergo the confusion):

.

Nothing fancy. The standard ISP gateway -> PFSense router -> Cisco master switch. The phones are connected to either the master switch or to smaller CISCO switches, which in turn connect to the master switch.

+1 for pfSense

Couple things to try:

  1. Static leases for equipment that is permanent.

2 ) 24hr. DHCP leases for the rest.

  1. Try using default background image on phones.

A quick update. We switched from PJSIP to SIP for our sangoma s500 phones and all problems went away. This even fixed the phone that was constantly rebooting.

We still have a bunch of Digium D40 left using PJSIP as those phones never had any problems.

Tldr: don’t use pajama SIP for production as it doesn’t seem to be mature enough. But in all fairness the official wiki says the same thing.

1 Like

You are correct. PJSIP actually causes quite a few issues. Its been around a long time but its not yet production worthy. When having issues I often find that I accidentally set up the extension as PJSIP.

I have a phone doing the same thing and it is not using PJSIP. This unit has the EXP-100 attached and I also have a custom background image. Other phones on the same network seem to be just fine.

On a side note, I also have some of these phones that people are randomly reporting the outbound DTMF sending wrong digits.

The constantly “Checking Firmware” and “Resetting” of the devices “could” sometimes also have a root in power supply of the phone. In this case the POE. I am suffering from the same problems as describe above and now have my phone connected to a separate POE injector and will see if it stabilizes.

Keep the results posted if it works or if I am going for a RMA :wink:

JWR