SIPStation Major Outage

siptrunk
freepbx
Tags: #<Tag:0x00007f7031128880> #<Tag:0x00007f7031128718>

(Mark Moore) #1

Is anyone else seeing that SIPStation is having a major outage. Our trunks are down, and the 920-886-8130 number reports that no one is there to take support calls.

The status.sangoma.com page reports numerous “Majpr Outage” but says that SIPStation - Trunking is Operational.

Unfortunately, none of our incoming numbers are working.

Anyone know what’s going on??


SIP Provider recommendation
(Edrick Smith) #2

Yeah isn’t this wonderful from such a “big” company as Sangoma they’re having a major systems outage, providing no notice or time frame for fixes you can’t get anyone on support for the hardware or apparently SIPStation as you say. Just another fubar from a company that keeps buying out anyone they can under the sun and yet amazingly not providing any more of a reliable system.

Hilarious if you go to SIPStation.com you just get a white page now or a Error 500 on Chrome.


(Jared K Smith) #3

We experienced a problem with one of our production databases, and our IT staff is working to resolve the issue. The impact on SIPStation trunks/DIDs seems to limited to trunks/DIDs purchased in the last couple of months. Please rest assured that we’re working to restore full service as quickly as possible.


(Tony Lewis - https://bit.ly/2SbDAyc) #4

Their status page shows this.


(Edrick Smith) #5

Well I can tell you even though it claims support.sangoma.com is fixed it certainly isn’t and per my other thread which I’ve lost all connectivity with phones connected over the VPN and can’t do anything to provision as that also seems to be down.

On top of that once we did a module upgrade it hosed the GUI interface because the OEM module can’t reach something with DDNS and is complaining about the system being down on Sangomas end. So this isn’t just some small glitch it’s an issue that is causing production environments to have been offline all day now


(Tony Lewis - https://bit.ly/2SbDAyc) #6

Are you on PBXact as that is the only reason you would
Have OEM branding module. If so and you have a support contract call Sangoma support. That is part of your support agreement along with the SLA. They should be able to provide a solution to get you limping along.


(Edrick Smith) #7

That’s why I’m pissed, I called at 12PM PST about the outage issue we’re having with VPN functionality and the only person I could get ahold of was through the sales channel, they told me they had no idea what was going on other than there were issues all over the board. They couldn’t provide any point of contact, said they had zero way of getting ahold of technical support and there was nothing they could do. They said since you can’t access support.sangoma.com to open a ticket there’s nothing you can do other than wait.

What a wonderful response that was… I insisted he take my information and send me a follow up email which I never heard anything. I tried running a module update to see if perhaps there was an issue since the system hadn’t had the modules updated in the past 60 days. Then got borked on the OEM module complaining that it can’t talk to Sangoma due to the outage and the module refuses to install / update.

I called again and got “through” the support menu by putting in an old ticket number only to be instantly dumped in the support voicemail.

I can’t open new tickets since there’s no support portal and on top of that since the portal is also down I can’t check any provisioning settings so there’s really zero I can actually diagnose on the module or VPN end unfortunately


(Tony Lewis - https://bit.ly/2SbDAyc) #8

Your VPN should not be dependent on any Sangoma servers unless you are using the Sangoma DDNS which is also down now.


(Edrick Smith) #9

I do believe we are but I am not sure and being that so many different systems are experiencing outages there’s no way for me to check. Other than the fact that all VPN enabled devices are refusing to connect and when I tried factory resetting one of the phones to test the phone won’t re-provision which I can only guess is related to the portal being down.

I can’t even further diagnose via the GUI on my end since… that’s down also due to the module failing

But again I can’t check any settings so those phones have been without service all day.


(Tony Lewis - https://bit.ly/2SbDAyc) #10

I am sorry your experiencing all this. If you could get into the PBX GUI you could update your VPN server to use your own DNS.


(Tony Lewis - https://bit.ly/2SbDAyc) #11

Have you tried to update modules from fwconsole to see if you can get the GUI back. fwconsole ma upgradeall


(Edrick Smith) #12

Yes unfortunately I can remove OEMbranding via CLI and get the portal to “load” but no menu structure shows up so other than manually typing the URL for everything which is going to be a major pain for testing plus not wanting to screw things up even more…

Any time you try to install / re-setup oembranding via CLI it fails because it can’t talk to Sangoma.

[root@uc-~]# fwconsole ma install oembranding

In Sysadmin.class.php line 1449:
                                                                                                                                                                   
  Tried to update DDNS with {"deploymentname":"xx","hash":"xx","publish_ddns":"true","endpoin  
  t_proto":"http","endpoint_port":84,"distro_version":"12.7.6-1910-1.sng7","failsafe_version":"1.0.0.0"}, crashed with {"result":false,"message":"System currentl  
  y under maintenance. You can check the current status of our applications in the following URL: <a href='https:\/\/status.sangoma.com\/' target='_blank'>https:  
  \/\/status.sangoma.com\/<\/a>"}                                                                                                                                  
                                                                                                                                                                   

(Jared K Smith) #13

Our engineers continue to work on restoring the failed database that has caused the service outages. They are making significant progress in restoring the database to full functionality, but we still expect that it will take several hours to fully restore all services. We will continue to update https://status.sangoma.com with more information as the services are restored.


(Nenad Corbic) #14

Most of the services have been restored!
Sangoma support will be up soon.

Sangoma engineers have worked through the night and were able to bring all critical
services back up. Please refer to https://status.sangoma.com/ for detailed status update.

We will continue to perform detailed monitoring throughout
the day in order to make sure that the new systems are working
optimally.

Over next few days, will be doing a detailed review of why the outage occurred
so that we can identify the root cause and improve the reliability of our systems.

Thank you for your patience.


(Lorne Gaetz) split this topic #15

2 posts were split to a new topic: SIP Provider recommendation


(Mark Moore) #16

Thank you very much Mr. Corbic. I hope that you can keep your customer base informed of the investigation, the findings, and the changes


(Nenad Corbic) #17

Hi @mmoo9154 for sure we will.

And we are going to be changing our processes to be much more transparent and over communicate.
One of many things this outage thought us, was the unintended consequences of relying on cloud and the effects of that cloud infra going down. We have a lot of work ahead of us and we will make it better that is forsure.

As for SIP Trunking, if you are interested in talking to VoIP Innovations
I would be very happy to make the introductions. You can email me at ncorbic@sangoma.com with your request and I’ll have someone reach out to you today.

Nenad


(Jared Busch) #18

Cloud fails, it is not magic. But that does not mean cloud is bad. It means you simply have to appropriately plan for DR for the cloud services. With AWS it could be as simple as a checkbox to enable regional failover, etc.


(Nenad Corbic) #19

Agreed.
I was more referring to how FreePBX handles the infra outage.
We will improve it so it handles the infra outage gracefully when it comes to module updates, licence checks etc…


#20

This seems like a bug that needs immediate attention. A module update shouldn’t hose the system ever, particularly when there’s nothing available to update.