Endpoint registration failure traced to config problem (no port)

15.0.27
Distro 12.7.8-2203.sng7
Asterisk 16.29.1

All modules updated yesterday to current.

User called with two extensions offline. AOR show no IPs for the devices.
However, the devices were online with IPs.
Rebooted handsets (Sangoma S500)
Removed one from Endpoint Manager and re-added.
Reset it to defaults.
Still no registration.
Checked config on device. “Primary SIP Server” showed 10.74.1.250:
Correct IP, but no port specified, so yes, no way to register.
Other affected extension was the same.
I manually added the PJSIP port to the Primary SIP Server setting, saved, and rebooted the devices and they registered.

So the immediate problem is resolved, but has left me wondering what happened and why just two devices were affected, and how this could have persisted after removing from EPM and re-adding.

Anyone experience an issue like this?

The SIP ports are pulled from Settings → Asterisk SIP Settings, so that’s the first place to check.

I should have added that I did that. The odd thing is that it’s only two devices that had the issue. I’ll pull a couple config files from the server and see if they look right.

Pertinent section of affected extension config file:

<hl_provision version="1">
  <config version="1">
    <P271>1</P271>
    <P47>10.74.1.250:</P47>

Pertinent section of non-affected extension:
<hl_provision version="1">
  <config version="1">
    <P271>1</P271>
    <P47>10.74.1.250:5061</P47>

The affected extension was removed from EPM and re-added, so this is somehow borked in the extension setup itself, not how the file is getting built by EPM.

Edit to add: Both used the same EPM template, both are S500s, both are set to use PJSIP.

I renamed the bad XML file and rebuilt it with EPM. New file has the same defect.

We have seen this in the past, but I didn’t dig this deep before we deleted and recreated the extension, which fixed it. That required the user to reconfigure their voicemail, etc. so we’d like to know how to properly fix it.

I recommend opening a support ticket if you haven’t yet done so.

Thanks, Lorne. I’m not sure I have the will to do the debate with support about whether this is an EPM Pro issue or a bug, or requires support credits. I’ve had several of those and they are exhausting.

I don’t mind buying support credits, but I do mind paying hundreds of $$ for support credits when most of them expire before we need to use. Expiring support credits is my biggest thorn with FreePBX/Sangoma.

You don’t need (and have never needed) support credits for commercial module bugs when they are within maintenance nor for hardware phones that are not EOL.

Not wanting to hijack the thread but I wanted to second this. It’s tough forking over a few hundred for support (unrelated to stuff that you don’t need to buy support credits for) only to have them expire because I really don’t need to utilize support for 99% of issues that do come up as we are able to handle them on our own.

It just doesn’t feel good to have that happen to you and for sure goes into the calculation when trying to decide to spend money on new credits because we are hitting an issue that we just can’t figure out.

1 Like

Yes, understood. And in the past there has been great debate about whether the problem is in a commercial module or a bug, or whether it’s in a non-commercial module.

In one case I spent months trying to get a bug acknowledged to great upset, including me posting bug reports and having them deleted. I was ultimately told by an exec that “…you’ll be f…(effed) if you ever need support again.”

A year or two later that same issue was accepted as a bug and fixed.

That’s the most severe instance, but far from the only time I’ve been told “it’s not a bug” or “this isn’t a problem with a commercial module.”

@lgaetz , if you sense is that this is an EPM Pro issue, I will open a ticket.

Appreciate the open back and forth here.

If EPM is writing that line to an xml file, it’s an EPM bug. It is likely caused by some unusual configuration on your setup, otherwise it would be a widespread issue, but EPM needs to handle it more gracefully than blindly writing an invalid host.

1 Like

Case #00972244

Will post back with resolution.

So a week later and support has agreed to submit a bug report, but told me “If they are not able to reproduce it locally they may not accept it.”

I told him that I’m not going to engage in (another) debate about what’s a bug and what’s not, and if they want to push back to contact you, @lgaetz. I have zero energy to invest in yet another argument in this area.

Support also said “this happens sometimes when the extension has been restored from a backup”. This is a system that was upgraded using the restore backup process.

Bottom line is that we need to know how to fix this without deleting and recreating the extension. I told support this.

Nearly three weeks after opening the case they still don’t think it’s a bug. So, so frustrating.
image

Since engineering can’t reproduce it and nobody else is reporting it, this is what the process gets reduced to.

Advised by support to upgrade to EPM Pro 15.0.57.1 (edge)
fwconsole ma downloadinstall endpoint --tag 15.0.57.1
Done.
Select all extensions, rebuild configs.
Delete config file for affected extension.
Rebuild.
Check config for affected extension. No change; port still missing.

I also found this post. I’m not sure it’s related, but this is a problem we’ve seen on two separate systems after a backup/restore upgrade, just like this person: