FreePBX goes down once a week

That’s pretty funny. Must be a development site or something.

I was looking here:
(http://www.freepbxhosting.com/virtual-private-server/)

Sorry Paul, I somehow missed the 2nd half of your post.

Here is some info, just for posterity:
lscpu:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                1
On-line CPU(s) list:   0
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 62
Stepping:              4
CPU MHz:               2500.078
BogoMIPS:              5000.15
Hypervisor vendor:     Xen
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              25600K
NUMA node0 CPU(s):     0

lsusb:

unable to initialize libusb: -99

I read somewhere that DAHDI can get timing from the USB driver in the absence of a real hardware telephony adapter. Perhaps this is contributing to the problem?

lsblk:

NAME  MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvde1 202:65   0  10G  0 disk /
xvdf  202:80   0   4G  0 disk 

df -H

Filesystem      Size  Used Avail Use% Mounted on
/dev/xvde1       11G  3.7G  6.4G  37% /
none            2.0G     0  2.0G   0% /dev/shm

free -m

             total       used       free     shared    buffers     cached
Mem:          3758       2292       1466          0        205       1370
-/+ buffers/cache:        715       3043
Swap:            0          0          0

…perhaps it’s running out of RAM? There’s no swap partition.

cat /proc/version

Linux version 2.6.32-431.el6.x86_64 ([email protected]) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Nov 22 03:15:09 UTC 2013

OK, thanks wes.

Ensure that your running ztdummy module. Critical.

Memory should be good.

is ztdummy still the one, or is it dahdi_dummy now?

Oops, old habits die hard! Yep.

Rr I remember. You’ll need to set the affinity bits on the VM and lock all cores down to a dedicated cpu. That was the other trick for Linux.

Oddly enough, I have a few small installs running in ms hypervisor and they work extremely well. No complaints yet.

Can you point me to a resource on this? That sounds a bit over my head…

Wes,

I think your using Xen, so see http://wiki.xen.org/wiki/Tuning_Xen_for_Performance#vCPU_Pinning_for_guests

It’s on Amazon Web Services; is that end-user tune-able?

This sounds like a typical Asterisk deadlock. Unfortunately, debugging those really needs an Asterisk binary built for debugging, which pretty much means you must have compiled it yourself.

You will want optimisation disabled (minor performance penalty) and thread debugging enabled (non-trivial performance penalty, but will still support medium work loads). You can then connect to the CLI and run “core show locks” and look for circular chains of locks, and you can force a crash dump, to see what is waiting for a lock. Getting a dump doesn’t need thread debugging.

I’d say the underlying platform is the issue.

Are you able to compile your own build of Asterisk?

Sorry Paul, I left this browser tab open so I didn’t get an email notification that you responded.

I’m sure I could compile asterisk.

How would that affect FreePBX? Is there a lot of configuration involved in changing Asterisk builds?

Ws,

Whats your latest with this? Did you resolve it?

No, not yet.

I was going to try a few things.

1)ask the PIAF guys how they have a stable AWS image and what’s different from my config
2)look at changing AMI kernels as described here: http://stackoverflow.com/questions/20738635/asterisk-11-on-amazon-ec2-instance-to-handle-100-concurrent-calls
3)recompile asterisk, as you recommend. any special flags/options I should use?
4)scrap it and move to freepbxhosting.com
5)move it back to a server in my office. We’re about to have gigabit fiber! (sorry to rub it in)

-Wes