How can I flush call state from the astdb? Is there a CLI command for that?

I am having a problem where millions of tasks are queued up and then the PBX stops responding. Even without registered endpoints or calls it takes hours for the queue to empty itself.
While I migrate to different hardware, I am having to restart asterisk and/or the complete machine several times per day.
is there a command I could type to force flush the queue? and all other call state data?
thank you.

anybody have a suggestion for what to try? We are now having to restart every 10-20 minutes and we only have about 20 active calls.
thanks.

You could run a fwconsole restart command on a cron job, to automatically run every x minutes, but that will take everything down. You could do something like this, but be more selective and just run it on asterisk the asterisk service.

https://wiki.asterisk.org/wiki/display/AST/Stopping+and+Restarting+Asterisk+From+The+CLI

comtech, thats what we are doing now, but it doesnt fully flush everything, nor does a full machine restart. It helps our problem, but only temporarily.
than you for the suggestion though.

anybody? is this possible? once the cache is full, it takes hours to recover. this error state survives, hangup requests, fwconsole restarts and machine restarts.

If you know sql syntax then

rasterisk -x ‘database query (your sql query)’

alows you to drill down and remove bad keys and values. A hammer solution with reference to:-

https://wiki.asterisk.org/wiki/display/AST/SQLite3+astdb+back-end

Is to you delete the sqlite3 database, it will be recreated on asterisk start up, then populated by a FreePBX reload, you will just loose any ad-hoc call forwarding/extension state and any blacklist

Trouble is with that, what is causing your problem? If you don’t fix it it will just reappear

Dicko, thanks, maybe I should try that. Can you tell me how to delete the DB completely?
I dont know what happened. It was a new deployment, it worked fine for over a month. Suddenly Friday week and a half ago things started getting wonky. Lots of queued up tasks on the taskprocessor, SIP errors popping up on the CLI, Then people stopped receiving calls or being able to make new ones. A system reboot solved it until the following monday. Monday same thing happened twice. It got worse and worse by the day. Today we had to do fwconsole restarts every 20-30 minutes.
Its a Xen Server virtual machine with 4 processors and 16GB of ram for only about 100 registered extensions and about 40 active calls during peak hours.
The only thing i’ve noticed is that the period it works fine is the longest in the morning. As if the some cache is getting filled up and is not clearing up fast enough, but clears up overnight. In the morning thigns work fine for a few hours and then all the same problems return.
Yesterday, under the assumption the problem was at a lower level (hardware or system installation) we installed a new machine with the latest frrepbx distro download and then backed up the current machine and restored into the the new installation. When we rebooted, even though it had a different IP and all trunks were disabled, it still showed queued up activity that had transferred over from the machine and it took hours for it to clean itself up. So maybe it is a DB problem? What exactly should I delete to test this? very few people have forwarding set up and nobody has blacklists.
THANK YOU!

dicko, should I simply delete (or temporarily rename this?
utils/astdb2bdb /var/lib/asterisk/astdb.sqlite3

(at whichever PATH it is located in my machine)

is that what you meant?

yes, delete /var/lib/asterisk/astdb.sqlite3 to hammer it :slight_smile:

will try it first thing in the morning. thank you!

dicko - genius-level user that you are (absolutely NO irony
 I’ve read your posts and your first-hand experience is formidable).

Both amcoit and I have similar problems which suddenly started for no discernible reason. We regularly see a 300-400% breach of the high water level for some taskprocesses. In some instances (amcoit is here now) after the PBX has been running for a short while it becomes completely unresponsive to SIP traffic. Incoming and outgoing calls get various unavailable signals and NOTHING (literally nothing!) is logged in any of the logfiles.

Our separate posts on the subject are here:-

If only someone could give some idea of what the problematic subprocessors are actually responsible for, it would help
 :frowning:

The one commonality I notice with this problem is ‘virtualization’ , perhaps xen and vmware more often than containers or kvm.

One would surmise this to be an asterisk ‘problem’ not a FreePBX one, as such I would recompile asterisk from source against the running kernel, not an easy option if you are distro based though. This should eliminate a couple of possible causes.

I’m not 100% certain that amcoit is virtualised
 unless I missed it in his post?

And although it’s not very scientific testing, I did rebuild my FreePBX installation on a quad core i7 and still received regular warnings about taskprocessor limits being exceeded once I had restored my original configuration.

I have re-installed (and re-configured manually - subject to copying moh, voicemail etc) the installation on a new machine - I’m just waiting for an opportunity to switch over to the new install and see if the issue goes away - the new install isn’t virtual and is on a very overspecced machine.

Sadly, I lack the ability and knowledge to recompile the source - and this is a production system :frowning:

I tend to agree that this is probably an Asterisk thing, not a distro one
 just wish I could understand what the taskprocessers are responsible for so I could narrow-down the issue.

Again , you have.

“once I had restored my original configuration.”

That points to the sqlite3 database. as the rest of the restore is just dialplan and voicemail, html and mysql tables and spooled files, none if which are likely culprits, if it is the astdb , then the problem remains, what is causing that?

dicko - good call
 on the new machine, I didn’t restore any backup but reconfigured manually
 I’ll see what happens :slight_smile:

Dicko,

I tried deleting the sqlite DB
here are the results
fwconsole stop
mv astdb.sqlite3 astdb.sqlite3.bak
fwconsole start
fwconsole reload

the new astdb.sqlite3 DB was created and populated.
All extensions immediately registered and were able to make outgoing external calls. BUT, nobody was able to receive calls, neither calls coming in from the trunks nor internal ext to ext calls. All attempts gave the “the person you are calling is unavailable” and sent to VM. I tried toggling DND from the extension and though the voice response was correct (“DND activated” “DND deactivated”) nothing changed (still says “extension unavailable” when called). If I went to the freepbx extensions page and searched for this extension the little checkboxes that show the status of DND (and other features) were all unchecked. As if the DND status is not recorded.
Here is the pjsip show endpoint for the extension I am testing with. This extension, using zoiper shows registered (on zoiper) and can successfully make outgoing calls.

    pjsip show endpoint 7292
    Endpoint:  <Endpoint/CID.....................................>  <State.....>  <Channels.>
        I/OAuth:  <AuthId/UserName...........................................................>
            Aor:  <Aor............................................>  <MaxContact>
        Contact:  <Aor/ContactUri..........................> <Hash....> <Status> <RTT(ms)..>
    Transport:  <TransportId........>  <Type>  <cos>  <tos>  <BindAddress..................>
    Identify:  <Identify/Endpoint.........................................................>
            Match:  <criteria.........................>
        Channel:  <ChannelId......................................>  <State.....>  <Time.....>
            Exten: <DialedExten...........>  CLCID: <ConnectedLineCID.......>
    ==========================================================================================

    Endpoint:  7292/7292                                            Unavailable   0 of inf
        InAuth:  7292-auth/7292
            Aor:  7292                                               2


    ParameterName                      : ParameterValue
    ===========================================================
    100rel                             : yes
    accept_multiple_sdp_answers        : false
    accountcode                        : 
    acl                                : 
    aggregate_mwi                      : true
    allow                              : (g729|ulaw|alaw|gsm)
    allow_overlap                      : true
    allow_subscribe                    : true
    allow_transfer                     : true
    aors                               : 7292
    asymmetric_rtp_codec               : false
    auth                               : 7292-auth
    bind_rtp_to_media_address          : false
    call_group                         : 
    callerid                           : "MyName" <7292>
    callerid_privacy                   : allowed_not_screened
    callerid_tag                       : 
    connected_line_method              : invite
    contact_acl                        : 
    context                            : from-internal
    cos_audio                          : 5
    cos_video                          : 4
    device_state_busy_at               : 0
    direct_media                       : true
    direct_media_glare_mitigation      : none
    direct_media_method                : invite
    disable_direct_media_on_nat        : false
    dtls_ca_file                       : 
    dtls_ca_path                       : 
    dtls_cert_file                     : 
    dtls_cipher                        : 
    dtls_fingerprint                   : SHA-256
    dtls_private_key                   : 
    dtls_rekey                         : 0
    dtls_setup                         : active
    dtls_verify                        : No
    dtmf_mode                          : rfc4733
    fax_detect                         : false
    fax_detect_timeout                 : 0
    follow_early_media_fork            : true
    force_avp                          : false
    force_rport                        : true
    from_domain                        : 
    from_user                          : 
    g726_non_standard                  : false
    ice_support                        : false
    identify_by                        : username,ip
    inband_progress                    : false
    incoming_mwi_mailbox               : 
    language                           : es_MX
    mailboxes                          : 7292@device
    media_address                      : 
    media_encryption                   : no
    media_encryption_optimistic        : false
    media_use_received_transport       : false
    message_context                    : 
    moh_suggest                        : default
    mwi_from_user                      : 
    mwi_subscribe_replaces_unsolicited : true
    named_call_group                   : 
    named_pickup_group                 : 
    notify_early_inuse_ringing         : false
    one_touch_recording                : true
    outbound_auth                      : 
    outbound_proxy                     : 
    pickup_group                       : 
    record_off_feature                 : apprecord
    record_on_feature                  : apprecord
    refer_blind_progress               : true
    rewrite_contact                    : true
    rpid_immediate                     : false
    rtcp_mux                           : false
    rtp_engine                         : asterisk
    rtp_ipv6                           : false
    rtp_keepalive                      : 0
    rtp_symmetric                      : true
    rtp_timeout                        : 0
    rtp_timeout_hold                   : 0
    sdp_owner                          : -
    sdp_session                        : Asterisk
    send_diversion                     : true
    send_pai                           : true
    send_rpid                          : false
    set_var                            : 
    srtp_tag_32                        : false
    sub_min_expiry                     : 0
    subscribe_context                  : 
    t38_udptl                          : false
    t38_udptl_ec                       : none
    t38_udptl_ipv6                     : false
    t38_udptl_maxdatagram              : 0
    t38_udptl_nat                      : false
    timers                             : yes
    timers_min_se                      : 90
    timers_sess_expires                : 1800
    tone_zone                          : 
    tos_audio                          : 184
    tos_video                          : 136
    transport                          : 
    trust_id_inbound                   : true
    trust_id_outbound                  : false
    use_avpf                           : false
    use_ptime                          : false
    user_eq_phone                      : false
    voicemail_extension                : 

    FreePBXÂź FreePBX is a registered trademark of
    Sangoma Technologies Inc.
    FreePBX 14.0.3.13 is licensed under the GPL
    Copyright© 2007-2018www.sangoma.com
    undefined

In the case of Queues, even though all static agents were registered (I checked all pjsip endpoints), nobody was receiving any calls. The CLI was flooded with these messages:

[2018-08-22 08:37:28] ERROR[1760][C-0000003f]: res_pjsip_header_funcs.c:454 func_read_header: This function requires a PJSIP channel.
[2018-08-22 08:37:28] WARNING[1783][C-0000003f]: chan_sip.c:22996 func_header_read: This function can only be used on SIP channels.

Which I assume is the system attempting to reach each agent in round robin.

Here is the last section of what the cli shows for a test we did coming in from an external call to a single extension

   Goto (macro-user-callerid,s,16)
-- Executing [s@macro-user-callerid:16] NoOp("PJSIP/amco-cc-MyCarrier-0000000f", "Macro Depth is 3") in new stack
-- Executing [s@macro-user-callerid:17] GotoIf("PJSIP/amco-cc-MyCarrier-0000000f", "1?report2:macroerror") in new stack
-- Goto (macro-user-callerid,s,18)
-- Executing [s@macro-user-callerid:18] GotoIf("PJSIP/amco-cc-MyCarrier-0000000f", "1?continue") in new stack
-- Goto (macro-user-callerid,s,37)
-- Executing [s@macro-user-callerid:37] Set("PJSIP/amco-cc-MyCarrier-0000000f", "CALLERID(number)=3418653391") in new stack
-- Executing [s@macro-user-callerid:38] Set("PJSIP/amco-cc-MyCarrier-0000000f", "CALLERID(name)=3418653391") in new stack
-- Executing [s@macro-user-callerid:39] GotoIf("PJSIP/amco-cc-MyCarrier-0000000f", "0?cnum") in new stack
-- Executing [s@macro-user-callerid:40] Set("PJSIP/amco-cc-MyCarrier-0000000f", "CDR(cnam)=3418653391") in new stack
-- Executing [s@macro-user-callerid:41] Set("PJSIP/amco-cc-MyCarrier-0000000f", "CDR(cnum)=3418653391") in new stack
-- Executing [s@macro-user-callerid:42] Set("PJSIP/amco-cc-MyCarrier-0000000f", "CHANNEL(language)=es_MX") in new stack
-- Executing [s@macro-user-callerid:43] GosubIf("PJSIP/amco-cc-MyCarrier-0000000f", "0?app-check-classofservce,s,1()") in new stack
-- Executing [s@macro-vm:2] Set("PJSIP/amco-cc-MyCarrier-0000000f", "VMGAIN=") in new stack
-- Executing [s@macro-vm:3] Macro("PJSIP/amco-cc-MyCarrier-0000000f", "blkvm-check,") in new stack
-- Executing [s@macro-blkvm-check:1] Set("PJSIP/amco-cc-MyCarrier-0000000f", "GOSUB_RETVAL=") in new stack
-- Executing [s@macro-blkvm-check:2] ExecIf("PJSIP/amco-cc-MyCarrier-0000000f", "0?Set(GOSUB_RETVAL=TRUE)") in new stack
-- Executing [s@macro-blkvm-check:3] MacroExit("PJSIP/amco-cc-MyCarrier-0000000f", "") in new stack
-- Executing [s@macro-vm:4] GotoIf("PJSIP/amco-cc-MyCarrier-0000000f", "1?vmx,1") in new stack
-- Goto (macro-vm,vmx,1)
-- Executing [vmx@macro-vm:1] Set("PJSIP/amco-cc-MyCarrier-0000000f", "MEXTEN=7292") in new stack
-- Executing [vmx@macro-vm:2] Set("PJSIP/amco-cc-MyCarrier-0000000f", "MMODE=NOANSWER") in new stack
-- Executing [vmx@macro-vm:3] Set("PJSIP/amco-cc-MyCarrier-0000000f", "RETVM=") in new stack
-- Executing [vmx@macro-vm:4] Set("PJSIP/amco-cc-MyCarrier-0000000f", "MODE=unavail") in new stack
-- Executing [vmx@macro-vm:5] Macro("PJSIP/amco-cc-MyCarrier-0000000f", "get-vmcontext,7292") in new stack
-- Executing [s@macro-get-vmcontext:1] Set("PJSIP/amco-cc-MyCarrier-0000000f", "VMCONTEXT=") in new stack
-- Executing [s@macro-get-vmcontext:2] GotoIf("PJSIP/amco-cc-MyCarrier-0000000f", "1?200:300") in new stack
-- Goto (macro-get-vmcontext,s,200)
-- Executing [s@macro-get-vmcontext:200] Set("PJSIP/amco-cc-MyCarrier-0000000f", "VMCONTEXT=default") in new stack
-- Executing [vmx@macro-vm:6] Set("PJSIP/amco-cc-MyCarrier-0000000f", "MODE=unavail") in new stack
-- Executing [vmx@macro-vm:7] NoOp("PJSIP/amco-cc-MyCarrier-0000000f", "MODE IS: unavail") in new stack
-- Executing [vmx@macro-vm:8] GotoIf("PJSIP/amco-cc-MyCarrier-0000000f", "1?chknomsg") in new stack
-- Goto (macro-vm,vmx,10)
-- Executing [vmx@macro-vm:10] GotoIf("PJSIP/amco-cc-MyCarrier-0000000f", "0?s-NOANSWER,1") in new stack
-- Executing [vmx@macro-vm:11] GotoIf("PJSIP/amco-cc-MyCarrier-0000000f", "1?notdirect") in new stack
-- Goto (macro-vm,vmx,13)
-- Executing [vmx@macro-vm:13] NoOp("PJSIP/amco-cc-MyCarrier-0000000f", "Checking if ext 7292 is enabled: ") in new stack
-- Executing [vmx@macro-vm:14] GotoIf("PJSIP/amco-cc-MyCarrier-0000000f", "1?s-NOANSWER,1") in new stack
-- Goto (macro-vm,s-NOANSWER,1)
-- Executing [s-NOANSWER@macro-vm:1] Macro("PJSIP/amco-cc-MyCarrier-0000000f", "get-vmcontext,7292") in new stack
-- Executing [s@macro-get-vmcontext:1] Set("PJSIP/amco-cc-MyCarrier-0000000f", "VMCONTEXT=") in new stack
-- Executing [s@macro-get-vmcontext:2] GotoIf("PJSIP/amco-cc-MyCarrier-0000000f", "1?200:300") in new stack
-- Goto (macro-get-vmcontext,s,200)
-- Executing [s@macro-get-vmcontext:200] Set("PJSIP/amco-cc-MyCarrier-0000000f", "VMCONTEXT=default") in new stack
-- Executing [s-NOANSWER@macro-vm:2] VoiceMail("PJSIP/amco-cc-MyCarrier-0000000f", "7292@default,u") in new stack
-- <PJSIP/amco-cc-MyCarrier-0000000f> Playing 'vm-theperson.g729' (language 'es_MX')
-- <Local/7225@from-queue-00000f08;2>AGI Script attendedtransfer-rec-restart.php completed, returning 0
-- Executing [s@macro-hangupcall:6] Hangup("Local/7225@from-queue-00000f08;2", "") in new stack
== Spawn extension (macro-hangupcall, s, 6) exited non-zero on 'Local/7225@from-queue-00000f08;2' in macro 'hangupcall'
== Spawn extension (ext-local, h, 1) exited non-zero on 'Local/7225@from-queue-00000f08;2'
-- Nobody picked up in 1000 ms
--     -- LazyMembers debugging - Numbusies: 0, Nummems: 17
cos.agi: Starting Class Of Service checks
cos.agi: Detected EXTERNAL Call. Skipping CoS Checks
cos.agi: Starting Class Of Service checks
cos.agi: Detected EXTERNAL Call. Skipping CoS Checks
-- <Local/7238@from-queue-00000f0a;2>AGI Script cos.agi completed, returning 0

any and all help is welcome
thank you!

PS.- Jon, yes, we are virtualized. Xen Server 4 cores, 16GB.

Did you reload FreePBX?

restarted and reloaded asterisk and also restarted the machine. No difference, all extensions unavailable. Even weirder, while asterisk was stoppped I returned to the previous (theoretically broken) astdb.sqlite3 file and I now have the exact same issue on that one, ie. All extensions are Unavailable.
I assumed this part of the issue might be related to https://community.asterisk.org/t/reload-endpoint-runtime/72156/5 but I did restart.
anything else you can think of that might help that I should try?

thank you

go back to your last known good backup?

we are doing that as we speak.

Something we just realized, is that on the failing system with all extensions showing as unavailable if we go one by one and press “submit” followed by Apply (with no changes made) right after the apply the extension is now able to receive phone calls. Does that tell you anything? anyway to do this to all extensions at once?
thanks!

Edit: just to clarify, this is on the system where we deleted astdb and had it recreated by asterisk.