OOM Killer Invoked


(Matt) #1
Nov 27 07:12:54 staff kernel: [55091.128660] node /var/www/h invoked oom-killer: gfp_mask=0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=0, order=0, oom_score_adj=0
Nov 27 07:12:54 staff kernel: [55091.128663] node /var/www/h cpuset=/ mems_allowed=0
Nov 27 07:12:54 staff kernel: [55091.128673] CPU: 10 PID: 26582 Comm: node /var/www/h Tainted: G           O    4.9.0-14-amd64 #1 Debian 4.9.240-2
Nov 27 07:12:54 staff kernel: [55091.128674] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Nov 27 07:12:54 staff kernel: [55091.128678]  0000000000000000 ffffffffbb337e7e ffffa99e45fdbc10 ffff91ae0678b080
Nov 27 07:12:54 staff kernel: [55091.128683]  ffffffffbb20a92b 0000000000000000 0000000000000000 ffffa99e45fdbc90
Nov 27 07:12:54 staff kernel: [55091.128686]  ffff91ae726fcce0 ffffffffbb18dd67 0000004200000000 0000000000000001
Nov 27 07:12:54 staff kernel: [55091.128690] Call Trace:
Nov 27 07:12:54 staff kernel: [55091.128721]  [<ffffffffbb337e7e>] ? dump_stack+0x66/0x88
Nov 27 07:12:54 staff kernel: [55091.128732]  [<ffffffffbb20a92b>] ? dump_header+0x78/0x1fd
Nov 27 07:12:54 staff kernel: [55091.128737]  [<ffffffffbb18dd67>] ? get_page_from_freelist+0x3f7/0xb20
Nov 27 07:12:54 staff kernel: [55091.128740]  [<ffffffffbb1899fa>] ? oom_kill_process+0x22a/0x3f0
Nov 27 07:12:54 staff kernel: [55091.128743]  [<ffffffffbb189e91>] ? out_of_memory+0x111/0x470
Nov 27 07:12:54 staff kernel: [55091.128746]  [<ffffffffbb18f24f>] ? __alloc_pages_slowpath+0xa1f/0xb30
Nov 27 07:12:54 staff kernel: [55091.128750]  [<ffffffffbb18f561>] ? __alloc_pages_nodemask+0x201/0x260
Nov 27 07:12:54 staff kernel: [55091.128754]  [<ffffffffbb1e1221>] ? alloc_pages_current+0x91/0x140
Nov 27 07:12:54 staff kernel: [55091.128756]  [<ffffffffbb187e27>] ? filemap_fault+0x327/0x5d0
Nov 27 07:12:54 staff kernel: [55091.128812]  [<ffffffffc04410e1>] ? ext4_filemap_fault+0x31/0x50 [ext4]
Nov 27 07:12:54 staff kernel: [55091.128816]  [<ffffffffbb1b94f9>] ? __do_fault+0xa9/0x1c0
Nov 27 07:12:54 staff kernel: [55091.128819]  [<ffffffffbb1bd704>] ? handle_mm_fault+0xda4/0x1350
Nov 27 07:12:54 staff kernel: [55091.128839]  [<ffffffffbb61f831>] ? __switch_to_asm+0x41/0x70
Nov 27 07:12:54 staff kernel: [55091.128844]  [<ffffffffbb063465>] ? __do_page_fault+0x255/0x4f0
Nov 27 07:12:54 staff kernel: [55091.128847]  [<ffffffffbb620b28>] ? page_fault+0x28/0x30
Nov 27 07:12:54 staff kernel: [55091.128849] Mem-Info:
Nov 27 07:12:54 staff kernel: [55091.128859] active_anon:5037244 inactive_anon:24933 isolated_anon:0
Nov 27 07:12:54 staff kernel: [55091.128859]  active_file:1255 inactive_file:2586 isolated_file:151
Nov 27 07:12:54 staff kernel: [55091.128859]  unevictable:0 dirty:0 writeback:0 unstable:0
Nov 27 07:12:54 staff kernel: [55091.128859]  slab_reclaimable:9400 slab_unreclaimable:20359
Nov 27 07:12:54 staff kernel: [55091.128859]  mapped:35347 shmem:51410 pagetables:38453 bounce:0
Nov 27 07:12:54 staff kernel: [55091.128859]  free:37675 free_pcp:330 free_cma:0
Nov 27 07:12:54 staff kernel: [55091.128865] Node 0 active_anon:20148976kB inactive_anon:99732kB active_file:5020kB inactive_file:10344kB unevictable:0kB isolated(anon):0kB isolated(file):604kB mapped:141388kB dirty:0kB writeback:0kB shmem:205640kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 10240kB writeback_tmp:0kB unstable:0kB pages_scanned:85553 all_unreclaimable? no
Nov 27 07:12:54 staff kernel: [55091.128867] Node 0 DMA free:15908kB min:48kB low:60kB high:72kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Nov 27 07:12:54 staff kernel: [55091.128873] lowmem_reserve[]: 0 3485 20333 20333 20333
Nov 27 07:12:54 staff kernel: [55091.128878] Node 0 DMA32 free:78816kB min:11576kB low:15144kB high:18712kB active_anon:3456736kB inactive_anon:9152kB active_file:652kB inactive_file:1336kB unevictable:0kB writepending:0kB present:3653568kB managed:3588000kB mlocked:0kB slab_reclaimable:1132kB slab_unreclaimable:4368kB kernel_stack:2536kB pagetables:14788kB bounce:0kB free_pcp:604kB local_pcp:0kB free_cma:0kB
Nov 27 07:12:54 staff kernel: [55091.128884] lowmem_reserve[]: 0 0 16847 16847 16847
Nov 27 07:12:54 staff kernel: [55091.128888] Node 0 Normal free:55976kB min:55952kB low:73200kB high:90448kB active_anon:16692240kB inactive_anon:90580kB active_file:4092kB inactive_file:8708kB unevictable:0kB writepending:0kB present:17603584kB managed:17256156kB mlocked:0kB slab_reclaimable:36468kB slab_unreclaimable:77068kB kernel_stack:26904kB pagetables:139024kB bounce:0kB free_pcp:716kB local_pcp:0kB free_cma:0kB
Nov 27 07:12:54 staff kernel: [55091.128902] lowmem_reserve[]: 0 0 0 0 0
Nov 27 07:12:54 staff kernel: [55091.128906] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15908kB
Nov 27 07:12:54 staff kernel: [55091.128922] Node 0 DMA32: 1056*4kB (UME) 586*8kB (UME) 636*16kB (UME) 340*32kB (ME) 208*64kB (UME) 136*128kB (UME) 51*256kB (UME) 8*512kB (E) 1*1024kB (E) 0*2048kB 0*4096kB = 78864kB
Nov 27 07:12:54 staff kernel: [55091.128939] Node 0 Normal: 10155*4kB (ME) 1886*8kB (UME) 13*16kB (ME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 55916kB
Nov 27 07:12:54 staff kernel: [55091.128953] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Nov 27 07:12:54 staff kernel: [55091.128961] 55513 total pagecache pages
Nov 27 07:12:54 staff kernel: [55091.128963] 0 pages in swap cache
Nov 27 07:12:54 staff kernel: [55091.128964] Swap cache stats: add 0, delete 0, find 0/0
Nov 27 07:12:54 staff kernel: [55091.128965] Free swap  = 0kB
Nov 27 07:12:54 staff kernel: [55091.128966] Total swap = 0kB
Nov 27 07:12:54 staff kernel: [55091.128967] 5318286 pages RAM
Nov 27 07:12:54 staff kernel: [55091.128968] 0 pages HighMem/MovableOnly
Nov 27 07:12:54 staff kernel: [55091.128969] 103270 pages reserved
Nov 27 07:12:54 staff kernel: [55091.128970] 0 pages hwpoisoned

OOM-Killer is being invoked every morning around 7AM. This only started occurring after FreePBX/Asterisk was installed.

asterisk 12720  0.7  0.2 311556 41764 ?        Ssl  13:41   0:08 node /var/www/html/admin/modules/core/node/fastagi-server.js
asterisk 12912  0.8  0.2 342656 54616 ?        Ssl  13:42   0:09 node /var/www/html/admin/modules/ucp/node/index.js

These are the only processes that are running node in /var/www/..., after OOM-Killer is invoked other processes are killed by OOM-Killer shortly after and the entire server comes unresposive.
Anyone have any ideas?


#2

Start with your cron jobs, as root, from a shell

crontab -l -u asterisk
crontab -l -u root

explore your memory

free

consider adding swap space if you need more


(Matt) #3

Hi, thanks for your response.

@daily [ -x /var/lib/asterisk/bin/freepbx_sipstation_check ] && /var/lib/asterisk/bin/freepbx_sipstation_check 2>&1 > /dev/null
26 3 * * * /usr/sbin/fwconsole certificates --updateall -q 2>&1 >/dev/null
* 2 * * * /usr/sbin/fwconsole util cleanplaybackcache -q
31 6 * * 6 [ -e /usr/sbin/fwconsole ] && /usr/sbin/fwconsole ma listonline --sendemail -q > /dev/null 2>&1
31 8 * * 6 [ -e /usr/sbin/fwconsole ] && /usr/sbin/fwconsole ma upgradeall --sendemail -q > /dev/null 2>&1
* * * * * [ -e /usr/sbin/fwconsole ] && sleep $((RANDOM\%30)) && /usr/sbin/fwconsole job --run --quiet 2>&1 > /dev/null

This is the crontab for asterisk, nothing in root. RAM is around 19GB & SWAP is 1GB.


(Matt) #4

I’m guessing it’s one of the Node servers running in /var/www since they’re the first processes OOM-Killer has killed since FreePBX was installed, however I’m unsure how to disable/turn them off to see if they’re the culprit or not.


#5

top -b -n 1 -o %MEM|head -20

Will show in batch mod top’s output for 1 iteration sorted by memory used filtering the top few processes


(Matt) #6

Shows up Jira and GitLab processes which are separate. The server is running completely fine as of right now, the OOM issue only seems to happen every morning around 7am. Even around 3am things are fine.
Every thing looks normal as of now.


#7

add a cron job at 6:59 running that same command without the iteration qualifier piped into a convenient file

for forensics

grep -C 20 -i OOM /var/log/syslog, (maybe your OS might prefer messages instead of syslog

might also work to reveal the process


#8

You have no swap space. If that was an intentional choice, then you understand that OOM Killer will do its job when the memory is consumed. Add swap or physical memory. You obviously have a lot running on that server and it would benefit from some more breathing room.

Other tasks are run by FreePBX at regular intervals besides what you see in the system crontabs. In mysql database ‘asterisk’, select * from cronmanager; and select * from cron_jobs; to see.

Also, the 7am memory situation could be from other processes. Look at all users’ crontabs and the system ones in /etc/cron*


#9

He has swap space , (he says 1g), he also has plenty of memory ‘around 19g’ so something is horribly running away at 7am, kernel messages will report the base cause, but not always the rogue thread.

I would suggest installing sysstat and see if sar if indicating any trend prior to the ‘event’, I doubt it is anything asterisk or FreePBX related, as both will happily run in 1G even on a quite large system until backups are performed, but really 19G ? . . .


#10

This is what I was seeing. Am I misunderstanding these log lines?


#11

No, I saw that too, but free would confirm or not that the 19G of ram is otherwise used. But only synchronously, hence a suggestion for systat and a near contemporaneous realtime top report


(Matt) #12

Yeah, I saw that line too and I thought it was a little weird. There’s definitely swap space available.


Nothing interesting in anyone’s crontabs running anything weird or running anything in /var/www, however I’ve setup resource usage logging and sysstat as suggested.

I’ll let y’all know what happens. Thanks to everyone who responded!


#13

Just noticed in your post:-

[55091.128674] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006

Can you elucidate?

Is this a VM or you are hosting one, if so then that’s a very different story.