Alibaba Cloud Linux 2系统的ECS实例中SGX驱动在特定情况下存在内存泄露问题,如何处理?
本文为您介绍Alibaba Cloud Linux 2系统的ECS实例中SGX驱动在特定情况下存在内存泄露问题的原因及解决方案。
问题描述
在符合如下条件的Alibaba Cloud Linux 2实例中,SGX驱动在特定情况下存在内存泄露问题,最终导致系统内存耗尽,绝大部分内存被SGX的测试进程APP给占用,出现类似如下提示。
镜像:Alibaba Cloud Linux 2.1903 LTS 64位。
内核:kernel-4.19.91-23.al7及之前的内核版本。
ECS实例类型:c7t、r7t、g7t。
[ 71.938733] systemd-journal invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
[ 71.938735] systemd-journal cpuset=/ mems_allowed=0
[ 71.938738] CPU: 0 PID: 415 Comm: systemd-journal Not tainted 4.19.91-23.al7.x86_64 #1
[ 71.938738] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 0.0.0 02/06/2015
[ 71.938739] Call Trace:
[ 71.938746] dump_stack+0x66/0x8b
[ 71.938749] dump_global_header+0x12/0x10f
[ 71.938750] oom_kill_process+0x2cf/0x310
[ 71.938752] out_of_memory+0xf7/0x4c0
[ 71.938754] __alloc_pages_nodemask+0xf07/0xfd0
[ 71.938757] ? blk_flush_plug_list+0xd7/0x220
[ 71.938759] pagecache_get_page+0x8c/0x350
[ 71.938761] filemap_fault+0x37e/0x6e0
[ 71.938764] ext4_filemap_fault+0x2c/0x3b
[ 71.938766] __do_fault+0x38/0x170
[ 71.938768] do_fault+0x2eb/0x640
[ 71.938769] __handle_mm_fault+0x621/0xa20
[ 71.938772] ? apic_timer_interrupt+0xa/0x20
[ 71.938774] handle_mm_fault+0x106/0x1c0
[ 71.938776] __do_page_fault+0x1ba/0x480
[ 71.938778] do_page_fault+0x32/0x140
[ 71.938780] ? async_page_fault+0x8/0x30
[ 71.938781] async_page_fault+0x1e/0x30
[ 71.938782] RIP: 0033:0x55a1ca49516f
[ 71.938786] Code: Bad RIP value.
[ 71.938787] RSP: 002b:00007ffcd58b22b0 EFLAGS: 00010246
[ 71.938788] RAX: 0000000000000000 RBX: 000055a1cbcc4400 RCX: a1fcdcf819d7e1e5
[ 71.938788] RDX: 00007f3d4d72a000 RSI: 000055a1cbcc2060 RDI: 000055a1cbcc4400
[ 71.938789] RBP: a1fcdcf819d7e1e5 R08: 00007ffcd58b23b0 R09: 00007ffcd58b23a8
[ 71.938790] R10: 000055a1ca49a935 R11: 00000000d1ba4319 R12: 000055a1cbcc4400
[ 71.938790] R13: 0000000000000011 R14: 000055a1cbcc2060 R15: a1fcdcf819d7e1e5
[ 71.938791] Task in / killed as a result of limit of host
[ 71.938792] Mem-Info:
[ 71.938795] active_anon:85 inactive_anon:410619 isolated_anon:0
active_file:150 inactive_file:353 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
slab_reclaimable:6038 slab_unreclaimable:17336
mapped:98 shmem:403568 pagetables:1793 bounce:0
free:12881 free_pcp:440 free_cma:0
[ 71.938797] Node 0 active_anon:340kB inactive_anon:1642476kB active_file:600kB inactive_file:1412kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:392kB dirty:0kB writeback:0kB shmem:1614272kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 2048kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[ 71.938798] Node 0 DMA free:7408kB min:392kB low:488kB high:584kB active_anon:0kB inactive_anon:8312kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:16kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 71.938800] lowmem_reserve[]: 0 1761 1761 1761 1761
[ 71.938801] Node 0 DMA32 free:44116kB min:44660kB low:55824kB high:66988kB active_anon:340kB inactive_anon:1633492kB active_file:688kB inactive_file:1812kB unevictable:0kB writepending:0kB present:1914960kB managed:1826408kB mlocked:0kB kernel_stack:2208kB pagetables:7156kB bounce:0kB free_pcp:1760kB local_pcp:1396kB free_cma:0kB
[ 71.938804] lowmem_reserve[]: 0 0 0 0 0
[ 71.938805] Node 0 DMA: 0*4kB 2*8kB (UM) 2*16kB (UE) 0*32kB 1*64kB (E) 3*128kB (UME) 3*256kB (UME) 2*512kB (ME) 3*1024kB (UME) 1*2048kB (E) 0*4096kB = 7408kB
[ 71.938810] Node 0 DMA32: 233*4kB (UMEH) 158*8kB (UMEH) 177*16kB (UMEH) 79*32kB (UEH) 34*64kB (UMEH) 16*128kB (UMEH) 6*256kB (E) 3*512kB (UE) 3*1024kB (ME) 3*2048kB (UME) 5*4096kB (M) = 44548kB
[ 71.938815] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 71.938816] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 71.938816] 404127 total pagecache pages
[ 71.938817] 0 pages in swap cache
[ 71.938818] Swap cache stats: add 0, delete 0, find 0/0
[ 71.938818] Free swap = 0kB
[ 71.938819] Total swap = 0kB
[ 71.938819] 482739 pages RAM
[ 71.938820] 0 pages HighMem/MovableOnly
[ 71.938820] 22160 pages reserved
[ 71.938820] 0 pages cma reserved
[ 71.938821] 0 pages hwpoisoned
[ 71.938821] Tasks state (memory values in pages):
[ 71.938822] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 71.938824] [ 415] 0 415 11814 85 147456 0 0 systemd-journal
[ 71.938826] [ 439] 0 439 11430 228 118784 0 -1000 systemd-udevd
[ 71.938827] [ 550] 0 550 22654 218 212992 0 0 rngd
[ 71.938828] [ 554] 81 554 15051 155 167936 0 -900 dbus-daemon
[ 71.938829] [ 573] 0 573 48803 120 180224 0 0 gssproxy
[ 71.938830] [ 585] 0 585 6598 91 98304 0 0 systemd-logind
[ 71.938831] [ 587] 0 587 4456 115 61440 0 0 assist_daemon
[ 71.938832] [ 597] 32 597 17316 135 188416 0 0 rpcbind
[ 71.938833] [ 601] 0 601 31598 153 106496 0 0 crond
[ 71.938834] [ 606] 997 606 29454 129 143360 0 0 chronyd
[ 71.938835] [ 616] 0 616 27553 33 57344 0 0 agetty
[ 71.938836] [ 819] 0 819 25740 516 221184 0 0 dhclient
[ 71.938837] [ 887] 0 887 121900 708 430080 0 0 rsyslogd
[ 71.938838] [ 953] 0 953 10512 391 102400 0 0 AliYunDunUpdate
[ 71.938839] [ 1078] 0 1078 32317 732 274432 0 0 AliYunDun
[ 71.938840] [ 1235] 0 1235 28237 261 266240 0 -1000 sshd
[ 71.938841] [ 1283] 0 1283 39209 337 348160 0 0 sshd
[ 71.938842] [ 1292] 0 1292 29086 317 90112 0 0 bash
[ 71.938843] [ 1310] 0 1310 87597 530 311296 0 -900 abrt-dbus
[ 71.938844] [ 1397] 0 1397 39209 347 348160 0 0 sshd
[ 71.938845] [ 1399] 0 1399 29080 279 81920 0 0 bash
[ 71.938846] [ 1430] 0 1430 27028 25 77824 0 0 dmesg
[ 71.938847] [ 1431] 0 1431 8392985 92 3219456 0 0 app
[ 71.938848] [ 1432] 0 1432 39209 339 356352 0 0 sshd
[ 71.938849] [ 1434] 0 1434 29053 276 81920 0 0 bash
[ 71.938850] [ 1470] 0 1470 2146 23 57344 0 0 systemd-cgroups
[ 71.938851] [ 1471] 0 1471 2146 23 57344 0 0 systemd-cgroups
[ 71.938852] [ 1472] 0 1472 2146 23 53248 0 0 systemd-cgroups
[ 71.938853] [ 1473] 0 1473 2143 15 57344 0 0 systemd-cgroups
[ 71.938854] Out of memory: Kill process 1431 (app) score 1 or sacrifice child
[ 71.939026] Killed process 1431 (app) total-vm:33571940kB, anon-rss:320kB, file-rss:48kB, shmem-rss:0kB
[ 71.942922] oom_reaper: reaped process 1431 (app), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
问题原因
Arch、X86、Kernel、CPU、SGX和encl.c中的sgx_encl_mm_release_deferred函数没有正确的处理Encl结构的引用计数,所以当持有EPC内存的进程fork后,会导致Encl的引用计数不归零,从而泄露加密内存(EPC)。在物理内存消耗殆尽后,会使用共享内存以交换出加密内存,并最终使普通内存也消耗殆尽。
解决方案
当遇到该问题时,您可以参考以下方案处理。
登录实例,执行以下命令,确认系统内核版本适用此方案。
uname -r
系统显示类似如下。
4.19.91-23.al7.x86_64
根据系统内核版本选择对应的解决方法。
对于4.19.91-23.al7.x86_64(不含)之前的版本:
更新操作系统版本至最新的内核版本。
yum update kernel
重启服务器生效。
reboot
更新内核热补丁。
若最新内核版本的操作系统同样存在该问题,请参见4.19.91-23.al7.x86_64版本的解决方案更新内核热补丁。
对于4.19.91-23.al7.x86_64版本,可通过安装内核热补丁解决,安装命令如下。
yum install -y kernel-hotfix-5577959-`uname -r | awk -F"-" '{print $NF}'`