查看问题详情
编号 | 项目 | 分类 | 查看权限 | 报告日期 | 最后更新 |
---|---|---|---|---|---|
0000509 | Anolis OS 8 | kernel | public | 2021-11-10 15:29 | 2021-11-29 16:00 |
报告员 | wb-wpp899309 | 分派给 | fghuims | ||
优先级 | high | 严重性 | major | 出现频率 | always |
状态 | assigned | 处理状况 | open | ||
平台 | aarch64 | 操作系统 | Anolis OS | 操作系统版本 | 8 |
标题 | 0000509: [Anolis OS 8.2][4.19.91-25.rc1] [aarch64]kernel crash:Unable to handle kernel NULL pointer dereference at virtual address 000000 | ||||
描述 | [缺陷描述]: 跑stress-ng压力测试,系统立马出现crash vmcore部分信息: [ 1531.077555] mmap: stress-ng (3661) uses deprecated remap_file_pages() syscall. See Documentation/vm/remap_file_pages.rst. [ 1532.141477] Unable to handle kernel NULL pointer dereference at virtual address 00000000000006d8 [ 1532.150415] Mem abort info: [ 1532.153247] ESR = 0x96000006 [ 1532.156337] Exception class = DABT (current EL), IL = 32 bits [ 1532.162339] SET = 0, FnV = 0 [ 1532.165424] EA = 0, S1PTW = 0 [ 1532.168791] Data abort info: [ 1532.171894] ISV = 0, ISS = 0x00000006 [ 1532.175968] CM = 0, WnR = 0 [ 1532.178965] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000021b47eb8 [ 1532.185675] [00000000000006d8] pgd=000020a9fe92b003, pud=000020a9fe92c003, pmd=0000000000000000 [ 1532.194508] Internal error: Oops: 96000006 [#1] SMP [ 1532.199450] Modules linked in: unix_diag(E+) tun(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) bonding(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nf_tables_set(E) nft_chain_nat_ipv6(E) nf_nat_ipv6(E) nft_chain_route_ipv6(E) nft_chain_nat_ipv4(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E) nft_chain_route_ipv4(E) ip6_tables(E) nft_compat(E) ip_set(E) nf_tables(E) nfnetlink(E) vfat(E) fat(E) ipmi_ssif(E) aes_ce_blk(E) crypto_simd(E) cryptd(E) aes_ce_cipher(E) mousedev(E) crc32_ce(E) crct10dif_ce(E) hibmc_drm(E) ghash_ce(E) ttm(E) sha2_ce(E) drm_kms_helper(E) sha256_arm64(E) sha1_ce(E) sbsa_gwdt(E) drm(E) hns_roce_hw_v2(E) hisi_sas_v3_hw(E) hisi_sas_main(E) hns_roce(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) [ 1532.272594] ib_core(E) ipmi_si(E) libsas(E) sysimgblt(E) ipmi_devintf(E) scsi_transport_sas(E) ipmi_msghandler(E) spi_dw_mmio(E) sch_fq_codel(E) ip_tables(E) sd_mod(E) sg(E) realtek(E) hns3(E) mlx5_core(E) ahci(E) nvme(E) i2c_designware_platform(E) nfit(E) libahci(E) i2c_designware_core(E) hclge(E) mlxfw(E) nvme_core(E) hnae3(E) libata(E) devlink(E) i2c_core(E) libnvdimm(E) [ 1532.318940] Process stress-ng (pid: 3530, stack limit = 0x0000000074d061ef) [ 1532.332537] CPU: 63 PID: 3530 Comm: stress-ng Kdump: loaded Tainted: G E 4.19.91-25.rc1.an8.aarch64 #1 [ 1532.349948] Hardware name: H3C R4960 G3/BC82AMDDA, BIOS 1.70 01/07/2021 [ 1532.363346] pstate: 80400009 (Nzcv daif +PAN -UAO) [ 1532.374895] pc : is_mem_section_removable+0x70/0x1f8 [ 1532.386778] lr : show_mem_removable+0x90/0xd8 [ 1532.397659] sp : ffff00004cf8bc10 [ 1532.407363] x29: ffff00004cf8bc10 x28: 0000000000000001 [ 1532.419396] x27: ffff7e0000000000 x26: 0000000000040000 [ 1532.431226] x25: ffffa0a9efb62e00 x24: ffff0000095632e0 [ 1532.443006] x23: ffff00000959df60 x22: ffff000008d0b000 [ 1532.455008] x21: ffff0000091f9780 x20: ffffa0accfbbf000 [ 1532.466754] x19: 0000000000000001 x18: 0000000000000000 [ 1532.478463] x17: 0000000000000000 x16: 0000000000000000 [ 1532.490205] x15: 0000000000000000 x14: 0000000000000000 [ 1532.501953] x13: 0000000000000000 x12: 0000000000000000 [ 1532.513625] x11: 0000000000000000 x10: 0000000000000000 [ 1532.525193] x9 : ffff0000086e46d0 x8 : 0000000000000000 [ 1532.536789] x7 : 0000000000080000 x6 : 0000000000000680 [ 1532.548314] x5 : 0000000000000000 x4 : 0000000000000005 [ 1532.549907] sched: DL replenish lagged too much [ 1532.570412] x3 : 0000000000000001 x2 : 0000000001000000 [ 1532.581842] x1 : ffff7e0001000000 x0 : 0000000000000001 [ 1532.593193] Call trace: [ 1532.601623] is_mem_section_removable+0x70/0x1f8 [ 1532.612115] show_mem_removable+0x90/0xd8 [ 1532.621856] dev_attr_show+0x28/0x60 [ 1532.631043] sysfs_kf_seq_show+0x8c/0x160 [ 1532.640684] kernfs_seq_show+0x30/0x38 [ 1532.650232] seq_read+0x148/0x478 [ 1532.659094] kernfs_fop_read+0x2c/0x1e0 [ 1532.668493] __vfs_read+0x20/0x48 [ 1532.677371] vfs_read+0x98/0x168 [ 1532.686159] ksys_read+0x6c/0xe0 [ 1532.694933] __arm64_sys_read+0x20/0x28 [ 1532.704294] el0_svc_common.constprop.0+0xa8/0x200 [ 1532.714531] el0_svc_handler+0x30/0x80 [ 1532.723606] el0_svc+0x10/0x14 [ 1532.731795] Code: 8b030466 f8607aa8 8b060866 8b061d06 (f9402cc3) [ 1532.742939] ---[ end trace 3dc4b79508ccd27a ]--- [ 1532.752705] Kernel panic - not syncing: Fatal exception [ 1532.762683] SMP: stopping secondary CPUs [ 1532.771222] Kernel Offset: disabled [ 1532.779130] CPU features: 0x88,22200a38 [ 1532.787220] Memory Limit: none [ 1532.796499] Starting crashdump kernel... [ 1532.804315] Bye! [重现概率]: 目前跑了2遍必现,执行完stress-ng命令,系统立马发生crash [重现环境]: 内核: 4.19.91-25.rc1.an8.aarch64 # cat /etc/os-release NAME="Anolis OS" VERSION="8.2" ID="anolis" ID_LIKE="rhel fedora centos" VERSION_ID="8.2" PLATFORM_ID="platform:an8" PRETTY_NAME="Anolis OS 8.2" ANSI_COLOR="0;31" HOME_URL="https://openanolis.org/" cpu信息: # lscpu Architecture: aarch64 Byte Order: Little Endian CPU(s): 96 On-line CPU(s) list: 0-95 Thread(s) per core: 1 Core(s) per socket: 48 Socket(s): 2 NUMA node(s): 1 Vendor ID: 0x48 Model: 0 Stepping: 0x1 CPU max MHz: 2600.0000 CPU min MHz: 200.0000 BogoMIPS: 200.00 L1d cache: 64K L1i cache: 64K L2 cache: 512K L3 cache: 24576K NUMA node0 CPU(s): 0-95 Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm 内存信息: # free -h total used free shared buff/cache available Mem: 755Gi 3.5Gi 739Gi 10Mi 12Gi 748Gi Swap: 2.0Gi 0B 2.0Gi [期望结果]: 跑stress-ng压力过程中,系统正常,不出现crash [实际结果]: 跑stress-ng压力后,系统立马crash | ||||
问题重现步骤 | [重现步骤]: 1、准备工作 配置参数值 echo 1 > /proc/sys/kernel/panic echo 1 > /proc/sys/kernel/hardlockup_panic echo 1 > /proc/sys/kernel/softlockup_panic echo 50 > /proc/sys/kernel/watchdog_thresh echo 1200 > /proc/sys/kernel/hung_task_timeout_secs echo 0 > /proc/sys/kernel/hung_task_panic 挂载数据盘 [ -d /disk1 ] || mkdir /disk1 wipefs -a --force /dev/nvme0n1p1 # 虚拟机环境更多的是/dev/vdb1 mkfs -t ext4 -q -F /dev/nvme0n1p1 mount -t ext4 /dev/nvme0n1p1 /disk1 创建日志目录 mkdir -p /disk1/tmpdir/stress-ng 2、下载stress-ng,编译 git clone https://github.com/ColinIanKing/stress-ng.git cd stress-ng make make install 3、执行命令 nohup stress-ng -a 1 -x softlockup,resources -t 72h --metrics --times --verify -v -Y /disk1/tmpdir/stress-ng/stress-statistic-12.yaml --log-file /disk1/tmpdir/stress-ng/stress-logfile-12.txt --temp-path /disk1/tmpdir/stress-ng/ & | ||||
标签 | 没加标签. | ||||
|
@wb-wpp899309 请问这个只在 ARM64 架构上才有吗? 有没有测试 Alinux2的环境? @baolinwang 帮忙安排个同学看看吧, thanks! |
|
目前在测试环境不能复现,且增加了一个相关已知patch |
日期 | 用户名 | 字段 | 更改 |
---|---|---|---|
2021-11-10 15:29 | wb-wpp899309 | 新建问题 | |
2021-11-10 21:25 | jacobwang | 分派给 | => Shiloong |
2021-11-10 21:25 | jacobwang | 状态 | 新建 => 已分配 |
2021-11-23 16:04 | Shiloong | 分派给 | Shiloong => baolinwang |
2021-11-23 16:04 | Shiloong | 注释已添加: 0000771 | |
2021-11-23 16:25 | baolinwang | 分派给 | baolinwang => fghuims |
2021-11-29 16:00 | fghuims | 注释已添加: 0000794 |