查看问题详情

编号项目分类查看权限最后更新
0000499Anolis OS 7Generalpublic2021-12-10 13:55
报告员shanxifanshi 分派给Shiloong  
优先级normal严重性minor出现频率always
状态 assigned处理状况open 
平台aarch64操作系统Anolis OS操作系统版本7
标题0000499: [Anolis 7.7-rhck-aarch64][vhd][ECS] 机器内存为250G,crashkernel默认配置(256M)和设置512M,触发crash后均无法正常生成vmcore
描述[缺陷描述]:
使用镜像 anolisos_7_7_arm64_20G_rhck_alibase_20211105.vhd
启动云上ecs类型 : ecs.g6r.16xlarge
1、crashkernel为默认值,echo c >/proc/sysrq-trigger触发crash
2、设置crashkernel=512M,echo c >/proc/sysrq-trigger触发crash

[root@iZ2zeh8hn89q43evsv4tyxZ 127.0.0.1-2021-11-08-16:44:39]# pwd
/var/crash/127.0.0.1-2021-11-08-16:44:39
[root@iZ2zeh8hn89q43evsv4tyxZ 127.0.0.1-2021-11-08-16:44:39]# ll
total 124
-rw-r--r-- 1 root root 123829 Nov 8 16:44 vmcore-dmesg.txt
[root@iZ2zeh8hn89q43evsv4tyxZ 127.0.0.1-2021-11-08-16:44:39]# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.18.0-80.7.2.an7.aarch64 root=UUID=2d224c03-5112-4556-9628-bc8db71e32dc ro crashkernel=512M cryptomgr.notests cgroup.memory=nokmem rcupdate.rcu_cpu_stall_timeout=300 vring_force_dma_api rhgb quiet console=tty0 biosdevname=0 net.ifnames=0 console=ttyAMA0,115200n8 noibrs nvme_core.io_timeout=4294967295

[预期结果]:echo c >/proc/sysrq-trigger触发crash后可以正常生成vmcore,且vmcore可以正常解析

[实际结果]:echo c >/proc/sysrq-trigger触发crash后,未生成vmcore,仅有vmcore-dmesg信息

[复现概率]:必现

[环境信息]:
内核信息:
# uname -r
4.18.0-80.7.2.an7.aarch64

机型:线上ecs机器

规格:ecs.g6r.16xlarge(64vcpu 250G memory)

操作系统信息:
# cat /etc/os-release
NAME="Anolis OS"
VERSION="7.7"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="7.7"
PRETTY_NAME="Anolis OS 7.7"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"
BUG_REPORT_URL="https://bugs.openanolis.cn/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
问题重现步骤1、使用镜像anolisos_7_7_arm64_20G_rhck_alibase_20211105.vhd安装云上实例 ecs.g6r.16xlarge
2、查看/proc/cmdline,crashkernel为默认配置(机器内存大于8G,则crashkernel值为256M),使用echo c >/proc/sysrq-trigger 触发crash,查看是否能正常生成vmcore
# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.18.0-80.7.2.an7.aarch64 root=UUID=2d224c03-5112-4556-9628-bc8db71e32dc ro crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M cryptomgr.notests cgroup.memory=nokmem rcupdate.rcu_cpu_stall_timeout=300 vring_force_dma_api rhgb quiet console=tty0 biosdevname=0 net.ifnames=0 console=ttyAMA0,115200n8 noibrs nvme_core.io_timeout=4294967295

3、设置crashkernel为512M,再次触发crash,查看是否能正常生成vmcore,且vmcore可以正常解析
标签没加标签.

活动

geliwei-ali

2021-11-16 15:35

经理   ~0000712

测试了centos7.9的镜像有同样的问题

shanxifanshi

2021-11-23 15:22

报告者   ~0000763

该问题在anolis 7.9 vhd镜像测试中依然存在
规格:
ecs.g6r.16xlarge(64vcpu 250G memory)

镜像:
# cat /etc/image-id
image_name="Anolis 7.9 RHCK 64 bit ARM Edition"
image_id="anolisos_7_9_arm64_20G_rhck_alibase_20211119.vhd"
release_date="20211119165352"

内核:
# uname -r
4.18.0-80.7.2.an7.aarch64

配置预留512M内存,无法生成vmcore,只有vmcore-dmesg文件
# ll /var/crash/127.0.0.1-2021-11-23-15\:00\:10/
total 124
-rw-r--r-- 1 root root 123866 Nov 23 15:00 vmcore-dmesg.txt
[root@iZbp1fwd1qi584r0yo9sdkZ ~]# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.18.0-80.7.2.an7.aarch64 root=UUID=efba3c00-74c3-4cc5-a9b2-4c1d91770a95 ro crashkernel=0M-2G:0M,2G-8G:192M,8G-:512M cryptomgr.notests cgroup.memory=nokmem rcupdate.rcu_cpu_stall_timeout=300 vring_force_dma_api rhgb quiet console=tty0 biosdevname=0 net.ifnames=0 console=ttyAMA0,115200n8 noibrs nvme_core.io_timeout=4294967295
[root@iZbp1fwd1qi584r0yo9sdkZ ~]# free -h
              total used free shared buff/cache available
Mem: 250G 1.3G 248G 16M 453M 231G
Swap: 0B 0B 0B

jacobwang

2021-12-10 13:54

经理   ~0000833

OS |instance| 包版本 | makedumpfile 是否有问题
--|--|--|--
CentOS 7.9 |ecs.g6r.large| kexec-tools-2.0.15-51.1.an7.3.aarch64/4.18.0-193.28.1.el7.aarch64| 是
CentOS 7.9 |ecs.g6r.large| kexec-tools-2.0.15-51.1.an7.3.aarch64/4.19.91-24.8.an7.aarch64| 否
CentOS 7.9 |ecs.g6r.large| kexec-tools-2.0.15-51.el7_9.3.aarch64/4.19.91-24.8.an7.aarch64| 否

通过排查, 该问题同kexec-tools 版本无关, 同内核有关系。

请 内核接口继续check。

问题历史

日期 用户名 字段 更改
2021-11-08 17:08 shanxifanshi 新建问题
2021-11-08 17:13 shanxifanshi 标题 [Anolis 7.7-rhck-aarch64][vhd][ECS] crashkernel默认配置和设置512M,触发crash后均无法正常生成vmcore => [Anolis 7.7-rhck-aarch64][vhd][ECS] crashkernel默认配置(256M)和设置512M,触发crash后均无法正常生成vmcore
2021-11-08 17:56 shanxifanshi 标题 [Anolis 7.7-rhck-aarch64][vhd][ECS] crashkernel默认配置(256M)和设置512M,触发crash后均无法正常生成vmcore => [Anolis 7.7-rhck-aarch64][vhd][ECS] 机器内存为250G,crashkernel默认配置(256M)和设置512M,触发crash后均无法正常生成vmcore
2021-11-08 17:56 shanxifanshi 描述已修改
2021-11-08 17:56 shanxifanshi 描述已修改
2021-11-08 17:57 shanxifanshi 出现频率 没有试验 => 总是
2021-11-16 15:35 geliwei-ali 注释已添加: 0000712
2021-11-16 19:15 jacobwang 分派给 => geliwei-ali
2021-11-16 19:15 jacobwang 状态 新建 => 已分配
2021-11-23 15:22 shanxifanshi 注释已添加: 0000763
2021-12-03 15:44 wb-zmy745940 分派给 geliwei-ali => jacobwang
2021-12-10 13:54 jacobwang 注释已添加: 0000833
2021-12-10 13:55 jacobwang 分派给 jacobwang => Shiloong