查看问题详情
编号 | 项目 | 分类 | 查看权限 | 报告日期 | 最后更新 |
---|---|---|---|---|---|
0000622 | Anolis OS 8 | - cloud kernel 5.10 | public | 2022-01-11 15:05 | 2022-01-17 09:20 |
报告员 | kangwen429 | 分派给 | |||
优先级 | high | 严重性 | major | 出现频率 | random |
状态 | new | 处理状况 | open | ||
平台 | x86_64 | 操作系统 | Anolis OS | 操作系统版本 | 8 |
标题 | 0000622: [Anolis 8.4-5.10-x86]升级5.10.84-10_rc2.an8.x86 版本内核后,稳定性测试出现crash:watchdog: BUG: soft lockup RIP: 0010:rt_flush_dev+0x84/0xb0 | ||||
描述 | 升级5.10.84-10_rc2.an8.x86 版本内核后,稳定性测试出现crash:watchdog: BUG: soft lockup RIP: 0010:rt_flush_dev+0x84/0xb0 部分vmcore-dmesg日志如下,更多日志参看附件: [10582.149998] Kernel panic - not syncing: softlockup: hung tasks [10582.150683] CPU: 42 PID: 119122 Comm: kworker/u128:21 Kdump: loaded Tainted: G W EL 5.10.84-10_rc2.an8.x86_64 #1 [10582.151411] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 90210cb 04/01/2014 [10582.152116] Workqueue: netns cleanup_net [10582.152746] Call Trace: [10582.153333] <IRQ> [10582.153921] dump_stack+0x57/0x6a [10582.154520] panic+0x10d/0x2e9 [10582.155100] watchdog_timer_fn.cold.14+0xc/0x16 [10582.155699] ? report_softlockup+0x1a0/0x1a0 [10582.156286] __hrtimer_run_queues+0xf1/0x230 [10582.156876] hrtimer_interrupt+0x100/0x210 [10582.157433] __sysvec_apic_timer_interrupt+0x5d/0xd0 [10582.158000] asm_call_irq_on_stack+0xf/0x20 [10582.158548] </IRQ> [10582.159057] sysvec_apic_timer_interrupt+0x73/0x80 [10582.159606] asm_sysvec_apic_timer_interrupt+0x12/0x20 [10582.160150] RIP: 0010:rt_flush_dev+0x84/0xb0 [10582.160675] Code: ff ff 48 39 c6 74 36 48 39 1a 75 1e 48 8b 0d ab 60 35 02 48 89 0a 48 8b 89 d0 04 00 00 65 ff 01 48 8b 8b d0 04 00 00 65 ff 09 <48> 8b 8a a0 00 00 00 48 8d 91 60 ff ff ff 48 39 ce 75 ca 4c 89 e7 [10582.161902] RSP: 0018:ffffa0ea10d2bd10 EFLAGS: 00000287 [10582.162476] RAX: ffff8ea0b2718e20 RBX: ffff8e958125d000 RCX: ffff8e7f0152e820 [10582.163097] RDX: ffff8e7f0152e780 RSI: ffff8eb4433b4988 RDI: ffff8eb4433b4980 [10582.163705] RBP: 0000000000034980 R08: 0000000000000000 R09: 0000000000000016 [10582.164302] R10: 00000000ffffffff R11: 0000000000000001 R12: ffff8eb4433b4980 [10582.164911] R13: 0000000000000016 R14: 0000000000000020 R15: 0000000000000000 [10582.165528] fib_netdev_event+0x110/0x140 [10582.166071] raw_notifier_call_chain+0x41/0x50 [10582.166628] ? dev_disable_lro+0xe0/0xe0 [10582.167172] rollback_registered_many+0x320/0x5b0 [10582.167728] unregister_netdevice_many+0x17/0x70 [10582.168276] default_device_exit_batch+0x131/0x150 [10582.168834] ? do_wait_intr_irq+0xa0/0xa0 [10582.169391] cleanup_net+0x224/0x340 [10582.169942] process_one_work+0x19e/0x340 [10582.170503] worker_thread+0x30/0x360 [10582.171037] ? process_one_work+0x340/0x340 [10582.171575] kthread+0x116/0x130 [10582.172102] ? __kthread_cancel_work+0x40/0x40 [10582.172648] ret_from_fork+0x1f/0x30 [10582.174524] Kernel Offset: 0x22000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [预期结果]:稳定性测试不会导致环境crash [实际结果]:执行stress-ng测试1h左右,环境出现crash [复现概率]:大概率会复现。 [环境信息]: 内核信息: # uname -r 5.10.84-10_rc2.an8.x86_64 机型: ECS 操作系统信息: # cat /etc/os-release NAME="Anolis OS" VERSION="8.4" ID="anolis" ID_LIKE="rhel fedora centos" VERSION_ID="8.4" PLATFORM_ID="platform:an8" PRETTY_NAME="Anolis OS 8.4" ANSI_COLOR="0;31" HOME_URL="https://openanolis.cn/" # free -mh total used free shared buff/cache available Mem: 247Gi 820Mi 245Gi 2.0Mi 755Mi 244Gi Swap: 0B 0B 0B # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 32 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel BIOS Vendor ID: Alibaba Cloud CPU family: 6 Model: 106 Model name: Intel(R) Xeon(R) Platinum 8369B CPU @ 2.70GHz BIOS Model name: pc-i440fx-2.1 Stepping: 6 CPU MHz: 2699.998 BogoMIPS: 5399.99 Hypervisor vendor: KVM Virtualization type: full L1d cache: 48K L1i cache: 32K L2 cache: 1280K L3 cache: 49152K NUMA node0 CPU(s): 0-63 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves wbnoinvd arat avx512vbmi avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm arch_capabilities | ||||
问题重现步骤 | 安装最新版本的stress-ng,设置系统参数: echo 1 > /proc/sys/kernel/panic echo 1 > /proc/sys/kernel/hardlockup_panic echo 1 > /proc/sys/kernel/softlockup_panic echo 50 > /proc/sys/kernel/watchdog_thresh echo 1200 > /proc/sys/kernel/hung_task_timeout_secs echo 0 > /proc/sys/kernel/hung_task_panic 执行压力测试: nohup stress-ng -a 1 -x seccomp,mmap,mmapaddr,mmapfixed,mmapfork,mmapmany,mremap,rlimit,stack,bigheap,env,brk,bad-altstack,aio,sysfs,bad-altstack,shm,close,clock,fallocate,l1cache,pci,sigio,rlimit,binderfs,munmap,softlockup,resources,fifo,set,zlib,wcs,tree,splice,sockfd,sctp,radixsort,pipe,mergesort,key,inotify,heapsort,epoll,dccp,cap,aiol,vforkmany,switch,sock,cyclic,cpu-online,mlockmany,oom-pipe,sysinval,watchdog -t 72h --metrics --times --verify -v -Y /disk1/tmpdir/stress-ng/stress-statistic-11.yaml --log-file /disk1/tmpdir/stress-ng/stress-logfile-11.txt --temp-path /disk1/tmpdir/stress-ng/ --oomable --skip-silent & | ||||
标签 | 没加标签. | ||||
日期 | 用户名 | 字段 | 更改 |
---|---|---|---|
2022-01-11 15:05 | kangwen429 | 新建问题 | |
2022-01-11 15:05 | kangwen429 | 添加了以下文件:: vmcore-dmesg.rar | |
2022-01-11 15:06 | kangwen429 | 标题 | [Anolis 8.4-5.10-x86]升级5.10.84-10_rc2.an8.x86 版本内核后,稳定性测试出现crash:watchdog: BUG: soft lockup RIP: 0010:rt_flush_dev+0x80/0xc0 => [Anolis 8.4-5.10-x86]升级5.10.84-10_rc2.an8.x86 版本内核后,稳定性测试出现crash:watchdog: BUG: soft lockup RIP: 0010:rt_flush_dev+0x8 |
2022-01-11 15:06 | kangwen429 | 描述已修改 | |
2022-01-11 15:07 | kangwen429 | 标题 | [Anolis 8.4-5.10-x86]升级5.10.84-10_rc2.an8.x86 版本内核后,稳定性测试出现crash:watchdog: BUG: soft lockup RIP: 0010:rt_flush_dev+0x8 => [Anolis 8.4-5.10-x86]升级5.10.84-10_rc2.an8.x86 版本内核后,稳定性测试出现crash:watchdog: BUG: soft lockup RIP: 0010:rt_flush_dev+0x84/0xb0 |