查看问题详情
编号 | 项目 | 分类 | 查看权限 | 报告日期 | 最后更新 |
---|---|---|---|---|---|
0000345 | Anolis OS 8 | - cloud kernel 4.19 | public | 2021-10-16 16:56 | 2021-11-23 16:07 |
报告员 | guanjun | 分派给 | guanjun | ||
优先级 | high | 严重性 | major | 出现频率 | always |
状态 | resolved | 处理状况 | fixed | ||
标题 | 0000345: ANCK4.19在某些情况下触发divide zero错误 | ||||
描述 | NVMe标盘硬件支持129个queue,最新的NVMe驱动按照possible cpus(如果机器有128个cpu(关闭HT),possible cpus=256)和硬件支持queue数量的较小值(129)分配irq数量129。这种情况下会触发内核除零bug,调用栈如下: 4.19.91-22.2.al7.x86_64 021-10-02 18:41:58 [ 31.234817] CPU: 9 PID: 1154 Comm: kworker/u513:2 Not tainted 4.19.91-22.2.al7.x86_64 #1 2021-10-02 18:41:58 [ 31.234819] Hardware name: Inventec Horsea-12U /Horsea-F , BIOS 1.1.EY.IV.D.060.02 08/17/2020 2021-10-02 18:41:58 [ 31.234827] Workqueue: nvme-reset-wq nvme_reset_work [nvme] 2021-10-02 18:41:58 [ 31.270457] RIP: 0010:__irq_build_affinity_masks.isra.3+0x17a/0x360 2021-10-02 18:41:58 [ 31.270459] Code: 24 14 48 63 54 24 24 48 c1 e2 06 48 03 54 24 28 89 c3 e8 c9 c1 35 00 be 0 0 02 00 00 4c 89 ef e8 4c c7 35 00 39 c3 0f 4f d8 99 <f7> fb 85 db 89 5c 24 10 89 54 24 04 89 44 24 08 0f 8e b8 01 00 00 2021-10-02 18:41:58 [ 31.306520] RSP: 0018:ffffb2025cd47b60 EFLAGS: 00010287 2021-10-02 18:41:58 [ 31.306522] RAX: 0000000000000010 RBX: 0000000000000000 RCX: 0000000000000200 2021-10-02 18:41:58 [ 31.306522] RDX: 0000000000000000 RSI: 0000000000000200 RDI: 0000000000000000 2021-10-02 18:41:58 [ 31.306525] RBP: 0000000000000040 R08: 0000000000000010 R09: 0000000000000008 2021-10-02 18:41:58 [ 31.344929] R10: ffffb2025cd47bf0 R11: ffffd2393d0cd400 R12: 0000000000000040 2021-10-02 18:41:58 [ 31.344929] R13: ffffb2025cd47bf0 R14: 0000000000000081 R15: 000000000000f1a0 2021-10-02 18:41:58 [ 31.344930] FS: 0000000000000000(0000) GS:ffff8dc81f640000(0000) knlGS:0000000000000000 2021-10-02 18:41:58 [ 31.344931] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2021-10-02 18:41:58 [ 31.344932] CR2: 00007f416fa9722d CR3: 0000003f4e97c000 CR4: 0000000000340ee0 2021-10-02 18:41:58 [ 31.344932] Call Trace: 2021-10-02 18:41:58 [ 31.344936] irq_build_affinity_masks.isra.4+0xf3/0x170 2021-10-02 18:41:58 [ 31.344938] irq_create_affinity_masks+0x205/0x300 2021-10-02 18:41:58 [ 31.344941] __pci_enable_msix_range+0x209/0x520 2021-10-02 18:41:58 [ 31.344942] pci_alloc_irq_vectors_affinity+0xbb/0x110 2021-10-02 18:41:58 [ 31.344944] nvme_reset_work+0xad2/0x162d [nvme] 2021-10-02 18:41:58 [ 31.344948] ? dequeue_entity+0x1e6/0x970 2021-10-02 18:41:58 [ 31.344950] ? sched_clock+0x5/0x10 2021-10-02 18:41:58 [ 31.344952] ? sched_clock_cpu+0xc/0xa0 2021-10-02 18:41:58 [ 31.344953] ? try_to_wake_up+0x219/0x580 2021-10-02 18:41:58 [ 31.344954] process_one_work+0x15b/0x370 2021-10-02 18:41:58 [ 31.344956] worker_thread+0x49/0x3e0 2021-10-02 18:41:58 [ 31.344957] kthread+0xf8/0x130 2021-10-02 18:41:58 [ 31.344958] ? process_one_work+0x370/0x370 2021-10-02 18:41:58 [ 31.344959] ? kthread_park+0xb0/0xb0 2021-10-02 18:41:58 [ 31.344961] ret_from_fork+0x1f/0x40 2021-10-02 18:41:58 [ 31.344963] Modules linked in: kvm_amd(+) sunrpc kvm irqbypass crct10dif_pclmul crc32_pclmu l ghash_clmulni_intel pcbc aesni_intel crypto_simd cryptd glue_helper pcspkr nvme i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt vfat fb_sys_fops fat sp5100_tco nvme_core drm sg i2c_piix4 i2c_designware_platfo rm ipmi_si(+) i2c_designware_core iosf_mbi ipmi_devintf i2c_core ipmi_msghandler pcc_cpufreq acpi_cpufreq ip_table s sd_mod crc32c_intel ahci libahci libata 2021-10-02 18:41:58 [ 31.344991] ---[ end trace 1506a87a8299d5c8 ]--- 2021-10-02 18:41:59 [ 32.765477] RIP: 0010:__irq_build_affinity_masks.isra.3+0x17a/0x360 2021-10-02 18:41:59 [ 32.765480] Code: 24 14 48 63 54 24 24 48 c1 e2 06 48 03 54 24 28 89 c3 e8 c9 c1 35 00 be 0 0 02 00 00 4c 89 ef e8 4c c7 35 00 39 c3 0f 4f d8 99 <f7> fb 85 db 89 5c 24 10 89 54 24 04 89 44 24 08 0f 8e b8 01 00 00 2021-10-02 18:41:59 [ 32.765481] RSP: 0018:ffffb2025cd47b60 EFLAGS: 00010287 2021-10-02 18:41:59 [ 32.765482] RAX: 0000000000000010 RBX: 0000000000000000 RCX: 0000000000000200 2021-10-02 18:41:59 [ 32.765482] RDX: 0000000000000000 RSI: 0000000000000200 RDI: 0000000000000000 2021-10-02 18:41:59 [ 32.765483] RBP: 0000000000000040 R08: 0000000000000010 R09: 0000000000000008 2021-10-02 18:41:59 [ 32.765483] R10: ffffb2025cd47bf0 R11: ffffd2393d0cd400 R12: 0000000000000040 2021-10-02 18:41:59 [ 32.765483] R13: ffffb2025cd47bf0 R14: 0000000000000081 R15: 000000000000f1a0 2021-10-02 18:41:59 [ 32.765484] FS: 0000000000000000(0000) GS:ffff8dc81f640000(0000) knlGS:0000000000000000 2021-10-02 18:41:59 [ 32.765485] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2021-10-02 18:41:59 [ 32.765486] CR2: 00007f416fa9722d CR3: 0000003f4e97c000 CR4: 0000000000340ee0 2021-10-02 18:41:59 [ 32.765487] Kernel panic - not syncing: Fatal exception 2021-10-02 18:41:59 [ 32.766622] Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 0xffffffff 80000000-0xffffffffbfffffff) 对应代码范围: 138 ncpus = cpumask_weight(nmsk); 139 vecs_to_assign = min(vecs_per_node, ncpus); 140 141 /* Account for rounding errors */ 142 extra_vecs = ncpus - vecs_to_assign * (ncpus / vecs_to_assign); 140 in kernel/irq/affinity.c 141 in kernel/irq/affinity.c 142 in kernel/irq/affinity.c 0xffffffff81103d99 <+377>: cltd 0xffffffff81103d9a <+378>: idiv %ebx 0xffffffff81103da2 <+386>: mov %edx,0x4(%rsp) 0xffffffff81103da6 <+390>: mov %eax,0x8(%rsp) | ||||
标签 | 没加标签. | ||||
日期 | 用户名 | 字段 | 更改 |
---|---|---|---|
2021-10-16 16:56 | guanjun | 新建问题 | |
2021-10-18 15:14 | geliwei-ali | 分派给 | => Shiloong |
2021-10-18 15:14 | geliwei-ali | 状态 | 新建 => 已分配 |
2021-10-20 10:52 | guanjun | 注释已添加: 0000512 | |
2021-11-23 16:07 | Shiloong | 分派给 | Shiloong => guanjun |
2021-11-23 16:07 | Shiloong | 状态 | 已分配 => 已解决 |
2021-11-23 16:07 | Shiloong | 处理状况 | 未处理 => 已修正 |