查看问题详情

编号项目分类查看权限最后更新
0000036Anolis OS 8kernelpublic2021-10-14 16:35
报告员wb-zmy745940 分派给jacobwang  
优先级low严重性minor出现频率always
状态 assigned处理状况open 
平台x86_64/aarch64操作系统Anolis OS操作系统版本8
产品版本8.2-rc1 
目标版本8.2 正式版 
标题0000036: [Anolis 8.2-RC1-4.18-x86]dmesg check:unchecked MSR access error: RDMSR from 0x1fc at rIP: 0xffffffff8cc61db3 (native_read_msr+0x
描述[缺陷描述]:
upstream代码已经发现并修改了这个问题:https://bugzilla.kernel.org/show_bug.cgi?id=203637

dmesg中有error日志,如下:

[    1.097211] intel_idle: v0.4.1 model 0x55
[    1.097265] unchecked MSR access error: RDMSR from 0x1fc at rIP: 0xffffffff8cc61db3 (native_read_msr+0x3/0x30)
[    1.097266] Call Trace:
[    1.097301]  intel_idle_cpu_online+0x6c/0xf0
[    1.097323]  cpuhp_invoke_callback+0x8d/0x500
[    1.097338]  ? sort_range+0x20/0x20
[    1.097340]  cpuhp_thread_fun+0xcb/0x130
[    1.097342]  smpboot_thread_fn+0xc5/0x160
[    1.097352]  kthread+0x112/0x130
[    1.097354]  ? kthread_flush_work_fn+0x10/0x10
[    1.097363]  ret_from_fork+0x35/0x40
[    1.097378] unchecked MSR access error: WRMSR to 0x1fc (tried to write 0x0000000000000000) at rIP: 0xffffffff8cc61f14 (native_write_msr+0x4/0x20)
[    1.097379] Call Trace:
[    1.097381]  intel_idle_cpu_online+0x83/0xf0
[    1.097382]  cpuhp_invoke_callback+0x8d/0x500
[    1.097384]  ? sort_range+0x20/0x20
[    1.097386]  cpuhp_thread_fun+0xcb/0x130
[    1.097387]  smpboot_thread_fn+0xc5/0x160
[    1.097389]  kthread+0x112/0x130
[    1.097390]  ? kthread_flush_work_fn+0x10/0x10
[    1.097391]  ret_from_fork+0x35/0x40
[    1.097526] intel_idle: lapic_timer_reliable_states 0x2
[    1.097650] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
[    1.097671] ACPI: Power Button [PWRF]
[    1.116885] PCI Interrupt Link [LNKC] enabled at IRQ 10
[    1.116935] virtio-pci 0000:00:03.0: virtio_pci: leaving for legacy driver



[重现概率]

必现



[重现环境]

Host:虚拟机 ,x86

OS:Anolis OS release 8.2

kernel:4.18.0-193.el8.x86_64



# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  2
Core(s) per socket:  1
Socket(s):           2
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz
Stepping:            7
CPU MHz:             2500.000
BogoMIPS:            5000.00
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            36608K
NUMA node0 CPU(s):   0-3
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_vnni



[重现步骤]:
开机后查看dmesg日志

[期望结果]:
dmesg日志中无error信息

[实际结果]:
dmesg日志中有error信息

[原因定位]:

[修复建议]: 
标签没加标签.

活动

jacobwang

2021-04-13 10:38

经理   ~0000063

4.18预期问题, 请帮忙check 是否4.19对应kernel 依旧存在该问题。

wb-wpp899309

2021-04-14 10:53

报告者   ~0000080

1、内核:4.19.91-23.1.an8.x86_64,机器型号:F62M1,同样有相同的日志
dmesg部分日志:
[ 1.418938] intel_idle: MWAIT substates: 0x2020
[ 1.418948] intel_idle: ACPI _CST not found or not usable
[ 1.418948] intel_idle: v0.4.1 model 0x55
[ 1.419002] unchecked MSR access error: RDMSR from 0x1fc at rIP: 0xffffffffbc0670c3 (native_read_msr+0x3/0x30)
[ 1.419003] Call Trace:
[ 1.419039] intel_idle_cpu_online+0x71/0xf3
[ 1.419060] cpuhp_invoke_callback+0x9b/0x540
[ 1.419074] ? sort_range+0x20/0x20
[ 1.419076] cpuhp_thread_fun+0xb0/0x110
[ 1.419080] smpboot_thread_fn+0xc5/0x160
[ 1.419088] kthread+0x112/0x130
[ 1.419093] ? kthread_park+0x80/0x80
[ 1.419102] ret_from_fork+0x35/0x40
[ 1.419120] unchecked MSR access error: WRMSR to 0x1fc (tried to write 0x0000000000000000) at rIP: 0xffffffffbc067134 (native_write_msr+0x4/0x20)
[ 1.419121] Call Trace:
[ 1.419122] intel_idle_cpu_online+0x88/0xf3
[ 1.419125] cpuhp_invoke_callback+0x9b/0x540
[ 1.419126] ? sort_range+0x20/0x20
[ 1.419128] cpuhp_thread_fun+0xb0/0x110
[ 1.419129] smpboot_thread_fn+0xc5/0x160
[ 1.419130] kthread+0x112/0x130
[ 1.419131] ? kthread_park+0x80/0x80
[ 1.419133] ret_from_fork+0x35/0x40
[ 1.419279] intel_idle: lapic_timer_reliable_states 0x2
2、同时有1条wrnning日志
[ 3.953952] random: 7 urandom warning(s) missed due to ratelimiting

jacobwang

2021-04-24 23:48

经理   ~0000108

Tested on following scene, can't reproduce issue.

- 'Alibaba ECS' with instance types of ecs.g5.large and ecs.s6-c1m1.small .
- VM created on intel sandybridge platform.

Perhaps this issue is related to special intel platform.
Will continue to check on QA's test platform.

wb-wpp899309

2021-10-14 16:35

报告者   ~0000492

Anolis OS 7.7有同样问题:
# cat /etc/os-release
NAME="Anolis OS"
VERSION="7.7"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="7.7"
PRETTY_NAME="Anolis OS 7.7"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"
BUG_REPORT_URL="https://bugs.openanolis.cn/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

# uname -r
4.19.91-24.8.an7.x86_64
[root@localhost ~]# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz
Stepping: 7
CPU MHz: 2500.000
BogoMIPS: 5000.00
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 36608K
NUMA node0 CPU(s): 0-3
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_vnni

dmesg日志:
# dmesg -l warn -T
[Thu Oct 14 15:59:39 2021] MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
[Thu Oct 14 15:59:39 2021] TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.
[Thu Oct 14 15:59:39 2021] #2
[Thu Oct 14 15:59:39 2021] #3
[Thu Oct 14 15:59:39 2021] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
[Thu Oct 14 15:59:40 2021] unchecked MSR access error: RDMSR from 0x1fc at rIP: 0xffffffffb606c023 (native_read_msr+0x3/0x30)
[Thu Oct 14 15:59:40 2021] Call Trace:
[Thu Oct 14 15:59:40 2021] intel_idle_cpu_online+0x71/0xf5
[Thu Oct 14 15:59:40 2021] cpuhp_invoke_callback+0x9a/0x5b0
[Thu Oct 14 15:59:40 2021] cpuhp_thread_fun+0xb8/0x130
[Thu Oct 14 15:59:40 2021] smpboot_thread_fn+0x10e/0x160
[Thu Oct 14 15:59:40 2021] kthread+0xf8/0x130
[Thu Oct 14 15:59:40 2021] ? sort_range+0x20/0x20
[Thu Oct 14 15:59:40 2021] ? kthread_park+0xb0/0xb0
[Thu Oct 14 15:59:40 2021] ret_from_fork+0x35/0x40
[Thu Oct 14 15:59:40 2021] unchecked MSR access error: WRMSR to 0x1fc (tried to write 0x0000000000000000) at rIP: 0xffffffffb606c094 (native_write_msr+0x4/0x20)
[Thu Oct 14 15:59:40 2021] Call Trace:
[Thu Oct 14 15:59:40 2021] intel_idle_cpu_online+0x88/0xf5
[Thu Oct 14 15:59:40 2021] cpuhp_invoke_callback+0x9a/0x5b0
[Thu Oct 14 15:59:40 2021] cpuhp_thread_fun+0xb8/0x130
[Thu Oct 14 15:59:40 2021] smpboot_thread_fn+0x10e/0x160
[Thu Oct 14 15:59:40 2021] kthread+0xf8/0x130
[Thu Oct 14 15:59:40 2021] ? sort_range+0x20/0x20
[Thu Oct 14 15:59:40 2021] ? kthread_park+0xb0/0xb0
[Thu Oct 14 15:59:40 2021] ret_from_fork+0x35/0x40
[Thu Oct 14 15:59:40 2021] systemd[1]: [/run/systemd/generator/dev-mapper-ao\x2droot.device.d/timeout.conf:3] Unknown lvalue 'JobRunningTimeoutSec' in section 'Unit'
[Thu Oct 14 15:59:45 2021] systemd: 16 output lines suppressed due to ratelimiting

问题历史

日期 用户名 字段 更改
2021-03-19 16:17 wb-zmy745940 新建问题
2021-03-19 16:17 wb-zmy745940 状态 新建 => 已分配
2021-03-19 16:17 wb-zmy745940 分派给 => jacobwang
2021-03-19 16:18 wb-zmy745940 平台 x86_64 => x86_64/aarch64
2021-03-19 22:44 swordantcs 产品版本 8.2 正式版 => 8.2-rc1
2021-03-19 22:45 swordantcs 目标版本 => 8.2-rc2
2021-04-02 03:22 swordantcs 分类 - aliyun-images => (无分类)
2021-04-02 12:18 swordantcs 分类 (无分类) => kernel
2021-04-13 10:38 jacobwang 注释已添加: 0000063
2021-04-13 23:09 jacobwang 分派给 jacobwang => qingming
2021-04-14 10:53 wb-wpp899309 注释已添加: 0000080
2021-04-15 10:36 jacobwang 分派给 qingming => jacobwang
2021-04-15 10:36 jacobwang 目标版本 8.2-rc2 => 8.2 正式版
2021-04-24 23:48 jacobwang 注释已添加: 0000108
2021-10-14 16:35 wb-wpp899309 注释已添加: 0000492