查看问题详情

编号项目分类查看权限最后更新
0000497Anolis OS 7Generalpublic2021-11-16 15:27
报告员anolislw 分派给Shiloong  
优先级low严重性minor出现频率always
状态 assigned处理状况open 
平台x86_64操作系统Anolis OS操作系统版本7
标题0000497: [Anolis7.7-anck vhd][ecs][x86_64] 实例reboot后dmesg中存在异常信息
描述[问题描述]
Anolis7.7-anck vhd ecs x86_64 实例reboot后dmesg中存在异常信息
镜像:anolisos_7_7_x64_20G_anck_alibase_20211105.vhd
实例:ecs.g7.32xlarge

[机器详情]
[root@iZbp13tgwor95du6lgnbjzZ ~]# uname -a
Linux iZbp13tgwor95du6lgnbjzZ 4.19.91-24.8.an7.x86_64 #1 SMP Sat Sep 18 16:53:17 CST 2021 x86_64 x86_64 x86_64 GNU/Linux
[root@iZbp13tgwor95du6lgnbjzZ ~]# cat /etc/os-release
NAME="Anolis OS"
VERSION="7.7"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="7.7"
PRETTY_NAME="Anolis OS 7.7"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"
BUG_REPORT_URL="https://bugs.openanolis.cn/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

[root@iZbp13tgwor95du6lgnbjzZ ~]# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.19.91-24.8.an7.x86_64 root=UUID=678fa482-e865-43e6-a841-68840e679524 ro crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M cryptomgr.notests cgroup.memory=nokmem rcupdate.rcu_cpu_stall_timeout=300 vring_force_dma_api spectre_v2=retpoline biosdevname=0 net.ifnames=0 console=tty0 console=ttyS0,115200n8 noibrs nvme_core.io_timeout=4294967295
[root@iZbp13tgwor95du6lgnbjzZ ~]# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 128
On-line CPU(s) list: 0-127
Thread(s) per core: 2
Core(s) per socket: 32
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 106
Model name: Intel(R) Xeon(R) Platinum 8369B CPU @ 2.70GHz
Stepping: 6
CPU MHz: 2699.998
BogoMIPS: 5399.99
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 48K
L1i cache: 32K
L2 cache: 1280K
L3 cache: 49152K
NUMA node0 CPU(s): 0-63
NUMA node1 CPU(s): 64-127
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves wbnoinvd arat avx512vbmi avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm
[root@iZbp13tgwor95du6lgnbjzZ ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 248G 0 248G 0% /dev
tmpfs 248G 0 248G 0% /dev/shm
tmpfs 248G 1.1M 248G 1% /run
tmpfs 248G 0 248G 0% /sys/fs/cgroup
/dev/vda1 40G 13G 25G 34% /
tmpfs 50G 0 50G 0% /run/user/0
[root@iZbp13tgwor95du6lgnbjzZ ~]# free -m
              total used free shared buff/cache available
Mem: 507603 251675 249019 1 6908 253096
Swap: 0 0 0


[报错详情]
[root@iZbp13tgwor95du6lgnbjzZ ~]# dmesg -l warn -T
[Sat Nov 6 02:24:37 2021] MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
[Sat Nov 6 02:24:37 2021] #2
[Sat Nov 6 02:24:37 2021] #3
[Sat Nov 6 02:24:37 2021] #4
[Sat Nov 6 02:24:37 2021] 0000005
......
......
[Sat Nov 6 02:24:37 2021] 0000029
[Sat Nov 6 02:24:37 2021] 0000030
[Sat Nov 6 02:24:37 2021] 0000031
[Sat Nov 6 02:24:37 2021] 0000032
[Sat Nov 6 02:24:37 2021] 0000033
[Sat Nov 6 02:24:37 2021] 0000034
[Sat Nov 6 02:24:37 2021] 0000035
[Sat Nov 6 02:24:37 2021] 0000036
[Sat Nov 6 02:24:37 2021] 0000037
[Sat Nov 6 02:24:37 2021] 0000038
[Sat Nov 6 02:24:37 2021] 0000039
[Sat Nov 6 02:24:37 2021] 0000040
[Sat Nov 6 02:24:37 2021] 0000041
[Sat Nov 6 02:24:37 2021] 0000042
[Sat Nov 6 02:24:37 2021] 0000043
[Sat Nov 6 02:24:37 2021] 0000044
[Sat Nov 6 02:24:38 2021] 0000045
[Sat Nov 6 02:24:38 2021] 0000046
[Sat Nov 6 02:24:38 2021] 0000047
[Sat Nov 6 02:24:38 2021] 0000048
[Sat Nov 6 02:24:38 2021] 0000049
[Sat Nov 6 02:24:38 2021] 0000050
[Sat Nov 6 02:24:38 2021] #51
[Sat Nov 6 02:24:38 2021] #52
[Sat Nov 6 02:24:38 2021] 0000053
[Sat Nov 6 02:24:38 2021] 0000054
[Sat Nov 6 02:24:38 2021] 0000055
[Sat Nov 6 02:24:38 2021] 0000056
[Sat Nov 6 02:24:38 2021] 0000057
[Sat Nov 6 02:24:38 2021] 0000058
[Sat Nov 6 02:24:38 2021] 0000059
[Sat Nov 6 02:24:38 2021] 0000060
[Sat Nov 6 02:24:38 2021] 0000061
[Sat Nov 6 02:24:38 2021] 0000062
[Sat Nov 6 02:24:38 2021] 0000063
[Sat Nov 6 02:24:34 2021] ------------[ cut here ]------------
[Sat Nov 6 02:24:34 2021] sched: CPU 0000064's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
[Sat Nov 6 02:24:34 2021] WARNING: CPU: 64 PID: 0 at arch/x86/kernel/smpboot.c:426 topology_sane.isra.5+0x63/0x70
[Sat Nov 6 02:24:34 2021] Modules linked in:
[Sat Nov 6 02:24:34 2021] CPU: 64 PID: 0 Comm: swapper/64 Not tainted 4.19.91-24.8.an7.x86_64 #1
[Sat Nov 6 02:24:34 2021] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 90210cb 04/01/2014
[Sat Nov 6 02:24:34 2021] RIP: 0010:topology_sane.isra.5+0x63/0x70
[Sat Nov 6 02:24:34 2021] Code: f0 01 5b c3 80 3d 16 83 3d 01 00 75 f0 45 8b 0c 09 31 c0 89 f1 89 fe 48 c7 c7 70 ef 0a 94 c6 05 fc 82 3d 01 01 e8 dd 4b 04 00 <0f> 0b eb cf 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 0f b6 05 18
[Sat Nov 6 02:24:34 2021] RSP: 0000:ffffb745cc6dfed0 EFLAGS: 00010086
[Sat Nov 6 02:24:34 2021] RAX: 0000000000000061 RBX: ffff945342c0f001 RCX: ffffffff94269568
[Sat Nov 6 02:24:34 2021] RDX: 0000000000000001 RSI: 0000000000000092 RDI: 0000000000000046
[Sat Nov 6 02:24:34 2021] RBP: 0000000000000040 R08: 0000000000000000 R09: 0000000000000005
[Sat Nov 6 02:24:34 2021] R10: 0000000000000000 R11: ffffb745cc6dfc60 R12: 000000000000f020
[Sat Nov 6 02:24:34 2021] R13: 0000000000000040 R14: 0000000000000000 R15: 0000000000000001
[Sat Nov 6 02:24:34 2021] FS: 0000000000000000(0000) GS:ffff945342c00000(0000) knlGS:0000000000000000
[Sat Nov 6 02:24:34 2021] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Sat Nov 6 02:24:34 2021] CR2: 0000000000000000 CR3: 000000226820a001 CR4: 00000000003706e0
[Sat Nov 6 02:24:34 2021] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[Sat Nov 6 02:24:34 2021] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[Sat Nov 6 02:24:34 2021] Call Trace:
[Sat Nov 6 02:24:34 2021] set_cpu_sibling_map+0x162/0x5a0
[Sat Nov 6 02:24:34 2021] start_secondary+0x9f/0x1e0
[Sat Nov 6 02:24:34 2021] secondary_startup_64+0xa4/0xb0
[Sat Nov 6 02:24:34 2021] ---[ end trace 1d46b6b5544b02c7 ]---
[Sat Nov 6 02:24:38 2021] 0000065
[Sat Nov 6 02:24:38 2021] 0000066
[Sat Nov 6 02:24:38 2021] 0000067
[Sat Nov 6 02:24:38 2021] 0000068
......
......
[Sat Nov 6 02:24:39 2021] #125
[Sat Nov 6 02:24:39 2021] 0000126
[Sat Nov 6 02:24:39 2021] 0000127
[Sat Nov 6 02:24:39 2021] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
[Sat Nov 6 02:24:43 2021] systemd[1]: [/run/systemd/generator/dev-disk-by\x2duuid-678fa482\x2de865\x2d43e6\x2da841\x2d68840e679524.device.d/timeout.conf:3] Unknown lvalue 'JobRunningTimeoutSec' in section 'Unit'
[Sat Nov 6 02:24:44 2021] systemd: 15 output lines suppressed due to ratelimiting
[Sat Nov 6 02:24:44 2021] systemd-journald[2380]: File /var/log/journal/20211105100401200910406909548250/system.journal corrupted or uncleanly shut down, renaming and replacing.
[Sat Nov 6 02:29:23 2021] conntrack: generic helper won't handle protocol 47. Please consider loading the specific helper module.
问题重现步骤[复现]
dmesg -l warn -T
标签没加标签.

活动

Shiloong

2021-11-11 15:07

开发人员   ~0000658

ICX 上的已知问题: https://lore.kernel.org/lkml/20210216195804.24204-1-alison.schofield@intel.com/
目前bugfix 还未进 upstream, 问题只有 warning, 无实际功能影响, 暂不修复.

Shiloong

2021-11-11 15:08

开发人员   ~0000659

Aone: 34275025

问题历史

日期 用户名 字段 更改
2021-11-08 16:51 anolislw 新建问题
2021-11-11 15:07 Shiloong 注释已添加: 0000658
2021-11-11 15:07 Shiloong 分派给 => Shiloong
2021-11-11 15:07 Shiloong 状态 新建 => 已分配
2021-11-11 15:08 Shiloong 注释已添加: 0000659
2021-11-16 15:27 jacobwang 优先级 中 => 低