查看问题详情
编号 | 项目 | 分类 | 查看权限 | 报告日期 | 最后更新 |
---|---|---|---|---|---|
0000289 | Anolis OS 8 | - cloud kernel 4.19 | public | 2021-09-22 11:27 | 2021-09-22 11:27 |
报告员 | CruzZhao | 分派给 | |||
优先级 | normal | 严重性 | minor | 出现频率 | always |
状态 | new | 处理状况 | open | ||
平台 | x86_64 | 操作系统 | Anolis OS | 操作系统版本 | 8 |
产品版本 | 8.2-rc1 | ||||
标题 | 0000289: 容器在退出时发生core dump,就会无法退出 | ||||
描述 | 容器在退出时发生core dump,就会无法退出 现象, 退出路径上等待coreudmp完成的信号。 [<0>] __refrigerator+0x75/0x160 [<0>] do_exit+0x224/0xc60 [<0>] do_group_exit+0x3a/0xa0 [<0>] get_signal+0x156/0x8c0 [<0>] do_signal+0x36/0x610 [<0>] exit_to_usermode_loop+0x95/0x100 [<0>] do_syscall_64+0x178/0x1a0 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<0>] 0xffffffffffffffff [<0>] __refrigerator+0x75/0x160 [<0>] do_exit+0x224/0xc60 [<0>] do_group_exit+0x3a/0xa0 [<0>] get_signal+0x156/0x8c0 [<0>] do_signal+0x36/0x610 [<0>] exit_to_usermode_loop+0x95/0x100 [<0>] do_syscall_64+0x178/0x1a0 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<0>] 0xffffffffffffffff [<0>] __refrigerator+0x75/0x160 [<0>] do_exit+0x224/0xc60 [<0>] do_group_exit+0x3a/0xa0 [<0>] get_signal+0x156/0x8c0 [<0>] do_signal+0x36/0x610 [<0>] exit_to_usermode_loop+0x95/0x100 [<0>] do_syscall_64+0x178/0x1a0 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<0>] 0xffffffffffffffff [root@bd033027000051.na610 /sys/fs/cgroup] #pstree -pa 244857 containerd-shim,244857 -namespace default -workdir /home/t4/pouch/containerd/root/io.containerd.runtime.v1.linux/default/677c77263a5b2fe4c8677cd6a7180f8c2dff50fe4fa4b255a328de1b9c2ec612 -address /run/containerd/containerd.sock -containerd-binary... ├─(start.sh,244922) │ └─(holo-worker,245213) │ ├─{holo-worker},245457 │ ├─{holo-worker},245458 │ ├─{holo-worker},245584 │ ├─{holo-worker},245603 │ ├─{holo-worker},245604 │ ├─{holo-worker},245787 │ ├─{holo-worker},246131 │ ├─{holo-worker},246132 │ ├─{holo-worker},246133 │ ├─{holo-worker},246134 │ ├─{holo-worker},246135 │ ├─{holo-worker},246136 │ ├─{holo-worker},246137 │ ├─{holo-worker},246138 │ ├─{holo-worker},246144 │ ├─{holo-worker},246153 │ ├─{holo-worker},246172 │ └─{holo-worker},249564 ├─{containerd-shim},244858 ├─{containerd-shim},244859 ├─{containerd-shim},244860 ├─{containerd-shim},244862 ├─{containerd-shim},244863 ├─{containerd-shim},244864 ├─{containerd-shim},244865 ├─{containerd-shim},244866 ├─{containerd-shim},245167 └─{containerd-shim},245435 245213 这个pid下面的所有线程都无法退出 | ||||
问题重现步骤 | 多线程coredump,申请大内存容量,这样coredump同步时间长一点; 在步骤1的同时,对进程执行cgroup freeze/unfreeze操作,出现上面的调用栈。 代码: #cat a.c #include <stdlib.h> #include <unistd.h> #include <string.h> void *thread(void *unused) { pause(); } int main() { int i; size_t len = 2 * 1024 *1024 * 1024UL; void *p = malloc(len); pthread_t tid; memset(p, 1, len); for (i = 0; i < 10; i++) { pthread_create(&tid, NULL, thread, NULL); } *(int *)(0) = 1; } gcc a.c -lpthread a.out 终端1触发coredump: #!/bin/bash ulimit -c unlimited mkdir /sys/fs/cgroup/freezer/test echo $$ > /sys/fs/cgroup/freezer/test/cgroup.procs while true do taskset -c 1-14 ./a.out rm ./core* done 终端2触发freeze/unfreeze: #!/bin/bash while true do echo THAWED > /sys/fs/cgroup/freezer/test/freezer.state sleep 1 echo FROZEN > /sys/fs/cgroup/freezer/test/freezer.state sleep 1 done | ||||
附注 | Aone id: 36929403 | ||||
标签 | 没加标签. | ||||
日期 | 用户名 | 字段 | 更改 |
---|---|---|---|
2021-09-22 11:27 | CruzZhao | 新建问题 |