概述
應用程序陷入死循環(huán)后,界面可能不會有任何輸出,所有的業(yè)務也不通,不易定位。
陷入死循環(huán)的程序占用的cpu使用率較高,通常可以通過使用top命令看出來。
對于多線程的程序,需要耐心調試,本文給出筆者近期使用的方法。
調試步驟
測試程序
編寫一個多線程進入死循環(huán)的測試程序,如下:
#include <stdio.h>
#include <pthread.h>
#define MAX_THREAD 4
static void dead_loop(void)
{
while(1) {
continue;
}
}
int main(void)
{
int i = 0;
pthread_t ntid[MAX_THREAD];
// dead_loop();
for (i = 0; i < 4; i++) {
if (pthread_create(&ntid[i], NULL, (void *)dead_loop, NULL) == 0) {
printf("pthread id = %u\n", ntid[i]);
}
}
while(1) {
pause();
}
return 0;
}
編譯:gcc -g while_test.c -pthread
該程序運行后進入死循環(huán),下面分析調試方法。
確定陷入死循環(huán)的程序
程序運行后會打印各id號,然后無任何輸出,程序陷入了死循環(huán)
ps確認程序仍在運行,ps也可以查看進程狀態(tài)
ps -T 查看進程中包含的線程
-> % ps -eT | grep a.out
10703 10703 pts/0 00:00:00 a.out
10703 10704 pts/0 00:00:39 a.out
10703 10705 pts/0 00:00:40 a.out
10703 10706 pts/0 00:00:41 a.out
10703 10707 pts/0 00:00:40 a.out
- ps -eL 查看各線程的運行時間,可以輔助確認處于死循環(huán)的線程
-> % ps -eL | grep a.out
10703 10703 pts/0 00:00:00 a.out
10703 10704 pts/0 00:01:33 a.out
10703 10705 pts/0 00:01:34 a.out
10703 10706 pts/0 00:01:31 a.out
10703 10707 pts/0 00:01:34 a.out
- top查看進程cpu使用率
- top -H 查看各線程的使用率
- top界面下按1查看各個cpu的占用率
-> % top -H
top - 21:47:24 up 6:18, 3 users, load average: 0.90, 0.64, 0.60
Threads: 402 total, 5 running, 396 sleeping, 0 stopped, 1 zombie
%Cpu0 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 99.7 us, 0.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 99.7 us, 0.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 1024256 total, 1002628 used, 21628 free, 101592 buffers
KiB Swap: 1046524 total, 4456 used, 1042068 free. 342212 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10796 yao 20 0 35068 608 544 R 99.9 0.1 0:15.59 a.out
10797 yao 20 0 35068 608 544 R 99.9 0.1 0:15.68 a.out
10794 yao 20 0 35068 608 544 R 99.3 0.1 0:15.66 a.out
10795 yao 20 0 35068 608 544 R 99.3 0.1 0:15.50 a.out
調試程序死在何處
使用gdb調試程序:
- 使用attach pid 進入之前查看到的進程
- 使用info threads 查看線程信息,可以看到四個線程的運行位置
- 使用thread 序號 進入指定序號的線程,打印出運行的位置
- 使用bt 打印棧信息,查看函數(shù)調用關系
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
(gdb) attach 10848
Attaching to process 10848
Reading symbols from /home/yao/work/util/a.out...done.
Reading symbols from /lib/i386-linux-gnu/libpthread.so.0...Reading symbols from /usr/lib/debug//lib/i386-linux-gnu/libpthread-2.19.so...done.
done.
[New LWP 10852]
[New LWP 10851]
[New LWP 10850]
[New LWP 10849]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
Loaded symbols for /lib/i386-linux-gnu/libpthread.so.0
Reading symbols from /lib/i386-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug//lib/i386-linux-gnu/libc-2.19.so...done.
done.
Loaded symbols for /lib/i386-linux-gnu/libc.so.6
Reading symbols from /lib/ld-linux.so.2...Reading symbols from /usr/lib/debug//lib/i386-linux-gnu/ld-2.19.so...done.
done.
Loaded symbols for /lib/ld-linux.so.2
0xb7718cb0 in ?? ()
(gdb) info threads
Id Target Id Frame
5 Thread 0xb7530b40 (LWP 10849) "a.out" dead_loop () at while_test.c:10
4 Thread 0xb6d2fb40 (LWP 10850) "a.out" dead_loop () at while_test.c:10
3 Thread 0xb652eb40 (LWP 10851) "a.out" dead_loop () at while_test.c:10
2 Thread 0xb5d2db40 (LWP 10852) "a.out" dead_loop () at while_test.c:10
* 1 Thread 0xb7531700 (LWP 10848) "a.out" 0xb7718cb0 in ?? ()
(gdb) bt
#0 0xb7718cb0 in ?? ()
#1 0xb754ca83 in __libc_start_main (main=0x8048552 <main>, argc=1, argv=0xbf999c34, init=0x80485d0 <__libc_csu_init>,
fini=0x8048640 <__libc_csu_fini>, rtld_fini=0xb7729180 <_dl_fini>, stack_end=0xbf999c2c) at libc-start.c:287
#2 0x08048471 in _start ()
(gdb) thread 2
[Switching to thread 2 (Thread 0xb5d2db40 (LWP 10852))]
#0 dead_loop () at while_test.c:10
10 }
(gdb) bt
#0 dead_loop () at while_test.c:10
#1 0xb76e7f70 in start_thread (arg=0xb5d2db40) at pthread_create.c:312
#2 0xb761ebee in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:129
結論
對于陷入死循環(huán)的程序,服務器一般會頻繁地散熱,在多線程業(yè)務復雜的程序中有可能出現(xiàn)。
開發(fā)人員在編程時需要考慮全面細致,盡量注意到可能的異常,很多死循環(huán)是因為沒有預料到的異常引入的。
對于陷入死循環(huán)的程序,穩(wěn)位步驟,逐步調試,并不難定位原因。