使用sysbench進(jìn)行性能測(cè)試,安裝apt install -y sysbench,教程
目錄
- 操作系統(tǒng)
- cpu
- 查看主頻,核數(shù)和線程數(shù)
- 性能測(cè)試前的準(zhǔn)備
- 性能測(cè)試
- GPU
- 查看大小,型號(hào),驅(qū)動(dòng)是否安裝正確
- 性能測(cè)試
- 內(nèi)存
- 查看大小
- 吞吐量
- 磁盤(pán)
- 查看大小
- IO性能
- 交換機(jī)-集群
- 所以節(jié)點(diǎn)之間的連通性
- 網(wǎng)速
操作系統(tǒng)
$ lsb_release -a
CPU
查看主頻,核數(shù),線程數(shù)
Socket芯片卡槽數(shù),Core(s) per socket每一塊芯片有多少核心,Thread(s) per core每個(gè)核心支持幾個(gè)線程,即是否使用超線程技術(shù)
總CPU數(shù)則為Socket * Core(s) per socket * Thread(s) per core
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 57 bits virtual
Byte Order: Little Endian
CPU(s): 112
On-line CPU(s) list: 0-111
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz
CPU family: 6
Model: 106
Thread(s) per core: 2
Core(s) per socket: 28
Socket(s): 2
Stepping: 6
CPU max MHz: 3100.0000
CPU min MHz: 800.0000
BogoMIPS: 4000.00
性能測(cè)試前的準(zhǔn)備
/sys/devices/system/cpu/cpu*/cpufreq文件夾里有每個(gè)CPU的配置和信息,*代表CPU編號(hào)(0~N-1)??捎?code>cat查看每個(gè)文件
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor中為CPU頻率調(diào)節(jié)器的類(lèi)型,可用如下命令改變模式
echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
有如下幾種選擇:
- performance:將CPU頻率設(shè)置為最高值,以提供最佳性能。適合需要高響應(yīng)速度和處理能力的場(chǎng)景,但可能會(huì)增加功耗和熱量。
- powersave:將CPU頻率設(shè)置為最低值,以節(jié)省電力。適合電池供電設(shè)備或?qū)拿舾械膱?chǎng)景。
- userspace:允許用戶(hù)空間程序通過(guò)寫(xiě)入scaling_setspeed屬性來(lái)設(shè)置CPU頻率。
- ondemand:根據(jù)當(dāng)前系統(tǒng)負(fù)載動(dòng)態(tài)調(diào)整CPU頻率。當(dāng)負(fù)載增加時(shí),頻率會(huì)提高以提供更好的性能,而在輕負(fù)載時(shí)頻率會(huì)降低以節(jié)省電力。
- conservative:類(lèi)似于ondemand,但頻率調(diào)整更加平緩,不會(huì)立即跳到最高頻率。適合需要平衡性能和功耗的場(chǎng)景。
- schedutil:基于CPU調(diào)度器的利用率數(shù)據(jù)來(lái)動(dòng)態(tài)調(diào)整頻率。它是較新的調(diào)節(jié)器,通常被認(rèn)為是ondemand和conservative的替代品,因?yàn)樗cCPU調(diào)度器更緊密集成,開(kāi)銷(xiāo)更小。
/sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq中為CPU頻率變化閾值上限
/sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq中為CPU頻率變化閾值下限
/sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq中為當(dāng)前CPU頻率
另外可以開(kāi)一個(gè)窗口持續(xù)執(zhí)行watch -n 1 "cat /proc/cpuinfo | grep 'MHz'"來(lái)監(jiān)控當(dāng)前CPU頻率
/sys/devices/system/cpu/cpu*/cpufreq/base_frequency中為一個(gè)頻率基準(zhǔn)值
當(dāng)scaling_governor為performance時(shí),若有base_frequency,則CPU頻率不會(huì)升高到scaling_max_freq而是會(huì)維持在base_frequency,同理,當(dāng)scaling_governor為powersave時(shí),若有base_frequency,則CPU頻率不會(huì)下降到scaling_min_freq而是會(huì)維持在base_frequency
/sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq中為CPU信息中的頻率上限,對(duì)應(yīng)lscpu中的CPU max MHz
/sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_min_freq中為CPU信息中的頻率下限,對(duì)應(yīng)lscpu中的CPU min MHz
性能測(cè)試
單核性能測(cè)試
$ sysbench --test=cpu --cpu-max-prime=20000 --time=30 run
多核性能測(cè)試
$ sysbench --test=cpu --cpu-max-prime=20000 --threads=112 --time=30 run
結(jié)果會(huì)包含每秒任務(wù)數(shù),任務(wù)耗時(shí),線程均衡性
CPU speed:
events per second: 50015.22
General statistics:
total time: 30.0023s
total number of events: 1500650
Latency (ms):
min: 0.98
avg: 2.24
max: 26.24
95th percentile: 2.26
sum: 3358727.27
Threads fairness:
events (avg/stddev): 13398.6607/139.68
execution time (avg/stddev): 29.9886/0.01
GPU
查看大小,型號(hào),驅(qū)動(dòng)是否安裝正確
Nvidia的顯卡可以如下查看,Perf下的部分就是型號(hào),Memory-Usage下的部分就是顯卡內(nèi)存
$ nvidia-smi
Tue Feb 18 06:41:55 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12 Driver Version: 550.90.12 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:52:00.0 Off | 0 |
| N/A 44C P0 62W / 300W | 1MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA A100 80GB PCIe Off | 00000000:56:00.0 Off | 0 |
| N/A 45C P0 68W / 300W | 1MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA A100 80GB PCIe Off | 00000000:D1:00.0 Off | 0 |
| N/A 41C P0 66W / 300W | 1MiB / 81920MiB | 2% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA A100 80GB PCIe Off | 00000000:D5:00.0 Off | 0 |
| N/A 42C P0 65W / 300W | 1MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
單卡/多卡性能
使用gpu_burn進(jìn)行測(cè)試,官方倉(cāng)庫(kù)
但官方給的加-參數(shù)似乎都不好使
$ cd /home/hx/gpu_burn
$ ./gpu_burn 100
GPU 0: NVIDIA A100 80GB PCIe (UUID: GPU-751c75f1-b612-d705-c571-9173e4969f8b)
GPU 1: NVIDIA A100 80GB PCIe (UUID: GPU-569f62c0-3b97-4d25-a7fc-70b2a2724478)
GPU 2: NVIDIA A100 80GB PCIe (UUID: GPU-2be55775-2295-c096-411d-4f28a4b50ec4)
GPU 3: NVIDIA A100 80GB PCIe (UUID: GPU-4f6920dc-f153-925e-a803-181dd91a232f)
Initialized device 0 with 81037 MB of memory (80602 MB available, using 72542 MB of it), using FLOATS
Initialized device 3 with 81037 MB of memory (80602 MB available, using 72542 MB of it), using FLOATS
Initialized device 2 with 81037 MB of memory (80602 MB available, using 72542 MB of it), using FLOATS
Initialized device 1 with 81037 MB of memory (80602 MB available, using 72542 MB of it), using FLOATS
11.0% proc'd: 9062 (17018 Gflop/s) - 9062 (16879 Gflop/s) - 9062 (17042 Gflop/s) - 4531 (14077 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 72 C - 73 C - 68 C - 81 C
Summary at: Wed Feb 19 04:23:49 AM UTC 2025
24.0% proc'd: 22655 (16849 Gflop/s) - 18124 (16789 Gflop/s) - 18124 (16967 Gflop/s) - 18124 (15897 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 80 C - 79 C - 75 C - 85 C
Summary at: Wed Feb 19 04:24:02 AM UTC 2025
36.0% proc'd: 31717 (16763 Gflop/s) - 31717 (16672 Gflop/s) - 31717 (16856 Gflop/s) - 27186 (14160 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 83 C - 82 C - 78 C - 84 C
Summary at: Wed Feb 19 04:24:14 AM UTC 2025
47.0% proc'd: 40779 (15897 Gflop/s) - 40779 (16435 Gflop/s) - 45310 (16754 Gflop/s) - 36248 (12967 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 85 C - 84 C - 82 C - 85 C
Summary at: Wed Feb 19 04:24:25 AM UTC 2025
58.0% proc'd: 49841 (14627 Gflop/s) - 54372 (14800 Gflop/s) - 54372 (16656 Gflop/s) - 45310 (11643 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 85 C - 85 C - 84 C - 84 C
Summary at: Wed Feb 19 04:24:36 AM UTC 2025
69.0% proc'd: 58903 (13925 Gflop/s) - 63434 (14173 Gflop/s) - 63434 (15784 Gflop/s) - 49841 (10948 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 84 C - 84 C - 84 C - 84 C
Summary at: Wed Feb 19 04:24:47 AM UTC 2025
80.0% proc'd: 67965 (13543 Gflop/s) - 72496 (13905 Gflop/s) - 72496 (15110 Gflop/s) - 58903 (10935 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 84 C - 84 C - 84 C - 84 C
Summary at: Wed Feb 19 04:24:58 AM UTC 2025
91.0% proc'd: 77027 (13270 Gflop/s) - 77027 (13868 Gflop/s) - 81558 (14820 Gflop/s) - 63434 (10594 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 84 C - 84 C - 84 C - 84 C
Summary at: Wed Feb 19 04:25:09 AM UTC 2025
100.0% proc'd: 86089 (13270 Gflop/s) - 86089 (13676 Gflop/s) - 90620 (14577 Gflop/s) - 72496 (10184 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 84 C - 84 C - 84 C - 84 C
Killing processes.. done
Tested 4 GPUs:
GPU 0: OK
GPU 1: OK
GPU 2: OK
GPU 3: OK
內(nèi)存
查看大小
$ sudo lshw -C memory
*-memory
description: System Memory
physical id: 4f
slot: System board or motherboard
size: 256GiB
$ free -h
total used free shared buff/cache available
Mem: 251Gi 1.0Gi 247Gi 2.0Mi 2.9Gi 249Gi
Swap: 8.0Gi 0B 8.0Gi
吞吐量
多線程隨機(jī)寫(xiě)入效率
$ sysbench memory --memory-block-size=1M --memory-total-size=200G --threads=50 --memory-access-mode=rnd run
多線程隨機(jī)讀取效率
$ sysbench memory --memory-block-size=1M --memory-total-size=200G --memory-access-mode=rnd --threads=50 --memory-oper=read run
磁盤(pán)
查看大小,分區(qū)合理性
lsblk,fdisk,df -h等命令查看到的1GB=1024MB換算來(lái)的容量,而硬盤(pán)廠商一般用1GB=1000MB換算,因此容量看上去會(huì)比預(yù)期的少許多,只有用parted能看到符合容量標(biāo)注的大小
ROTA值為1為HDD,0為SSD)
$ lsblk --output NAME,ROTA,SIZE,TYPE,RM,RO,MOUNTPOINTS
nvme0n1 0 3.5T disk 0 0
├─nvme0n1p1 0 1G part 0 0 /boot/efi
├─nvme0n1p2 0 2G part 0 0 /boot
└─nvme0n1p3 0 3.5T part 0 0
└─ubuntu--vg-ubuntu--lv 0 100G lvm 0 0 /
$ sudo parted /dev/nvme0n1 print
[sudo] password for hx:
Model: SAMSUNG MZQL23T8HCLS-00A07 (nvme)
Disk /dev/nvme0n1: 3841GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 1049kB 1128MB 1127MB fat32 boot, esp
2 1128MB 3276MB 2147MB ext4
3 3276MB 3841GB 3837GB
IO性能
sysbench評(píng)估磁盤(pán)讀寫(xiě)需要先prepare準(zhǔn)備數(shù)據(jù),然后run,測(cè)試完后cleanup清理測(cè)試數(shù)據(jù)
多線程隨機(jī)寫(xiě)人測(cè)試
$ sysbench fileio --file-total-size=25G --file-test-mode=rndwr --threads=10 --file-num=100 prepare
$ sysbench fileio --file-total-size=25G --file-test-mode=rndwr --threads=10 --file-num=100 --report-interval=1 run
$ sysbench fileio --file-total-size=25G --file-test-mode=rndwr --threads=10 --file-num=100 cleanup
多線程隨機(jī)讀取測(cè)試
$ sysbench fileio --file-total-size=25G --file-test-mode=rndrd --threads=10 --file-num=100 prepare
$ sysbench fileio --file-total-size=25G --file-test-mode=rndrd --threads=10 --file-num=100 --report-interval=1 run
$ sysbench fileio --file-total-size=25G --file-test-mode=rndrd --threads=10 --file-num=100 cleanup
多線程隨機(jī)讀寫(xiě)混合,讀寫(xiě)比6:4
$ sysbench fileio --file-total-size=25G --file-test-mode=rndrw --threads=10 --file-num=100 --file-rw-ratio=1.5 prepare
$ sysbench fileio --file-total-size=25G --file-test-mode=rndrw --threads=10 --file-num=100 --file-rw-ratio=1.5 --report-interval=1 run
$ sysbench fileio --file-total-size=25G --file-test-mode=rndrw --threads=10 --file-num=100 --file-rw-ratio=1.5 cleanup