Numa配置對(duì)數(shù)據(jù)庫(kù)環(huán)境有swap的影響已經(jīng)眾所周知,但Numa對(duì)內(nèi)存延遲、磁盤吞吐量都有影響,最近看到MLC(Intel Memory Latency Checker )這個(gè)工具,測(cè)試了下跨numa的內(nèi)存延遲有多大
# 準(zhǔn)備工作
# cat /sys/kernel/mm/transparent_hugepage/defrag
always defer defer+madvise [madvise] never
# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
# cat /proc/sys/vm/nr_hugepages
0
echo 4000 > /proc/sys/vm/nr_hugepages
# 內(nèi)存延遲測(cè)試
# ./mlc --latency_matrix
Intel(R) Memory Latency Checker - v3.11b
Command line parameters: --latency_matrix
Using buffer size of 1800.000MiB
Measuring idle latencies for random access (in ns)...
Numa node
Numa node 0 1
0 104.7 287.5
1 292.6 104.6
# 內(nèi)存帶寬測(cè)試
[root@k8s-10-128-64-5 Linux]# ./mlc --bandwidth_matrix
Intel(R) Memory Latency Checker - v3.11b
Command line parameters: --bandwidth_matrix
Using buffer size of 100.000MiB/thread for reads and an additional 100.000MiB/thread for writes
Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Numa node
Numa node 0 1
0 143101.8 78546.8
1 78872.6 143302.3
可以看到跨Numa是有2~3倍的延遲以及帶寬影響的,所以假如追求極致性能,建議--cpunodebind=0 --membind=0 或--cpunodebind=1 --membind=1來(lái)綁定到同個(gè)Numa節(jié)點(diǎn)。