在超算環(huán)境中一般都是比較老的mpi環(huán)境,會出現(xiàn)調(diào)用不成功,無法使用等情況。所以一般都會需要在自己的用戶目錄下去配置mpi環(huán)境.
配置mpi一般前提條件需要:
- gcc/g++ 編譯器
- gfrotran 編譯器(可選,fortran并行需要)
在配置mpi環(huán)境前,需要系統(tǒng)確認(rèn)已經(jīng)安裝好了上述編譯器.
下載mpi源碼
- mpi3.3
- 當(dāng)前目錄解壓,進(jìn)入解壓目錄
- 新建build目錄 進(jìn)入build目錄
在學(xué)校的超算上編譯運(yùn)行mpi環(huán)境
../configure --prefix=/public/home/zhankang/miaozhaohui/software/install/mpich3.3 # 指定安裝目錄
make
make install
上述步驟都順利的完成!
之后會在將在prefix指定問路徑下看到編譯的結(jié)果
zhankang@login2:[/public/home/zhankang/miaozhaohui/software/install/mpich3.3]ls
bin include lib share
zhankang@login2:[/public/home/zhankang/miaozhaohui/software/install/mpich3.3]cd bin/
zhankang@login2:[/public/home/zhankang/miaozhaohui/software/install/mpich3.3/bin]ls
hydra_nameserver hydra_persist hydra_pmi_proxy mpic++ mpicc mpichversion mpicxx mpiexec mpiexec.hydra mpif77 mpif90 mpifort mpirun mpivars parkill
看到編譯常用的mpicc mpic++ 說明安裝成功.
之后可以將bin目錄下添加到環(huán)境變量.
# 使用vim打開bashrc文件 在文件最后添加一句
# export PATH=/public/home/zhankang/miaozhaohui/software/install/mpich3.3/bin:$PATH
# 前面的目錄表示自行編譯的mpi的bin目錄 執(zhí)行如下操作
# which mpiexec 顯示結(jié)果為bin目錄的路徑說明成功.
zhankang@login2:[/public/home/zhankang/miaozhaohui/software/install/mpich3.3/bin]vim ~/.bashrc
zhankang@login2:[/public/home/zhankang/miaozhaohui/software/install/mpich3.3/bin]source ~/.bashrc
zhankang@login2:[/public/home/zhankang/miaozhaohui/software/install/mpich3.3/bin]which mpiexec
/public/home/zhankang/miaozhaohui/software/install/mpich3.3/bin/mpiexec
測試并行程序
- 單機(jī)測試 (在學(xué)校的登陸節(jié)點(diǎn)上測試)
#include <mpi.h>
#include<stdio.h>
// 測試并行是否成功
int main(int argc, char* argv[])
{
int rank;
int size;
int namelen;
char processor_name[MPI_MAX_PROCESSOR_NAME];
// 并行環(huán)境初始化
MPI_Init(&argc,&argv);
// 獲得當(dāng)前進(jìn)程
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
// 獲得運(yùn)行該程序的總進(jìn)程
MPI_Comm_size(MPI_COMM_WORLD, &size);
// 獲得當(dāng)前進(jìn)程下主機(jī)的名稱
MPI_Get_processor_name(processor_name, &namelen);
printf("hello world from process %i of size %i -- name %s .\n",rank,size,processor_name);
// mpi環(huán)境結(jié)束
MPI_Finalize();
return 0;
}
編譯運(yùn)行結(jié)果:
zhankang@login2:[/public/home/zhankang/miaozhaohui/code]mpicc main.c
zhankang@login2:[/public/home/zhankang/miaozhaohui/code]mpiexec -n 3 ./a.out
hello world from process 0 of size 3 -- name login2 .
hello world from process 1 of size 3 -- name login2 .
hello world from process 2 of size 3 -- name login2 .
成功。
- 多機(jī)并行( 使用學(xué)校free隊(duì)列 )
在超算集群中一般都是多個(gè)用戶提交多個(gè)作業(yè),為了使系統(tǒng)運(yùn)行狀態(tài)最優(yōu),通過PBS作業(yè)管理系統(tǒng)根據(jù)集群上的可用計(jì)算節(jié)點(diǎn)的計(jì)算資源管理和調(diào)度所有計(jì)算作業(yè).
超算計(jì)算資源計(jì)較緊缺,使用免費(fèi)的隊(duì)列進(jìn)行計(jì)算:free.
對應(yīng)的pbs文件:
#PBS -N test
#PBS -l nodes=2:ppn=2
#PBS -j oe
#PBS -q free
#PBS -l walltime=0:05:0
cd $PBS_O_WORKDIR
JOBID=`echo $PBS_JOBID | awk -F. '{print $1}'`
echo This job id is $JOBID | tee job_info.log
echo Working directory is $PBS_O_WORKDIR | tee -a job_info.log
echo Start time is `date` | tee -a job_info.log
echo This job runs on the following nodes: | tee -a job_info.log
echo `cat $PBS_NODEFILE | sort | uniq` | tee -a job_info.log
NPROCS=`cat $PBS_NODEFILE | wc -l`
PPROCS=$(($NPROCS/$NNODES))
echo This job has allocated $NNODES nodes, $NPROCS processors.| tee -a job_info.log
uniq $PBS_NODEFILE | sort | sed s/$/i:$PPROCS/ > $PBS_O_WORKDIR/hostfile
#source your profile
MPIRUN="mpiexec -np $NPROCS -f $PBS_O_WORKDIR/hostfile -env I_MPI_DEVICE=rdma"
JOBCMD="./a.out"
{ time $MPIRUN $JOBCMD; } >$PBS_O_WORKDIR/output_$JOBID.log 2>&1
echo End time is `date`| tee -a job_info.log
rm -f $PBS_O_WORKDIR/hostfile
pkill -P $$
exit 0
指定了兩個(gè)節(jié)點(diǎn),每個(gè)節(jié)點(diǎn)計(jì)算資源2核,總共4個(gè)計(jì)算核,計(jì)算結(jié)果:
hello world from process 1 of size 4 -- name c1137 .
hello world from process 3 of size 4 -- name c1138 .
hello world from process 2 of size 4 -- name c1138 .
hello world from process 0 of size 4 -- name c1137 .
說明環(huán)境配置成功!
在學(xué)部的超算上編譯出現(xiàn)問題
出現(xiàn)以下問題:
checking size of bool... 0
configure: error: unable to determine matching C type for C++ bool
不管怎么調(diào)試都出現(xiàn)問題。
解決辦法
由于在使用petsc編譯的過程中,當(dāng)系統(tǒng)環(huán)境沒有mpi環(huán)境時(shí),可以自己下載和安裝mpich,故使用petsc自帶的編譯命令進(jìn)行mpi的安裝,對比自行安裝過程出現(xiàn)的問題.
工具:
- petsc3.10.2
./configure --download-mpich=/public/home2/kangzang/zhmiao/software/mpich-3.3.tar.gz --download-fblaslapack
發(fā)現(xiàn)編譯成功了,在arch-linux2-c-debug目錄下已經(jīng)編譯成功了.
并且使用上述測試程序并行計(jì)算成功.
hello world from process 0 of size 3 -- name clusadm .
hello world from process 1 of size 3 -- name clusadm .
hello world from process 2 of size 3 -- name clusadm .