本文主要是針對(duì)Centos7以下的老系統(tǒng)(包括內(nèi)核2.6以下的Linux系統(tǒng)),源碼安裝Tensorflow的最新版本。其他高版本系統(tǒng)就直接參照官方安裝手冊(cè)進(jìn)行安裝配置就行了,傳送門:https://www.tensorflow.org/install/。
網(wǎng)上關(guān)于Centos6安裝GPU版本的Tensorflow的教程基本是找不到的,這也是我千辛萬(wàn)苦摸索出來(lái)的,所以如果對(duì)你有幫助的話,記得給個(gè)Like。如果在安裝過程中出現(xiàn)什么問題可以在底下留言,我會(huì)盡快回復(fù)。想一起討論交流深度學(xué)習(xí)的,請(qǐng)關(guān)注我 :)
首先我們需要明白軟件不是越新越好,兼容才是最好,有些版本在安裝Tensorflow的過程中也有bug,所以我先把我的配置環(huán)境貼出來(lái),大神就不需要看教程就可以復(fù)現(xiàn)了。
配置環(huán)境:
Centos 6.5
gcc 4.8.2
bazel 0.5.2
tensorflow r1.3
cuda 8.0
cudnn 5.1.10
Python 3.6.2
第一步:升級(jí)GCC到4.8
這一步主要是為了后面源碼安裝Python3.6、Bazel和Tensorflow提供一個(gè)編譯環(huán)境,太低版本的GCC基本上不行了(注:超過GCC5.0版本的安裝不了Tensorflow)。
首先我們導(dǎo)入 CERN's GPG 鑰匙:
sudo rpm --import http://ftp.scientificlinux.org/linux/scientific/5x/x86_64/RPM-GPG-KEYs/RPM-GPG-KEY-cern
添加倉(cāng)庫(kù):
wget -O /etc/yum.repos.d/slc6-devtoolset.repo http://linuxsoft.cern.ch/cern/devtoolset/slc6-devtoolset.repo
安裝dev包:
sudo yum install devtoolset-2
應(yīng)用到系統(tǒng)環(huán)境:
scl enable devtoolset-2 bash
測(cè)試:
$ gcc --version
gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)
...
第二步:安裝Python
由于Centos 6預(yù)裝的只有Python2.6,已經(jīng)out了。
如果你用的是Python2.X的就升級(jí)到Python2.7,用Pyhton3.X的話就安裝Python3.6。
都是需要源碼安裝,請(qǐng)選擇自己要的版本,傳送門https://www.python.org/downloads/source/
這里拿Python3.6.2做個(gè)例子
wget https://www.python.org/ftp/python/3.6.2/Python-3.6.2.tgz
下載完之后就進(jìn)行解壓
tar xf Python-3.6.2.tgz
cd Python-3.6.2
接下來(lái)就是配置安裝,分別執(zhí)行這三個(gè)命令(你也可以參考里面README文件):
./configure
make
make install
如果執(zhí)行完沒有報(bào)錯(cuò)的話,就成功完成安裝Python,有些同學(xué)不知道有沒有出現(xiàn)錯(cuò)誤的,可以在每一步命令執(zhí)行之后,用以下的命令來(lái)查看有無(wú)報(bào)錯(cuò)(敲黑板):
$ echo $?
0
0代表沒有錯(cuò)誤,其他數(shù)字代表有多少個(gè)錯(cuò)誤。
第三步:安裝虛擬環(huán)境
安裝虛擬環(huán)境能夠與系統(tǒng)環(huán)境隔離開來(lái),這對(duì)于多項(xiàng)目和多用戶系統(tǒng)來(lái)說(shuō)是必選。虛擬環(huán)境能夠保證每個(gè)項(xiàng)目之間的環(huán)境不受影響,因?yàn)槊總€(gè)項(xiàng)目用到的軟件版本基本不同,所以有必要安裝虛擬環(huán)境。
安裝軟件:
sudo yum install python-virtualenv
部署虛擬環(huán)境:
virtualenv --system-site-packages tensorflow #對(duì)于Python 2.7的用戶
virtualenv --system-site-packages -p python3 tensorflow #對(duì)于Python3的用戶
激活虛擬環(huán)境:
source tensorflow/bin/activate
離開虛擬環(huán)境執(zhí)行deactivate命令即可。
第四步:安裝Bazel
安裝Bazel是為了把Tensorflow源碼編譯成可以用pip安裝的whl文件,官方的whl安裝包是基于更高Glibc版本,而我們需要的是適用本機(jī)版本的whl安裝包。
在安裝Bazel之前需要安裝JDK8:
yum install java-1.8.0-openjdk
yum install java-1.8.0-openjdk-devel
然后我們下載Bazel0.5.2的源碼進(jìn)行安裝:
wget https://github.com/bazelbuild/bazel/releases/download/0.5.2/bazel-0.5.2-dist.zip
解壓到Bazel文件夾里:
unzip bazel-0.5.2-dist.zip -d bazel
編譯:
cd bazel/
bash ./compile.sh
編譯完成后將output/bazel復(fù)制到/usr/local/bin里面
cp output/bazel /usr/local/bin
第五步:安裝CUDA8.0和Cudnn5.1
首先你要確認(rèn)你的電腦顯卡是NVIDIA的,因?yàn)楝F(xiàn)在Tensorflow只支持NVIDIA顯卡。
檢查你的電腦是不是有NVIDIA顯卡:
lspci | grep -i nvidia
下載CUDA 8.0:
wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda-repo-rhel6-9-0-local-9.0.176-1.x86_64-rpm
執(zhí)行以下命令進(jìn)行安裝:
sudo rpm -i cuda-repo-rhel6-9-0-local-9.0.176-1.x86_64.rpm
sudo yum clean all
sudo yum install cuda
接下來(lái)安裝Cudnn 5.1(只是把cudnn的頭文件和鏈接文件放到cuda里面):
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v5.1/prod_20161129/8.0/cudnn-8.0-linux-x64-v5.1-tgz
tar xf cudnn-8.0-linux-x64-v5.1.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
第六步:安裝Tensorlfow
完成前面的步驟之后我們就要開始安裝Tensorflow了,首先把numpy和wheel用pip安裝。(我們記得先激活virtualenv進(jìn)入虛擬環(huán)境)
sudo pip install numpy wheel #對(duì)于Python2.7
sudo pip3 install numpy wheel #對(duì)于Python3.6
如果是Python2.7還需要更新一下pip和setuptools:
sudo pip install pip setuptools --upgrade
TensorFlow官網(wǎng)說(shuō)還需要安裝libcupti-dev。這句話應(yīng)該是對(duì)ubuntu系統(tǒng)來(lái)說(shuō)的,而對(duì)redhat系統(tǒng)來(lái)說(shuō),在安裝Cuda時(shí),已經(jīng)把cupti給裝了,可以查看機(jī)器的/usr/local/cuda-8.0/extras/CUPTI/這個(gè)目錄。
接下來(lái)是一個(gè)比較重要的步驟,修改環(huán)境變量:
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cudnn/cuda/lib64:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
把它們加到.bashrc 里面就不用每次都重新定義變量。
下載Tensorflow源程序:
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
git checkout r1.3
下載完之后是比較重要的配置,一個(gè)非常重要的點(diǎn)就是選Clang是否作為CUDA的compiler的時(shí)候選擇No。
$ ./configure
Do you wish to build TensorFlow with MKL support? [y/N]
No MKL support will be enabled for TensorFlow
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n] n
jemalloc disabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] n
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] n
No XLA JIT support will be enabled for TensorFlow
Do you wish to build TensorFlow with VERBS support? [y/N] n
No VERBS support will be enabled for TensorFlow
Do you wish to build TensorFlow with OpenCL support? [y/N] n
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] y
CUDA support will be enabled for TensorFlow
Do you want to use clang as CUDA compiler? [y/N] n
nvcc will be used as CUDA compiler
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]:
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify which gcc should be used by nvcc as the host compiler. [Default is /opt/rh/devtoolset-2/root/usr/bin/gcc]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 5.1.10
Please specify the location where cuDNN 5.1.10 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,3.5,3.5,3.5"]:
Do you wish to build TensorFlow with MPI support? [y/N] n
MPI support will not be enabled for TensorFlow
Configuration finished
接下來(lái)就是用Bazel來(lái)編譯源碼還有生成whl文件:
bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
最后面就是用pip來(lái)安裝我們生成的whl文件(每個(gè)人的文件名應(yīng)該都不一樣,請(qǐng)查看/tmp/tensorflow_pkg/里面的whl文件):
sudo pip install /tmp/tensorflow_pkg/tensorflow-1.3.0-cp27-cp27m-linux_x86_64.whl
測(cè)試(如果能夠import說(shuō)明大功告成):
$ python
import tensorflow as tf
另外,我還在安裝Tensorflow r1.2 cpu版本的時(shí)候發(fā)現(xiàn)一個(gè)普遍存在的問題,需要修改源碼里面 tensorflow/tensorflow.bzl文件:
將其中tf_extension_linkopts 函數(shù)添加參數(shù):
def tf_extension_linkopts():
return []
to
def tf_extension_linkopts():
return ["-lrt"]
安裝的cpu版本的同學(xué)可以參考一下。
參考:
最后感謝Google、Stackoverflow、Github社區(qū)給我的幫助,謝謝!
注:未經(jīng)本人允許,禁止轉(zhuǎn)載!