linux Centos 安装 Tensorflow GPU版本 安装教程

╰半夏微凉° 2023-07-13 04:25 139阅读 0赞

日萌社" class="reference-link">20191009191333910.png日萌社

人工智能AI:Keras PyTorch MXNet TensorFlow PaddlePaddle 深度学习实战(不定时更新)


注意:虚拟机是无法安装英伟达GPU驱动的,所以必须要求linux(ubuntu、centos)是装在系统盘上的系统,比如双系统之类的。


  1. CUDAtensorflow版本清单:https://tensorflow.google.cn/install/source#linux

watermark_type_ZmFuZ3poZW5naGVpdGk_shadow_10_text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3ppbWlhbzU1MjE0NzU3Mg_size_16_color_FFFFFF_t_70

  1. GPU 版本 TensorFlow 安装指南
  2. GPU 版本的 TensorFlow 可以利⽤ NVIDIA GPU 强⼤的计算加速能⼒,使 TensorFlow 的运⾏更为⾼效,尤其是可以
  3. 成倍提升模型训练的速度。安装过程有以下⼏个步骤
  4. 1Centos nouveau模式禁⽤
  5. 2CUDA TooKitcuDNN的安装
  6. 3TensorFlow-gpu版本安装

1、Centos nouveau模式禁用

Centos 安装之前先要禁⽤ nouveau, 才能后续安装

禁⽤过程命令如下:

  1. sudo vim /etc/modprobe.d/blacklist-nouveau.conf
  2. # 1、添加下列两⾏
  3. blacklist nouveau
  4. options nouveau modeset=0
  5. # 2、输⼊下⾯命令并进⾏重启
  6. sudo dracut --force
  7. sudo reboot #重启
  8. lsmod | grep nouveau #若⽆结果显示则表明成功禁⽤

检查本机显卡驱动是否安装过:

  1. # 注意:如果安装过显卡驱动输⼊,输⼊下⾯命令会显示下⾯情况
  2. [root@localhost ~]# nvidia-smi
  3. Wed Sep 25 04:24:35 2019
  4. +-----------------------------------------------------------------------------+
  5. | NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.0 |
  6. |-------------------------------+----------------------+----------------------+
  7. | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
  8. | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
  9. |===============================+======================+======================|
  10. | 0 GeForce GTX 108... Off | 00000000:03:00.0 Off | N/A |
  11. | 20% 38C P0 60W / 250W | 0MiB / 11178MiB | 0% Default |
  12. +-------------------------------+----------------------+----------------------+
  13. | 1 GeForce GTX 108... Off | 00000000:04:00.0 Off | N/A |
  14. | 22% 43C P0 54W / 250W | 0MiB / 11178MiB | 0% Default |
  15. +-------------------------------+----------------------+----------------------+
  16. +-----------------------------------------------------------------------------+
  17. | Processes: GPU Memory |
  18. | GPU PID Type Process name Usage |
  19. |=============================================================================|
  20. | No running processes found |
  21. +-----------------------------------------------------------------------------+

注:如果没有安装过驱动,可以单独安装,也可以随着cuda过程安装,如下教程

2、CUDA TooKit和cuDNN的安装

Tensorflflow-gpu 与 cuda 版本对应: Linux 平台如下,其它平台参考官⽹: https://tensorflflow.google.cn/install/source

watermark_type_ZmFuZ3poZW5naGVpdGk_shadow_10_text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3ppbWlhbzU1MjE0NzU3Mg_size_16_color_FFFFFF_t_70 1

注意:安装显卡驱动命令, 去https://www.nvidia.cn/Download/index.aspx?lang=cn ⽹站,下载对应的驱动脚本,注意根据⾃⼰的显卡型号选择,必须选择cuda10.0版本,选择10.1最新会有TensorFlow加载不到动态库的错误。

watermark_type_ZmFuZ3poZW5naGVpdGk_shadow_10_text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3ppbWlhbzU1MjE0NzU3Mg_size_16_color_FFFFFF_t_70 2

  1. # 运⾏脚本,中间选择按照它默认回撤即可
  2. bash NVIDIA-Linux-x86_64-430.50.run
  3. # 查看是否安装成功
  4. nvidia-smi
  5. Thu Oct 17 04:43:21 2019
  6. +-----------------------------------------------------------------------------+
  7. | NVIDIA-SMI 430.50 Driver Version: 430.50 |
  8. |-------------------------------+----------------------+----------------------+
  9. | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
  10. | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
  11. |===============================+======================+======================|
  12. | 0 GeForce GTX 108... Off | 00000000:03:00.0 Off | N/A |
  13. | 17% 35C P0 59W / 250W | 0MiB / 11178MiB | 0% Default |
  14. +-------------------------------+----------------------+----------------------+
  15. | 1 GeForce GTX 108... Off | 00000000:04:00.0 Off | N/A |
  16. | 24% 43C P0 58W / 250W | 0MiB / 11178MiB | 2% Default |
  17. +-------------------------------+----------------------+----------------------+
  18. +-----------------------------------------------------------------------------+
  19. | Processes: GPU Memory |
  20. | GPU PID Type Process name Usage |
  21. |=============================================================================|
  22. | No running processes found |
  23. +-----------------------------------------------------------------------------+

然后是安装cuda

【cuda10.0】https://developer.nvidia.com/cuda-10.0-download-archive
【最新版本cuda10.1】https://developer.nvidia.com/cuda-downloads

(1)⽐如这⾥我们⽤linux平台64位的centos的7版本,选择⽹络安装模式:

watermark_type_ZmFuZ3poZW5naGVpdGk_shadow_10_text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3ppbWlhbzU1MjE0NzU3Mg_size_16_color_FFFFFF_t_70 3

  1. # 上⾯的⻚⾯中,去下载这个包,1.9G⼤⼩
  2. # 然后运⾏安装命令
  3. sudo sh cuda_10.0.130_410.48_linux.run

注:这个安装过程会比较慢,需要等待较长时间,中间几次提示默认即可

(2) 配置环境,这⾥加⼊到了系统的环境变量,没有选择⽤户下的环境变量 .bash_profifile

输入

  1. vi /etc/profile

可在⽂件末尾加⼊⼀下配置:

  1. export PATH=/usr/local/cuda-10.0/bin:$PATH
  2. export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH

⽣效操作

  1. source /etc/profile

(3) 验证是否安装完成,在终端输⼊: nvcc -V

显示下⾯的结果即成功

  1. nvcc: NVIDIA (R) Cuda compiler driver
  2. Copyright (c) 2005-2018 NVIDIA Corporation
  3. Built on Sat_Aug_25_21:08:01_CDT_2018
  4. Cuda compilation tools, release 10.0, V10.0.130

2、安装加速器cudnn

下载地址: https://developer.nvidia.com/rdp/cudnn-download :

watermark_type_ZmFuZ3poZW5naGVpdGk_shadow_10_text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3ppbWlhbzU1MjE0NzU3Mg_size_16_color_FFFFFF_t_70 4

这⾥选择下载cudnn-10.0-linux-x64-v7.6.4.38.tgz

注:需要注册登录才能选择版本且下载

接下来输⼊下⾯命令进⾏配置安装:

  1. $ tar -xzvf cudnn-10.0-linux-x64-v7.6.4.38.tgz
  2. $ sudo cp cuda/include/cudnn.h /usr/local/cuda/include
  3. $ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
  4. $ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

3、TensorFlow-gpu版本安装

我们先进⼊ conda 创建的⼀个 CV 课程虚拟环境中:

  1. [root@localhost ~]# source activate cv_dl
  2. (cv_dl) [root@localhost ~]#

在 python 虚拟环境中直接 pip install tensorflflow-gpu 默认下载是 TF 官⽹最新的 stable 版本,这⾥是安装了最新稳定版本 2.0.0

  1. # 添加⼀个国内的下载源会提⾼下载速度-i https://pypi.tuna.tsinghua.edu.cn/simple
  2. [root@localhost ~]# pip install tensorflow-gpu==2.0.0 -i https://pypi.tuna.tsinghua.edu.cn/simple

进⾏测试,这⾥会显示我们 有两块 GTX 1080Ti 的显卡 :

  1. import tensorflow as tf
  2. con = tf.constant('hello world')
  3. print(con)
  4. # 显示以下信息示为成功,这⾥会显示我们的centos中有两块GTX 1080Ti的显卡
  5. ...
  6. 2019-10-17 00:25:36.794546: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device
  7. (/job:localhost/replica:0/task:0/device:GPU:0 with 10481 MB memory) -> physical GPU (device: 0, name: GeForce GTX
  8. 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1)
  9. 2019-10-17 00:25:36.796989: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device
  10. (/job:localhost/replica:0/task:0/device:GPU:1 with 10481 MB memory) -> physical GPU (device: 1, name: GeForce GTX
  11. 1080 Ti, pci bus id: 0000:04:00.0, compute capability: 6.1)
  12. tf.Tensor(b'hello world', shape=(), dtype=string)

发表评论

表情:
评论列表 (有 0 条评论,139人围观)

还没有评论,来说两句吧...

相关阅读