3

在Ubuntu20.04上安装Tensorflow遇到的问题和解决方法

 2 years ago
source link: https://allenwind.github.io/blog/12238/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

安装最新版Ubuntu20.04作为体验,发现安装本地Tensorflow遇到很多问题,此处记录一下解决方法。

  • CUDA 10.1 requires gcc <= 8
  • Python3.8
$cat /var/log/cuda-installer.log
...
[ERROR]: unsupported compiler version: 9.3.0. Use --override to override this check.

解决Python版本问题

参考这个,使用Conda或Docker创建多版本Python环境

解决gcc版本问题

这里可参考旧文Linux系统中安装多版本gcc

安装CUDA Toolkit

下载 run 版本的cuda

$chmod a+x cuda_10.1.243_418.87.00_linux.run
$sudo ./cuda_10.1.243_418.87.00_linux.run --silent --toolkit --samples --librarypath=/usr/local/cuda

你也可以输入./cuda_10.1.243_418.87.00_linux.run --help 看看其他参数。

查看CUDA版本

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

cuDNN

cuDNN下载地址在这里cudnnlib

$ tar -xzvf cudnn-10.1-linux-x64-v7.6.5.32.tgz
$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include
$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

.bashrc

export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64

安装Tensorflow

conda create env

pip -i https://pypi.tuna.tsinghua.edu.cn/simple install tensorflow

写一个简单的模型测试

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras import backend as K

num_classes = 10
img_rows, img_cols = 28, 28

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# channel last
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

y_train = tf.keras.utils.to_categorical(y_train, num_classes)
y_test = tf.keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=tf.keras.losses.categorical_crossentropy,
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])

model.fit(x_train, y_train,
batch_size=1024,
epochs=20,
verbose=1,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

watch nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.100 Driver Version: 440.100 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 207... Off | 00000000:01:00.0 On | N/A |
| 37% 59C P2 212W / 255W | 3770MiB / 7974MiB | 91% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1232 G /usr/lib/xorg/Xorg 60MiB |
| 0 1943 G /usr/lib/xorg/Xorg 283MiB |
| 0 2145 G /usr/bin/gnome-shell 135MiB |
| 0 2551 G ...AAAAAAAAAAAACAAAAAAAAAA= --shared-files 236MiB |
| 0 8349 G ...quest-channel-token=1436292411171661387 296MiB |
| 0 27901 G /usr/bin/totem 18MiB |
| 0 56093 C python3 2715MiB |
+-----------------------------------------------------------------------------+

此外,你也可以下载 https://github.com/tensorflow/benchmarks 上面的源码来测试。

以上是在Ubuntu20.04上安装Tensorflow,不过Ubuntu20.04发布不久,不知道会遇到什么问题,而且很多工具还不支持,建议还是作为尝鲜试试,不要把开发环境迁移到这里。

转载请包括本文地址:https://allenwind.github.io/blog/12238/
更多文章请参考:https://allenwind.github.io/blog/archives/


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK