4

Instructions for CUDA v11.3 and cuDNN 8.2 installation on Ubuntu 20.04 for PyTor...

 1 year ago
source link: https://gist.github.com/Mahedi-61/2a2f1579d4271717d421065168ce6a73
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Instructions for CUDA v11.3 and cuDNN 8.2 installation on Ubuntu 20.04 for PyTorch 1.11 · GitHub

Instantly share code, notes, and snippets.

Instructions for CUDA v11.3 and cuDNN 8.2 installation on Ubuntu 20.04 for PyTorch 1.11

Sorry, but this is shell malware that none should be using namely:

  1. Lack of sanitization so code alike https://gist.github.com/Mahedi-61/2a2f1579d4271717d421065168ce6a73#file-cuda_10-1_installation_on_ubuntu_18-04-L45 is placing these lines on every invokation resulting in overwhelming spam.

You should be using something alike:

if ! grep pattern path/to/file; then printf '%s\n' "expected content" > path/to/file; fi
  1. Other various issues mensioned on https://shellcheck.net/?id=cb40763 -> ALWAYS use shellcheck (linting) if you are writing a shell/bash script

I am willing to make a contribution if you provide an abstract to what needs to be done and why as in-code documentation and assuming this being released under FSF approved license

Thanks for this great guide!

@Kreyren - where can I download your much better version of this? I can not find it. Thanks!

@supersexy I would need an abstract for that version meaning what the script should do, how and when which wasn't provided yet and i am not motivated enough to try to reverse engineer it.

Thanks! Help me a lot!

Jesus! NVIDIA is messing with us! Like seriously? I had to go through so many installation tutorials and finally a golden gem of a gist linked on some answer in stack overflow works? That's just messed up man! The official documentation just doesn't work. Anyway I found out from here that I had to add another line to make it work when using cuda 10.1 ...

export LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

any guesses why though? cause nvidia installs some 10.2 components when installing 10.1. who would have thunk?
anyway peace out and much thanks my man @Mahedi-61

@supersexy I would need an abstract for that version meaning what the script should do, how and when which wasn't provided yet and i am not motivated enough to try to reverse engineer it.

Wow, zero willingness to help, zero willingness to give more detail, just complaining about linter warnings on some code you found on the internet. And malware, talk about crying wolf.

I'll address your complaints:

  • the paths will be reappended every time the script runs source ~/.bashrc. So, once. Hardly overwhelming, and not at all spam. If you care so much about having an aesthetically-pleasing PATHvariable,therearescriptsontheinternetthatcanremoveduplicates,anditgetsresetto(getconf PATH) anyway (probably on reboot).
  • the expressions in lines 45-46 are supposed to be non-expanding since they are written to .bashrc, not executed

There is zero reverse-engineering needed here as you have full access to the sources and documentation (if we're taking ourselves so seriously about a 71-line script).

What if I want to install cuda 10.0 ? Is the same sequence of instructions? thanx

Unpacking xfonts-base (1:1.0.4+nmu1) ...
Errors were encountered while processing:
/tmp/apt-dpkg-install-0cZNOW/106-libnvidia-compute-460_460.27.04-0ubuntu1_amd64.deb
/tmp/apt-dpkg-install-0cZNOW/107-libnvidia-decode-460_460.27.04-0ubuntu1_amd64.deb
/tmp/apt-dpkg-install-0cZNOW/108-libnvidia-encode-460_460.27.04-0ubuntu1_amd64.deb
/tmp/apt-dpkg-install-0cZNOW/109-libnvidia-fbc1-460_460.27.04-0ubuntu1_amd64.deb
/tmp/apt-dpkg-install-0cZNOW/110-libnvidia-gl-460_460.27.04-0ubuntu1_amd64.deb
/tmp/apt-dpkg-install-0cZNOW/111-libnvidia-ifr1-460_460.27.04-0ubuntu1_amd64.deb
/tmp/apt-dpkg-install-0cZNOW/112-nvidia-compute-utils-460_460.27.04-0ubuntu1_amd64.deb
/tmp/apt-dpkg-install-0cZNOW/117-libnvidia-extra-460_460.27.04-0ubuntu1_amd64.deb
/tmp/apt-dpkg-install-0cZNOW/118-nvidia-utils-460_460.27.04-0ubuntu1_amd64.deb
/tmp/apt-dpkg-install-0cZNOW/119-libnvidia-cfg1-460_460.27.04-0ubuntu1_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

the best tutorial that saves my life!!!!!

Yes, It's the best Tensorflow Installation Guide, I had resolved all my previous issues.

If you run this on shell, tensorflow recognizes gpus?
I ran this shell script, and seemed like there was no problem running it, but tensorflow-gpu still doesn't recognize gpus.
Tensorflow-gpu version is 2.3.0 and this version must also be compatible with cuda 10.1 and cudnn 7.6.

If you run this on shell, tensorflow recognizes gpus?
I ran this shell script, and seemed like there was no problem running it, but tensorflow-gpu still doesn't recognize gpus.
Tensorflow-gpu version is 2.3.0 and this version must also be compatible with cuda 10.1 and cudnn 7.6.

Try this
sudo apt-get install -y --no-install-recommends
cuda-10-1
libcudnn7=7.6.0.64-1+cuda10.1
libcudnn7-dev=7.6.0.64-1+cuda10.1;

sudo apt-get install -y --no-install-recommends
libnvinfer6=6.0.1-1+cuda10.1
libnvinfer-dev=6.0.1-1+cuda10.1
libnvinfer-plugin6=6.0.1-1+cuda10.1;

also as someone above said, cuda 10.1 install some cuda 10.2 components

it works on my pc. Thanks!

can you tell what changes should be done for cuda 11.1 in 18.04 ubuntu system.

Thanks

Was having issues getting the TensorFlow Object Detection API to work without errors. This guide worked for Ubuntu 20.04, CUDA 11.2, CuDNN 8.1.0 and TensorFlow 2.6.

Thanks a lot!

@mnielsen There is extra i in sudo apt install libnividia-gl-470. I think it should be sudo apt install libnvidia-gl-470

Author

@yummyKnight Thanks for your correction.

Can someone tell me is sudo ubuntu-drivers autoinstall the same as three following commands? Do they do the same job?

sudo apt install libnvidia-common-470
sudo apt install libnvidia-gl-470
sudo apt install nvidia-driver-470

After installing this I was getting the following (non-fatal) warning

>>> import tensorflow as tf
>>> print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
2021-11-24 09:01:58.877869: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-11-24 09:01:58.899255: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-11-24 09:01:58.900051: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
Num GPUs Available:  1

I resolved it by following tensorflow/tensorflow#53184
for a in /sys/bus/pci/devices/*; do echo 0 | sudo tee -a $a/numa_node; done

Thanks for nice repo
I have installed using your instruction. but when type nvidia-smi it shows 11.5. Why, how can I install 11.2?

Works great up till cuDNN, and then I get the following

$ wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.1.1.33/11.2_20210301/cudnn-11.2-linux-x64-v8.1.1.33.tgz
--2022-02-13 13:24:40--  https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.1.1.33/11.2_20210301/cudnn-11.2-linux-x64-v8.1.1.33.tgz
Resolving developer.nvidia.com (developer.nvidia.com)... 152.195.19.142
Connecting to developer.nvidia.com (developer.nvidia.com)|152.195.19.142|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2022-02-13 13:24:41 ERROR 403: Forbidden.

EDIT: This link worked: wget https://developer.download.nvidia.com/compute/redist/cudnn/v8.1.1/cudnn-11.2-linux-x64-v8.1.1.33.tgz

I want to install CUDA 11.3 or higher version on Ubuntu 18.04 (which is installed using a Virtual Machine). Which instructions should I follow?

Thanks for nice repo I have installed using your instruction. but when type nvidia-smi it shows 11.5. Why, how can I install 11.2?

I had to implement the end of this tutorial:
https://towardsdatascience.com/installing-multiple-cuda-cudnn-versions-in-ubuntu-fcb6aa5194e2

I used his edit of bash so tensorflow (in my case) can choose what cuda toolkit use, and it worked.

Thank you very much @Mahedi-61, much appreciated

RTX 3090 requires driver version of 515 (not 470).

# install nvidia driver with dependencies
sudo apt install libnvidia-common-515
sudo apt install libnvidia-gl-515
sudo apt install nvidia-driver-515

I am wondering whether these work for installing cuda 11.3 on ubuntu 22.04 also?

Will it work for nvidia-server on ubuntu 20.04 server ?

install nvidia driver with dependencies

sudo apt install libnvidia-common-470-server
sudo apt install libnvidia-gl-470-server
sudo apt install nvidia-driver-470-server

Will it work for nvidia-server on ubuntu 20.04 server ?

install nvidia driver with dependencies

sudo apt install libnvidia-common-470-server sudo apt install libnvidia-gl-470-server sudo apt install nvidia-driver-470-server

@saravananpsg It's works for server. I tested. I also changed 470 to 515 to support 3090.

I also had to change the version from 470 to 515 for a 1070 TI.

sudo apt install libnvidia-common-515
sudo apt install libnvidia-gl-515
sudo apt install nvidia-driver-515

After installing, if nvidia-smi gives a kernel/client version mismatch error, reboot.

This helped A LOT! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK