This tutorial will get you a fresh build of PyTorch v0.4.1 on Fedora 28 with the lastest versions of CUDA and cuDNN. You should be able to complete this tutorial in around half an hour.

This tutorial will get you a fresh build of PyTorch v0.4.1 on Fedora 28 with the lastest versions of CUDA and cuDNN. You should be able to complete this tutorial in under 30 minutes.
First you will need to install CUDA 9.2 and cuDNN 7.2 from the website of Nvidia, which are both required by PyTorch. I suggest to install them in a dedicated directory, for example /opt/nvidia/cuda-9.2/, so that you can organize your different dependencies easily.

Install CUDA 9.2 and cuDNN 7.2

You can install CUDA using the visual installer with sudo ./cuda_9.2.148_396.37_linux.run or directly with sudo ./cuda_9.2.148_396.37_linux.run --toolkitpath=/opt/nvidia/cuda-9.2/ --override --no-drm --silent --toolkit --verbose. Complete the procedure with the 9.2 patch sudo ./cuda_9.2.148.1_linux.run.

Once CUDA is installed, it should look like this :

[fedora28@iv-ms-593 nvidia]$ ll /opt/nvidia/cuda-9.2/
total 68
drwxr-xr-x  3 root root 4096 Aug 23 13:27 bin
drwxr-xr-x  5 root root 4096 Aug 23 12:12 doc
drwxr-xr-x  5 root root 4096 Aug 23 12:12 extras
drwxr-xr-x  5 root root 4096 Aug 23 14:57 include
drwxr-xr-x  5 root root 4096 Aug 23 12:12 jre
drwxr-xr-x  3 root root 4096 Aug 23 14:58 lib64
drwxr-xr-x  8 root root 4096 Aug 23 12:12 libnsight
drwxr-xr-x  7 root root 4096 Aug 23 12:12 libnvvp
drwxr-xr-x  2 root root 4096 Aug 23 12:12 nsightee_plugins
drwxr-xr-x  3 root root 4096 Aug 23 12:12 nvml
drwxr-xr-x  7 root root 4096 Aug 23 12:12 nvvm
drwxr-xr-x  2 root root 4096 Aug 23 12:12 pkgconfig
drwxr-xr-x 11 root root 4096 Aug 23 12:12 samples
drwxr-xr-x  3 root root 4096 Aug 23 12:12 share
drwxr-xr-x  2 root root 4096 Aug 23 12:12 src
drwxr-xr-x  2 root root 4096 Aug 23 12:12 tools
-rw-r--r--  1 root root   50 Aug 23 13:27 version.txt

Then move the content of the cuDNN archive to the corresponding directories of CUDA:

[fedora28@iv-ms-593 nvidia]$ ll /opt/nvidia/cudnn/
total 48
drwxr-xr-x 2 root root  4096 Aug 23 14:57 include
drwxr-xr-x 2 root root  4096 Aug 23 14:58 lib64
-r--r--r-- 1 root root 38963 Jun 30 03:40 NVIDIA_SLA_cuDNN_Support.txt
[fedora28@iv-ms-593 nvidia]$ cp -r /opt/nvidia/cudnn/include/* /opt/nvidia/cuda-9.2/include
[fedora28@iv-ms-593 nvidia]$ cp -r /opt/nvidia/cudnn/lib64/* /opt/nvidia/cuda-9.2/lib64

Export environment variables

In your ~/.bashrc file, add the variables required to detect CUDA during the compilation of PyTorch and if you wish to use the compiler tools.

# CUDA
MY_CUDA=/opt/nvidia/cuda-9.2
export CUDA_INC_PATH=$MY_CUDA/include
export CUDA_INCLUDE_DIRS=$MY_CUDA/include
export LD_LIBRARY_PATH=$MY_CUDA/lib64:$LD_LIBRARY_PATH
export PATH=$MY_CUDA/bin:$PATH

Install Anaconda

PyTorch officially requires you to use conda if you want to install dedicated binaries. I recommend to use miniconda, which is lighter than anaconda. During the install process, specify a dedicated installation directory where your virtual environments will be located (including libraries, binaries).

Dedicated PyTorch virtual environment

It’s now time to install the dependencies required by PyTorch, let’s first move to a dedicated environment for the compilation of this version.

conda create --name pytorch-v0.4.1
# or if you need a precise python version
conda create --name pytorch-v0.4.1 python=3.6
# Load the virtual env
source activate pytorch-v0.4.1
conda install numpy pyyaml mkl mkl-include setuptools cmake cffi typing
conda install -c mingfeima mkldnn
conda install -c pytorch magma-cuda92

CMake needs to know where these dependencies are located to assemble them during the compilation. You can find the root of your virtual env with :

(pytorch-v0.4.1) [fedora28@iv-ms-593]$ which python
~/deps/miniconda3/envs/pytorch-v0.4.1/bin/python

Where ~/deps/miniconda3/envs/pytorch-v0.4.1/ is the root directory of the pytorch-v0.4.1 env. Set the CMAKE env variable accordingly :

# update the dir with your path
export CMAKE_PREFIX_PATH=~/deps/miniconda3/envs/pytorch-v0.4.1/

Download PyTorch source

git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
# checkout to version 0.4.1
git checkout remotes/origin/v0.4.1

GCC and G++

On most systems, the gcc and g++ versions are not supported by CUDA (too much recent). To solve this issue, we will be relying on two packages maintained by the negativo17 repository.

sudo dnf config-manager --add-repo=https://negativo17.org/repos/fedora-nvidia.repo
# install GCC and G++
sudo dnf install cuda-gcc-7.3.0-1.fc28.x86_64 cuda-gcc-c++-7.3.0-1.fc28.x86_64

Eigen dependency

Eigen is required at various levels in Pytorch as well as Caffe2 (on which PyTorch now relies). For this reason you will need Eigen, problem is that CUDA 9.1 introduced a bug which is now corrected in the recent versions of Eigen. Chances are your system does not benefit from the patched release or that you don’t have root permission. No problem, PyTorch directly pulls the correct Eigen version as a third-party, but compiles it only if it’s not already detected on your system. To block the detection of the system version of Eigen, there is one hack : add a CMake option to pytorch/tools/build_pytorch_libs.sh:

# line 282

## before update
-DCMAKE_SHARED_LINKER_FLAGS="$LDFLAGS $USER_LDFLAGS" ${EXTRA_CAFFE2_CMAKE_FLAGS[@]}
# STOP!!! Are you trying to add a C or CXX flag?  Add it

## after update
-DCMAKE_SHARED_LINKER_FLAGS="$LDFLAGS $USER_LDFLAGS" \
-DCMAKE_DISABLE_FIND_PACKAGE_Eigen3=1 ${EXTRA_CAFFE2_CMAKE_FLAGS[@]}
# STOP!!! Are you trying to add a C or CXX flag?  Add it

Build

(pytorch-v0.4.1) [fedora28@iv-ms-593 pytorch]$ CC=cuda-gcc CXX=cuda-g++ python setup.py install

Enjoy

(pytorch-v0.4.1) [fedora28@iv-ms-593 pytorch]$ cd
(pytorch-v0.4.1) [fedora28@iv-ms-593 ~]$ python
Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 17:14:51) 
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> x=torch.Tensor(10,10).random_(10).cuda()
>>> x.sum()
tensor(425., device='cuda:0')

Acknowledgment

This tutorial would not have been possible without the incredible help of Thomas Grenier, many thanks to him !