Assumptions
You have access to a linux server with
- GPUs,
- CUDA Toolkit (for details: NVIDIA’s documentation),
- NVIDIA drivers associated with CUDA Toolkit,
- cuDNN (for details: NVIDIA’s documentation),
- python 3.x, and
- jupyter are installed.
You want to install tensorflow with Anaconda and run a jupyter notebook remotely.
Installation
- Download Anaconda
wget http://repo.continuum.io/archive/Anaconda3-4.3.0-Linux-x86_64.sh
- Install Anaconda
sh Anaconda3-4.3.0-Linux-x86_64.sh
- Create a new virtual environment
conda create -n ENVIRONMENT-NAME
Replace ENVIRONMENT-NAME with your a name of your choice.
- Activate the new environment
source activate ENVIRONMENT-NAME
To see which enviromnet is run the following command and look for *
conda info --envs
- Set enviroment variables
export CUDA_HOME=/usr/local/cuda-9.0 export PATH=$PATH:/usr/local/cuda-9.0/bin:/usr/local/cuda-9.0/lib64 export LD_LIBRARY_PATH=/usr/local/cuda-9.0/bin:/usr/local/cuda-9.0/extras/CUPTI/lib64
in case of cuda-9.0
- Install tensorflow-gpu
pip install --ignore-installed --upgrade tensorflow-gpu
- OR install tensorflow
pip install --ignore-installed --upgrade tensorflow
Checking Installation
1. Run nvcc and you should see release 9.0 on the last line
nvcc --version
2. Run nvidia-smi to see what is happening on the GPU(s)
nvidia-smi
3. Run python
python
and then paste this code (taken from here)
# Python import tensorflow as tf hello = tf.constant('Hello, TensorFlow!') sess = tf.Session() print(sess.run(hello))
Start a jupyter notebook
- Here is the script I use to start a jupyter notebook after activating the new environment
#!/bin/bash export CUDA_HOME=/usr/local/cuda-9.0 export PATH=$PATH/usr/local/cuda-9.0/bin:/usr/local/cuda-9.0/lib64 export LD_LIBRARY_PATH=/usr/local/cuda-9.0/bin:/usr/local/cuda-9.0/extras/CUPTI/lib64 # after setting the environment variables start the notebook jupyter-notebook --port=8899 --no-browser
You will see something like:
http://localhost:8899/?token=YOUR-TOKEN
- Tunnel
ssh -L 8899:127.0.0.1:8899 your-username@server-address
- Open a browser and copy paste the address from 1.
Possible error
I ran in a libcublas.so.9.0 error. In my case the problem was that
/usr/local/cuda was pointing at /usr/local/cuda-8.0/
You can check that with
ls -lh /usr/local/cuda
In case your are using my script, be sure to start the script with source. Why? answer