Docker GPU
How to get Docker working for GPU
Ref:
https://github.com/NVIDIA/nvidia-docker
https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-driver
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/docker
https://github.com/floydhub/dl-docker
FIrst clone the tensorflow module
The directory inside this:
tensorflow/tools/docker
has the Dockerfile which we will use to install tensorflow for GPU
But, the tensorflow installation using pip does not work, so comment those lines as shown
# Install TensorFlow GPU version.
#RUN pip --no-cache-dir install \
#http://storage.googleapis.com/tensorflow/linux/gpu/tensorflow\_gpu-0.0.0-cp27-none-linux\_x86\_64.whl
And add this at the top of the file:
FROM gcr.io/tensorflow/tensorflow:latest-gpu
Latest setup for AWS p2.xlarge instance:
apt-get remove docker docker-engine docker.io
apt-get update
apt-get install \
apt-transport-https \
ca-certificates \
curl \
software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
apt-key fingerprint 0EBFCD88
add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
apt-get install docker-ce
apt-get update
./aws_gpu_instance_setup.sh
The modified Dockerfile for GPU looks like:
FROM nvidia/cuda:8.0-cudnn6-devel-ubuntu16.04
FROM gcr.io/tensorflow/tensorflow:latest-gpu
MAINTAINER Craig Citro <craigcitro@google.com>
ARG DEBIAN_FRONTEND=noninteractive
# Pick up some TF dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
curl \
libfreetype6-dev \
libpng12-dev \
libzmq3-dev \
pkg-config \
python \
python-dev \
rsync \
software-properties-common \
unzip \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
#==================================
# Amit - custom installs
#==================================
#RUN apt-get install -y apt-utils
# Install nodejs
RUN curl -sL https://deb.nodesource.com/setup_7.x | bash -
RUN apt-get install -y nodejs
# for scp
RUN apt-get install -y openssh-client
# for git
RUN apt-get install -y git
# Install mysql-server
RUN apt-get install -y mysql-server
RUN service mysql start
# Amit - custom installs
RUN apt-get install -y vim
RUN apt-get install -y cython
RUN apt-get install -y python-pandas
RUN apt-get install -y python-cairosvg
RUN apt-get install -y python-pydot
RUN apt-get install -y python-pygraphviz
RUN apt-get install -y s3cmd
RUN apt-get install -y python-boto
RUN apt-get install -y python-mysqldb
RUN pip install --upgrade pip
RUN pip install pydotplus
RUN pip install graphviz
RUN pip install keras
RUN curl -O https://bootstrap.pypa.io/get-pip.py && \
python get-pip.py && \
rm get-pip.py
RUN pip --no-cache-dir install \
ipykernel \
jupyter \
matplotlib \
numpy \
scipy \
sklearn \
pandas \
Pillow \
&& \
python -m ipykernel.kernelspec
# --- DO NOT EDIT OR DELETE BETWEEN THE LINES --- #
# These lines will be edited automatically by parameterized_docker_build.sh. #
# COPY _PIP_FILE_ /
# RUN pip --no-cache-dir install /_PIP_FILE_
# RUN rm -f /_PIP_FILE_
# Install TensorFlow GPU version.
#RUN pip --no-cache-dir install \
#http://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-0.0.0-cp27-none-linux_x86_64.whl
# --- ~ DO NOT EDIT OR DELETE BETWEEN THE LINES --- #
# RUN ln -s /usr/bin/python3 /usr/bin/python#
# Set up our notebook config.
COPY jupyter_notebook_config.py /root/.jupyter/
# Copy sample notebooks.
COPY notebooks /notebooks
# Jupyter has issues with being run directly:
# https://github.com/ipython/ipython/issues/7062
# We just add a little wrapper script.
COPY run_jupyter.sh /
# For CUDA profiling, TensorFlow requires CUPTI.
ENV LD_LIBRARY_PATH /usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
# TensorBoard
EXPOSE 6006
# IPython
EXPOSE 8888
WORKDIR "/notebooks"
#CMD ["/run_jupyter.sh", "--allow-root"]
CMD ["/bin/bash"]
#CMD ["nohup" "./run_jupyter.sh" "--allow-root" ">" "tf_files/nohup.out" "2>&1" "<" "/dev/null" "&"]
Next execute:
BUILD THE IMAGE:
docker build -t custom-agshift-docker-image-gpu .
RUN THE IMAGE
nvidia-docker run -p 8888:8888 -p 6006:6006 --name custy-agshift-gpu -it -v /home/ubuntu/tf_files:/tf_files custom-agshift-docker-image-gpu
THEN FROM DOCKER BASH, RUN
nohup ./run_jupyter.sh --allow-root > tf_files/nohup.out 2>&1 < /dev/null &
nohup ./run_jupyter.sh --allow-root > tf_files/nohup.out 2>&1 < /dev/null &
INSTALL OPENCV
Follow the instructions here: http://milq.github.io/install-opencv-ubuntu-debian/
The above did not work for me. So I created an install script following instructions on OpenCV website
make a directory called /tf_files/opencv_install. Then change the directory to that.
Create the below bash shell script. Then execute the script.
#install_opencv_mine.sh
=========================
git clone https://github.com/Itseez/opencv.git
git clone https://github.com/Itseez/opencv_contrib.git
cd opencv
mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=Release -D CMAKE_INSTALL_PREFIX=/usr/local ..
make -j7 # runs 7 jobs in parallel
make install
After this install the python opencv bindings
apt-get install -y python-opencv
AWS GPU instance
http://ec2-35-164-187-208.us-west-2.compute.amazonaws.com:8888/?token=927d17004747094e5d7aa123342e99e5afcd3ee6a65ea1ea
Note:
If the web browser does not connect to jupyter session, make sure that the EC2 instance has the correct 'security group' with the ports opened up.
security group example (at least these ports should be opened up): launch-wizard-3
===================================================================================
Custom TCP Rule
TCP
8888
0.0.0.0/0
Custom TCP Rule
TCP
8888
::/0
Custom TCP Rule
TCP
6006
0.0.0.0/0
Custom TCP Rule
TCP
6006
::/0
SSH
TCP
22
0.0.0.0/0
HTTPS
TCP
443
0.0.0.0/0
HTTPS
TCP
443
::/0
STOP INSTANCE, RESTART INSTANCE, START DOCKER and then RE_RUN jupyter
When the GPU instance is stopped and then restarted, it assigns a new public DNS.
So the previous way of connecting to AWS EC2 instance won't work. Just change the DNS name in the 'ssh command'
Everything else remains the same. All files, docker stays intact. If we do the following we will see that everything is the same
sudo su
cd /home/ubuntu/workdir/tensorflow/tensorflow/tools/docker
docker images -a
docker ps -a # will show that the previous docker process has exited. Copy the container id from here
# NOW JUST START THE DOCKER CONTAINER AND THEN START jupyter
docker start <CONTAINER ID> # for example: f5e826b541b4
docker attach <CONTAINER ID> # for example: f5e826b541b4
cd /
nohup ./run_jupyter.sh --allow-root > tf_files/nohup.out 2>&1 < /dev/null &
To start another docker shell using the same container id, but avoiding clash with co-user
docker exec -it f5e8 bash
This will open another bash shell in the same container
# To start jupyter inside docker from an xterm (say own Mac) and then
# silently exiting without breaking docker
> cd /
> nohup ./run_jupyter.sh --allow-root > tf_files/nohup.out 2>&1 < /dev/null &
> CTRL P Q
Enabling GPU's in container ecosystem
Last updated
Was this helpful?