In this article, we are leverging The RAPIDS as a base Image framework for a collection of libraries (including tensorflow 2.13) to run end-to-end data science pipelines completely on the GPU. The interaction is designed to have a familiar look and feel to working in Python, but utilizes optimized NVIDIA® CUDA® primitives and high-bandwidth GPU memory under the hood.
dockerfile
# Ref https://docs.rapids.ai/install
FROM nvcr.io/nvidia/rapidsai/rapidsai:23.06-cuda11.8-runtime-ubuntu22.04-py3.10# Run everything as root
USER root
# Set our locale to en_US.UTF-8.
ENV LANG en_US.UTF-8
ENV LC_CTYPE en_US.UTF-8
# install python packages per your requirements
ADD requirements.txt /
RUN pip3 install --requirement /requirements.txt
# For tensorflow 2 to run on gpu, cudnn and cudatoolkit must be installed.
# Moreover, the versions of cudnn and cudatoolkit must be compatible with the drivers of the gpu you are using
RUN apt install -y software-properties-common
RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin && \
mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600 && \
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub && \
add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /" && \
apt-get update && \
apt-get install -y libcudnn8 && \
apt-get install -y libcudnn8-dev && \
apt-get install -y kmod && \
apt-get install -y nvidia-cuda-toolkit
# check list of availavle cudnn packages
# apt list -a libcudnn8-dev
# download cudnn for Linux from official Nvidia site
# refer https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html?ncid=em-prod-337416
COPY cudnn-local-repo-ubuntu2204-8.9.5.29_1.0-1_amd64.deb /var
# install cudnn
RUN dpkg -i /var/cudnn-local-repo-ubuntu2204-8.9.5.29_1.0-1_amd64.deb && \
cp /var/cudnn-local-repo-ubuntu2204-8.9.5.29/cudnn-local-535C49CB-keyring.gpg /usr/share/keyrings/ && \
apt-get update && \
apt-get install -y libcudnn8=8.9.5.29-1+cuda11.8 --allow-downgrades && \
apt-get install -y libcudnn8-dev=8.9.5.29-1+cuda11.8 --allow-downgrades && \
apt-get install -y libcudnn8-samples=8.9.5.29-1+cuda11.8 --allow-downgrades && \
export LIBRARY_PATH=/usr/lib/cuda/nvvm/libdevice:$LIBRARY_PATH && \
export XLA_FLAGS=--xla_gpu_cuda_data_dir=/usr/lib/cuda && \
cd /root
# setting up Cloud SDK
RUN export CLOUD_SDK_REPO="cloud-sdk" \
&& echo "deb http://packages.cloud.google.com/apt $CLOUD_SDK_REPO main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list \
&& curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - \
&& apt-get update \
&& apt install -y \
google-cloud-sdk
# Setting LIBRARY_PATH to include the path to the libdevice library is necessary for compiling CUDA code that uses the NVVM compiler
ENV LIBRARY_PATH="/usr/lib/cuda/nvvm/libdevice:$LIBRARY_PATH"
ENV XLA_FLAGS="--xla_gpu_cuda_data_dir=/usr/lib/cuda"
2. Add Python packages requirement files
requirements.txt
tensorflow>=2.12.0
Cygnus==0.0.1
numpy==1.23.5
pandas==2.0.1
matplotlib==3.7.1
scipy==1.10.1
setuptools==67.7.2
pyhocon==0.3.60
tensorboard>=2.12.3
google-cloud-bigquery>=3.10.0
google-cloud-storage>=2.9.0
3. Build Image
docker build -t tensorflow-gpu:latest .
4. Once image build, you may upload docker images in your local or cloud Artifact Registry to use in your required application
Example: Running a simple code at Jupyter as an example to display how to use TensorFlow to train a simple neural network on random data. The code imports the NumPy library as np and TensorFlow as tf. It then defines the input dimension, generates random training and test data, and creates random training and test labels.
Memory usage on GPU driver:
Note: t’s worth noting that the performance of the code on a GPU will depend on the specific GPU hardware and drivers being used. If you encounter issues or errors when running the code on a GPU, you may need to update your GPU drivers or adjust the code to work with your specific hardware.
Be the first to comment