In this post we keep records for setting up procedure on Tensorflow custom-op development environment.
Conda
At first, we recommend using Conda to seperate the Python version and dependency.
Conda configuration file ~/.condarc
cat > ~/.condarc <<EOF
## setup proxy if needed
#proxy_servers:
# http: http://172.17.0.1:8123
# https: https://172.17.0.1:8123
report_errors: false
ssl_verify: false
## do not enter default conda environment
auto_activate_base: false
#offline: True
EOF
Conda init script inside like bashrc
Use like conda init zsh
to setup Conda automatically. Or place following snippet in common shell rc file to support multiple shell:
cat <<'EOF'
SHELL_EXE=$(readlink /proc/$$/exe)
SHELL_TYPE=$(echo 'bash:bash,zsh:zsh,dash:sh,zsh5:zsh' | awk -F: -v RS=, -v s=$(basename ${SHELL_EXE:-zsh}) '$1==s{print $2}')
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/opt/conda/bin/conda' shell.${SHELL_TYPE} 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/opt/conda/etc/profile.d/conda.sh" ]; then
. "/opt/conda/etc/profile.d/conda.sh"
else
export PATH="/opt/conda/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
EOF
Get faimilar with Conda command
CONDA_ENV_NAME=tf22_py38
# create environment with specfic package versions
CUDA_VERSION=$(nvcc --version | awk -F',' '/release/{split($(NF-1),v," "); print v[2]}')
conda create -p /opt/conda/envs/${CONDA_ENV_NAME} \
python=3.8 'tensorflow<2.4' \
cudatoolkit=${CUDA_VERSION:-10.1} cudnn
# enter Conda environment by name (or the full path)
conda activate ${CONDA_ENV_NAME}
# maybe save all dependencies for later reconstruct
conda env export > conda_env.yaml
# one day we may destroy the env and rebuild
conda deactivate
conda env remove -n ${CONDA_ENV_NAME}
# restore the environment from configuration file
conda create --file conda_env.yaml -n ${CONDA_ENV_NAME}
Conda with TF2.3 under CUDA 10
TF2.3 is the last version support CUDA 10, or upgrade Nvidia driver for >=TF2.4. But Conda do not provide TF2.3 out of box because of some issues.
Thought it is not recommended way, we use pip to install TF2.3 while conda to install the must have cudnn
and cudatoolkit
package.
CONDA_ENV_PATH=/opt/conda/envs/tf23_py36
## Please use following strict package version or strange things happened.
## higher version of certifi introduce issue with pip
conda create -p ${CONDA_ENV_PATH} \
python=3.6 \
cudnn=7.6.5 cudatoolkit=10.1 \
numpy=1.18.5 scipy=1.4.1 requests=2.27.1 h5py=2.10.0 \
certifi=2021.5.30
# furthur install tensorflow 2.3 with pip
conda activate ${CONDA_ENV_PATH}
python -m pip install tensorflow-gpu==2.3
# maybe install some data science packages
conda install -y matplotlib scikit-learn scikit-image pandas
The envrionment configuration for TF2.3 and furthur data science were exported.
Tensorflow Custom-op
tfopgen is a scaffold project to generate custom-op start-up codebase.
The Tensorflow package root path can be found with TF_ROOT=$(python -c 'import pkgutil as p; from pathlib import Path; print(Path(p.get_loader("tensorflow").get_filename()).parent)')
.
Be careful with some inconsistence issues:
-
Missing header file under
third_party
. Do a link likeln -s "$(dirname $(command -v nvcc))/../targets/x86_64-linux/include" ${TF_ROOT}/include/third_party/gpus/cuda/include
-
cuda toolkit (e.g. nvcc) better be a fixed path or a lot of changes needed. Do a link with
ln -s "$(dirname $(command -v nvcc))/.." /usr/local/cuda
-
Be careful with your GPU’s capcbility while spectifying
nvcc
options--gpu-architecture
. Or weird things can happen in runtime, like triggeringno kernel image is available for execution on the device
hence a segment fault to crash the process. One can placeGPU_ARCH := $(shell python3 -c "from tensorflow.python.client import device_lib; print(next(d.physical_device_desc for d in device_lib.list_local_devices() if d.device_type=='GPU'))" 2> /dev/null | awk -F: -v RS=, '/compute capability/{print "sm_"int($2*10)}')
inside the Makefile and then use--gpu-architecture=$(GPU_ARCH)
inside theNVCCFLAGS
.