Skip to content

Commit

Permalink
Merge pull request #4 from AI-ASMR/docker-impl
Browse files Browse the repository at this point in the history
feat!: docker cuda improvements
  • Loading branch information
StiliyanKushev authored Dec 12, 2023
2 parents bd2c392 + e1fa5be commit 1982b7d
Show file tree
Hide file tree
Showing 4 changed files with 151 additions and 4 deletions.
3 changes: 2 additions & 1 deletion .dockerignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
node_modules
tensorboard
tensorboard
Dockerfile
21 changes: 20 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,24 @@
# CUDA base image
FROM nvidia/cuda:11.6.1-cudnn8-devel-ubuntu20.04 as cuda-base

# install curl
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && apt install -y curl && rm -rf /var/lib/apt/lists/*
RUN curl https://raw.githubusercontent.com/creationix/nvm/master/install.sh | bash

# node version from .npmrc and .nvmrc
FROM node:18.19.0
SHELL ["/bin/bash", "--login", "-i", "-c"]
RUN curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.35.2/install.sh | bash
RUN source /root/.bashrc && nvm install v18.19.0

# Set PATH to include Node.js binaries installed via nvm
ENV PATH="/root/.nvm/versions/node/v18.19.0/bin:${PATH}"

# link cuda related libs to lib64 so tensorflow
# can find all of them.
RUN ln -s /usr/lib/x86_64-linux-gnu/libcud* /lib64
RUN ln -s /usr/local/cuda/lib64/* /lib64
RUN ln -s /usr/local/cuda-11.6/compat/* /lib64

WORKDIR /container

Expand Down
70 changes: 70 additions & 0 deletions PKGBUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
pkgbase=libnvidia-container
pkgname=(libnvidia-container1
libnvidia-container-tools
nvidia-container-runtime
nvidia-container-toolkit
nvidia-container-toolkit-base
nvidia-docker2)

pkgver=1.14.3
pkgrel=1
_elfver=0.7.1
_nvmpver=495.44
_pkgname_tools=libnvidia-container-tools
pkgdesc='NVIDIA container runtime library'
arch=('x86_64')
url='https://github.com/NVIDIA/libnvidia-container'
license=('Apache')
depends=(libcap libseccomp libtirpc)

source=("https://nvidia.github.io/libnvidia-container/stable/deb/amd64/./${pkgname[0]}_${pkgver}-1_amd64.deb"
"https://nvidia.github.io/libnvidia-container/stable/deb/amd64/./${pkgname[1]}_${pkgver}-1_amd64.deb"
"https://nvidia.github.io/libnvidia-container/stable/deb/amd64/./${pkgname[2]}_3.14.0-1_all.deb"
"https://nvidia.github.io/libnvidia-container/stable/deb/amd64/./${pkgname[3]}_${pkgver}-1_amd64.deb"
"https://nvidia.github.io/libnvidia-container/stable/deb/amd64/./${pkgname[4]}_${pkgver}-1_amd64.deb"
"https://nvidia.github.io/libnvidia-container/stable/deb/amd64/./${pkgname[5]}_2.14.0-1_all.deb")
sha256sums=('45fbd94f30bed5bca491ef8e893291e9e946a7ea0fe667be55357a250af9fb25'
'736138e919ada12d7fffcffa226bb3577e4cb29df013ff45da01dc08d33b4764'
'fe425ba3a1008748b123ce0cc50835b4a29f4000df5a88d7b22479514a8fb795'
'a0faabb6633bffb4115f8912de1c21727c9b8cb35fd3c943f0ffd6dd2a021d0f'
'65187a56fe483e0146c6c613da4e5d113be1f5b18c2ea607b8c1a30edd971afc'
'f4d01406e7e38ce810c0b3ba44c56842abc1ee38affa4c6a8a56da7989f17b2e')

prepare() {
for package in "${pkgname[@]}"; do
mkdir -p "$srcdir/${package}/"
if [[ "${package}" -eq "nvidia-container-toolkit" ]]; then
ar -xv "${package}"_*.deb; ls -al; tar -xf "data.tar.xz" -C "$srcdir/${package}/"
else
tar -xf "${package}"_*.tar.zst -C "$srcdir/${package}/"
fi
done
}

install_pkg() {
cd "${1}"
find usr/bin -type f -exec install -Dm755 "{}" "$pkgdir/{}" \; || true
find usr/lib -type f -exec install -Dm755 "{}" "$pkgdir/{}" \; || true
find usr/share -type f -exec install -Dm755 "{}" "$pkgdir/{}" \; || true
find etc -type f -exec install -Dm755 "{}" "$pkgdir/{}" \; || true
}

package_libnvidia-container1() {
install_pkg "$srcdir/${pkgname[0]}"
}

package_libnvidia-container-tools() {
install_pkg "$srcdir/${pkgname[1]}"
}
package_nvidia-container-runtime(){
install_pkg "$srcdir/${pkgname[2]}"
}
package_nvidia-container-toolkit(){
install_pkg "$srcdir/${pkgname[3]}"
}
package_nvidia-container-toolkit-base(){
install_pkg "$srcdir/${pkgname[4]}"
}
package_nvidia-docker2(){
install_pkg "$srcdir/${pkgname[5]}"
}
61 changes: 59 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
3. [Build from source.](#build-from-source)
4. [Repo's file structure.](#file-structure)
5. [Versioning and automation.](#versioning)
6. [Arch linux NVIDIA Container Toolkit.](#arch-nvidia-container)

### What is this? <a id="introduction"></a>

Expand All @@ -40,8 +41,16 @@ For better performance you can also use the binaries via docker like so:
# pull the latest version
sudo docker pull stiliyankushev/aimr-asmr-gan:latest
# run the docker instance (pass arguments at the end)
sudo docker run -ti stiliyankushev/aimr-asmr-gan:latest --help
sudo docker run --gpus all -ti stiliyankushev/aimr-asmr-gan:latest --help
```

#### (Optional) Docker Prerequisites.
Running the above docker container will automatically use a version of tensorflow that makes use of native C bindings. It'll also try to take advantage of any CUDA enabled GPUs running on the system. The docker container already pre-configures Cuda and Cudnn to work with tensorflow js. What you need to do is:
- Nvidia GPU with Cuda support.
- Running a Linux distro.
- Nvidia proprietary drivers installed.
- Installed and configured NVIDIA Container Toolkit. (for arch linux, [follow my guide](#arch-nvidia-container).)

### Build from source. <a id="build-from-source"></a>

You can build both the library and the binary from source using short predefined npm-scripts.
Expand All @@ -65,7 +74,7 @@ same as with the docker container:
```shell
npm start -- --help
```
#### Requirements for CUDA enabled model training
#### Requirements for CUDA enabled model training.
Running the above command will work but might not automatically pick up your GPU.
That's why it's advised to use the docker image which comes pre-configured. However, if you'd like to run this locally without docker, here's what you need:
- Nvidia GPU with Cuda support.
Expand Down Expand Up @@ -105,3 +114,51 @@ CI/CD implementation can be found here:
The repository hosts a minimal, scripted and cross-platform build tool used by all github actions, as well as users (via npm-scripts.)

For more details, [read the documented source](https://github.com/AI-ASMR/asmr-gan-core/blob/main/scripts.js).

### Arch linux NVIDIA Container Toolkit. <a id="arch-nvidia-container"></a>

This is a short guide on how to install the NVIDIA Container Toolkit on arch linux. For other Linux distros take a look at their official [guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).

I've created a custom PKGBUILD you need to install.
Build and install it.

Make a fresh directory:
```shell
mkdir ./temp-nvidia
cd ./temp-nvidia
```
Download the PKGBUILD file:
```shell
wget https://raw.githubusercontent.com/AI-ASMR/asmr-gan-core/main/PKGBUILD
```
Build the package:
```shell
makepkg
```
Install all .tgz files:
```shell
sudo pacman -U \
./libnvidia-container1-1.14.3-1-x86_64.pkg.tar.zst \
./libnvidia-container-tools-1.14.3-1-x86_64.pkg.tar.zst \
./nvidia-container-runtime-1.14.3-1-x86_64.pkg.tar.zst \
./nvidia-container-toolkit-1.14.3-1-x86_64.pkg.tar.zst \
./nvidia-container-toolkit-base-1.14.3-1-x86_64.pkg.tar.zst \
./nvidia-docker2-1.14.3-1-x86_64.pkg.tar.zst
```
Install `libnvidia-container-tools` manually:
```shell
sudo pacman -Syu libnvidia-container-tools
```
Configure docker:
```shell
sudo nvidia-ctk runtime configure --runtime=docker
```
Restart docker afterwards:
```shell
sudo systemctl restart docker
```
At this point docker should be configured. Test like so:
```shell
sudo docker run --gpus all ubuntu nvidia-smi
```
If `nvidia-smi` works, than everything works as expected.

0 comments on commit 1982b7d

Please sign in to comment.