この記事は近畿大学 Advent Calendar 2019 - Qiita3日目の記事です．

はじめに
Kubernetes・Nvidia-Docker2・NVIDIA-device-plugin-for-Kubernetes
Kubernetes環境の構築
Kubernetes Cluster上でのDeepLearning環境の構築
- NVIDIA-device-plugin-for-Kubernetesの導入
DeepLearning環境Deploymentの作成
計算資源の可視化
- metrics server
- Kubernetes Dashboard 2.0
最後に

はじめに

今回はDeepLearningをするために研究室内にある複数の機械学習用サーバを簡単に管理・運用する際Kubernetes,Nvidia-Docker2,NVIDIA-device-plugin-for-Kubernetesを用いて環境構築を行いました．

Kubernetesの構築に関するブログ記事やQiitaなどは多数出ていますが，GPUを用いたKubernetes環境の構築に関する日本語のブログ記事やQiitaなどがあまり見受けられませんでしたので，自分用のメモとして残しておきたいと思います．以下のブログを参考にさせていただきました．

blog.sky-net.pw

この記事の最終目標は以下の図のような構成となっております．この記事では右側黄色の波線で囲われたKubernetes Clusterの構築を行っていきたいと思います．左側のVScodeとの連携に関する記事はまた別に書く予定です．

Kubernetes・Nvidia-Docker2・NVIDIA-device-plugin-for-Kubernetes

Kubernetes

Kubernetes（クーベネティス）とは雑に言ってしまうとコンテナオーケストレーションエンジンと言う，マルチホスト・マルチコンテナを管理・運用するためのツールです．詳細は過去記事に書いたのでそちらを読むか，書籍などで知識を身につけてきてください．本記事ではある程度Docker・Kubernetesについての知識があることを前提で書いておりますので，Kubernetesの仕組みや機能の解説などは行いません．

tenzen.hatenablog.com

Nvidia-Docker2

Nvidia-Docker2はDockerと組み合わせて使用する，Nvidia社が公開しているOSSのことです．

Dockerのみの場合，ホストOSにインストールしているNvidia DriverやCUDAのバージョンと同じバージョンのものをDockerコンテナ内にインストールしてあげればDockerコンテナ内でもGPUを使うことができます．

しかしながら，完全にホスト依存になってしまうためGPUにおいてはコンテナの利便性が損なわれてしまいます．

そこでNvidia-Docker2を用いるとホストOS側のNvidia DriverやCUDAのバージョンにとらわれる事なく，コンテナ内に好きなバージョンのNvidia DriverやCUDAをインストールすることができます．

追記として記事執筆段階でKubernetesがサポートしている最新のDockerバージョンは18.09ですが,Dockerバージョン19.03(Kubernetesでのサポートはされていない.)からはNvidia GPUをネイティブサポートしたためDockerのみでNvidia-Docker2を使用した場合と同じようにGPUコンテナを作成できるようになっています.

NVIDIA-device-plugin-for-Kubernetes

NVIDIA-device-plugin-for-KubernetesはNVIDIA社が提供しているKubernetes用プラグインです. Git hub公式にも書いてありますが,以下のような機能があります.

クラスタの各ノードでのGPU数の取得
GPUの状態の追跡
Kubernetes クラスターでGPU対応コンテナの実行

github.com

今回は記事執筆時点で公開されている最新版のBeta4を使用します.

Kubernetes環境の構築

今回構築に使用するコンピュータの構成は以下のようになります．構築の説明では以下のHOST名を用いて説明を行っていきますので，適宜参照してください．また，Workerでの構築説明では基本的にUtahaを使用して行っています．Workerがマルチホストになっている場合Utahaと同じ手順を全てのWorkerに対して行えば構築ができると思います．

・Master

HOST名	CPU	メモリ	GPU	OS
Master	intel Core i7 3770	8GB(Non ECC)	なし	Ubuntu 16.04LTS

・Worker

HOST名	CPU	メモリ	GPU	OS
Eriri	Intel Core i9 9900K	128GB(Non ECC)	Nvidia Geforce RTX 2080ti 2台	Ubuntu 16.04LTS
Megumi	Intel Core i9 9900K	128GB(Non ECC)	Nvidia Geforce RTX 2080ti 2台	Ubuntu 16.04LTS
Utaha	Xeon Bronze 3104 2socket	192GB(Ecc Registered)	Nvidia Quadro GP100 4台	Ubuntu16.04LTS

Step01[全Node共通事項] : 準備

この章では各種ソフトウェアをインストールする前の準備を行います．明記しない限り全てのWorker，Masterで同じ設定を行ってください．

スワップ機能のオフ

Kubernetesは，スワップ機能に対応していないためスワップ機能をオフにする必要があります．

そこで以下のコマンドを実行してスワップ機能のオンオフを確認してください．

$ cat /etc/fstab

# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/sda2 during installation
UUID=hoge hoge hoge /               ext4    errors=remount-ro 0       1
# /boot/efi was on /dev/sda1 during installation
UUID=hoge hoge  /boot/efi       vfat    umask=0077      0       1
swap was on /dev/sda3 during installation
#UUID=hoge hoge hoge none            swap    sw              0       0

上記のように「swap was on ~」の部分がコメントアウトされていない場合は，スワップ機能がオンになっているためコメントアウトしてスワップ機能をオフにしてください．その後以下のコマンドを実行してスワップがオフになっているか確認してください．

$ swapon

何か表示された場合は以下のコマンドを実行してスワップを削除します．

$ swapoff -a

ポートの開放

次に，kubernetesが使用するポートを開放していきます．

開放しなければならないポートはKubernetesの公式インストールガイドにも書いてあります．

kubernetes.io kubernetes.io

今回はufwを用いてポート制御を行うためufwをオンにし，確認を行ってください．

$ sudo ufw enable
$ sudo ufw status
Status: active

次に，Kubernetes公式インストールガイドにしたがって下記のポートを開けていきます．

また今回はflannelを用いて内部ネットワークを構築するため，以下のように他のポートも開けていきます． kubernetes.io

Node	ポート番号
Master	53, 6443, 2379-2380, 8285, 8472, 9153, 10250-10252
Worker	53, 8285, 8472, 9153, 10250, 30000-32767

Master

$ sudo ufw allow 53
$ sudo ufw allow 6443
$ sudo ufw allow 2379:2380/tcp
$ sudo ufw allow 8285
$ sudo ufw allow 8472
$ sudo ufw allow 9153
$ sudo ufw allow 10250:10252/tcp

Worker

$ sudo ufw allow 53
$ sudo ufw allow 8285
$ sudo ufw allow 8472
$ sudo ufw allow 9153
$ sudo ufw allow 10250
$ sudo ufw allow 30000:32767/tcp

name serverの変更

最後にdnsmasqが悪さをしてgithubのissuesのようにKubernetesのCoreDNSを壊したり， github.com

コンテナ内から通信できなくなったりするためname serverを変更します．

まず，以下のコマンドを実行して設定変更の必要があるかどうかを確認します．

$ cat /etc/resolv.conf 
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 127.0.1.1

上記のようにnameserverが「127.0.1.1」になっている場合は設定変更の必要があります．それ以外の場合は以下の作業を行わなくても大丈夫なはずです．

まず下記のコマンドでNetworkManager.confを表示させ，「dns=dnsmasq」の部分がコメントアウトになっているか確認してください．コメントアウトされていない場合は「dns=dnsmasq」の部分をコメントアウトしてください．

その後ネットワークマネージャを再起動し，resolve.confを再確認してnameserverが変わっていることを確認してください．その際127.0.1.1の設定が残っている場合コメントアウトするか，削除してください．

$ cat /etc/NetworkManager/NetworkManager.conf 
[main]
plugins=ifupdown,keyfile,ofono
dns=dnsmasq

[ifupdown]
managed=false
$ sudo systemctl restart network-manager
$ cat /etc/resolv.conf 
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver xxx.xxx.xxx.xxx

以上で準備は完了です．

Step02[全Node共通事項]:ソフトウェアのインストール

この章では各種ソフトウェアのインストールを行います．明記しない限り全てのWorker，Masterで同じ設定を行ってください．また本記事では以下の記事に書いたような理由で，Pod作成時も「kubectl create」ではなく「kubectl apply」を使用します．

tenzen.hatenablog.com

Nvidia Driver

今回は最新のNvidia Driverの中でも比較的安定しているnvidia-418をインストールしました．

Nvidia DriverはWorker Nodeにのみインストールしてください．（Master NodeにもGPUを積んでWorker Nodeとして使用する場合はMaster Nodeにもインストールしてください．）

下記コマンドを順番に実行します．

$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt-get update
$ sudo apt-get upgrade
$ apt-cache search nvidia-\d+
$ sudo apt-get install nvidia-418
$ reboot

Docker

次にDockerをインストールします．以下のコマンドを順番に実行していけばインストールできます．途中，「sudo apt-cache policy docker-ce」コマンドの出力結果に「docker-ce=18.06.3~ce~3-0~ubuntu」があることを確認してください．大抵の場合最新版のdockerはKubernetesでサポートされてませんが，バージョン指定せずにdockerをインストールしてしまうと最新版がインストールされてしまうため，dockerのバージョンはサポートされているバージョンから選びましょう．今回は18.06.3を選択しました．

$ sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$ sudo apt-key fingerprint 0EBFCD88
$ sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"
$ sudo apt-get update
$ sudo apt-cache policy docker-ce
$ sudo apt-get install -y docker-ce=18.06.3~ce~3-0~ubuntu

ここで，指定バージョンのdockerがインストールできたかの確認をしてPC起動時にdockerが起動するよう設定しておきます．

$ docker -v
Docker version 18.06.3-ce, build d7080c1
$ service docker start && service docker status

Nvidia-Docker2

さらにNvidia-Docker2のインストールをしていきます．

Nvidia-Docker2はWorkerにのみインストールしてください．（MasterにもGPUを積んでWorkerとして使用する場合はMasterにもインストールしてください．）

よく似た名前でNvidia-Dockerと言うものがありますが，別物なので間違ってインストールしないように気をつけてください．

以下のコマンドを順番に実行していきます．

$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt-get update
$ sudo apt-get install -y nvidia-docker2
$ sudo pkill -SIGHUP dockerd

これでNvidia-docker2のインストールは完了しました．

インストールが完了したら，nvidia-docker2のバージョン確認とdockerのランタイムをnvidia-docker2のものへ変更します．

まず，以下のコマンドでnvidia-docker2のバージョンを確認し，nvidia-dockerではなくnvidia-docker2がインストールできているか確認します．

$ sudo nvidia-docker version
NVIDIA Docker: 2.2.2
Client:
 Version:           18.06.3-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        d7080c1
 Built:             Wed Feb 20 02:27:18 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.3-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       d7080c1
  Built:            Wed Feb 20 02:26:20 2019
  OS/Arch:          linux/amd64
  Experimental:     false

次にdockerのランタイムをnvidiaのものへ変更をしていきます．

下記のコマンドを実行してnvidiaランタイムをダウンロード後，/etc/docker/daemon.jsonをnvidiaランタイムに変更します．

$ sudo apt-get install nvidia-container-runtime
$ sudo tee /etc/docker/daemon.json <<EOF
{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}
EOF
$ sudo pkill -SIGHUP dockerd
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker

最後にnvidiaのランタイムが正しく読み込まれているかの確認を行います．下記コマンドを実行後，GPUの情報が表示されれば正しく読み込まれています．

$ sudo docker run --rm nvidia/cuda nvidia-smi
Sun Nov 28 01:08:22 2019           
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56       Driver Version: 418.56       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro GP100        Off  | 00000000:3B:00.0  On |                  Off |
| 28%   44C    P0    32W / 235W |    324MiB / 16276MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Quadro GP100        Off  | 00000000:5E:00.0 Off |                  Off |
| 26%   41C    P0    31W / 235W |      2MiB / 16278MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Quadro GP100        Off  | 00000000:86:00.0 Off |                  Off |
| 26%   40C    P0    25W / 235W |      2MiB / 16278MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Quadro GP100        Off  | 00000000:D8:00.0 Off |                  Off |
| 26%   39C    P0    25W / 235W |      2MiB / 16278MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Kubernetes

最後にKubernetesをインストールしていきます．何もツールを用いずインストールするを行うのは非常に難易度が高いため，今回はKubernetes公式もオススメしているkubeadmを用いてインストールを行っていきます．

kubernetes.io

以下のコマンドを実行してGoogle リポジトリの登録を行い，その後パッケージのインストールを行います．

$ sudo apt-get update
$ sudo apt-get update && sudo apt-get install -y apt-transport-https
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
#kubernetes用アプトゲットデータ取得元リストの登録
$ sudo tee /etc/apt/sources.list.d/kubernetes.list <<EOF
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
deb http://apt.kubernetes.io/ kubernetes-xenial main
$ sudo apt-get update
$ sudo apt-get install -y kubectl kubelet kubeadm
$ sudo apt-mark hold kubectl kubeadm kubelet docker-ce docker-ce-cli containerd.io nvidia-docker docker

Step03 : Masterのセットアップ

Step01でスワップ機能をオフにしましたが，念のためにもう一度オフにしておきます．その後，kubeadmを用いてセットアップしていきます．「sudo kubeadm init」コマンドのオプション「----pod-network-cidr」は，クラスタ内ネットワークの構築にFlannelを使用する場合の設定です．他の方法で構築する場合は以下のKubernetes公式ページを参考に設定してください．

kubernetes.io

kubeadmでのセットアップ

$ sudo swapoff -a
$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.16.3
[preflight] Running pre-flight checks
    [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [Master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.21.159]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [Master localhost] and IPs [xxx.xxx.xxx.xxx 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [Master localhost] and IPs [xxx.xxx.xxx.xxx 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 16.506152 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.16" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node Master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node Master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: hogehoge.hogehogehoge
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join xxx.xxx.xxx.xxx:6443 --token hogehoge.hogehogehoge \
    --discovery-token-ca-cert-hash sha256:hogehogehoge

下の方に「Your Kubernetes control-plane has initialized successfully!」と表示されれば成功です．このMasterが管理するKubernetes ClusterにWorkerを追加する際使用するtokenなので，一番下の「kubeadm join xxx.xxx.xxx.xxx:6443 --token hogehoge.hogehogehoge --discovery-token-ca-cert-hash sha256:hogehogehoge」をどこかに保存しておいてください．このtokenの有効期限は24時間のため有効期限が切れた場合は以下のコマンドで生成済みのtokenの有無を確認の上新規発行を行ってください．

$ sudo kubeadm token list
TOKEN                     TTL       EXPIRES                     USAGES                   DESCRIPTION                                                EXTRA GROUPS
hogehogehoge   23h       2019-11-29T20:53:35+09:00   authentication,signing   The default bootstrap token generated by 'kubeadm init'.   system:bootstrappers:kubeadm:default-node-token
$ sudo kubeadm token create --print-join-command
kubeadm join xxx.xxx.xxx.xxx:6443 --token hogehogehoge     --discovery-token-ca-cert-hash sha256:hogehogehoge

ここで設定ミスをして，もう一度「sudo kubeadm init」コマンドを実行する場合「sudo kubeadm reset」を実行してkubeadmの設定をリセットしてから再度「sudo kubeadm init」コマンドを実行してください．

次に，kubectlで使用する認証ファイルの準備を行うため、以下のコマンドを実行してください．

$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

Kubernetes Cluster内ネットワークの設定

本記事では，Kubernetes Cluster内ネットワークの構築にFlannelを使用します． Kubernetes Cluster内ネットワークの詳細は改めて記事にします．

以下のコマンドを実行していき，Flannelのデプロイを行います．

$ cd
#作業用ディレクトリを作成する．
$ mkdir ./k8s
$ cd ./k8s
$ wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
--2019-11-28 23:59:89--  https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
raw.githubusercontent.com (raw.githubusercontent.com) をDNSに問いあわせています... 151.101.228.133
raw.githubusercontent.com (raw.githubusercontent.com)|151.101.228.133|:443 に接続しています... 接続しました。
HTTP による接続要求を送信しました、応答を待っています... 200 OK
長さ: 14416 (14K) [text/plain]
`kube-flannel.yml' に保存中

kube-flannel.yml                        100%[===============================================================================>]  14.08K  --.-KB/s    時間 0.007s

2019-11-28 23:59:99 (1.91 MB/s) - `kube-flannel.yml' へ保存完了 [14416/14416]
$ kubectl apply -f kube-flannel.yml
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-amd64 created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created
$ service kubelet restart

念の為に正しく設定できているか確認しておきます．

$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                            READY   STATUS    RESTARTS   AGE     IP               NODE    NOMINATED NODE   READINESS GATES
kube-system   coredns-5644d7b6d9-48spc        1/1     Running   0          46m     10.244.0.2       Master   <none>           <none>
kube-system   coredns-5644d7b6d9-q924l        1/1     Running   0          46m     10.244.0.4       Master   <none>           <none>
kube-system   etcd-k8s-m                      1/1     Running   0          46m     xxx.xxx.xxx.xxx   Master   <none>           <none>
kube-system   kube-apiserver-k8s-m            1/1     Running   0          45m     xxx.xxx.xxx.xxx   Master   <none>           <none>
kube-system   kube-controller-manager-k8s-m   1/1     Running   0          45m     xxx.xxx.xxx.xxx   Master   <none>           <none>
kube-system   kube-flannel-ds-amd64-g99fj     1/1     Running   0          5m59s   xxx.xxx.xxx.xxx   Master   <none>           <none>
kube-system   kube-proxy-nc5wf                1/1     Running   0          46m     xxx.xxx.xxx.xxx   Master   <none>           <none>
kube-system   kube-scheduler-k8s-m            1/1     Running   0          46m     xxx.xxx.xxx.xxx   Master   <none>           <none>

ここで，すべてのサービスのSTATUSがRunningになっていることを確認してください．特にcorednsが正しく起動できていない場合，Step01の設定が正しく行えていないことが考えられるためよく確認してください．

Step04 : Workerのセットアップ

Workerの方も，念のためにもう一度スワップ機能をオフにしておきます．その後Masterで発行されて保存したtokenを入力します．

$ sudo swapoff -a
$ sudo kubeadm join xxx.xxx.xxx.xxx:6443 --token hoge.hogehoge --discovery-token-ca-cert-hash sha256:hogehogehoge
[preflight] Running pre-flight checks
    [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.16" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

「Run 'kubectl get nodes' on the control-plane to see this node join the cluster.」と最後に表示されればさきほど作成したKubernetes Clusterへの追加が成功しています．

Step05：最終確認

最後に，追加したWorkerがKubernetes Clusterに追加されたか確認を行うためMasterで以下のコマンドを実行してください．

$ kubectl get nodes
NAME    STATUS   ROLES    AGE   VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
Eriri   Ready    <none>   63m   v1.16.3   xxx.xxx.xxx.xxx   <none>        Ubuntu 16.04.6 LTS   4.4.0-166-generic   docker://18.6.3
Megumi   Ready    <none>   42m   v1.16.3   xxx.xxx.xxx.xxx   <none>        Ubuntu 16.04.6 LTS   4.4.0-166-generic   docker://18.6.3
Utaha   Ready    <none>   21m   v1.16.3   xxx.xxx.xxx.xxx   <none>        Ubuntu 16.04.6 LTS   4.4.0-166-generic   docker://18.6.3
Master   Ready    master   95m   v1.16.3   xxx.xxx.xxx.xxx   <none>        Ubuntu 16.04.6 LTS   4.4.0-169-generic   docker://18.6.3
$

このように追加したMaster・Workerがすべて表示されていれば，Kubernetesのセットアップは完了です．いよいよ次の章から，機械学習環境の構築を行っていきます．

Kubernetes Cluster上でのDeepLearning環境の構築

この章では，先ほど構築したKubernetes Cluster上に機械学習環境の構築を行っていきます．作業場所は都度指定していますので，指定場所で実行してください．

NVIDIA-device-plugin-for-Kubernetesの導入

NVIDIA-device-plugin-for-Kubernetesのデーモンセットをデプロイします.以下はMasterで行ってください.

$ cd k8s
$ wget https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml
--2019-11-28 23:11:44--  https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml
raw.githubusercontent.com (raw.githubusercontent.com) をDNSに問いあわせています... 151.101.228.133
raw.githubusercontent.com (raw.githubusercontent.com)|151.101.228.133|:443 に接続しています... 接続しました。
HTTP による接続要求を送信しました、応答を待っています... 200 OK
長さ: 2320 (2.3K) [text/plain]
`nvidia-device-plugin.yml' に保存中

nvidia-device-plugin.yml            100%[===================================================================>]   2.27K  --.-KB/s    時間 0s    

2019-11-28 23:11:44 (33.3 MB/s) - `nvidia-device-plugin.yml' へ保存完了 [2320/2320]
$ kubectl apply -f nvidia-device-plugin.yml
daemonset.apps/nvidia-device-plugin-daemonset created

次にNVIDIA-device-plugin-for-Kubernetesデーモンセットが正常にデプロイできているかどうか確認します.「nvidia-device-plugin-daemonset-xxxxx」のSTATUSがRUNNINGになっていれば正常にデプロイできています. また,デプロイしたNODEもWorkerの場所になっているかどうか確認してください.

$ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE     IP               NODE    NOMINATED NODE   READINESS GATES
kube-system   coredns-5644d7b6d9-48spc               1/1     Running   0          148m    10.244.0.2       Master   <none>           <none>
kube-system   coredns-5644d7b6d9-q924l               1/1     Running   0          148m    10.244.0.4       Master   <none>           <none>
kube-system   etcd-k8s-m                             1/1     Running   0          147m    xxx.xxx.xxx.xxx   Master   <none>           <none>
kube-system   kube-apiserver-k8s-m                   1/1     Running   0          147m    xxx.xxx.xxx.xxx   Master   <none>           <none>
kube-system   kube-controller-manager-k8s-m          1/1     Running   0          147m    xxx.xxx.xxx.xxx   Master   <none>           <none>
kube-system   kube-flannel-ds-amd64-g99fj            1/1     Running   0          107m    xxx.xxx.xxx.xxx   Master   <none>           <none>
kube-system   kube-flannel-ds-amd64-r92b4            1/1     Running   2          85m     xxx.xxx.xxx.xxx   Utaha   <none>           <none>
kube-system   kube-proxy-7fvl8                       1/1     Running   0          85m     xxx.xxx.xxx.xxx   Utaha   <none>           <none>
kube-system   kube-proxy-nc5wf                       1/1     Running   0          148m    xxx.xxx.xxx.xxx   Master   <none>           <none>
kube-system   kube-scheduler-k8s-m                   1/1     Running   0          147m    xxx.xxx.xxx.xxx   Master   <none>           <none>
kube-system   nvidia-device-plugin-daemonset-2bwxt   1/1     Running   0          6m37s   10.244.1.2       Utaha   <none>           <none>

以下の作業はやる必要がありません．（2020/02/22訂正） ~~次にDocker hubからイメージを取得してきます.以下はWorkerで実行してください.~~

$ docker pull nvidia / k8s-device-plugin：1.0.0-beta4
$ sudo docker run --security-opt=no-new-privileges --cap-drop=ALL --network=none -it -v /var/lib/kubelet/device-plugins:/var/lib/kubelet/device-plugins nvidia/k8s-device-plugin:1.0.0-beta4
2019/12/1 05:18:26 Loading NVML
2019/12/1 05:18:26 Fetching devices.
2019/12/1 05:18:26 Starting FS watcher.
2019/12/1 05:18:26 Starting OS watcher.
2019/12/1 05:18:26 Starting GRPC server
2019/12/1 05:18:26 Starting to serve on /var/lib/kubelet/device-plugins/nvidia.sock
2019/12/1 05:18:27 Registered device plugin with Kubelet

セットアップが完了したので,テスト用Podをデプロイして確認します.　デプロイを行うのでもちろん以下の作業はMasterで行ってください. デプロイするPodの中身は自分でマニフェストファイルを書くか,以下の内容で「sample.yaml」ファイルを作成してください.

piVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  containers:
    - name: cuda-container
      image: nvidia/cuda:10.0-devel
      tty: true
      resources:
        limits:
          nvidia.com/gpu: 1 # requesting 1 GPUs

マニフェストが完成したら以下のコマンドを実行してgpu-podをデプロイし,デプロイしたPodのStatusがRunnningになっているかどうか確認します.

$ kubectl apply -f sample.yaml
pod/gpu-pod created
$ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE   IP               NODE    NOMINATED NODE   READINESS GATES
default       gpu-pod                                1/1     Running   0          10s   10.244.1.5       Utaha   <none>           <none>
（省略）

以上でKubernetes Cluster上に基本的なのDeepLearning構築をすることができました.

DeepLearning環境Deploymentの作成

次に，構築したKubernetes Cluster上にDockerfileを用いてDeploymentを作成していきます． Dockerイメージで基本的なパッケージを準備し，そのイメージを使用して様々なホストにスケーリングが容易なDeploymentを構築していきます．先ほどの「sample.yaml」ではPodをデプロイしていましたが，本番環境になりますのでKubernetes推奨のDeploymentのデプロイしていきます．DeploymentやPodの違いの詳しい話は改めて記事にします．

hub.docker.com

Dockerイメージの準備

DockerfileではDeepLearningで使用しそうな以下のパッケージを内包しています.

curl
wget
git
unzip
imagemagick
bzip2
vim
pyenv
anaconda3-4.4.0
libsm6
cuda 10.0
cndnn 7.6.5
opencv-python 3.4.7.28
tensorflow-gpu 1.13.1
keras
torch
torchvision
libgl1-mesa-dev
tqdm
torchsummary
progressbar

上記のようなパッケージを内包するイメージを作成するため以下のようなDockerfileをMasterで作成してください.

FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04

#preparation
RUN apt-get update
RUN apt-get install -y curl wget git unzip imagemagick bzip2 vim
RUN git clone https://github.com/pyenv/pyenv.git .pyenv

WORKDIR /
ENV HOME  /
ENV PYENV_ROOT /.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH

RUN pyenv install anaconda3-4.4.0
RUN pyenv global anaconda3-4.4.0
RUN pyenv rehash

RUN pip install --upgrade pip
RUN apt-get update && apt-get install -y libsm6
RUN pip install opencv-python==3.4.7.28
RUN pip install tensorflow-gpu==1.13.1 --ignore-installed --user
RUN pip install keras  
RUN pip install torch torchvision && apt-get install -y libgl1-mesa-dev
RUN pip install tqdm
RUN pip install torchsummary
RUN pip install progressbar

Dockerfileをビルドします.　しばらく時間がかかるので待ちましょう.

$ sudo docker build -t dl_env .
Sending build context to Docker daemon  45.06kB
Step 1/20 : FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04
10.0-cudnn7-devel-ubuntu16.04: Pulling from nvidia/cuda
976a760c94fc: Pull complete 
c58992f3c37b: Pull complete 
0ca0e5e7f12e: Pull complete 
f2a274cc00ca: Pull complete 
708a53113e13: Pull complete 
465b2edc87fb: Pull complete 
4189f57a58ef: Pull complete 
35de2d1091bb: Pull complete 
719d77537fdc: Pull complete 
3745e7bcc1b3: Pull complete 
Digest: sha256:6cd48444de35a2aa8fa8652da86769205f6e895167d304537403e169bcee1fd8
Status: Downloaded newer image for nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04
 ---> 7bb1f0b039e1
Step 2/20 : RUN apt-get update
 ---> Running in 35fe3e24dc9d
（省略）
Successfully built b0eeafa8df6d
Successfully tagged dl_env:latest

次に,レジストリに作成したイメージをアップロードします. Docker hubのようなサービスを利用するか，プライベートレジストリを作成してそれを利用してください．今回は，プライベートレジストリを使用します．プライベートレジストリの構築については過去に記事に書いたのでそちらを参照してください．今回はMaster上でレジストリコンテナをデプロイしています．

tenzen.hatenablog.com

docker hubにアップロードしたい場合はあらかじめdocker hubのアカウントを取得後，Masterで以下のようなコマンドを実行し，ログインしておいてください．

$ sudo docker login

また，docker hubを使用する場合下記のアップロードの方法やタグのつけ方など若干違いますので，注意してください．

作成しておいたイメージにタグをつけてからレジストリにアップロードを行います. 今回は先ほどビルドした「dl_env」をレジストリにアップロードします．

$ docker images
REPOSITORY                              TAG                             IMAGE ID            CREATED             SIZE
dl_env                                  latest                          b0eeafa8df6d        15 hours ago        9.3GB
nvidia/cuda                             10.0-cudnn7-devel-ubuntu16.04   7bb1f0b039e1        2 days ago          3.04GB
k8s.gcr.io/kube-proxy                   v1.16.3                         9b65a0f78b09        2 weeks ago         86.1MB
k8s.gcr.io/kube-apiserver               v1.16.3                         df60c7526a3d        2 weeks ago         217MB
k8s.gcr.io/kube-controller-manager      v1.16.3                         bb16442bcd94        2 weeks ago         163MB
k8s.gcr.io/kube-scheduler               v1.16.3                         98fecf43a54f        2 weeks ago         87.3MB
k8s.gcr.io/etcd                         3.3.15-0                        b2756210eeab        2 months ago        247MB
k8s.gcr.io/coredns                      1.6.2                           bf261d157914        3 months ago        44.1MB
quay.io/coreos/flannel                  v0.11.0-amd64                   ff281650a721        10 months ago       52.6MB
k8s.gcr.io/pause                        3.1                             da86e6ba6ca1        23 months ago       742kB
konradkleine/docker-registry-frontend   v2                              60d4b91e68fa        2 years ago         266MB
registry                                2.3.0                           5eaced67751b        3 years ago         166MB
$ sudo docker tag dlenv:latest xxx.xxx.xxx.xxx:5000/k8s/dl_env:hoge
$ sudo docker push xxx.xxx.xxx.xxx:5000/k8s/dl_env:hoge
The push refers to repository [xxx.xxx.xxx.xxx:5000/k8s/dl_env]
e428433c3f7c: Pushed 
334a36d32e08: Pushed 
02329ea76a91: Pushed 
e3078c762ee2: Pushed 
aa04522b497b: Pushed 
11966f162c9e: Pushed 
78b5aab0476b: Pushed 
f8f213d08b00: Pushed 
f8dda259a125: Pushed 
5159c055acb8: Pushed 
b8d56ef47379: Pushed 
4b894dd79dd5: Pushed 
1f8956b49964: Pushed 
9154b69eed62: Pushed 
33d281c6c667: Pushed 
ca1122f45652: Pushed 
585032098094: Pushed 
cb0856d33862: Pushed 
03a1cbfdd831: Pushed 
e579c4e796e4: Pushed 
b7ee80f86be3: Pushed 
aa7f8c8d5f39: Pushed 
48817fbd6c92: Pushed 
1b039d138968: Pushed 
7082d7d696f8: Pushed 
hoge: digest: sha256:hogehogehogehogehoge size: 5579

マニフェストの準備

先ほども言いましたが，本記事ではDeploymentを使用して環境構築をしていきます．以下に私が準備したマニフェストのgpu-deployment.yamlを使用するか，自分で作成した物を使用してください．また，以下全てMasterで行ってください．

apiVersion: apps/v1
kind: Deployment
metadata:
  name: gpu-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gpu-dlenv
  template:
    metadata:
      labels:
        app: gpu-dlenv
    spec:
      containers:
        - name: gpu-dlenv-container
          image: xxx.xxx.xxx.xxx:5000/k8s/dl_env:hoge

#デプロイするNodeにのみ存在するimageを使用する場合は下記を記述する．
#          imagePullPolicy: Never

          tty: true
          volumeMounts:
          - mountPath: /srv
            name: host-share
          resources:
            limits:
              nvidia.com/gpu: 2 #GPUの個数を制限する．
      volumes:
      - name: host-share
        hostPath:
          path: /host_hoge
          type: DirectoryOrCreate
#nodeを指定す場合は下記のように記述する． 
#      nodeSelector:
#          type: <label名>

今回作成するDeploymentではReplica数を1にしています．これによりオートヒーリング機能が使用可能になり，何らかの問題でホストに障害が発生した場合でも別ホストにて同じDeploymentを自動で作成してくれます．

spec.template.spec.containers.imageでは先ほどのPrivate Docker RegistryにアップロードしたImageを指定しています．その他使用したいイメージを指定することができますが，使用するイメージがNode上にしか存在しない物を使用する際はspec.template.spec.containers.imagePullPolicyをNeverに指定してください．これを指定しないとデプロイする際エラーを吐いてDeploymentの作成をすることができません．

Volumeについては共有ドライブとしてNFSサーバなどを建ててpersistentVolumeをマウントするべきですが，私の環境はネットワークが貧弱でDeepLearningに使用する大量の学習データなどをネットワーク経由で転送することが難しいためホストのドライブをマウントするように設定しています．例ではspec.template.spec.containers.volumeMountsでPod側の場所を，spec.template.spec.volumes.hostPath.pathでHost側の場所を指定していて，Pod内の/srvにHostの/host_hogeをマウントしています．

さらに例では条件に合う計算リソースを自動選択してDeploymentを作成するようにしていますが，spec.template.spec.nodeSelectorでホストを指定することでDeploymentを作成するホストを強制することもできます．

デプロイ

全ての準備が完了したのでDeploymentを作成していきます．以下のコマンドをMasterで実行してください．

$ kubectl apply -f gpu-deployment.yaml --record
deployment.apps/gpu-deployment created

上記コマンドでは「--record」オプションをつけることでアップデート履歴を保持できるようにしています．次にdeployment・ReplicaSet・Podが正常に作成できているかどうか確認します．

$ kubectl get deployments --all-namespaces -o wide
NAMESPACE     NAME             READY   UP-TO-DATE   AVAILABLE   AGE    CONTAINERS            IMAGES                                SELECTOR
default       gpu-deployment   1/1     1            1           4m4s   gpu-dlenv-container   xxx.xxx.xxx.xxx:5000/k8s/dl_env:hoge   app=gpu-dlenv
kube-system   coredns          2/2     2            2           3d4h   coredns               k8s.gcr.io/coredns:1.6.2              k8s-app=kube-dns
$
$
$ kubectl get replicasets --all-namespaces -o wide
NAMESPACE     NAME                        DESIRED   CURRENT   READY   AGE     CONTAINERS            IMAGES                                SELECTOR
default       gpu-deployment-7ffd6c86d5   1         1         1       7m44s   gpu-dlenv-container   xxx.xxx.xxx.xxx:5000/k8s/dl_env:hoge   app=gpu-dlenv,pod-template-hash=7ffd6c86d5
kube-system   coredns-5644d7b6d9          2         2         2       3d5h    coredns               k8s.gcr.io/coredns:1.6.2              k8s-app=kube-dns,pod-template-hash=5644d7b6d9
$
$
$ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE     IP               NODE    NOMINATED NODE   READINESS GATES
default       gpu-deployment-7ffd6c86d5-q7gbg        1/1     Running   0          2m56s   10.244.1.29      utaha   <none>           <none>
（省略）

それぞれRunnnigになっていて正常に起動できていることがわかります．また以下のコマンドを実行して見ると先ほどPrivate Docker RegistryからアップロードしたイメージをWorkerにダウンロードできていることがわかります．pod以下には先ほど確認したPodの名前を入力してください．

$ kubectl describe pod gpu-deployment-7ffd6c86d5-q7gbg
（省略）
Events:
  Type    Reason     Age        From               Message
  ----    ------     ----       ----               -------
  Normal  Scheduled  <unknown>  default-scheduler  Successfully assigned default/gpu-deployment-7ffd6c86d5-q7gbg to utaha
  Normal  Pulling    15m        kubelet, utaha     Pulling image "xxx.xxx.xxx.xxx:5000/k8s/dl_env:hoge"
  Normal  Pulled     15m        kubelet, utaha     Successfully pulled image "xxx.xxx.xxx.xxx:5000/k8s/dl_env:hoge"
  Normal  Created    15m        kubelet, utaha     Created container gpu-dlenv-container
  Normal  Started    15m        kubelet, utaha     Started container gpu-dlenv-container

これで環境構築は完了しました．

コンテナの確認

次に，実際コンテナ内をみてDeepLearnig環境の構築ができているかどうか，Volumeが機能しているかどうかなどを確認していきます．

まず，以下のコマンドで擬似的にコンテナ内にログインします．今回はコンテナを一つしか作成していないため指定しませんが，一つのPod内に複数コンテナを作成した場合には「-c:<コンテナ名>」オプションで擬似ログインするコンテナを指定してください．以下明示しない限りMasterで行ってください．

$ kubectl exec -it gpu-deployment-7ffd6c86d5-q7gbg /bin/bash
root@gpu-deployment-7ffd6c86d5-q7gbg:/# nvidia-smi 
Sun Dec  1 17:52:33 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00    Driver Version: 418.87.00    CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro GP100        Off  | 00000000:86:00.0 Off |                  Off |
| 26%   36C    P0    25W / 235W |      2MiB / 16278MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Quadro GP100        Off  | 00000000:D8:00.0 Off |                  Off |
| 26%   35C    P0    24W / 235W |      2MiB / 16278MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

root@gpu-deployment-7ffd6c86d5-q7gbg:/#python3
Python 3.6.1 |Anaconda 4.4.0 (64-bit)| (default, May 11 2017, 13:09:58) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2   
>>> cv2.__version__
'3.4.7'
>>>
>>> import tensorflow as tf 
>>> tf.__version__
'1.13.1'
>>>
>>> import keras
Using TensorFlow backend.
>>> keras.__version__
'2.3.1'
>>>
>>> torch.__version__
'1.3.1'
>>> 
>>> import torchvision
>>> torchvision.__version__
'0.4.2'
>>> exit()
root@gpu-deployment-7ffd6c86d5-q7gbg:/# 
root@gpu-deployment-7ffd6c86d5-q7gbg:/# conda -V
conda 4.3.21

上記のようにDockerfileで指定したパッケージがインストールされていることがわかります．また，GPUも2本認識されていることがわかります．

次にVolumeの確認を行っていきます．ホストの/host_hogeをPod内の/srvにマウントしたので，確認の前に/host_hogeにサンプルファイルを作成しておきます．以下の作業のみDeploymentが作成されたWorker上で行ってください．

$ cd /host_hoge
$ touch test.txt

次に，Workerホスト上で作成したファイルが/srvに反映されているかどうか確認します．

$ kubectl exec -it gpu-deployment-7ffd6c86d5-q7gbg /bin/bash
root@gpu-deployment-7ffd6c86d5-q7gbg:/# cd /srv
root@gpu-deployment-7ffd6c86d5-q7gbg:/srv# ls
host_test.txt
root@gpu-deployment-7ffd6c86d5-q7gbg:/srv#

最後に，GPUでtensorflow用のmnistサンプルを使って学習させてみます．

$ kubectl exec -it gpu-deployment-7ffd6c86d5-q7gbg /bin/bash
root@gpu-deployment-7ffd6c86d5-q7gbg:/# cd /srv
root@gpu-deployment-7ffd6c86d5-q7gbg:/srv# wget https://raw.githubusercontent.com/fchollet/keras/master/examples/mnist_cnn.py
--2019-12-01 18:42:06--  https://raw.githubusercontent.com/fchollet/keras/master/examples/mnist_cnn.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.228.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.228.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2257 (2.2K) [text/plain]
Saving to: 'mnist_cnn.py'

mnist_cnn.py              100%[=====================================>]   2.20K  --.-KB/s    in 0s      

2019-12-01 18:42:06 (35.2 MB/s) - 'mnist_cnn.py' saved [2257/2257]
root@gpu-deployment-7ffd6c86d5-q7gbg:/srv# python3 mnist_cnn.py 
Using TensorFlow backend.
/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
11493376/11490434 [==============================] - 13s 1us/step
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
WARNING:tensorflow:From /.local/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /.local/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2019-12-01 18:45:08.955276: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2019-12-01 18:45:09.416060: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5978410 executing computations on platform CUDA. Devices:
2019-12-01 18:45:09.416135: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): Quadro GP100, Compute Capability 6.0
2019-12-01 18:45:09.416154: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (1): Quadro GP100, Compute Capability 6.0
2019-12-01 18:45:09.421518: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1699985000 Hz
2019-12-01 18:45:09.422753: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x59ea1a0 executing computations on platform Host. Devices:
2019-12-01 18:45:09.422798: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-12-01 18:45:09.423150: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: Quadro GP100 major: 6 minor: 0 memoryClockRate(GHz): 1.4425
pciBusID: 0000:86:00.0
totalMemory: 15.90GiB freeMemory: 15.64GiB
2019-12-01 18:45:09.423319: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties: 
name: Quadro GP100 major: 6 minor: 0 memoryClockRate(GHz): 1.4425
pciBusID: 0000:d8:00.0
totalMemory: 15.90GiB freeMemory: 15.64GiB
2019-12-01 18:45:09.426778: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2019-12-01 18:45:09.432828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-01 18:45:09.432890: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 1 
2019-12-01 18:45:09.432909: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N Y 
2019-12-01 18:45:09.432926: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1:   Y N 
2019-12-01 18:45:09.433319: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15212 MB memory) -> physical GPU (device: 0, name: Quadro GP100, pci bus id: 0000:86:00.0, compute capability: 6.0)
2019-12-01 18:45:09.434461: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15212 MB memory) -> physical GPU (device: 1, name: Quadro GP100, pci bus id: 0000:d8:00.0, compute capability: 6.0)
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
2019-12-01 18:45:11.671905: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
60000/60000 [==============================] - 7s 121us/step - loss: 0.2637 - accuracy: 0.9186 - val_loss: 0.0531 - val_accuracy: 0.9823
Epoch 2/12
60000/60000 [==============================] - 4s 72us/step - loss: 0.0891 - accuracy: 0.9738 - val_loss: 0.0430 - val_accuracy: 0.9857
Epoch 3/12
60000/60000 [==============================] - 4s 70us/step - loss: 0.0662 - accuracy: 0.9801 - val_loss: 0.0315 - val_accuracy: 0.9891
Epoch 4/12
60000/60000 [==============================] - 4s 71us/step - loss: 0.0550 - accuracy: 0.9833 - val_loss: 0.0297 - val_accuracy: 0.9895
Epoch 5/12
60000/60000 [==============================] - 4s 69us/step - loss: 0.0462 - accuracy: 0.9859 - val_loss: 0.0295 - val_accuracy: 0.9900
Epoch 6/12
60000/60000 [==============================] - 4s 70us/step - loss: 0.0419 - accuracy: 0.9870 - val_loss: 0.0283 - val_accuracy: 0.9900
Epoch 7/12
60000/60000 [==============================] - 4s 71us/step - loss: 0.0363 - accuracy: 0.9884 - val_loss: 0.0266 - val_accuracy: 0.9903
Epoch 8/12
60000/60000 [==============================] - 4s 71us/step - loss: 0.0335 - accuracy: 0.9897 - val_loss: 0.0267 - val_accuracy: 0.9913
Epoch 9/12
60000/60000 [==============================] - 4s 70us/step - loss: 0.0317 - accuracy: 0.9900 - val_loss: 0.0311 - val_accuracy: 0.9893
Epoch 10/12
60000/60000 [==============================] - 4s 70us/step - loss: 0.0287 - accuracy: 0.9910 - val_loss: 0.0270 - val_accuracy: 0.9910
Epoch 11/12
60000/60000 [==============================] - 4s 71us/step - loss: 0.0286 - accuracy: 0.9911 - val_loss: 0.0285 - val_accuracy: 0.9912
Epoch 12/12
60000/60000 [==============================] - 4s 71us/step - loss: 0.0264 - accuracy: 0.9918 - val_loss: 0.0290 - val_accuracy: 0.9904
Test loss: 0.028990692665684947
Test accuracy: 0.9904000163078308
root@gpu-deployment-7ffd6c86d5-q7gbg:/srv#

学習実行中に別ターミナルを開いて以下のコマンドを実行するとGPUで実行できていることが確認できます．

$ kubectl exec gpu-deployment-7ffd6c86d5-q7gbg -- watch -n1 "nvidia-smi"
Sun Dec  1 18:50:58 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00    Driver Version: 418.87.00    CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro GP100        Off  | 00000000:86:00.0 Off |                  Off |
| 33%   48C    P0    95W / 235W |  15745MiB / 16278MiB |     45%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Quadro GP100        Off  | 00000000:D8:00.0 Off |                  Off |
| 26%   36C    P0    34W / 235W |    267MiB / 16278MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

計算資源の可視化

metrics serverとKubernetes Dashboard2.0を用いてkubernetes cluster計算リソースの可視化を行っていきます．これにより監視がしやすくなります．以下の記事を参考にさせていただきました．また，この作業はMasterで行ってください．

qiita.com

metrics server

Kubernetesでの計算資源利用率の収集などはHeapsterが有名でしたが、サポートが終了することが決定しています．

github.com

RETIRED: Heapster is now retired. See the deprecation timeline for more information on support. We will not be making changes to Heapster.

そこで，今回はmetrics serverを使用していきます．

github.com

以下のコマンドを実行してmetrics serverをダウンロードしてきます．

$ git clone https://github.com/kubernetes-incubator/metrics-server.git

ダウンロードしてきたままではmetrics serverが正常に起動しないようなので下記のように「metrics-server/deploy/1.8+/metrics-server-deployment.yaml」を書き換えます．

github.com

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: metrics-server
  namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metrics-server
  namespace: kube-system
  labels:
    k8s-app: metrics-server
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  template:
    metadata:
      name: metrics-server
      labels:
        k8s-app: metrics-server
    spec:
      serviceAccountName: metrics-server
      volumes:
      # mount in tmp so we can safely use from-scratch images and/or read-only containers
      - name: tmp-dir
        emptyDir: {}
      containers:
      - name: metrics-server
        image: k8s.gcr.io/metrics-server-amd64:v0.3.6
        args:
          - --cert-dir=/tmp
          - --secure-port=4443
        ports:
        - name: main-port
          containerPort: 4443
          protocol: TCP
        securityContext:
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        imagePullPolicy: Always
+       command:
+       - /metrics-server
+       - --kubelet-insecure-tls
+       - --kubelet-preferred-address-types=InternalDNS,InternalIP,ExternalDNS,ExternalIP,Hostname
        volumeMounts:
        - name: tmp-dir
          mountPath: /tmp
      nodeSelector:
        beta.kubernetes.io/os: linux

これで正常に起動するはずなのでmetrics serverをデプロイします．

$ kubectl apply -f metrics-server/deploy/1.8+/
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
$
$
$ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE     IP               NODE    NOMINATED NODE   READINESS GATES
kube-system   metrics-server-795b774c76-7k6tp        1/1     Running   0          3m29s   10.244.1.30      utaha   <none>           <none>
（省略）

Workerでmetricsで起動していることを確認してください．Workerで起動していないと資源情報の収集が行えません．

Kubernetes Dashboard 2.0

Kubernetes Dashboard 1.xは紹介してるブログが多く，公式GithubのReadmeでもv1.xのインストール方法が書いてありますが，Kubernetes1.6では動作しないので注意してください．

github.com

本記事執筆段階でKubernetes Dashboard 2.0はまだBeta版しかリリースされておりません．今回は最新版のBeta6を使用して構築していきます．

$ wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta6/aio/deploy/recommended.yaml
--2019-12-02 05:30:40--  https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta6/aio/deploy/recommended.yaml
raw.githubusercontent.com (raw.githubusercontent.com) をDNSに問いあわせています... 151.101.228.133
raw.githubusercontent.com (raw.githubusercontent.com)|151.101.228.133|:443 に接続しています... 接続しました。
HTTP による接続要求を送信しました、応答を待っています... 200 OK
長さ: 7568 (7.4K) [text/plain]
`recommended.yaml' に保存中

recommended.yaml                    100%[==================================================================>]   7.39K  --.-KB/s    時間 0s    

2019-12-02 05:30:41 (28.6 MB/s) - `recommended.yaml' へ保存完了 [7568/7568]

Serviceをデプロイする際デフォルトではKubernetes Cluster内からしかアクセスできないようになっていますが，Kubernetes Cluster外からでもアクセスできるようにするためrecommend.yamlを書き換えます．

（省略）
---

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard

---

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
+ type: NodePort
  ports:
    - port: 443
      targetPort: 8443
+     nodePort: 30794
  selector:
    k8s-app: kubernetes-dashboard

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-certs
  namespace: kubernetes-dashboard
type: Opaque

---
（省略）

以下のコマンドでKubernetes dashboard2.0をデプロイし，確認を行います．

$ kubectl apply -f recommended.yaml 
namespace/kubernetes-dashboard created
serviceaccount/kubernetes-dashboard created
service/kubernetes-dashboard created
secret/kubernetes-dashboard-certs created
secret/kubernetes-dashboard-csrf created
secret/kubernetes-dashboard-key-holder created
configmap/kubernetes-dashboard-settings created
role.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
deployment.apps/kubernetes-dashboard created
service/dashboard-metrics-scraper created
deployment.apps/dashboard-metrics-scraper created
$
$
$ kubectl get svc -n kubernetes-dashboard
NAME                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)         AGE
dashboard-metrics-scraper   ClusterIP   10.104.216.215   <none>        8000/TCP        2m28s
kubernetes-dashboard        NodePort    10.110.229.237   <none>        443:30794/TCP   2m28s

kubernetes Dashboardの起動ができたのでブラウザで「<MasterのIPアドレス>:30794」にアクセスします．以下のように表示されルと思うので，トークンを準備します．

k8s dashboard login page — Kubernetes Dashboard 2.0のログイン画面

以下のようなdashboard-admin-serviceaccount.yamlを作成します．

apiVersion: v1
kind: ServiceAccount
metadata:
  name: dashboard-admin
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: dashboard-admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: dashboard-admin
    namespace: kube-system

dashboard-admin-serviceaccountをデプロイします．

$ kubectl apply -f dashboard-admin-serviceaccount.yaml
serviceaccount/dashboard-admin created
clusterrolebinding.rbac.authorization.k8s.io/dashboard-admin created
$
$
$ kubectl get secret -n kube-system | grep admin
dashboard-admin-token-bvcfp                      kubernetes.io/service-account-token   3      6m2s

最後のコマンドで取得してきた名前のトークンを表示させます．

$ kubectl describe secret -n kube-system dashboard-admin-token-bvcfp
Name:         dashboard-admin-token-bvcfp
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: dashboard-admin
              kubernetes.io/service-account.uid: 70c51d53-2424-48f7-8f1c-67b860541baa

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1025 bytes
namespace:  11 bytes
token:      hogehogehoge...

表示されたトークンをブラウザのトークンの所に貼り付けてログインします．正常に動作していると下記のようなページ（下記はダークモードにしている．）が表示されるはずです．また，metrics serverの起動に多少時間がかかるためグラフの表示は少し待ちましょう．お疲れ様でした．

Complete Kubertes Dashboard 2.0 — Kubernetes Dashboard 2.0　セットアップ完了後

最後に

これで基本的なKubernetesでのDeepLearning環境の構築は完了です．かなり長かったと思うので最後までやり遂げた人はお疲れ様でした．はじめにで述べたとおり，次の記事では今回構築したKubenetes環境とMacとVScodeを使用してリモート開発環境の構築について書く予定ですのでよかったらそちらも読んでください．

tenzenの生存日誌

Kubernetes(k8s) v1.16とNvidia-Docker2を用いたマルチノードDeepLearning環境の構築

はじめに

Kubernetes・Nvidia-Docker2・NVIDIA-device-plugin-for-Kubernetes

Kubernetes

Nvidia-Docker2

NVIDIA-device-plugin-for-Kubernetes

Kubernetes環境の構築

Step01[全Node共通事項] : 準備

スワップ機能のオフ

ポートの開放

name serverの変更

Step02[全Node共通事項]:ソフトウェアのインストール

Nvidia Driver

Docker

Nvidia-Docker2

Kubernetes

Step03 : Masterのセットアップ

kubeadmでのセットアップ

Kubernetes Cluster内ネットワークの設定

Step04 : Workerのセットアップ

Step05：最終確認

Kubernetes Cluster上でのDeepLearning環境の構築

NVIDIA-device-plugin-for-Kubernetesの導入

DeepLearning環境Deploymentの作成

Dockerイメージの準備

マニフェストの準備

デプロイ

コンテナの確認

計算資源の可視化

metrics server

Kubernetes Dashboard 2.0

最後に