Tanzu Kubernetes Grid(TKGm) v1.3.0 のWorkload Cluster をアップグレードする
自分備忘録用メモ。TKGm v1.3.0 のWorkload Cluster をアップグレードします。
以下で実施している事
既に作成済のTKGm v1.3.0 の環境で、Workload Cluster をKubernetes v1.19.8 で作成し、そのWorkload Cluster を最新のKubernetes バージョンにアップグレードします。手順は「Upgrade Tanzu Kubernetes Clusters」に従って実施します。
Workload Cluster の作成
通常の作成方法に従って、Workload Cluster を作成します。Workload Cluster 作成時に以下のカスタマイズを入れて作成します。ディレクトリ<
~/.tanzu/tkg/providers/infrastructure-vsphere/ytt
> に以下のファイルを作成しています。$ cat containerd-multi-image-registry.yaml
#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.subset({"kind":"KubeadmControlPlane"})
---
spec:
kubeadmConfigSpec:
preKubeadmCommands:
#@overlay/append
- echo ' [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]' >> /etc/containerd/config.toml
#@overlay/append
- echo ' endpoint = ["https://registry-1.docker.io"]' >> /etc/containerd/config.toml
#@overlay/append
- systemctl restart containerd
#@overlay/match by=overlay.subset({"kind":"KubeadmConfigTemplate"})
---
spec:
template:
spec:
preKubeadmCommands:
#@overlay/append
- echo ' [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]' >> /etc/containerd/config.toml
#@overlay/append
- echo ' endpoint = ["https://registry-1.docker.io"]' >> /etc/containerd/config.toml
#@overlay/append
- systemctl restart containerd
$ cat customize-ring-buffer.yaml
#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.subset({"kind":"KubeadmControlPlane"})
---
spec:
kubeadmConfigSpec:
preKubeadmCommands:
#@overlay/append
- ethtool -G eth0 tx 2048
#@overlay/append
- ethtool -G eth0 rx 2048
#@overlay/match by=overlay.subset({"kind":"KubeadmConfigTemplate"})
---
spec:
template:
spec:
preKubeadmCommands:
#@overlay/append
- ethtool -G eth0 tx 2048
#@overlay/append
- ethtool -G eth0 rx 2048
containerd のデフォルト設定に関しては、「メモ:Tanzu Kubernetes Grid(TKGm) で コンテナランタイムの設定を変更する」に記載しています。
ring buffer のデフォルト値に関しては、以下の設定になります。
$ ethtool -g eth0
Ring parameters for eth0:
Pre-set maximums:
RX: 4096
RX Mini: 2048
RX Jumbo: 4096
TX: 4096
Current hardware settings:
RX: 1024
RX Mini: 128
RX Jumbo: 256
TX: 512
Workload Cluster を作成します。
$ tanzu cluster create ibanez --tkr v1.19.8---vmware.1-tkg.1 --file .tanzu/tkg/clusterconfigs/cluster-ibanez-config.yaml -v 6
Using namespace from config:
Validating configuration...
Waiting for resource pinniped-federation-domain of type *unstructured.Unstructured to be up and running
no matches for kind "FederationDomain" in version "config.supervisor.pinniped.dev/v1alpha1", retrying
no matches for kind "FederationDomain" in version "config.supervisor.pinniped.dev/v1alpha1", retrying
no matches for kind "FederationDomain" in version "config.supervisor.pinniped.dev/v1alpha1", retrying
no matches for kind "FederationDomain" in version "config.supervisor.pinniped.dev/v1alpha1", retrying
Failed to configure Pinniped configuration for workload cluster. Please refer to the documentation to check if you can configure pinniped on workload cluster manually
Creating workload cluster 'ibanez'...
patch cluster object with operation status:
...(SNIP)...
Waiting for resources type *v1alpha3.MachineList to be up and running
Waiting for addons installation...
Waiting for resources type *v1alpha3.ClusterResourceSetList to be up and running
Waiting for resource antrea-controller of type *v1.Deployment to be up and running
Workload cluster 'ibanez' created
無事Workload Cluster が作成されました。
$ tanzu cluster list --include-management-cluster
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN
ibanez default running 1/1 1/1 v1.19.8+vmware.1 <none> dev
schecter tkg-system running 1/1 1/1 v1.20.4+vmware.1 management dev
tanzu cluster kubeconfig get ibanez --export-file lab-ibanez-kubeconfig --admin
export KUBECONFIG=~/lab-ibanez-kubeconfig
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ibanez-control-plane-46qhr Ready master 169m v1.19.8+vmware.1
ibanez-md-0-86c8f89bc4-2dvzp Ready <none> 168m v1.19.8+vmware.1
作成されたControl plane ノード、Worker ノードにログインし、設定が反映されているか確認します。
- Control plane ノード
root@ibanez-control-plane-46qhr:~# cat /etc/containerd/config.toml
## template: jinja
# Use config version 2 to enable new configuration fields.
# Config file is parsed as version 1 by default.
version = 2
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "projects.registry.vmware.com/tkg/pause:3.2"
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io"]
root@ibanez-control-plane-46qhr:~# ethtool -g eth0
Ring parameters for eth0:
Pre-set maximums:
RX: 4096
RX Mini: 2048
RX Jumbo: 4096
TX: 4096
Current hardware settings:
RX: 2048
RX Mini: 128
RX Jumbo: 256
TX: 2048
- Worker ノード
root@ibanez-md-0-86c8f89bc4-2dvzp:~# cat /etc/containerd/config.toml
## template: jinja
# Use config version 2 to enable new configuration fields.
# Config file is parsed as version 1 by default.
version = 2
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "projects.registry.vmware.com/tkg/pause:3.2"
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io"]
root@ibanez-md-0-86c8f89bc4-2dvzp:~# ethtool -g eth0
Ring parameters for eth0:
Pre-set maximums:
RX: 4096
RX Mini: 2048
RX Jumbo: 4096
TX: 4096
Current hardware settings:
RX: 2048
RX Mini: 128
RX Jumbo: 256
TX: 2048
Workload Cluster のアップグレード
手順に従って、Workload Cluster をアップグレードします。「Upgrade Tanzu Kubernetes Clusters」を確認すると、”
Before you upgrade a Tanzu Kubernetes cluster, remove all unmanaged kapp-controller deployment artifacts from the Tanzu Kubernetes cluster. An unmanaged kapp-controller deployment is a deployment that exists outside of the vmware-system-tmc namespace. You can assume it is in the kapp-controller namespace.
” と記載があるので、kapp-controller
deployment を削除します(削除、、、本当に大丈夫であろうか。。。)。$ kubectl get deployment -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system antrea-controller 1/1 1 1 3h7m
kube-system coredns 2/2 2 2 3h9m
kube-system metrics-server 1/1 1 1 3h7m
kube-system vsphere-csi-controller 1/1 1 1 177m
tkg-system kapp-controller 1/1 1 1 3h8m
k -n tkg-system delete deployment kapp-controller
k delete clusterrole kapp-controller-cluster-role
k delete clusterrolebinding kapp-controller-cluster-role-binding
k -n tkg-system delete serviceaccount kapp-controller-sa
準備が整ったので、アップグレードしていきます。
$ tanzu kubernetes-release get
NAME VERSION COMPATIBLE UPGRADEAVAILABLE
v1.17.16---vmware.2-tkg.1 v1.17.16+vmware.2-tkg.1 True True
v1.18.16---vmware.1-tkg.1 v1.18.16+vmware.1-tkg.1 True True
v1.19.8---vmware.1-tkg.1 v1.19.8+vmware.1-tkg.1 True True
v1.20.4---vmware.1-tkg.1 v1.20.4+vmware.1-tkg.1 True False
$ tanzu cluster list
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN
ibanez default running 1/1 1/1 v1.19.8+vmware.1 <none> dev
$ tanzu cluster upgrade ibanez
Upgrading workload cluster 'ibanez' to kubernetes version 'v1.20.4+vmware.1'. Are you sure? [y/N]: y
Validating configuration...
Verifying kubernetes version...
Retrieving configuration for upgrade cluster...
Create InfrastructureTemplate for upgrade...
Upgrading control plane nodes...
Patching KubeadmControlPlane with the kubernetes version v1.20.4+vmware.1...
Waiting for kubernetes version to be updated for control plane nodes
Upgrading worker nodes...
Patching MachineDeployment with the kubernetes version v1.20.4+vmware.1...
Waiting for kubernetes version to be updated for worker nodes...
updating additional components: 'metadata/tkg,addons-management/kapp-controller' ...
Cluster 'ibanez' successfully upgraded to kubernetes version 'v1.20.4+vmware.1'
別のターミナルで、Workload Cluster アップグレード時のノード状況を確認します。Control Plane ノードが1ノード構成なので、
kube-apiserver
への通信が一瞬切れたりしていますが、Control Plane ノードが複数構成、かつロードバランサー機能としてNSX ALB を使っている環境であれば、この様な事象は発生しないかと思います。$ k get nodes -w
NAME STATUS ROLES AGE VERSION
ibanez-control-plane-46qhr Ready master 3h12m v1.19.8+vmware.1
ibanez-md-0-86c8f89bc4-2dvzp Ready <none> 3h11m v1.19.8+vmware.1
ibanez-md-0-86c8f89bc4-2dvzp Ready <none> 3h11m v1.19.8+vmware.1
ibanez-control-plane-46qhr Ready master 3h13m v1.19.8+vmware.1
ibanez-control-plane-q99sp NotReady <none> 0s v1.20.4+vmware.1
ibanez-control-plane-q99sp NotReady <none> 0s v1.20.4+vmware.1
ibanez-control-plane-q99sp NotReady <none> 0s v1.20.4+vmware.1
ibanez-control-plane-q99sp NotReady <none> 0s v1.20.4+vmware.1
$ k get nodes -w
Unable to connect to the server: dial tcp <control-plane-endpoint-ip>:6443: connect: no route to host
$ k get nodes -w
NAME STATUS ROLES AGE VERSION
ibanez-control-plane-46qhr Ready master 3h14m v1.19.8+vmware.1
ibanez-control-plane-q99sp NotReady control-plane,master 30s v1.20.4+vmware.1
ibanez-md-0-86c8f89bc4-2dvzp Ready <none> 3h13m v1.19.8+vmware.1
ibanez-control-plane-q99sp NotReady control-plane,master 34s v1.20.4+vmware.1
ibanez-control-plane-q99sp NotReady control-plane,master 39s v1.20.4+vmware.1
ibanez-control-plane-q99sp NotReady control-plane,master 39s v1.20.4+vmware.1
$ k get nodes -w
NAME STATUS ROLES AGE VERSION
ibanez-control-plane-46qhr Ready,SchedulingDisabled master 3h16m v1.19.8+vmware.1
ibanez-control-plane-q99sp Ready control-plane,master 101s v1.20.4+vmware.1
ibanez-md-0-86c8f89bc4-2dvzp Ready <none> 3h15m v1.19.8+vmware.1
ibanez-control-plane-q99sp Ready control-plane,master 101s v1.20.4+vmware.1
ibanez-control-plane-q99sp Ready control-plane,master 2m4s v1.20.4+vmware.1
ibanez-control-plane-46qhr Ready,SchedulingDisabled master 3h16m v1.19.8+vmware.1
ibanez-md-0-86c8f89bc4-2dvzp Ready <none> 3h16m v1.19.8+vmware.1
ibanez-md-0-8ff576557-2vgmt NotReady <none> 0s v1.20.4+vmware.1
ibanez-md-0-8ff576557-2vgmt NotReady <none> 0s v1.20.4+vmware.1
ibanez-md-0-8ff576557-2vgmt NotReady <none> 0s v1.20.4+vmware.1
ibanez-md-0-8ff576557-2vgmt NotReady <none> 1s v1.20.4+vmware.1
ibanez-md-0-8ff576557-2vgmt NotReady <none> 5s v1.20.4+vmware.1
ibanez-md-0-8ff576557-2vgmt NotReady <none> 10s v1.20.4+vmware.1
ibanez-md-0-8ff576557-2vgmt Ready <none> 30s v1.20.4+vmware.1
ibanez-md-0-8ff576557-2vgmt Ready <none> 30s v1.20.4+vmware.1
ibanez-md-0-8ff576557-2vgmt Ready <none> 30s v1.20.4+vmware.1
ibanez-md-0-8ff576557-2vgmt Ready <none> 31s v1.20.4+vmware.1
ibanez-md-0-86c8f89bc4-2dvzp Ready,SchedulingDisabled <none> 3h17m v1.19.8+vmware.1
ibanez-md-0-86c8f89bc4-2dvzp Ready,SchedulingDisabled <none> 3h17m v1.19.8+vmware.1
ibanez-md-0-8ff576557-2vgmt Ready <none> 35s v1.20.4+vmware.1
ibanez-md-0-86c8f89bc4-2dvzp Ready,SchedulingDisabled <none> 3h17m v1.19.8+vmware.1
ibanez-md-0-8ff576557-2vgmt Ready <none> 90s v1.20.4+vmware.1
ibanez-md-0-8ff576557-2vgmt Ready <none> 118s v1.20.4+vmware.1
ibanez-md-0-8ff576557-2vgmt Ready <none> 2m v1.20.4+vmware.1
$ k get nodes
NAME STATUS ROLES AGE VERSION
ibanez-control-plane-q99sp Ready control-plane,master 6m45s v1.20.4+vmware.1
ibanez-md-0-8ff576557-2vgmt Ready <none> 3m23s v1.20.4+vmware.1
TKGm ではCluster API を利用し、Kubernetes クラスタのライフサイクル管理をしています。その中の仕組みにおいては既存ノードをそのまま利用し、Kubernetes のバージョンアップを行うのではなく、新たなバージョンのノードを用意し、Kubernetes クラスタに参画させ、古いバージョンのノードは廃棄する形を取っています。
TKGm on vSphere 環境においてもその様な動きである事が
kubectl get nodes
の実行結果や、vCenter の仮想マシンのステータスから伺えます。無事にアップグレード出来た事を確認します。
$ k get all -n tkg-system
NAME READY STATUS RESTARTS AGE
pod/kapp-controller-f7964d5bd-q5l62 1/1 Running 0 4m11s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/kapp-controller 1/1 1 1 11m
NAME DESIRED CURRENT READY AGE
replicaset.apps/kapp-controller-f7964d5bd 1 1 1 11m
$ k get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system antrea-agent-jzmf9 2/2 Running 0 22m
kube-system antrea-agent-m4dvn 2/2 Running 0 19m
kube-system antrea-controller-6f6cc6bdc9-nwggl 1/1 Running 0 18m
kube-system coredns-5b7b55f9f8-wbx5g 1/1 Running 0 18m
kube-system coredns-5b7b55f9f8-wcw2s 1/1 Running 0 18m
kube-system etcd-ibanez-control-plane-q99sp 1/1 Running 0 22m
kube-system kube-apiserver-ibanez-control-plane-q99sp 1/1 Running 0 22m
kube-system kube-controller-manager-ibanez-control-plane-q99sp 1/1 Running 0 22m
kube-system kube-proxy-645v6 1/1 Running 0 20m
kube-system kube-proxy-wnrth 1/1 Running 0 19m
kube-system kube-scheduler-ibanez-control-plane-q99sp 1/1 Running 0 22m
kube-system kube-vip-ibanez-control-plane-q99sp 1/1 Running 0 21m
kube-system metrics-server-84b9bc9fc9-qcwdw 1/1 Running 0 18m
kube-system vsphere-cloud-controller-manager-vck48 1/1 Running 0 21m
kube-system vsphere-csi-controller-57464947d9-7ws52 5/5 Running 0 21m
kube-system vsphere-csi-node-b52t9 3/3 Running 0 22m
kube-system vsphere-csi-node-pj9jz 3/3 Running 0 19m
tkg-system kapp-controller-f7964d5bd-q5l62 1/1 Running 0 18m
Management Cluster からCluster API のカスタムリソースの状態も確認してみます。
$ k get kubeadmcontrolplanes
NAME INITIALIZED API SERVER AVAILABLE VERSION REPLICAS READY UPDATED UNAVAILABLE
ibanez-control-plane true true v1.20.4+vmware.1 1 1 1
$ k get machinedeployment
NAME PHASE REPLICAS READY UPDATED UNAVAILABLE
ibanez-md-0 Running 1 1 1
$ k get machinesets
NAME REPLICAS AVAILABLE READY
ibanez-md-0-86c8f89bc4
ibanez-md-0-8ff576557 1 1 1
Machineset
に関しては削除されたものが表示されている様です。
$ k describe machinesets ibanez-md-0-86c8f89bc4
Name: ibanez-md-0-86c8f89bc4
Namespace: default
Labels: cluster.x-k8s.io/cluster-name=ibanez
machine-template-hash=4274945670
node-pool=ibanez-worker-pool
Annotations: machinedeployment.clusters.x-k8s.io/desired-replicas: 1
machinedeployment.clusters.x-k8s.io/max-replicas: 2
machinedeployment.clusters.x-k8s.io/revision: 1
API Version: cluster.x-k8s.io/v1alpha3
Kind: MachineSet
Metadata:
Creation Timestamp: 2021-04-20T06:58:06Z
Generation: 2
Managed Fields:
API Version: cluster.x-k8s.io/v1alpha3
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:machinedeployment.clusters.x-k8s.io/desired-replicas:
f:machinedeployment.clusters.x-k8s.io/max-replicas:
f:machinedeployment.clusters.x-k8s.io/revision:
f:labels:
.:
f:cluster.x-k8s.io/cluster-name:
f:machine-template-hash:
f:node-pool:
f:ownerReferences:
.:
k:{"uid":"3129ee77-67f3-4a91-b27a-8632fdd59a85"}:
.:
f:apiVersion:
f:blockOwnerDeletion:
f:controller:
f:kind:
f:name:
f:uid:
f:spec:
.:
f:clusterName:
f:replicas:
f:selector:
.:
f:matchLabels:
.:
f:cluster.x-k8s.io/cluster-name:
f:machine-template-hash:
f:template:
.:
f:metadata:
.:
f:labels:
.:
f:cluster.x-k8s.io/cluster-name:
f:machine-template-hash:
f:node-pool:
f:spec:
.:
f:bootstrap:
.:
f:configRef:
.:
f:apiVersion:
f:kind:
f:name:
f:clusterName:
f:infrastructureRef:
.:
f:apiVersion:
f:kind:
f:name:
f:version:
f:status:
.:
f:observedGeneration:
f:selector:
Manager: manager
Operation: Update
Time: 2021-04-20T07:15:03Z
Owner References:
API Version: cluster.x-k8s.io/v1alpha3
Block Owner Deletion: true
Controller: true
Kind: MachineDeployment
Name: ibanez-md-0
UID: 3129ee77-67f3-4a91-b27a-8632fdd59a85
Resource Version: 10913012
UID: 46f4b1de-1ed5-4a4a-be9a-7e57d235f8fb
Spec:
Cluster Name: ibanez
Delete Policy: Random
Replicas: 0
Selector:
Match Labels:
cluster.x-k8s.io/cluster-name: ibanez
Machine - Template - Hash: 4274945670
Template:
Metadata:
Labels:
cluster.x-k8s.io/cluster-name: ibanez
Machine - Template - Hash: 4274945670
Node - Pool: ibanez-worker-pool
Spec:
Bootstrap:
Config Ref:
API Version: bootstrap.cluster.x-k8s.io/v1alpha3
Kind: KubeadmConfigTemplate
Name: ibanez-md-0
Cluster Name: ibanez
Infrastructure Ref:
API Version: infrastructure.cluster.x-k8s.io/v1alpha3
Kind: VSphereMachineTemplate
Name: ibanez-worker
Version: v1.19.8+vmware.1
Status:
Observed Generation: 2
Selector: cluster.x-k8s.io/cluster-name=ibanez,machine-template-hash=4274945670
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulDelete 59m machineset-controller Deleted machine "ibanez-md-0-86c8f89bc4-2dvzp"
$ k get machines
NAME PROVIDERID PHASE VERSION
ibanez-control-plane-q99sp vsphere://423573c5-5ac0-bdb4-c0de-b9ea63650a46 Running v1.20.4+vmware.1
ibanez-md-0-8ff576557-2vgmt vsphere://4235e856-c017-36c4-6c05-21179a5b7ac1 Running v1.20.4+vmware.1
tanzu
cli でも確認しておきます。$ tanzu cluster get ibanez
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES
ibanez default running 1/1 1/1 v1.20.4+vmware.1 <none>
ℹ
Details:
NAME READY SEVERITY REASON SINCE MESSAGE
/ibanez True 88m
├─ClusterInfrastructure - VSphereCluster/ibanez True 4h46m
├─ControlPlane - KubeadmControlPlane/ibanez-control-plane True 88m
│ └─Machine/ibanez-control-plane-q99sp True 89m
└─Workers
└─MachineDeployment/ibanez-md-0
└─Machine/ibanez-md-0-8ff576557-2vgmt True 86m
Kubernetes クラスタに施したカスタマイズは以下の通り、新しいノードに対してもちゃんと反映されています。
- Control Plane ノード
root@ibanez-control-plane-q99sp:~# cat /etc/containerd/config.toml
## template: jinja
# Use config version 2 to enable new configuration fields.
# Config file is parsed as version 1 by default.
version = 2
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "projects.registry.vmware.com/tkg/pause:3.2"
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io"]
root@ibanez-control-plane-q99sp:~# ethtool -g eth0
Ring parameters for eth0:
Pre-set maximums:
RX: 4096
RX Mini: 2048
RX Jumbo: 4096
TX: 4096
Current hardware settings:
RX: 2048
RX Mini: 128
RX Jumbo: 256
TX: 2048
- Worker ノード
root@ibanez-md-0-8ff576557-2vgmt:~# cat /etc/containerd/config.toml
## template: jinja
# Use config version 2 to enable new configuration fields.
# Config file is parsed as version 1 by default.
version = 2
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "projects.registry.vmware.com/tkg/pause:3.2"
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io"]
root@ibanez-md-0-8ff576557-2vgmt:~# ethtool -g eth0
Ring parameters for eth0:
Pre-set maximums:
RX: 4096
RX Mini: 2048
RX Jumbo: 4096
TX: 4096
Current hardware settings:
RX: 2048
RX Mini: 128
RX Jumbo: 256
TX: 2048
TKGm ではCluster API の仕組みを用いて、簡単にKubernetes クラスタのアップグレードが出来る事を確認出来ました。
- リンクを取得
- ×
- メール
- 他のアプリ