diff --git a/Standards/scs-0215-v1-robustness-features.md b/Standards/scs-0215-v1-robustness-features.md index 5fb1ee57f..c6b78face 100644 --- a/Standards/scs-0215-v1-robustness-features.md +++ b/Standards/scs-0215-v1-robustness-features.md @@ -77,7 +77,7 @@ different priority levels and rate limit maximums. The concept documentation offers a more in-depth explanation of the feature: [Flow Control](https://kubernetes.io/docs/concepts/cluster-administration/flow-control/) -### etcd compaction/defragmentation +### etcd maintenance etcd is a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines. For these reasons, etcd was chosen as the default database @@ -89,52 +89,13 @@ gives back disk space to the underlying file system and can help bring the clust ran out of space earlier. This can be achieved by providing the necessary flags/parameters to etcd, either via the KubeadmControlPlane or in the -configuration file of the etcd cluster, if it is managed independent from the Kubernetes cluster. +configuration file of the etcd cluster, if it is managed independent of the Kubernetes cluster. Possible flags, that can be set for this feature, are: * auto-compaction-mode * auto-compaction-retention -etcd cluster defragmentation unfortunately can't be done automatically. Instead the user would need to manually call -the defrag command on the cluster. In order to mitigate this, a systemd (or similar) job could be created, which -periodically calls the defragmentation procedure. Unfortunately, simultaneous defragmentation of all members of an etcd -cluster would block read and write procedures. A preferable strategy to mitigate this would be the following: - -* defragment the non leader etcd members first -* change the leadership to the randomly selected and defragmentation completed etcd member -* defragment the local (ex-leader) etcd member - -This example was taken from the [Maintenance and Troubleshooting page](https://github.com/SovereignCloudStack/k8s-cluster-api-provider/blob/main/doc/Maintenance_and_Troubleshooting.md#defragmentation-and-backup) -page of the SCS documentation, which was derived in part from the [OpenShift Host Practices](https://docs.openshift.com/container-platform/4.9/scalability_and_performance/recommended-host-practices.html#automatic-defrag-etcd-data_recommended-host-practices). - -An example for a defragmentation job, e.g. as a systemd service, and its helpers could be the following: - -```bash -[Unit] -Description=Run etcdctl defrag -Documentation=https://etcd.io/docs/v3.3.12/op-guide/maintenance/#defragmentation -After=network.target -[Service] -Type=oneshot -Environment="LOG_DIR=/var/log" -Environment="ETCDCTL_API=3" -ExecStart=/usr/local/sbin/etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt defrag -[Install] -WantedBy=multi-user.target -``` - -```bash -[Unit] -Description=Run etcd-defrag.service every day -After=network.target -[Timer] -OnCalendar=*-*-* 02:00:0 -RandomizedDelaySec=10m -[Install] -WantedBy=multi-user.target -``` - -More information about compaction and defragmentation can be found in the respective etcd documentation +More information about compaction can be found in the respective etcd documentation [etcd maintenance](https://etcd.io/docs/v3.3/op-guide/maintenance/) ### etcd backup @@ -227,7 +188,7 @@ It is also RECOMMENDED to activate the Kubernetes API priority and fairness feat which also uses the aforementioned cluster parameters to better queue, schedule and prioritize incoming requests. -### etcd compaction/defragmentation +### etcd compaction etcd needs to be cleaned up regularly, so that it functions correctly and doesn't take up too much space, which happens because of its increase of the keyspace. @@ -237,13 +198,6 @@ To compact the etcd keyspace, the following flags/parameters MUST be set for etc * auto-compaction-mode = periodic * auto-compaction-retention = 8h -OPTIONALLY, a cluster defragmentation can be carried out regularly. -To do this, it is RECOMMENDED to create a systemd (or similar automatic job) in order -to execute this defragmentation regularly in a fixed timeframe. -An example for such a systemd job can be found in the chapter [Design Considerations]. -It is important to note, that such a defragmentation could lead to service interruptions. -Therefore, such a process should at best be carried during times of low traffic in order -to not disrupt normal workflow. ### etcd backup @@ -294,4 +248,4 @@ for this, since it is dependent on the CA. ## Conformance Tests -Conformance Tests, OPTIONAL +*Conformance Tests, OPTIONAL*