-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
📖 Add docs for cluster stack operator (#32)
Adding docs to understand, develop, and use the cluster stacks and the cluster stack operator. Signed-off-by: janiskemper <janis.kemper@syself.com>
- Loading branch information
1 parent
ad039ca
commit 4c4f906
Showing
18 changed files
with
431 additions
and
71 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# Documentation Index | ||
|
||
## General | ||
- [Concept](concept.md) | ||
- [Terminology](terminology.md) | ||
|
||
## Quickstart | ||
- [Quickstart](topics/quickstart.md) | ||
- [Cluster API quick start](https://cluster-api.sigs.k8s.io/user/quick-start.html) | ||
|
||
### Architecture | ||
- [Overview](architecture/overview.md) | ||
- [User flow](architecture/user-flow.md) | ||
- [Workflow - Node images](architecture/node-image-flow.md) | ||
- [Workflow - Management Cluster](architecture/mgt-cluster-flow.md) | ||
- [Workflow - Workload Cluster](architecture/workload-cluster-flow.md) | ||
|
||
### Topics | ||
- [Managing ClusterStack resources](topics/managing-clusterstacks.md) | ||
- [Upgrade flow](topics/upgrade-flow.md) | ||
- [Troubleshooting](topics/troubleshoot.md) | ||
|
||
### Developing | ||
- [Development guide](develop/develop.md) | ||
- [Develop provider integrations](develop/provider-integration.md) | ||
|
||
### Reference | ||
- [General](reference/README.md) | ||
- [ClusterStack](reference/clusterstack.md) | ||
- [ClusterStackRelease](reference/clusterstackrelease.md) | ||
- [ClusterAddon](reference/clusteraddon.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
# Management Cluster flow | ||
|
||
In a Cluster API management cluster, the Cluster API operators run. In our management cluster, there are also the Cluster Stack operators. | ||
|
||
The user controls workload clusters via custom resources. As the Cluster Stack approach uses `ClusterClasses`, the user has to create only a `Cluster` object and refer to a `ClusterClass`. | ||
|
||
However, in order for this to work, the `ClusterClass` has to be applied as well as all other Cluster API objects that are referenced by the `ClusterClass`, such as `MachineTemplates`, etc. | ||
|
||
These Cluster API objects are packaged in a Helm Chart that is part of every cluster stack. The clusterstackrelease-controller is responsible for applying this Helm chart, which is done by first calling `helm template` and then the "apply" method of the Kubernetes go-client. | ||
|
||
The main resource is always the `ClusterClass` that follows a very specific naming pattern and is called in the exact same way as the `ClusterStackRelease` object that manages it. For example, `docker-ferrol-1-27-v1`, which refers to all defining properties of a specific release of a cluster stack for a certain provider. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Node image flow | ||
|
||
The node image flow depends on each provider. There are various ways in which providers allow the use of custom images. We have documented the options in the [cluster stacks repo](https://github.com/SovereignCloudStack/cluster-stacks#film_strip-node-images). | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
# Architecture | ||
|
||
![Cluster Stacks](../pics/syself-cluster-stacks-web.png) | ||
|
||
## Cluster stacks | ||
|
||
The cluster stacks are opinionated templates of clusters in which all configuration and all core components are defined. They can be implemented on any provider. | ||
|
||
There can be multiple cluster stacks that acknowledge the many ways in which a cluster can be set up. There is no right or wrong and cluster stacks make sure that the flexibility is not lost. | ||
|
||
At the same time, they offer ready-made templates for users, who do not have to spend a lot of thought on how to build clusters so that everything works well together. | ||
|
||
Cluster stacks are implemented by two Helm charts. The first one contains all Cluster API objects and is applied in the management cluster. The second Helm chart contains the cluster addons, i.e. the core components every cluster needs, and is installed in the workload clusters. | ||
|
||
Furthermore, there are node images that can look quite different depending on the provider. | ||
|
||
To sum up, there are three components of a cluster stack: | ||
|
||
1. Cluster addons: The cluster addons (CNI, CSI, CCM) have to be applied in each workload cluster that the user starts | ||
2. Cluster API objects: The `ClusterClass` object makes it easier to use Cluster-API. The cluster stack contains a `ClusterClass` object and other Cluster-API objects that are necessary in order to use the `ClusterClass`. These objects have to be applied in the management cluster. | ||
3. Node images: Node images can be provided to the user in different form. They are released and tested together with the other two components of the cluster stack. | ||
|
||
More information about cluster stacks and their three parts can be found in https://github.com/SovereignCloudStack/cluster-stacks/blob/main/README.md. | ||
|
||
## Cluster Stack Operator | ||
|
||
The Cluster Stack Operator takes care of all steps that have to be done in order to use a certain cluster stack implementation. | ||
|
||
It has to be installed in the management cluster and can be interacted with by applying custom resources. It extends the functionality of the Cluster API operators. | ||
|
||
The Cluster Stack Operator mainly applies the two Helm charts from a cluster stack implementation. It is also able to automatically fetch a remote Github repository to see whether there are new releases of a certain cluster stack. | ||
|
||
The first and second component of a cluster stack are handled by the Cluster Stack Operator. | ||
|
||
The node images, on the other hand, have to be handled by separate provider integrations, similar to the ones that [Cluster-API uses](https://cluster-api.sigs.k8s.io/developer/providers/implementers-guide/overview). | ||
|
||
## Cluster Stack Provider Integrations | ||
|
||
The Cluster Stack Operator is accompanied by Cluster Stack Provider Integrations. A provider integration is also an operator that works together with the Cluster Stack Operator in a specific way, which is described in the docs about building [provider integrations](../develop/provider-integration.md). | ||
|
||
A provider integration makes sure that the node images are taken care of and made available to the user. | ||
|
||
If there is no work to be done for node images, then the Cluster Stack Operator can work in `noProvider` mode and this Cluster Stack Provider Integration can be omitted. | ||
|
||
## Steps to make cluster stacks ready to use | ||
|
||
There are many steps that are needed in order to make cluster stacks ready to use. In order to understand the full flow better and to get an idea of how much work there is and how many personas are involved, we will give an overview of how to start from scratch with a new cluster stack and provider. | ||
|
||
We will assume that this operator exists, but that you want to use a new cluster stack and provider. | ||
|
||
### Defining a cluster stack | ||
|
||
First, you need to define your cluster stack. Which cluster addons do you need? How do your node images look like? You need to take these decisions and write them down. | ||
|
||
### Implementing a cluster stack | ||
|
||
The next step is to implement your cluster stack for your provider. You can take existing implementations as reference, but need to think of how the provider-specific custom resources are called and how the respective Cluster API Provider Integration works. | ||
|
||
### Implementing a Cluster Stack Provider Integration | ||
|
||
We assume that you need to do some manual tasks in order to make node images accessible on your provider. These steps should be implemented in a Cluster Stack Provider Integration, which of course has to work together with the details of how you implemented your cluster stack. | ||
|
||
### Using everything | ||
|
||
Finally, you can use the new cluster stack you defined and implemented on the infrastructure of your provider. Enjoy! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# Deep dive: User flow | ||
|
||
It is essential to understand the flow of what you have to do as a user and what happens in the background. | ||
|
||
The [Quickstart guide](quickstart.md) goes over all small steps you have to do to. If you are just interested in getting started, then have a look there. | ||
|
||
In the following, we will not go into the detail of every command, but will focus more on a high-level of what you have to do and of what happens in the background. | ||
|
||
## Steps to create a workload cluster | ||
|
||
### Get the right cluster stacks | ||
|
||
The first step would be to make sure that you have the cluster stacks implemented that you want to use. Usually, you will use cluster stacks that have been implemented by others for the provider that you want to use. However, you can also build your own cluster stacks. | ||
|
||
### Apply cluster stack resource | ||
|
||
If you have everything available, you can start your management cluster / bootstrap cluster. In this cluster, you have to apply the `ClusterStack` custom resource with your individual desired configuration. | ||
|
||
Depending on your configuration, you will have to wait until all steps are done in the background. | ||
|
||
The operator will perform all necessary steps to provide you with node images. If all node images are ready, it will apply the Cluster API resources that are required. | ||
|
||
At the end, you will have node images and Cluster API objects ready to use. There is only one step more to create a cluster. | ||
|
||
### Use the ClusterClasses | ||
|
||
That the previous step is done, you can see in the status of the `ClusterStack` object. However, you can also just check if you have certain `ClusterClass` objects. The `ClusterClass` objects will be applied by the Cluster Stack Operator as well. They follow a certain naming pattern. If you have the cluster stack "ferrol" for the docker provider and Kubernetes version 1.27 in version "v1", then you'll see a `ClusterClass` that has the name "docker-ferrol-1-27-v1". | ||
|
||
You can use this `ClusterClass` by referencing it in a `Cluster` object. For details, you can check out the official Cluster-API documentation. | ||
|
||
### Wait until cluster addons are ready | ||
|
||
If you created a workload cluster by applying a `Cluster` object, the cluster addons will be applied automatically. You just have to wait until everything is ready, e.g. that the CCM or CNI are installed. | ||
|
||
## Recap - how do Cluster API and Cluster Stacks work together? | ||
|
||
The user triggers the flow by configuring and applying a `ClusterStack` custom resource. This will trigger some work in the background, to make node images and Cluster API objects ready to use. | ||
|
||
This process is completed, when a `ClusterClass` with a certain name is created. This `ClusterClass` resource is used in order to create as many clusters as you want that look like the template specified in the `ClusterClass`. | ||
|
||
Upgrades of clusters are done by changing the reference to a new `ClusterClass`, e.g. from `docker-ferrol-1-27-v1` to `docker-ferrol-1-27-v2`. | ||
|
||
To sum up: The Cluster Stack Operator takes care of steps that you would otherwise have to do manually. It does not change anything in the normal Cluster API flow, expcept that it enforces the use of `ClusterClasses`. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# The workload cluster flow | ||
|
||
The workload cluster flow is implemented by two controllers and one custom resource. | ||
|
||
The `ClusterAddon` resource gets created by the ClusterAddonCreate controller for any `Cluster` resource that is applied. | ||
|
||
The user never interacts with the `ClusterAddon` resource as it is created, updated, and deleted automatically. | ||
|
||
It is updated by the ClusterAddon controller, which makes sure that all cluster addons are applied in the respective workload cluster. | ||
|
||
The controller follows a simple pattern. When a cluster is created, it waits until the cluster is ready. If that is the case, it applies all objects from the ClusterAddon Helm Chart. | ||
|
||
If a cluster is updated, it checks whether there has been an update of the cluster addons and only if that's the case, it applies the objects again. It also deletes objects that have been there in the previous version but are not there anymore. | ||
|
||
Applying the objects has one additional step: we take the idea of the cluster-api-addon-provider-helm and add a few details about the `Cluster` and the `ProviderCluster` in there (https://github.com/kubernetes-sigs/cluster-api-addon-provider-helm/blob/main/internal/value_substitutions.go). | ||
|
||
This is necessary, because normal templating could not inject these values that are only available at runtime but that are very important to the resources that we apply as cluster addons. | ||
|
||
As this controller relies on the release assets to be downloaded - as do other controllers that do not download anything themselves - there is one issue after a container restart that we have to solve: | ||
|
||
If the container restarts, then everything that was stored in memory or without external volume in the container, will be lost. Therefore, a container restart requires to fetch from Github again. | ||
|
||
This takes a bit of time, even if it is just one second. If a `ClusterAddon` reconciles within this one second, it willl realize though, that the desired file is not available yet. Instead of throwing an error, we can intelligently requeue again. | ||
|
||
The same pattern is followed in all other controllers as well, if needed. | ||
|
||
This controller also sets intelligent conditions into the status of the objects to make sure that the user can understand what is going on. |
Oops, something went wrong.