Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use hcloud ccm with CAPH bare metal? #702

Closed
chess-knight opened this issue Jul 19, 2024 · 6 comments
Closed

How to use hcloud ccm with CAPH bare metal? #702

chess-knight opened this issue Jul 19, 2024 · 6 comments
Assignees
Labels
question Further information is requested

Comments

@chess-knight
Copy link

I heard that someone from Hetzner successfully used it in the past, but according to my test, there needs to be at least one manual step for Cluster API's happiness. The thing is the providerID, which is different between these two projects for the robot servers and then CAPI cannot pair nodes with machine objects. For more see SovereignCloudStack/cluster-stacks#125 (comment)

@guettli
Copy link
Contributor

guettli commented Jul 29, 2024

Just for the records, this issue tracked the support for bare-metal. Afaik it is implemented:

#330

We at Syself have a fork (which supported bare-metal before hcloud-ccm did):

https://github.com/syself/hetzner-cloud-controller-manager

I am not happy with the current situation to have two CCMs. Sooner or later we want to solve that.

@chess-knight
Copy link
Author

I know it is implemented, and almost everything works nicely for me. But when creating clusters with CAPH, somehow, the providerID field got messed up. As I wrote in the linked comment - CAPH is setting the providerID field hcloud://bm-$SERVER_NUMBER for the baremetal inframachine objects and hcloud ccm is setting hrobot://$SERVER_NUMBER for the workload k8s nodes. See also SovereignCloudStack/cluster-stacks#125 (comment)

@apricote
Copy link
Member

apricote commented Aug 8, 2024

hcloud-cloud-controller-manager supports Robot Servers. We merged the necessary code from syself/hetzner-cloud-controller-manager at the end of last year. You can check out #523 for details on this merge, and the full design doc we have written for it. At that time, the design doc was shared with @batistein.

The design doc has the following considerations for the Provider ID. The implementation matches this plan:

We always need to know which nodes belong to which "source". We can save this info to the ProviderID field. Our existing Cloud servers use the pattern hcloud://. For Robot, we will use hrobot://. This differs from the Syself Fork, they use hcloud://bm-. We will also allow reading the Syself format, to enable users to migrate from the fork to our HCCM.

IMO this new format should be added to CAPH if it wants to work with hcloud-cloud-controller-manager.

I am not happy with the current situation to have two CCMs. Sooner or later we want to solve that.

I was hoping that with the merge, there was no longer a reason for the syself fork and you would migrate your users to hcloud-cloud-controller-manager.

@chess-knight
Copy link
Author

Yes, migration is possible for the existing clusters. However, for the new clusters, the provider ID simply differs.

IMO this new format should be added to CAPH if it wants to work with hcloud-cloud-controller-manager.

I think the same for the reasons I wrote above. CAPH also recently updated docs in syself/cluster-api-provider-hetzner#1401 for hcloud clusters. But for the baremetal servers, docs are still pointing to syself fork hccm.

@chess-knight
Copy link
Author

I found that the mentioned manual workaround KUBE_EDITOR="sed -i 's#hcloud://bm-#hrobot://#'" kubectl edit hetznerbaremetalmachine works for now, but CAPH csr controller is also using "wrong" ProviderID in case of usage of constant hostnames for baremetal servers. When using this feature, CAPH cannot pair nodes and kubelet-serving CSRs are therefore in a pending state(e.g. kubectl logs/exec/... doesn't work then in the workload cluster). This needs to be also fixed.

@apricote
Copy link
Member

As this is a missing feature in cluster-api-provider-hetzner, I have opened an issue on that repository: syself/cluster-api-provider-hetzner#1470

I will close this issue.

Please open a new issue if there are any features missing in hcloud-cloud-controller-manager that would be required for cluster-api-provider-hetzner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants