Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User is not being created from kubernetes secret #4030

Open
batulziiy opened this issue Nov 7, 2024 · 4 comments
Open

User is not being created from kubernetes secret #4030

batulziiy opened this issue Nov 7, 2024 · 4 comments

Comments

@batulziiy
Copy link

Overview

I'm experiencing a strange issue that postgres user is not being created from kubernetes secret. It was working fine for the last few months and there are several users created in this way, but when I tried to create another user recently it didn't work.
First I thought there should be something wrong with kustomization as it's the approach used for automation. However, found that secret is already created correctly in Kubernetes cluster while the actual db user is not created on the database.

Environment

Please provide the following details:

  • Platform: Kubernetes
  • Platform Version: v1.25.4+k3s1
  • PGO Image Tag: ubi8-15.2-0
  • Postgres Version: 15
  • Storage: longhorn

Steps to Reproduce

REPRO

Provide steps to get to the error condition:

  1. Create a new user via kustomization by adding user: username format in PostgresCluster definition yaml.
  2. Wait for CD tool to reconcile the change from git to kubernetes cluster.
  3. Check if the user is created on the database.

EXPECTED

  1. Expecting the user is created on the database same as kube secret.

ACTUAL

  1. Even though kubernetes secret is created in kubernetes, the actual user doesn't exist on the database.

Logs

Haven't found any logs related to the user creation in both db log, kustomization controller log, kubernetes event and pod log.

Additional Information

Please provide any additional information that may be helpful.

@benjaminjb
Copy link
Contributor

Hi @batulziiy, sorry you're running into this situation, which is certainly strange.

To make sure I understand, it sounds like you're saying that when you add a user to the postgrescluster.spec.users (here), the K8s secret for that user is created by the operator, but the PG database itself doesn't get that user added. AND to complicate matters, it was working correctly at one point in the past (that is, adding a user to the spec led to the creation of a K8s secret and creating the user in the PG database). Is that the situation?

To start looking at the issue, what PGO version are you running and what does your postgrescluster.spec.users look like? Does this problem happen with any username? I would expect to see some logs about user creation in the operator logs (at least from here), so I'm curious what you're seeing (if you can reproduce this issue).

@batulziiy
Copy link
Author

batulziiy commented Dec 10, 2024

hi @benjaminjb, yes you're correct, that's what I tried to mean. I'm still experiencing the issue and haven't found a solution yet. To answer your question.

  • PGO version we're currently running on is v15 with image crunchy-postgres:ubi8-15.2-0.
  • The problem happens with any username, though the usernames are quite simple as just username or username-abc
  • The bizarre thing is I don't see any log about the user creation in the operator log, then I'm stuck without knowing what to investigate next.

Do you think restarting the PG might help me to get rid of the issue?

@benjaminjb
Copy link
Contributor

Hi @batulziiy, that's interesting. Before we retry restarting anything, can you share what PGO version you're running and any logs you see in the operator pod around the time you try to create a new user through the spec.

It's odd to me that the operator would create a secret for a user and then not create the user in PG -- or at least not log the errors. I have messed up creating users once or twice and always looked for the operator log "wrote PostgreSQL users" to help debug the problem.

@batulziiy
Copy link
Author

batulziiy commented Dec 11, 2024

thanks @benjaminjb, actually it seems I was looking at the wrong log. According to your suggestion, looked at pgo log and found something interesting.

time="2024-12-10T17:33:07Z" level=debug msg="wrote PostgreSQL users" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pg-prod-001 namespace=postgres-operator pod=pg-prod-001-instance1-4tcc-0 postgresCluster=postgres-operat or/pg-prod-001 reconcileID=3934b68a-6f9b-49c3-b8e1-312f61253307 revision=5f4b7d8fb8 stderr="psql:<stdin>:58: ERROR: database \"aop\" does not exist\n" stdout= version=5.3.0-0 time="2024-12-10T17:33:07Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pg-prod-001 namespace=postgres-operator postgresCluster=postgres-operator/pg-prod-001 reconcileID=3934b68a-6 f9b-49c3-b8e1-312f61253307 version=5.3.0-0 time="2024-12-10T17:33:07Z" level=error msg="Reconciler error" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster error="command terminated with exit code 3" file="internal/controller/postgrescluster/postgres.go:499" func="postgr escluster.(*Reconciler).reconcilePostgresUsersInPostgreSQL" name=pg-prod-001 namespace=postgres-operator postgresCluster=postgres-operator/pg-prod-001 reconcileID=3934b68a-6f9b-49c3-b8e1-312f61253307 version=5.3.0-0 time="2024-12-10T17:37:28Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pg-test-001 namespace=postgres-operator postgresCluster=postgres-operator/pg-test-001 reconcileID=8e4d5d 6a-c63d-44f6-a8ac-048951eda12d stderr= stdout="Not changed\n" version=5.3.0-0 time="2024-12-10T17:37:28Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=hippo namespace=postgres-operator postgresCluster=postgres-operator/hippo reconcileID=bb4924ff-0164-4c8f -b0ea-0141c7bbb67f stderr= stdout="Not changed\n" version=5.3.0-0

Now I'm sure that when the pgo tries to reflect the changes from kustomization it gets stuck at this database since it doesn't exist. I will try to create an empty database with the same name and let you know what happens.
In other words, users are defined in kustomization yaml along with database as below (but the aop db has been deleted at some point):

users: - name: user1 databases: - aop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants