You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a Teleport component fails to execute its heartbeat procedure, it will enter a degraded state. Teleport will begin recovering from this state when a heartbeat completes successfully.
The first successful heartbeat will transition Teleport into a recovering state.
A second consecutive successful heartbeat will cause Teleport to transition to the OK state.
Teleport heartbeats run approximately every 60 seconds when healthy, and failed heartbeats are retried approximately every 5 seconds. This means that depending on the timing of heartbeats, it can take 60-70 seconds after connectivity is restored for /readyz to start reporting healthy again.
Instead, if a proxy instance (tested with no other roles than proxy) loses connectivity with the Auth service, it will flap between states in a loop.
I added additional logs in a test build (based on 16.4.12) to capture the behavior in more detail.
The fetchAndAnnounce() function in srv/heartbeat.go:624 is called once every 5 seconds. Depending on the current heartbeat state, it will take a different action. If it encounters an error, then a TeleportDegraded event is emitted by the caller. If there is no error, then TeleportOK is emitted.
If the state is HeartbeatStateAnnounce, then it will attempt to do the heartbeat announce. If there is a problem, it returns an error, and a TeleportDegraded event is emitted. The issue seems to stem from the fact that it will always emit a TeleportOK event if the heartbeat status is in HeartbeatStateAnnounceWait, even if it hasn't successfully emitted an event. This doesn't match the behavior as described in the documentation.
A Table showing the expected and observed states and events fired over time
Time
interval
Heartbeat State
Actual Event
Expected Event
Actual Process State
Expected Process State
actual readyz
12/26/2024 15:35:01
HeartbeatStateInit
-
-
stateStarting
stateStarting
teleport is starting and hasn't joined the cluster yet
12/26/2024 15:35:13
0:00:12
HeartbeatStateAnnounceWait
TeleportOKEvent
TeleportOKEvent
stateOK
stateOK
ok
12/26/2024 15:35:16
0:00:03
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:35:21
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:35:22
------ ERROR CONDITION INTRODUCED ---------
12/26/2024 15:35:26
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:35:31
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:35:36
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:35:41
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:35:46
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:35:51
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:35:56
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:36:01
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:36:06
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:36:11
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:36:16
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:36:21
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:36:26
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:36:31
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:36:36
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:36:41
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:36:46
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:36:51
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:36:56
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:37:01
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:37:06
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:37:11
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:37:16
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:37:21
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:37:26
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:37:31
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:37:36
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:37:41
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:37:46
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:37:51
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:37:56
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:38:01
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:38:06
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:38:11
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:38:16
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:38:21
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:38:26
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:38:31
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:38:36
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:38:41
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:38:46
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:38:51
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:38:56
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:39:01
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:39:06
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:39:11
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:39:16
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:39:21
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:39:26
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:39:31
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:39:36
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:39:41
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:39:46
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:39:51
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:39:56
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:40:01
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:40:06
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:40:11
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:40:16
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateOK
ok
12/26/2024 15:40:21
0:00:05
HeartbeatStateAnnounce
TeleportDegraded
TeleportDegraded
stateDegraded
stateDegraded
teleport is in a degraded state, check logs for details
12/26/2024 15:40:26
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateRecovering
stateDegraded
teleport is recovering from a degraded state, check logs for details
12/26/2024 15:40:31
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:40:36
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:40:41
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:40:46
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:40:51
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:40:56
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:41:01
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:41:06
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:41:11
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:41:16
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:41:21
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:41:26
0:00:05
HeartbeatStateAnnounce
TeleportDegraded
TeleportDegraded
stateDegraded
stateDegraded
teleport is in a degraded state, check logs for details
12/26/2024 15:41:31
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateRecovering
stateDegraded
teleport is recovering from a degraded state, check logs for details
12/26/2024 15:41:36
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:41:41
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:41:46
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:41:51
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:41:56
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:42:01
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
12/26/2024 15:42:06
0:00:05
HeartbeatStateAnnounceWait
TeleportOKEvent
None / no change
stateOK
stateDegraded
ok
A load balancer that is trying to use /readyz to determine whether to send traffic to this proxy will get inconsistent data. It will show as degraded or recovering for a brief time, and then go back to ok. This doesn't seem like the intended behavior, but if it is, the documentation should reflect it.
Steps to Reproduce:
Start a proxy-only Teleport instance.
Block access to the Teleport auth service. I used a kubectl port-forward, so I just stopped the process so the proxy would see "connection refused"
Watch the /readyz response in a loop over time. Note that it will only show "degraded" and then "recovering" in 5 second windows each, and then return to ok.
The text was updated successfully, but these errors were encountered:
According to https://goteleport.com/docs/admin-guides/management/diagnostics/monitoring/#readyz-and-heartbeats the healthcheck should behave like this:
Instead, if a proxy instance (tested with no other roles than proxy) loses connectivity with the Auth service, it will flap between states in a loop.
I added additional logs in a test build (based on 16.4.12) to capture the behavior in more detail.
The
fetchAndAnnounce()
function in srv/heartbeat.go:624 is called once every 5 seconds. Depending on the current heartbeat state, it will take a different action. If it encounters an error, then a TeleportDegraded event is emitted by the caller. If there is no error, then TeleportOK is emitted.If the state is
HeartbeatStateAnnounce
, then it will attempt to do the heartbeat announce. If there is a problem, it returns an error, and a TeleportDegraded event is emitted. The issue seems to stem from the fact that it will always emit aTeleportOK
event if the heartbeat status is inHeartbeatStateAnnounceWait
, even if it hasn't successfully emitted an event. This doesn't match the behavior as described in the documentation.A Table showing the expected and observed states and events fired over time
A load balancer that is trying to use
/readyz
to determine whether to send traffic to this proxy will get inconsistent data. It will show as degraded or recovering for a brief time, and then go back took
. This doesn't seem like the intended behavior, but if it is, the documentation should reflect it.Steps to Reproduce:
kubectl port-forward
, so I just stopped the process so the proxy would see "connection refused"/readyz
response in a loop over time. Note that it will only show "degraded" and then "recovering" in 5 second windows each, and then return took
.The text was updated successfully, but these errors were encountered: