Add support for Container CPU Utilization Resource Monitor #37792

nix1n · 2024-12-23T08:03:07Z

Risk Level: Low
Testing: Unit tests
Docs Changes:
Release Notes:
Platform Specific Features: Today this is only implemented for Linux

In my org we use envoy in containerised K8s environment, this PR will allow us to trigger overload actions based on the container cpu utilization of the pod where envoy is running.

repokitteh-read-only · 2024-12-23T08:03:12Z

Hi @nix1n, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #37792 was opened by nix1n.

see: more, trace.

repokitteh-read-only · 2024-12-23T08:03:18Z

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @markdroth
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #37792 was opened by nix1n.

see: more, trace.

…edding in K8s environment Signed-off-by: nix1n <nikhil.murari@hotstar.com>

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

KBaichoo

Thank you for working on this! I think we can try to reuse some of the existing components from the other cpu monitor to make this even better.

source/extensions/resource_monitors/envoy_container_cpu_utilization/container_stats_reader.h

KBaichoo · 2024-12-24T16:03:39Z

...resource_monitors/envoy_container_cpu_utilization/envoy_container_cpu_utilization_monitor.cc

+  previous_envoy_container_stats_ = envoy_container_stats_reader_->getEnvoyContainerStats();
+}
+
+void EnvoyContainerCpuUtilizationMonitor::updateResourceUsage(Server::ResourceUpdateCallbacks& callbacks) {


If we did the folding this file would effectively merge with the existing https://github.com/envoyproxy/envoy/blob/main/source/extensions/resource_monitors/cpu_utilization/cpu_utilization_monitor.h which it looks to me to be very similar.

KBaichoo · 2024-12-24T16:20:08Z

/wait

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

api/envoy/extensions/resource_monitors/cpu_utilization/v3/cpu_utilization.proto

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

markdroth · 2025-01-06T16:40:11Z

api/envoy/extensions/resource_monitors/cpu_utilization/v3/cpu_utilization.proto

@@ -13,7 +14,16 @@ option (udpa.annotations.file_status).package_version_status = ACTIVE;
 // [#protodoc-title: CPU utilization]
 // [#extension: envoy.resource_monitors.cpu_utilization]

-// The CPU utilization resource monitor reports the Envoy process the CPU Utilization of the entire host.
+// The CPU utilization resource monitor reports the Envoy process the,


I'm having trouble parsing this sentence. Can you please clarify?

api/envoy/extensions/resource_monitors/cpu_utilization/v3/cpu_utilization.proto

KBaichoo

Thanks for working on simplifying this, this looks much better.

/wait

source/extensions/resource_monitors/cpu_utilization/cpu_stats_reader.h

source/extensions/resource_monitors/cpu_utilization/cpu_utilization_monitor.cc

KBaichoo · 2025-01-06T18:31:39Z

source/extensions/resource_monitors/cpu_utilization/cpu_stats_reader.h

@@ -22,13 +22,26 @@ struct CpuTimes {
  uint64_t total_time;
 };

+struct CgroupStats {


What is the reason we can't use the existing cpu times structure given it has effectively the same fields?

This would simplify the utilization monitor as it could then just use the CpuStatsReader interface vs the concrete implementation class.

@KBaichoo actually, for Host CPU Utilization, we just need /proc/stat file to read cpu_work and cpu_times both of the field in CpuTimes.
While to calculate container cpu utilization we need to read two different files to read allocated quota at some time, usage seconds total in nanoseconds at that time, and the time_difference from the timer only I could find to calculate. This method we are using in our ambassador edge stack service also, but to update injected resource pressure from a parallely running python script using same calculation strategy.

ref: google/cadvisor#2026 (comment)

So I created another class to read cgroup stats, which read allocated quota, total cpu time in nanoseconds.

This can be merged with CpuTimes class itself but not sure what meaningful naming would suffice for these two though both are uint64_t data type only. But yeah then the reader class also would need to access config mode and based on config mode it should calculate and return stats accordingly.

Time difference can be calculated by timeSource only in case of cgroup metrics . Which we don't need while calculating usage of host from /proc/stat file since all the data in this is time dependent already.

This is a good point, if we were to push down the time source into the cgroup stats reader (along with some other logic) we would be able to produce the same format for CpuTimes -- e.g. https://github.com/envoyproxy/envoy/pull/37792/files#diff-1183c2c3937672e9d4c85d700d1e54ef13d360b4b517a0c643a8e46c13c3eb79R120

we could then de-dup a lot of the implementation details from being brought up at the monitor layer to avoid the monitoring layer having to have implementations for all of the different readers.

@KBaichoo in that case, in cgroupstatsreader, I have to derive some calculation to incorporate timing also to allocated millicores and usage seconds total metrics to make it equivalent to , cputimes fields total time and work time ? Something like that you are meaning ? And then the resource monitor would still using the original strategy without checking config mode and switching the calculation strategy?

This is a good point, if we were to push down the time source into the cgroup stats reader (along with some other logic) we would be able to produce the same format for CpuTimes -- e.g.

have checked the calculation strategy, It should work. But trying now to pushdown timesource from context in stats reader.

@KBaichoo /proc/uptime has precision till 100th of a second. We can use this without using timesource if refresh_interval is not selected below 0.01 seconds. But where to set it's limit ? How much less refresh_interval we can set ? Is it given ? We use 5 second of refresh_interval for loadshedding in our ambassador edgestack. Will there be any usecase that devs would use this for even less than 0.01 second refresh interval ?

Using proc/uptime metrics to calculate will get us rid of timesource in reader class and much simpler implementaion. Just with 1 limitation, refresh_interval less than 0.01 second won't work properly, but might update resource pressure as soon as the monitor's loop interval crosses 0.01 second. For refresh_interval greater than 0.01 seconds, it would be perfect. Please let me know if you will allow this. For us it should be sufficient.

Have implemented all tests and functionality using proc/uptime. @KBaichoo . This should be sufficient for us.

I might be missing something, why do we need to use uptime vs the timesource? ISTM it might be cheaper to use timesource e.g. no file open, etc. and no limitation on uptime granularity.

100% agree that for most use cases polling resource monitors faster is a great way to eat up CPU.

source/extensions/resource_monitors/cpu_utilization/linux_cpu_stats_reader.cc

KBaichoo · 2025-01-06T18:37:14Z

source/extensions/resource_monitors/cpu_utilization/linux_cpu_stats_reader.h

@@ -20,6 +22,16 @@ class LinuxCpuStatsReader : public CpuStatsReader {
  const std::string cpu_stats_filename_;
 };

+class LinuxContainerCpuStatsReader: public CgroupStatsReader {


as mentioned above in https://github.com/envoyproxy/envoy/pull/37792/files#diff-9281e66aafccb8196311602044a32a4ac53a877d7bae55cc591df8f30ae15810R25 if we go that route this can then just inherit directly from CpuStatsReader interface.

@KBaichoo have got rid of timesource and removed multiple redundancies and simplified the monitor as well , our org would not use refresh_interval smaller than even 1 second. And envoy's cpu overhead again increases if we use more smaller refresh_interval . So sticking with /proc/uptime stats which has precision of 0.01 second. Which is more than enough for most usecases and from now on linux cpu stats reader will be sending stats to monitor based on the strategy configured.

source/extensions/resource_monitors/cpu_utilization/cpu_utilization_monitor.h

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

KBaichoo

/wait

KBaichoo · 2025-01-07T22:48:08Z

source/extensions/resource_monitors/cpu_utilization/cpu_utilization_monitor.h

@@ -35,7 +35,7 @@ class CpuUtilizationMonitor : public Server::ResourceMonitor {
  std::unique_ptr <CgroupStatsReader> cgroup_stats_reader_;
  TimeSource& time_source_;
  MonotonicTime last_update_time_;
-  envoy::extensions::resource_monitors::cpu_utilization::v3::CpuUtilizationConfig_UtilizationComputeStrategy mode_;
+  int16_t mode_ = -1; // Will be updated in Resource Monitor Class Constructor


to be more specific I meant to use the non-mangled named like:

envoy::extensions::resource_monitors::cpu_utilization::v3::CpuUtilizationConfig::UtilizationComputeStrategy

I did try this earlier already, it won't build. Previous two ways were only compiling successfully for me. If we use this we have to use default in switch statement then only it builds which you said to avoid.

we can try something like this then, container is specific mode. rest for default computeHostUsage()

KBaichoo · 2025-01-07T23:12:05Z

source/extensions/resource_monitors/cpu_utilization/cpu_stats_reader.h

@@ -22,13 +22,26 @@ struct CpuTimes {
  uint64_t total_time;
 };

+struct CgroupStats {


This is a good point, if we were to push down the time source into the cgroup stats reader (along with some other logic) we would be able to produce the same format for CpuTimes -- e.g. https://github.com/envoyproxy/envoy/pull/37792/files#diff-1183c2c3937672e9d4c85d700d1e54ef13d360b4b517a0c643a8e46c13c3eb79R120

we could then de-dup a lot of the implementation details from being brought up at the monitor layer to avoid the monitoring layer having to have implementations for all of the different readers.

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

Signed-off-by: nix1n <138643332+nix1n@users.noreply.github.com>

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

nix1n

/ready

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

KBaichoo

This is looking much better and almost ready to land. Thank you for iterating on this @nix1n!

KBaichoo · 2025-01-09T22:28:11Z

api/envoy/extensions/resource_monitors/cpu_utilization/v3/cpu_utilization.proto

@@ -13,7 +14,16 @@ option (udpa.annotations.file_status).package_version_status = ACTIVE;
 // [#protodoc-title: CPU utilization]
 // [#extension: envoy.resource_monitors.cpu_utilization]

-// The CPU utilization resource monitor reports the Envoy process the CPU Utilization of the entire host.
-// Today, this only works on Linux and is calculated using the stats in the /proc/stat file.
+// The CPU utilization resource monitor reports the Envoy process the, CPU Utilization


Can you fix the grammar in this sentence, something like:
The CPU utilization resource monitors and reports CPU usage across different platforms.

KBaichoo · 2025-01-09T22:28:46Z

changelogs/current.yaml

@@ -419,6 +419,10 @@ new_features:
  change: |
    Add the option to reduce the rate limit budget based on request/response contexts on stream done.
    See :ref:`apply_on_stream_done <envoy_v3_api_field_config.route.v3.RateLimit.apply_on_stream_done>` for more details.
+- area: resource_monitors
+  change: |
+    Added support for to monitor Container CPU utilization in Linux K8s environment using existing


KBaichoo · 2025-01-09T22:32:16Z

docs/root/configuration/operations/overload_manager/overload_manager.rst

@@ -366,7 +366,9 @@ this overload action can be used to ensure the fleet does not get into a cascadi
 mode.
 Some platform owners may choose to install this overload action by default to protect the fleet,
 since it is easier to configure a target CPU utilization percentage than to configure a request rate per
-workload.
+workload. This supports monitoring both HOST CPU Utilization and K8s Container CPU Utilization.
+By default it's using mode: HOST , to trigger overload actions on Container CPU usage,


s/HOST ,/HOST,/.

KBaichoo · 2025-01-09T22:45:13Z

source/extensions/resource_monitors/cpu_utilization/linux_cpu_stats_reader.cc


 CpuTimes LinuxCpuStatsReader::getCpuTimes() {
+  if (mode_ ==


I see you used kind of the "strategy" pattern here, I think we could simplify this by having two different implementations since e.g.

LinuxHostCpuStatReader which inherits and implement CpuStatsReader and
LinuxContainerCpuStatsReader which inherits and implements CpuStatsReader.

The implementations would just be getCpuTimes having the corresponding code of getHostCpuTimes or getContainerCpuTimes and we'd avoid the check every time -- instead just doing the check once in the factor for making the cpu utilization monitor.

The other nice part about that is that only the relevant files would need to be provided to the different implementations.

Since both return the same underlying CPUTimes and the CPU Utilization monitor uses the interface to interact with them there shouldn't be any friction there.

Done and got the timesource implemented in reader itself.

KBaichoo · 2025-01-09T22:47:42Z

source/extensions/resource_monitors/cpu_utilization/cpu_stats_reader.h

@@ -22,13 +22,26 @@ struct CpuTimes {
  uint64_t total_time;
 };

+struct CgroupStats {


I might be missing something, why do we need to use uptime vs the timesource? ISTM it might be cheaper to use timesource e.g. no file open, etc. and no limitation on uptime granularity.

100% agree that for most use cases polling resource monitors faster is a great way to eat up CPU.

KBaichoo · 2025-01-09T22:48:18Z

source/extensions/resource_monitors/cpu_utilization/cpu_stats_reader.h

-  uint64_t work_time;
-  uint64_t total_time;
+  double work_time;
+  double total_time;


Why did we need to switch these to doubles?

Updated calculation strategy, Now it's just
double work_time, [ This is due to division operation by allocated cpu millicores for averaging. So for container mode, it will divide that's why to accomodate the calculation. ]
and total_time is uint64_t again,
, and since now I have incorporated timesource within LinuxContainerCpuStatsReader, we can return time in nanoseconds as integer itself. Previously it was using proc/uptime so to accomodate it's 1/100th precision it was double. Have changed it back to uint64_t.
So now it's
double work_time;
uint64_t total_time;

Can you get it released soon ? @KBaichoo .

Will there be a problem in merging it if coverage test is failing but for other extension. @KBaichoo

Per-extension coverage failed:
Code coverage for source/common/quic is lower than limit of 93.4 (93.3)
Yesterday it was fine.

KBaichoo · 2025-01-09T22:49:12Z

/wait

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

repokitteh-read-only bot added the api label Dec 23, 2024

repokitteh-read-only bot assigned markdroth Dec 23, 2024

nix1n added 4 commits December 23, 2024 15:08

Add support for Container CPU Utilization Resource Monitor for loadhs…

8d1863c

…edding in K8s environment Signed-off-by: nix1n <nikhil.murari@hotstar.com>

remove trailing whitespaces & add last newline

8c24c52

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

correct doc reference

3de82b3

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

fix changelog yamllint character limit

174111d

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

nix1n force-pushed the main branch from cc1471e to 174111d Compare December 23, 2024 09:38

Add CGROUP in dictionary

26d1e99

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

KBaichoo self-assigned this Dec 23, 2024

Merge branch 'envoyproxy:main' into main

59f6dd3

KBaichoo reviewed Dec 24, 2024

View reviewed changes

repokitteh-read-only bot added the waiting label Dec 24, 2024

Merge branch 'envoyproxy:main' into main

b607d8c

repokitteh-read-only bot removed the waiting label Dec 30, 2024

nix1n added 2 commits January 1, 2025 14:06

add container cpu utilization type monitor

4315e3f

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

add myself as codeowner in cpu_utilization

354bec1

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

markdroth reviewed Jan 3, 2025

View reviewed changes

api/envoy/extensions/resource_monitors/cpu_utilization/v3/cpu_utilization.proto Outdated Show resolved Hide resolved

nix1n and others added 3 commits January 6, 2025 15:06

correct proto config message

beafb4b

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

Merge branch 'envoyproxy:main' into main

b6a619b

docs and changelogs update

e623cf4

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

markdroth reviewed Jan 6, 2025

View reviewed changes

KBaichoo reviewed Jan 6, 2025

View reviewed changes

repokitteh-read-only bot added the waiting label Jan 6, 2025

correct mode initialisation,and proto comments

8979e8b

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

repokitteh-read-only bot removed the waiting label Jan 7, 2025

refactor cpu stats error handling

060118f

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

KBaichoo reviewed Jan 7, 2025

View reviewed changes

repokitteh-read-only bot added the waiting label Jan 7, 2025

change time diff calculation strategy without timesource

a32c1f6

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

repokitteh-read-only bot removed the waiting label Jan 8, 2025

nix1n and others added 6 commits January 8, 2025 17:41

Merge branch 'main' into main

ccd9456

Signed-off-by: nix1n <138643332+nix1n@users.noreply.github.com>

fix test header

3d7bbd1

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

fix proto message comments

153a2a8

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

fix proto message comments

ca96425

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

report error from container cpu usage monitor

114f444

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

add tests for missing cases

b93f3d8

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

nix1n commented Jan 9, 2025

View reviewed changes

nix1n and others added 4 commits January 9, 2025 14:48

Simplify Monitor class implementaion

346cb53

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

Merge branch 'envoyproxy:main' into main

9f90f9c

add missing library in BUIL

64320f4

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

update dictionary

ef793ba

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

KBaichoo reviewed Jan 9, 2025

View reviewed changes

repokitteh-read-only bot added the waiting label Jan 9, 2025

Merge branch 'envoyproxy:main' into main

607050e

repokitteh-read-only bot removed the waiting label Jan 10, 2025

nix1n added 7 commits January 10, 2025 11:41

add timesource support

e4817dc

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

fix doc and spelling

1213a6c

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

remove older restriction of refresh_interval > 0.01s from doc

72da693

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

revert unnecessary changed files

2523785

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

add more realistic calculation and test data

71bd67a

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

fix clang format

44188b2

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

fix spelling

c817300

Signed-off-by: nix1n <nikhil.murari@hotstar.com>

Add support for Container CPU Utilization Resource Monitor #37792

Are you sure you want to change the base?

Add support for Container CPU Utilization Resource Monitor #37792

Conversation

nix1n commented Dec 23, 2024

repokitteh-read-only bot commented Dec 23, 2024

repokitteh-read-only bot commented Dec 23, 2024

KBaichoo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KBaichoo commented Dec 24, 2024

Choose a reason for hiding this comment

KBaichoo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ref: google/cadvisor#2026 (comment)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KBaichoo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nix1n left a comment

Choose a reason for hiding this comment

KBaichoo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KBaichoo commented Jan 9, 2025