Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incomplete URL substring sanitization #2789

Conversation

AshitaSingamsetty
Copy link

There is CODEQL Scan issue in the following

The string system.network.io may be at an arbitrary position in the sanitized URL.
The string system.disk.io may be at an arbitrary position in the sanitized URL.

Type of change
Bug fix (non-breaking change which fixes an issue) in file

instrumentation/opentelemetry-instrumentation-system-metrics/src/opentelemetry/instrumentation/system_metrics/init.py

Copy link

linux-foundation-easycla bot commented Aug 9, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link
Member

@emdneto emdneto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate a bit more on how this is fixing an issue?

@AshitaSingamsetty
Copy link
Author

Could you please elaborate a bit more on how this is fixing an issue?

The Code snippet is designed to check if the strings "system.disk.io" and "system.network.io" are present in any of the URLs in the configuration. The issue identified by the CODEQL scan is that these strings may appear at arbitrary positions within the sanitized URLs, which could lead to false positives or missed detections.

To address this issue, we need to ensure that the checks for "system.disk.io" and "system.network.io" are more precise.

Sanitization: The configuration is sanitized by decoding each URL.

Check for "system.disk.io" in URLs: The code checks if the string "system.disk.io" is present in any of the sanitized URLs.
any("system.disk.io" in url for urls in sanitized_config.values() for url in urls) iterates through all URLs in sanitized_config and returns True if "system.disk.io" is found in any URL.
If the string is found, an observable counter named "system.disk.io" is created using self._meter.create_observable_counter.
The counter is configured with a callback to self._get_system_disk_io, a description, and a unit of "bytes".

Check for "system.network.io" in URLs: Similarly, the code checks if the string "system.network.io" is present in any of the sanitized URLs.
If the string is found, an observable counter named "system.network.io" is created with a callback to self._get_system_network_io, a description, and a unit of "bytes".

Summary
The highlighted code snippet sanitizes the configuration URLs and checks for the presence of specific strings ("system.disk.io" and "system.network.io") within these URLs. If these strings are found, it creates observable counters for monitoring disk and network I/O, respectively. This ensures that the system can track and report on disk and network I/O metrics based on the configuration provided.

@emdneto
Copy link
Member

emdneto commented Aug 13, 2024

Could you please elaborate a bit more on how this is fixing an issue?

The Code snippet is designed to check if the strings "system.disk.io" and "system.network.io" are present in any of the URLs in the configuration. The issue identified by the CODEQL scan is that these strings may appear at arbitrary positions within the sanitized URLs, which could lead to false positives or missed detections.

To address this issue, we need to ensure that the checks for "system.disk.io" and "system.network.io" are more precise.

Sanitization: The configuration is sanitized by decoding each URL.

Check for "system.disk.io" in URLs: The code checks if the string "system.disk.io" is present in any of the sanitized URLs. any("system.disk.io" in url for urls in sanitized_config.values() for url in urls) iterates through all URLs in sanitized_config and returns True if "system.disk.io" is found in any URL. If the string is found, an observable counter named "system.disk.io" is created using self._meter.create_observable_counter. The counter is configured with a callback to self._get_system_disk_io, a description, and a unit of "bytes".

Check for "system.network.io" in URLs: Similarly, the code checks if the string "system.network.io" is present in any of the sanitized URLs. If the string is found, an observable counter named "system.network.io" is created with a callback to self._get_system_network_io, a description, and a unit of "bytes".

Summary The highlighted code snippet sanitizes the configuration URLs and checks for the presence of specific strings ("system.disk.io" and "system.network.io") within these URLs. If these strings are found, it creates observable counters for monitoring disk and network I/O, respectively. This ensures that the system can track and report on disk and network I/O metrics based on the configuration provided.

Well, what I understand is that the problem is that your codeql thinks "system.disk.io" is a URL, but I think that isn't the case here. "system.disk.io" and "system.network.io" are metric names.

@xrmx
Copy link
Contributor

xrmx commented Aug 19, 2024

I think this is a false positive reported by an automatic tool that is confused by the code. Closing.

@xrmx xrmx closed this Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants