-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
controller: Add cloudwatch logging for pods #882
Conversation
18daee3
to
5d576a7
Compare
Err(e) => { | ||
let service_error = e.into_service_error(); | ||
if service_error.is_resource_already_exists_exception() { | ||
info!("Log group already exists.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a little noisy?
info!("Log group already exists.") | |
debug!("testsys Cloudwatch Log group already exists.") |
.context(error::NoLogsSnafu { pod: pod_name }) | ||
} | ||
|
||
pub(crate) async fn archive_logs(k8s_client: kube::Client, job_name: &str) -> JobResult<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when does this happen? Is it only after the test has completed or failed? Or can this be kicked off at the start of the pod and the logs stream in realtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It happens as a job is deleted. I wasn't able to find a solution for realtime logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might not be something "inside" the system, but there are likely plenty of solutions for sending pod logs in realtime, so we should consider using those as well. Not going to block the PR though as we need to get something going for this.
#[snafu(display("Unable to create log event '{}': {:?}", log_event, source))] | ||
CreateLogEvent { | ||
log_event: String, | ||
source: aws_sdk_cloudwatchlogs::types::SdkError< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, since we've run into it before, will this unwrap the error fully to something useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it will show the underlying error since it is using debug.
Ok(name) | ||
} | ||
|
||
async fn pod_logs(k8s_client: kube::Client, pod_name: &str) -> JobResult<String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the testsys model already have a way to get logs? Why do we need a new, different way here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are accessing the logs from different interfaces. (ResourceInterface vs TestManager)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should backlog an issue to unify this so that log fetching updates do not require updating 2 places. Not necessary for this change though
Issue number:
N/A
Description of changes:
Add an optional flag to the controller for archiving pod logs with CloudWatch.
Testing done:
Verified that logs are archived when the flag is set and not if the flag is not set. Tested archival for completed pods and interrupted pods.
Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.