-
Notifications
You must be signed in to change notification settings - Fork 511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-11511. Introduce metrics in deletion services of OM #7377
Conversation
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java
Outdated
Show resolved
Hide resolved
@ashishkumar50 thank you for the reviews, I have addressed the comments, could you please review the update? |
...src/main/java/org/apache/hadoop/ozone/om/response/key/OMDirectoriesPurgeResponseWithFSO.java
Outdated
Show resolved
Hide resolved
...r/src/main/java/org/apache/hadoop/ozone/om/request/key/OMDirectoriesPurgeRequestWithFSO.java
Outdated
Show resolved
Hide resolved
...r/src/main/java/org/apache/hadoop/ozone/om/request/key/OMDirectoriesPurgeRequestWithFSO.java
Outdated
Show resolved
Hide resolved
@ashishkumar50 I have addressed all the comments, thank you for the review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plz check for review comments
public void setDirectoryDeletionIterationMetrics(long runcount, long startTime, long duration, | ||
long dirDel, long subdirDel, | ||
long filesMove, long subdirMove) { | ||
setIterationDirRunCount(runcount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do not need capture runCount and startTime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see anything wrong with capturing run count. It has been useful in the past to detect when the service has completed a run after a restart, and can be divided over past time intervals to get average time taken per run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@errose28 I am not getting what we are going to achieve with runCount as metrics. If there are 10 threads running, it will be increase by 10 times as default.
We can use duration
as metrics for performance, but not the startTime. This will give better metrics. So startTime may not be useful, this can be easily identified as Timer task.
For task running, based on delete count increase, and other data can show the progress. So Above metric is not useful.
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java
Outdated
Show resolved
Hide resolved
...r/src/main/java/org/apache/hadoop/ozone/om/request/key/OMDirectoriesPurgeRequestWithFSO.java
Outdated
Show resolved
Hide resolved
...ne/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyPurgeRequest.java
Outdated
Show resolved
Hide resolved
...r/src/main/java/org/apache/hadoop/ozone/om/request/key/OMDirectoriesPurgeRequestWithFSO.java
Outdated
Show resolved
Hide resolved
...one-manager/src/main/java/org/apache/hadoop/ozone/om/service/AbstractKeyDeletingService.java
Outdated
Show resolved
Hide resolved
...one-manager/src/main/java/org/apache/hadoop/ozone/om/service/AbstractKeyDeletingService.java
Outdated
Show resolved
Hide resolved
...ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/KeyDeletingService.java
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java
Outdated
Show resolved
Hide resolved
@sumitagrawl @ashishkumar50 @errose28 I have updated the PR to have only restart metrics and removed the iteration specific ones. |
@@ -456,6 +460,7 @@ public long optimizeDirDeletesAndSubmitRequest(long remainNum, | |||
" totalRunCount: {}", | |||
dirNum, subdirDelNum, subFileNum, (subDirNum - subdirDelNum), | |||
Time.monotonicNow() - startTime, rnCnt); | |||
metrics.incrementDirectoryDeletionTotalMetrics(dirNum + subdirDelNum, subdirMoved, subFileNum); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validate if {deletedDirsCount, movedDirsCount, movedFilesCount} as variable, PurgeDir metrics {NumSubDirectoriesMoved, NumSubFilesMoved, NumDirPurged} and Metrics here {dirDel, dirMove, fileMove} are same value, else this will create confusion between metrics, and also with logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! I have verified the numbers using integration tests and updated the PR with some checks in the tests.
@sumitagrawl @ashishkumar50 When a delete is performed with skipTrash, the logs and metrics show the exact number of deletes as the user perceives. But in the case we don't skipTrash, due to a checkpoint directory created which is also deleted, the count of deleted directories is +1 -- which is reflected in both the pre-existing logs and metrics added in this PR. All comments have been addressed. Can you please review the updated PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What changes were proposed in this pull request?
The following metrics are added in OM's deletingServices:
KeyDeleting
No of keys processed for delete since last restart
No of keys sent to scm for delete and response(success/fail) since last restart
DirDeleting
No of dirs/empty dirs processed for delete since last restart
No of subfiles processed for delete since last restart
Dirpurge
No of dirs purged since last restart
No of subfiles purged since last restart
Keypurge
No of keys purged since last restart
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-11511
How was this patch tested?
Tested manually in docker at the jmx endpoint
http://localhost:9874/jmx?qry=Hadoop:service=OzoneManager,name=DeletingServiceMetrics
: