-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
session: skip creating indexes on the analyze_jobs table for older clusters #58608
base: master
Are you sure you want to change the base?
session: skip creating indexes on the analyze_jobs table for older clusters #58608
Conversation
…usters Signed-off-by: Rustin170506 <techregister@pm.me>
Signed-off-by: Rustin170506 <techregister@pm.me>
Signed-off-by: Rustin170506 <techregister@pm.me>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #58608 +/- ##
================================================
+ Coverage 73.5500% 74.6977% +1.1477%
================================================
Files 1680 1695 +15
Lines 464730 464785 +55
================================================
+ Hits 341809 347184 +5375
+ Misses 102055 96025 -6030
- Partials 20866 21576 +710
Flags with carried forward coverage won't be shown. Click here to find out more.
|
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔢 Self-check (PR reviewed by myself and ready for feedback.)
[LGTM Timeline notifier]Timeline:
|
doReentrantDDL(s, addAnalyzeJobsSchemaTableStateIndex, dbterror.ErrDupKeyName) | ||
doReentrantDDL(s, addAnalyzeJobsSchemaTablePartitionStateIndex, dbterror.ErrDupKeyName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
personally, i don't like the idea that for two clusters of same version, but with different schema. It introduce more maintenance burden, if I don't know this PR, and meet a issue and see this difference, I will take it as a bug at first glance.
for a production cluster with so many tables, say 1M, the upgrade duration is quite long in most cases, not just the process of TiDB upgrade, also for rolling upgrade of other components, it's even longer when there are many online traffic, takes hours or even days. I think it's acceptable for this add-index
to be slower, and it's the tradeoff we have to make
If we can reduce the size of this table, or the index create faster, that would be better, certainly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
personally, i don't like the idea that for two clusters of same version, but with different schema. It introduce more maintenance burden, if I don't know this PR, and meet a issue and see this difference, I will take it as a bug at first glance.
Yes. I agree that having different schemas is annoying. But for most users the full table scan is OK. So introducing the prototail risk to slow down the upgrade is not worth it. We don't want to introduce that risk for users that 99% of them don't have the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also asked @tangenta, he suggested that it is better not to do this kind of operation for a volumetric table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
99% of them don't have the problem.
99% users don't have 1M tables, so upgrade is fast even with the index. this PR is for the 1%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/hold
I am not in a hurry with this PR. So let's discuss it further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
99% users don't have 1M tables, so upgrade is fast even with the index. this PR is for the 1%
My point here is that 99% of users do not have 1M tables, so there is no need to add this index for them to bring the potential risk for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMU, 99% of users do not have 1M tables
-> 99% users don't have 100K rows
-> add-index very fast
-> no such risk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may also need to add new indexes on tables like mysql.stats_histograms
.
This table is related to column count and index count. It's more likely to be a big table.
So the problem will still exist.
I think we can add the related operation to our upgrading guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can add the related operation to our upgrading guide.
you mean ask users to manually create index on system table? system table should better be managed by TiDB itself IMO, and most production cluster have very strict permission control, asking DBA or others with root permission to manage what TiDB itself should done, not sure how much they will buy this idea.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Leavrth, time-and-fate, winoros, yudongusa The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What problem does this PR solve?
Issue Number: close #57996
Problem Summary:
What changed and how does it work?
I test #58134 again locally to evaluate the performance of creating indexes. For 100k rows, it takes 16 seconds to create the indexes, although it is not that slow, but it still takes some time. So I decided to undo part of this change. We will only create the new indexes for the new cluster. And we do not create the index for the old clusters during the upgrade process.
Normally, for the smaller cluster, this should not be a problem. But for some huge clusters, we can ask users to manually create it instead of blocking the upgrade process.
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.