-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Iceberg]Support non-identity partition predicate for thoroughly push down #21999
[Iceberg]Support non-identity partition predicate for thoroughly push down #21999
Conversation
Is this a cherry pick? At first glance, I don't see substantial similarities with the referenced PR. |
I referenced the implementation of trino, which may evolve gradually through a series of PRs. Is it suitable to be described as a "cherry-pick"? |
Based on my reading of the code, this wouldn't qualify as a cherry pick as it's substantially different, and even the way it's performed is different between the two systems. I think we can remove the cherry pick reference. |
OK, fixed! Thanks for the suggestion. |
bb7daeb
to
90f6f9d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some initial comments, I'll need more time to review this.
presto-common/src/main/java/com/facebook/presto/common/type/Type.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of making changes directly to Type
, can we just have a new utility method in the Iceberg connector which takes in the Type, the Object, and from those two things infers the primtive type of the Object from the SQL Type? The helper methods on Type don't directly relate to Block usage, which is what we expect for Types.
presto-common/src/main/java/com/facebook/presto/common/type/AbstractIntType.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/com/facebook/presto/iceberg/optimizer/IcebergPlanOptimizer.java
Outdated
Show resolved
Hide resolved
Very reasonable, I agree that we should not modify common Should we move the method |
658b0f1
to
18366d6
Compare
I think we can start in the Iceberg connector, and extract if or when an additional use case comes along. |
Good idea! Have done this, please take a look, thanks a lot! |
18366d6
to
1873dc3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff! The core changes LGTM. Mainly a few nits on the tests
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergLogicalPlanner.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergLogicalPlanner.java
Outdated
Show resolved
Hide resolved
1873dc3
to
faee439
Compare
Hi @ZacBlanco, I have combined |
faee439
to
198d79f
Compare
@tdcmeehan @ZacBlanco just resolved the conflict in |
Don't forget to squash your commits |
198d79f
to
d3dd4fc
Compare
OK, done! |
@@ -172,11 +247,21 @@ private static Block bucketInteger(Block block, int count) | |||
return bucketBlock(block, count, position -> bucketHash(INTEGER.getLong(block, position))); | |||
} | |||
|
|||
private static int bucketValueInteger(Block block, int position, int count) | |||
{ | |||
return bucketValue(block, count, position, pos -> bucketHash(INTEGER.getLong(block, pos))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and for some of the other newly added methods: Is it intended that this calls bucketValue(..., count, position, ...)
even though the signature of the method is bucketValue(..., int position, int count, ...)
(that is, the arguments are switched)?
Sorry for this drive-by and late review comment, hopefully it is useful though. I wasn't sure whether a separate issue or discussion was justified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out. I think in some cases the code will still work but bucketing will technically be incorrect. We should switch the arguments. I'll open a PR later today to fix this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch @Marcono1234, thanks for pointing out this, seems we didn't have sufficient test cases to cover these codes. So perhaps we need to supplement some test cases that can cover these code logics in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#24309 fixes this
Description
This PR extends Iceberg capabilities to pushdown thoroughly predicates on partitioning columns. Besides pushdown thoroughly predicates on identity partitioning columns, it also pushdown thoroughly predicates if they align with partitioning boundaries.
It is also helpful for DELETE, as it allows to do metadata-only delete in more cases, where we don't really needing to do a row-level delete.
Test Plan
Contributor checklist
Release Notes