[improvement](External Catalog) Remove unnecessary conjuncts handling in External Catalog by zy-kkk · Pull Request #41218 · apache/doris

zy-kkk · 2024-09-24T08:51:51Z

In the previous FileScanNode, some parts that used conjuncts for predicate conversion were placed in the init phase. However, for the Nereids planner, pushing the filter down to the scan happens in the Translator, which means that the ScanNode can only get the complete conjuncts in the finalized phase. Therefore, in this PR, I have removed all conjuncts variables in External for the Nereids planner. They no longer need to store conjuncts themselves or add them to the ScanNode. Instead, all places in the ScanNode that use conjuncts should be moved to the finalized phase.

This refactor also fix a performance issue introduced from #40176
After introducing the change of generating SelectNode for consecutive projects or filters, FileScan still adds conjuncts too early in the init phase, resulting in the discovery of consecutive filters when the upper layer continues to translate, a selectnode was unexpectedly generated on the scannode, causing the project to be unable to prune the scannode columns. However, the Project node trims columns of SelectNode and ScanNode differently, which causes ScanNode to scan unnecessary columns.

My modification removes the addition of conjuncts in the scannode step, so that we can keep the structure from ScanNode to Project and achieve correct column trimming.

doris-robot · 2024-09-24T08:51:56Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

zy-kkk · 2024-09-24T08:52:10Z

run buildall

morrySnow · 2024-09-24T08:53:41Z

fe/fe-core/src/main/java/org/apache/doris/datasource/es/source/EsScanNode.java

    @Override
    public void init(Analyzer analyzer) throws UserException {
        super.init(analyzer);
-        buildQuery();


do not modify legacy planner

zy-kkk · 2024-09-25T14:58:53Z

run buildall

zy-kkk · 2024-09-26T11:25:59Z

run buildall

morningman · 2024-10-10T14:30:37Z

run buildall

morningman

LGTM

github-actions · 2024-10-10T14:37:27Z

PR approved by at least one committer and no changes requested.

github-actions · 2024-10-10T14:37:29Z

PR approved by anyone and no changes requested.

zy-kkk · 2024-10-11T13:12:14Z

run buildall

zy-kkk · 2024-10-12T09:47:15Z

run buildall

zy-kkk · 2024-10-14T04:05:51Z

run buildall

zy-kkk · 2024-10-14T08:37:30Z

run buildall

… in External Catalog

zy-kkk · 2024-10-15T06:39:49Z

run buildall

zy-kkk · 2024-10-16T07:59:22Z

run buildall

morningman

LGTM

github-actions · 2024-10-16T11:23:20Z

PR approved by at least one committer and no changes requested.

morrySnow

do we have external table partition prune test case in regression test?

zy-kkk · 2024-10-16T12:19:27Z

do we have external table partition prune test case in regression test?

Yes, such as: test_hive_partition, test_hive_default_partition

…apache#41218) In the previous FileScanNode, some parts that used conjuncts for predicate conversion were placed in the init phase. However, for the Nereids planner, pushing the filter down to the scan happens in the Translator, which means that the ScanNode can only get the complete conjuncts in the finalized phase. Therefore, in this PR, I have removed all conjuncts variables in External for the Nereids planner. They no longer need to store conjuncts themselves or add them to the ScanNode. Instead, all places in the ScanNode that use conjuncts should be moved to the finalized phase. This refactor also fix a performance issue introduced from apache#40176 After introducing the change of generating SelectNode for consecutive projects or filters, FileScan still adds conjuncts too early in the init phase, resulting in the discovery of consecutive filters when the upper layer continues to translate, a selectnode was unexpectedly generated on the scannode, causing the project to be unable to prune the scannode columns. However, the Project node trims columns of SelectNode and ScanNode differently, which causes ScanNode to scan unnecessary columns. My modification removes the addition of conjuncts in the scannode step, so that we can keep the structure from ScanNode to Project and achieve correct column trimming.

… Scan (#42261) pick (#41218) In the previous FileScanNode, some parts that used conjuncts for predicate conversion were placed in the init phase. However, for the Nereids planner, pushing the filter down to the scan happens in the Translator, which means that the ScanNode can only get the complete conjuncts in the finalized phase. Therefore, in this PR, I have removed all conjuncts variables in External for the Nereids planner. They no longer need to store conjuncts themselves or add them to the ScanNode. Instead, all places in the ScanNode that use conjuncts should be moved to the finalized phase. This refactor also fix a performance issue introduced from #40176 After introducing the change of generating SelectNode for consecutive projects or filters, FileScan still adds conjuncts too early in the init phase, resulting in the discovery of consecutive filters when the upper layer continues to translate, a selectnode was unexpectedly generated on the scannode, causing the project to be unable to prune the scannode columns. However, the Project node trims columns of SelectNode and ScanNode differently, which causes ScanNode to scan unnecessary columns. My modification removes the addition of conjuncts in the scannode step, so that we can keep the structure from ScanNode to Project and achieve correct column trimming.

…apache#41218) In the previous FileScanNode, some parts that used conjuncts for predicate conversion were placed in the init phase. However, for the Nereids planner, pushing the filter down to the scan happens in the Translator, which means that the ScanNode can only get the complete conjuncts in the finalized phase. Therefore, in this PR, I have removed all conjuncts variables in External for the Nereids planner. They no longer need to store conjuncts themselves or add them to the ScanNode. Instead, all places in the ScanNode that use conjuncts should be moved to the finalized phase. This refactor also fix a performance issue introduced from apache#40176 After introducing the change of generating SelectNode for consecutive projects or filters, FileScan still adds conjuncts too early in the init phase, resulting in the discovery of consecutive filters when the upper layer continues to translate, a selectnode was unexpectedly generated on the scannode, causing the project to be unable to prune the scannode columns. However, the Project node trims columns of SelectNode and ScanNode differently, which causes ScanNode to scan unnecessary columns. My modification removes the addition of conjuncts in the scannode step, so that we can keep the structure from ScanNode to Project and achieve correct column trimming.

… Scan (#43018) bp (#41218) In the previous FileScanNode, some parts that used conjuncts for predicate conversion were placed in the init phase. However, for the Nereids planner, pushing the filter down to the scan happens in the Translator, which means that the ScanNode can only get the complete conjuncts in the finalized phase. Therefore, in this PR, I have removed all conjuncts variables in External for the Nereids planner. They no longer need to store conjuncts themselves or add them to the ScanNode. Instead, all places in the ScanNode that use conjuncts should be moved to the finalized phase. This refactor also fix a performance issue introduced from #40176 After introducing the change of generating SelectNode for consecutive projects or filters, FileScan still adds conjuncts too early in the init phase, resulting in the discovery of consecutive filters when the upper layer continues to translate, a selectnode was unexpectedly generated on the scannode, causing the project to be unable to prune the scannode columns. However, the Project node trims columns of SelectNode and ScanNode differently, which causes ScanNode to scan unnecessary columns. My modification removes the addition of conjuncts in the scannode step, so that we can keep the structure from ScanNode to Project and achieve correct column trimming.

) ### What problem does this PR solve? Problem Summary: In the previous PR #41218, some partition pruning logic was changed, which caused the hudi partition pruning to fail. This PR is to fix this problem. ### Release note [fix](hudi) fix hudi partition prune issue

morrySnow reviewed Sep 24, 2024

View reviewed changes

morrySnow added the dev/2.1.x label Sep 24, 2024

zy-kkk force-pushed the del_useless_conjuncts_for_external branch from 897bda4 to db088ff Compare September 25, 2024 14:58

zy-kkk force-pushed the del_useless_conjuncts_for_external branch from db088ff to 7e36a14 Compare September 26, 2024 11:25

zy-kkk force-pushed the del_useless_conjuncts_for_external branch 2 times, most recently from 6d91294 to 7bec2a1 Compare September 27, 2024 08:00

morningman force-pushed the del_useless_conjuncts_for_external branch from 7bec2a1 to 34701b2 Compare October 10, 2024 14:25

morningman added the dev/3.0.x label Oct 10, 2024

morningman previously approved these changes Oct 10, 2024

View reviewed changes

morningman added the p0_b label Oct 10, 2024

github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 10, 2024

github-actions bot added the reviewed label Oct 10, 2024

zy-kkk force-pushed the del_useless_conjuncts_for_external branch from 34701b2 to 1f6d21b Compare October 11, 2024 13:12

zy-kkk dismissed morningman’s stale review via 59f2acb October 12, 2024 09:39

github-actions bot removed the approved Indicates a PR has been approved by one committer. label Oct 12, 2024

[improvement](External Catalog) Remove unnecessary conjuncts handling…

0828a41

… in External Catalog

zy-kkk force-pushed the del_useless_conjuncts_for_external branch from 81ebfa3 to 0828a41 Compare October 15, 2024 06:39

fix ut

9d0839b

morningman approved these changes Oct 16, 2024

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 16, 2024

wuwenchi approved these changes Oct 16, 2024

View reviewed changes

CalvinKirs approved these changes Oct 16, 2024

View reviewed changes

morrySnow reviewed Oct 16, 2024

View reviewed changes

morrySnow merged commit 22aabb5 into apache:master Oct 17, 2024

zy-kkk deleted the del_useless_conjuncts_for_external branch October 22, 2024 09:49

zy-kkk mentioned this pull request Oct 22, 2024

[2.1][opt](Catalog) Remove unnecessary conjuncts handling on External Scan #42261

Merged

morningman added dev/2.1.7-merged and removed dev/2.1.x labels Oct 22, 2024

zy-kkk mentioned this pull request Oct 31, 2024

[3.0][opt](Catalog) Remove unnecessary conjuncts handling on External Scan #43018

Merged

zy-kkk added dev/3.0.3-merged and removed dev/3.0.x labels Nov 4, 2024

hubgeter mentioned this pull request Nov 27, 2024

[fix](hudi)Add hudi catalog read partition table partition prune #44669

Merged

16 tasks

Conversation

zy-kkk commented Sep 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

doris-robot commented Sep 24, 2024

Uh oh!

zy-kkk commented Sep 24, 2024

Uh oh!

morrySnow Sep 24, 2024

Choose a reason for hiding this comment

Uh oh!

zy-kkk commented Sep 25, 2024

Uh oh!

zy-kkk commented Sep 26, 2024

Uh oh!

morningman commented Oct 10, 2024

Uh oh!

morningman left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 10, 2024

Uh oh!

github-actions bot commented Oct 10, 2024

Uh oh!

zy-kkk commented Oct 11, 2024

Uh oh!

zy-kkk commented Oct 12, 2024

Uh oh!

zy-kkk commented Oct 14, 2024

Uh oh!

zy-kkk commented Oct 14, 2024

Uh oh!

zy-kkk commented Oct 15, 2024

Uh oh!

zy-kkk commented Oct 16, 2024

Uh oh!

morningman left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 16, 2024

Uh oh!

morrySnow left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zy-kkk commented Oct 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

zy-kkk commented Sep 24, 2024 •

edited

Loading

morrySnow left a comment •

edited

Loading