[enhancement](nereids) improve lots of values in insert into values statement#40202
Merged
morrySnow merged 17 commits intoapache:masterfrom Dec 23, 2024
Merged
[enhancement](nereids) improve lots of values in insert into values statement#40202morrySnow merged 17 commits intoapache:masterfrom
insert into values statement#40202morrySnow merged 17 commits intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 38038 ms |
TPC-DS: Total hot run time: 192483 ms |
ClickBench: Total hot run time: 31.62 s |
c221b92 to
1b6854e
Compare
Contributor
Author
|
run buildall |
5 similar comments
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
2a30e59 to
4fe7f08
Compare
Contributor
Author
|
run buildall |
3 similar comments
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
61d85d6 to
5c8f0a7
Compare
Contributor
Author
|
run buildall |
7 similar comments
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
13e1f0f to
bee14a8
Compare
Contributor
Author
|
run buildall |
1 similar comment
Contributor
Author
|
run buildall |
This was referenced Mar 6, 2025
yiguolei
pushed a commit
that referenced
this pull request
Mar 11, 2025
…ion (#48780) ### What problem does this PR solve? when nereids cast invalid date literal to date like type, it will throws exceptions: ``` select '' = cast('2020-10-20' as date); (1105, 'errCode = 2, detailMessage = date/datetime literal [] is invalid') ``` But old planner will not throw exceptions, so let neredis don't throw exceptions too. This PR is pick code from: #40202 ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [x] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
16 tasks
morrySnow
pushed a commit
that referenced
this pull request
Mar 14, 2025
### What problem does this PR solve? when parse date literal failed, no throw DateTimeException, all throw AnalysisException. and for cast date literal met Exception, will skip parsing it, and give it to be for processing. Relate PR: #40202
deardeng
pushed a commit
to deardeng/incubator-doris
that referenced
this pull request
Apr 30, 2025
) when nereids cast invalid date literal to date like type, it will throws exceptions: ``` select '' = cast('2020-10-20' as date); (1105, 'errCode = 2, detailMessage = date/datetime literal [] is invalid') ``` But old planner will not throw exceptions, so let neredis don't throw exceptions too. This PR is pick code from: apache#40202 Co-Authored-By: yujun <yu.jun.reach@gmail.com>
koarz
pushed a commit
to koarz/doris
that referenced
this pull request
Jun 4, 2025
) ### What problem does this PR solve? when parse date literal failed, no throw DateTimeException, all throw AnalysisException. and for cast date literal met Exception, will skip parsing it, and give it to be for processing. Relate PR: apache#40202
16 tasks
924060929
added a commit
to 924060929/incubator-doris
that referenced
this pull request
Jun 19, 2025
… statement (apache#40202) improve lots of values in `insert into values` statement by bypass NereidsPlanner the main logic is 1. `InsertUtils.normalizePlan` use `FoldConstantRuleOnFE` to reduce the expression, e.g. `values(date(now())` 2. `FastInsertIntoValuesPlanner` skip most of rules to analyze and rewrite `LogicalInlineTable` to `LogicalUnion` or `LogicalOneRowRelation` 3. fast parse date time string without date format 4. getHintMap and normal lexer share the same tokens 5. `set enable_fast_analyze_into_values=false` can force to execute all optimize rules, when we meet some bugs in `FastInsertIntoValuesPlanner` test: insert 1000 rows with 1000 columns, the columns contains int, bigint, decimal(26,7), date, datetime, varchar(10 chinese chars) +---------------------------------+------------------------------------------------------+--------------------------+--------------------------+ |FastInsertIntoValuesPlanner |NereidsPlanner(enable_fast_analyze_into_values=false) |Legacy optimizer in 2.1.6 | Nereids planner in 2.1.6 | +---------------------------------+------------------------------------------------------+--------------------------+--------------------------+ |16s(bottleneck is antlr's lexer) |32s |16s |80s | +---------------------------------+------------------------------------------------------+--------------------------+--------------------------+ If you use FastInsertIntoValuesPlanner with group commit in a transaction, the time can reduce to 12s. TODO: build a custom lexer. in my hand write lexer test, FastInsertIntoValuesPlanner without group commit can reduce 16s to 12s, but it will take more effort: RegularExpression -> NFA -> DFA -> minimal DFA -> Lexer codegen (cherry picked from commit 81f3c48)
morrySnow
pushed a commit
that referenced
this pull request
Jun 20, 2025
This was referenced Jul 2, 2025
Merged
yiguolei
pushed a commit
that referenced
this pull request
Jul 3, 2025
…n collect hint map (#52627) cherry pick part of code from pr: #40202 commitId: 81f3c48 ### What problem does this PR solve? Issue Number: close #xxx Related PR: #xxx Problem Summary: ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
16 tasks
924060929
added a commit
that referenced
this pull request
Jul 10, 2025
…nbound object' and 'Insert has filtered data in strict mode' exception (#52802) 1. fix `Invalid call to sql on unbound object` when use `interval`, introduced by #40202 ```sql CREATE TABLE `test_insert_cast_interval` ( `id` int NULL, `dt` date NULL ) ENGINE=OLAP DISTRIBUTED BY HASH(`id`) BUCKETS 10 PROPERTIES ( "replication_allocation" = "tag.location.default: 1" ); INSERT INTO test_insert_cast_interval values(1, date_floor('2020-02-02', interval 1 second)); (1105, 'errCode = 2, detailMessage = Invalid call to sql on unbound object') ``` 2. fix `Insert has filtered data in strict mode`, introduced by #49116 ```sql CREATE TABLE `test_insert_more_string` ( `r_regionkey` int NULL, `r_name` varchar(25) NULL, `r_comment` varchar(152) NULL ) ENGINE=OLAP DISTRIBUTED BY HASH(`r_regionkey`) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1"); insert into test_insert_more_string values (3, "akljalkjbalkjsldkrjewokjfalksdjflaksjfdlaskjfalsdkfjalsdfjkasfdl", "aa"); (1105, 'errCode = 2, detailMessage = Insert has filtered data in strict mode') ```
924060929
added a commit
to 924060929/incubator-doris
that referenced
this pull request
Jul 10, 2025
…nbound object' and 'Insert has filtered data in strict mode' exception (apache#52802) 1. fix `Invalid call to sql on unbound object` when use `interval`, introduced by apache#40202 ```sql CREATE TABLE `test_insert_cast_interval` ( `id` int NULL, `dt` date NULL ) ENGINE=OLAP DISTRIBUTED BY HASH(`id`) BUCKETS 10 PROPERTIES ( "replication_allocation" = "tag.location.default: 1" ); INSERT INTO test_insert_cast_interval values(1, date_floor('2020-02-02', interval 1 second)); (1105, 'errCode = 2, detailMessage = Invalid call to sql on unbound object') ``` 2. fix `Insert has filtered data in strict mode`, introduced by apache#49116 ```sql CREATE TABLE `test_insert_more_string` ( `r_regionkey` int NULL, `r_name` varchar(25) NULL, `r_comment` varchar(152) NULL ) ENGINE=OLAP DISTRIBUTED BY HASH(`r_regionkey`) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1"); insert into test_insert_more_string values (3, "akljalkjbalkjsldkrjewokjfalksdjflaksjfdlaskjfalsdkfjalsdfjkasfdl", "aa"); (1105, 'errCode = 2, detailMessage = Insert has filtered data in strict mode') ``` (cherry picked from commit 2c01f69)
924060929
added a commit
to 924060929/incubator-doris
that referenced
this pull request
Jul 14, 2025
…nto values statement (apache#40202) (apache#51925) cherry pick from apache#40202 and apache#51925
924060929
added a commit
to 924060929/incubator-doris
that referenced
this pull request
Jul 14, 2025
…nbound object' and 'Insert has filtered data in strict mode' exception (apache#52802) 1. fix `Invalid call to sql on unbound object` when use `interval`, introduced by apache#40202 ```sql CREATE TABLE `test_insert_cast_interval` ( `id` int NULL, `dt` date NULL ) ENGINE=OLAP DISTRIBUTED BY HASH(`id`) BUCKETS 10 PROPERTIES ( "replication_allocation" = "tag.location.default: 1" ); INSERT INTO test_insert_cast_interval values(1, date_floor('2020-02-02', interval 1 second)); (1105, 'errCode = 2, detailMessage = Invalid call to sql on unbound object') ``` 2. fix `Insert has filtered data in strict mode`, introduced by apache#49116 ```sql CREATE TABLE `test_insert_more_string` ( `r_regionkey` int NULL, `r_name` varchar(25) NULL, `r_comment` varchar(152) NULL ) ENGINE=OLAP DISTRIBUTED BY HASH(`r_regionkey`) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1"); insert into test_insert_more_string values (3, "akljalkjbalkjsldkrjewokjfalksdjflaksjfdlaskjfalsdkfjalsdfjkasfdl", "aa"); (1105, 'errCode = 2, detailMessage = Insert has filtered data in strict mode') ``` (cherry picked from commit 2c01f69)
dataroaring
pushed a commit
that referenced
this pull request
Aug 12, 2025
…n collect hint map (#52629) cherry-pick part code from #40202 pr: #40202 commitId: 81f3c48 ### What problem does this PR solve? Issue Number: close #xxx Related PR: #xxx Problem Summary: ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposed changes
improve lots of values in
insert into valuesstatement by bypass NereidsPlannerthe main logic is
InsertUtils.normalizePlanuseFoldConstantRuleOnFEto reduce the expression, e.g.values(date(now())FastInsertIntoValuesPlannerskip most of rules to analyze and rewriteLogicalInlineTabletoLogicalUnionorLogicalOneRowRelationset enable_fast_analyze_into_values=falsecan force to execute all optimize rules, when we meet some bugs inFastInsertIntoValuesPlannertest: insert 1000 rows with 1000 columns, the columns contains int, bigint, decimal(26,7), date, datetime, varchar(10 chinese chars)
If you use FastInsertIntoValuesPlanner with group commit in a transaction, the time can reduce to 12s.
TODO: build a custom lexer. in my hand write lexer test, FastInsertIntoValuesPlanner without group commit can reduce 16s to 12s, but it will take more effort: RegularExpression -> NFA -> DFA -> minimal DFA -> Lexer codegen