fix: move coercion of union from builder to TypeCoercion#11961
Merged
alamb merged 16 commits intoapache:mainfrom Aug 14, 2024
Merged
fix: move coercion of union from builder to TypeCoercion#11961alamb merged 16 commits intoapache:mainfrom
TypeCoercion#11961alamb merged 16 commits intoapache:mainfrom
Conversation
jonahgao
commented
Aug 13, 2024
| } | ||
|
|
||
| #[test] | ||
| fn plan_builder_union_different_num_columns_error() -> Result<()> { |
jonahgao
commented
Aug 13, 2024
| fn assert_optimized_plan_equal(plan: LogicalPlan, expected: &str) -> Result<()> { | ||
| assert_optimized_plan_eq(Arc::new(EliminateNestedUnion::new()), plan, expected) | ||
| let options = ConfigOptions::default(); | ||
| let analyzed_plan = Analyzer::with_rules(vec![Arc::new(TypeCoercion::new())]) |
Member
Author
There was a problem hiding this comment.
Add TypeCoercion to avoid breaking the tests.
jonahgao
commented
Aug 13, 2024
| 05)--------Projection: Int64(1) AS a | ||
| 06)----------EmptyRelation | ||
| 07)--Projection: Float64(2.1) + x.a AS Float64(0) + x.a | ||
| 08)----Aggregate: groupBy=[[Float64(2.1) + CAST(x.a AS Float64)]], aggr=[[]] |
Member
Author
There was a problem hiding this comment.
This is the only difference I observed from the previous test results because type coercion was performed, and x.a had an additional cast.
jonahgao
commented
Aug 13, 2024
| } | ||
|
|
||
| #[test] | ||
| fn union_with_different_column_names() { |
Member
Author
There was a problem hiding this comment.
Converting these union tests to SLT is easier than moving them to TypeCoercion.
alamb
approved these changes
Aug 13, 2024
| return plan_err!("Empty UNION"); | ||
| } | ||
|
|
||
| // Temporarily use the schema from the left input and later rely on the analyzer to |
| 05)----EmptyRelation | ||
|
|
||
| # union_with_incompatible_data_type() | ||
| query error Error during planning: UNION Column 'Int64\(1\)' \(type: Int64\) is not compatible with other type: Interval\(MonthDayNano\) |
Contributor
There was a problem hiding this comment.
I noticed that the error refers to the types in reverse order than they appear in the query
The error message might be better if it were something more like
Incompatible inputs for Union. Previous inputs were of type Interval\(MonthDayNano\), got incomaptible type 'Int64\(1\)' \(type: Int64\)
alamb
approved these changes
Aug 14, 2024
|
|
||
| /// Get a common schema that is compatible with all inputs of UNION. | ||
| fn coerce_union_schema(inputs: Vec<Arc<LogicalPlan>>) -> Result<DFSchema> { | ||
| fn coerce_union_schema(inputs: &Vec<Arc<LogicalPlan>>) -> Result<DFSchema> { |
Contributor
|
Looks great @jonahgao -- thank you |
Member
Author
|
Thanks for the review @alamb |
wiedld
pushed a commit
to influxdata/arrow-datafusion
that referenced
this pull request
Aug 19, 2024
…pache#11961)" This reverts commit afa23ab.
wiedld
pushed a commit
to influxdata/arrow-datafusion
that referenced
this pull request
Aug 21, 2024
…pache#11961)" This reverts commit afa23ab.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #11742.
Rationale for this change
Currently, in the LogicalPlan builder, we coerce the inputs of a union plan to a common schema.
But this common schema is inaccurate because it lacks type coercion for expressions.
For example, in #11742, the return type of the expression
nvl(v1, 0.5)is INT before type coercion and will become Float64 afterward.The fix is to place coercion of the union after the type coercion of expressions.
What changes are included in this PR?
TypeCoercion.Are these changes tested?
Yes
Are there any user-facing changes?
Yes
project_with_column_indexhas become private. It was introduced for UNION in #2108, and I think it should only be used internally.