Skip to content

Average groups accumulator doesn't coerce to return type in presence of null values #10113

@gruuya

Description

@gruuya

Describe the bug

We observed TPC-DS q26 running into
Arrow error: Invalid argument error: column types must match schema types, expected Decimal128(11, 6) but found Decimal128(38, 10) at column index 2

The source files were generated using DuckDB, and the original data type is Decimal128(7, 2). This is then coerced to Decimal128(11, 6) by https://github.com/apache/arrow-datafusion/blob/4ad4f90d86c57226a4e0fb1f79dfaaf0d404c273/datafusion/expr/src/type_coercion/aggregates.rs#L457-L462

To Reproduce

❯ create table t as values ('a', arrow_cast(1, 'Decimal128(7,2)')), ('b', arrow_cast(NULL, 'Decimal128(7,2)'));
0 rows in set. Query took 0.045 seconds.

❯ select column1, avg(column2) from t group by column1;
Arrow error: Invalid argument error: column types must match schema types, expected Decimal128(11, 6) but found Decimal128(38, 10) at column index 1

Expected behavior

❯ create table t as values ('a', arrow_cast(1, 'Decimal128(7,2)')), ('b', arrow_cast(NULL, 'Decimal128(7,2)'));
0 rows in set. Query took 0.045 seconds.

❯ select column1, avg(column2) from t group by column1;
+---------+----------------+
| column1 | AVG(t.column2) |
+---------+----------------+
| a       | 1.000000       |
| b       |                |
+---------+----------------+
2 row(s) fetched. 
Elapsed 0.019 seconds.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions