Skip to content

[EPIC] Support TPC-DS benchmarks #4763

@andygrove

Description

@andygrove

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I would like to be able to run all TPC-DS queries with DataFusion, but some are not yet supported.

Old description:

I am testing with [SQLBench-DS](https://github.com/sql-benchmarks/sqlbench-ds) and I am seeing some failures. Many of these affect multiple queries but I have just listed a single example query here for each type of error.

- https://github.com/apache/arrow-datafusion/issues/4794
- https://github.com/apache/arrow-datafusion/issues/123
- `At least two values are needed to calculate variance` (q17)
- `The type of Int32 = Int64 of binary physical should be same` (q72)
- `physical plan is not yet implemented for GROUPING aggregate function` (q27)
- `Projections require unique expression names but the expression "MAX(customer_demographics.cd_dep_count)" at position 6 and "MAX(customer_demographics.cd_dep_count)" at position 7 have the same name. Consider aliasing ("AS") one of them.` (q35)
- `The function Stddev does not support inputs of type Decimal128(7, 2).` (q74)

Describe the solution you'd like
Support all the queries.

Describe alternatives you've considered
N/A

Additional context
N/A

Metadata

Metadata

Assignees

Labels

PROPOSAL EPICA proposal being discussed that is not yet fully underwayenhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions