Skip to content

Add support for non-correlated subqueries #3266

@andygrove

Description

@andygrove

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

I would like to be able to run this query:

CREATE TABLE paintings AS SELECT 'Mona Lisa' as name, 1000000 as listed_price;

SELECT name, listed_price
FROM paintings
WHERE listed_price > (
    SELECT AVG(listed_price)
    FROM paintings
);

It currently fails with:

Skipping optimizer rule decorrelate_scalar_subquery due to unexpected error: scalar subqueries must have a filter to be correlated at /home/andy/git/apache/arrow-datafusion/datafusion/optimizer/src/decorrelate_scalar_subquery.rs:177
caused by
Error during planning: Could not coerce into Filter! at /home/andy/git/apache/arrow-datafusion/datafusion/expr/src/logical_plan/plan.rs:1127
NotImplemented("Physical plan does not support logical expression (<subquery>)")

This is because the existing subquery optimizer rules fail to rewrite the query.

It should be possible to rewrite the query as a join:

SELECT name, listed_price
FROM paintings
CROSS JOIN (SELECT AVG(listed_price) AS avg_price FROM paintings) temp
WHERE listed_price > temp.avg_price;

Describe the solution you'd like
☝️

Describe alternatives you've considered
None

Additional context
None

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestsqlSQL Planner

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions