Skip to content

Uncorrelated subquery in JOIN panics (SQLStorm) #17772

@2010YOUY01

Description

@2010YOUY01

Describe the bug

datafusion-cli is compiled from the latest main commit 5bbdb7e

To reproduce, tpch dataset can be generated via https://github.com/clflushopt/tpchgen-rs/tree/main/tpchgen-cli, and update the following paths

CREATE EXTERNAL TABLE customer
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/customer.parquet';

CREATE EXTERNAL TABLE nation
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/nation.parquet';

CREATE EXTERNAL TABLE part
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/part.parquet';

CREATE EXTERNAL TABLE region
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/region.parquet';

CREATE EXTERNAL TABLE lineitem
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/lineitem.parquet';

CREATE EXTERNAL TABLE orders
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/orders.parquet';

CREATE EXTERNAL TABLE partsupp
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/partsupp.parquet';

CREATE EXTERNAL TABLE supplier
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/supplier.parquet';

WITH RankedOrders AS (
  SELECT
    o_orderkey,
    o_orderdate,
    DENSE_RANK() OVER (
      PARTITION BY o_orderpriority
      ORDER BY o_totalprice DESC
    ) AS priority_rank
  FROM orders
  WHERE o_orderstatus = 'F'
),
CustomerNation AS (
  SELECT c_custkey, c_name
  FROM customer
),
OrderLineStats AS (
  SELECT
    l_orderkey,
    SUM(l_extendedprice) AS total_line_value
  FROM lineitem
  GROUP BY l_orderkey
),
TopOrder AS (
  -- Pick one top-priced order (priority_rank = 1)
  SELECT o_orderkey
  FROM RankedOrders
  WHERE priority_rank = 1
  LIMIT 1
)
SELECT DISTINCT
  c.c_name,
  r.o_orderkey,
  COALESCE(o.total_line_value, 0) AS total_order_value
FROM CustomerNation AS c
LEFT JOIN RankedOrders AS r
  ON r.o_orderkey = (SELECT o_orderkey FROM TopOrder)
LEFT JOIN OrderLineStats AS o
  ON o.l_orderkey = r.o_orderkey
WHERE r.o_orderdate >= DATE '1995-01-01'
  AND r.o_orderdate <  DATE '1996-01-01';

thread 'main' panicked at datafusion/common/src/dfschema.rs:365:31:
index out of bounds: the len is 2 but the index is 2
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

To Reproduce

No response

Expected behavior

No response

Additional context

Found by SQLStorm #17698

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions