Describe the bug
max_distinct_count in datafusion/physical-plan/src/joins/utils.rs panics with "attempt to subtract with overflow" in the Precision::Exact branch (line 725):
Precision::Exact(count) => {
let count = count - stats.null_count.get_value().unwrap_or(&0); // <-- panic
This happens when num_rows (Exact) is smaller than null_count, which became possible after #20228, which added fetch support to HashJoinExec. When a limit is pushed down, HashJoinExec::partition_statistics() calls stats.with_fetch(self.fetch, 0, 1), which reduces num_rows to Exact(fetch_value) but does not reduce null_count in column statistics.
Example failing pipeline:
https://github.com/datafusion-contrib/datafusion-distributed/actions/runs/22798285744/job/66136064932?pr=366
To Reproduce
git clone https://github.com/datafusion-contrib/datafusion-distributed
cd datafusion-distributed
git checkout branch-53
cargo test --test tpcds_plans_test tests::test_tpcds_19 --all-features
Expected behavior
No substraction overflow
Additional context
No response
Describe the bug
max_distinct_countindatafusion/physical-plan/src/joins/utils.rspanics with "attempt to subtract with overflow" in thePrecision::Exactbranch (line 725):This happens when
num_rows(Exact) is smaller thannull_count, which became possible after #20228, which addedfetchsupport toHashJoinExec. When a limit is pushed down,HashJoinExec::partition_statistics()callsstats.with_fetch(self.fetch, 0, 1), which reducesnum_rowstoExact(fetch_value)but does not reducenull_countin column statistics.Example failing pipeline:
https://github.com/datafusion-contrib/datafusion-distributed/actions/runs/22798285744/job/66136064932?pr=366
To Reproduce
Expected behavior
No substraction overflow
Additional context
No response