Describe the bug
When upgrading to DataFusion 52 we hit a bug in our test cases where pre-sorted data was being resorted
I spent a while (with Codex's help) tracking it down, and it seems to be due to not applying the FileScanConfig projection to existing orderings.
To Reproduce
Here is a unit test. I am working on an end to end .slt test
#[test]
fn equivalence_properties_projection_reorders_schema() {
// This test ensures `project_orderings` is applied even when there are no
// partition columns: a projection reorders the schema, and the output ordering
// is specified in projected schema indices.
let file_schema = Arc::new(Schema::new(vec![
Field::new("a", DataType::Int32, false),
Field::new("b", DataType::Int64, false),
Field::new("c", DataType::Utf8, true),
]));
let object_store_url = ObjectStoreUrl::parse("test:///").unwrap();
let table_schema = TableSchema::new(Arc::clone(&file_schema), vec![]);
let file_source: Arc<dyn FileSource> =
Arc::new(MockSource::new(table_schema.clone()));
let config = FileScanConfigBuilder::new(
object_store_url.clone(),
Arc::clone(&file_source),
)
.with_projection_indices(Some(vec![2, 0]))
.unwrap()
// Indices are in the projected schema: [c, a] -> [0, 1].
.with_output_ordering(vec![[sort_expr("c", 0), sort_expr("a", 1)].into()])
.build();
let eq_properties = config.eq_properties();
let ordering = eq_properties
.output_ordering()
.expect("expected output ordering");
let first_col = ordering[0].expr.as_any().downcast_ref::<Column>().unwrap();
assert_eq!(first_col.name(), "c");
assert_eq!(first_col.index(), 0);
let second_col = ordering[0].expr.as_any().downcast_ref::<Column>().unwrap();
assert_eq!(second_col.name(), "a");
assert_eq!(second_col.index(), 1);
}
Expected behavior
No response
Additional context
No response
Describe the bug
When upgrading to DataFusion 52 we hit a bug in our test cases where pre-sorted data was being resorted
I spent a while (with Codex's help) tracking it down, and it seems to be due to not applying the FileScanConfig projection to existing orderings.
To Reproduce
Here is a unit test. I am working on an end to end .slt test
Expected behavior
No response
Additional context
No response