Skip to content

Equivalence class projection does not find new equivalent classes correctly #14326

@askalt

Description

@askalt

Consider the following case:

We have an exec with schema = [a,b,c] and an equivalent class: a == b. We project [a+c,b+c] from it. As a==b we can conclude that a+c==b+c, but that case does not work for now, i.e. the next test fails:

#[tokio::test]
async fn test_stat_projection_equivalent_classes() -> Result<()> {
    // - columns: [a, b, c].
    // - "a" and "b" in the same equivalence class.
    // - then after a+c, b+c projection col(0) and col(1) must be
    // in the same class too.
    let schema = Arc::new(Schema::new(vec![
        Field::new("a", DataType::Int32, false),
        Field::new("b", DataType::Int32, false),
        Field::new("c", DataType::Int32, false),
    ]));
    let mut eq_properties = EquivalenceProperties::new(Arc::clone(&schema));
    eq_properties.add_equal_conditions(&col("a", &schema)?, &col("b", &schema)?)?;

    let properties = PlanProperties::new(
        eq_properties,
        Partitioning::UnknownPartitioning(1),
        EmissionType::Both,
        Boundedness::Bounded,
    );

    let projection = ProjectionExec::try_new(
        vec![
            (
                binary(
                    col("a", &schema)?,
                    Operator::Plus,
                    col("c", &schema)?,
                    &schema,
                )?,
                "a+c".to_owned(),
            ),
            (
                binary(
                    col("b", &schema)?,
                    Operator::Plus,
                    col("c", &schema)?,
                    &schema,
                )?,
                "b+c".to_owned(),
            ),
        ],
        Arc::new(PropertiesExec {
            properties,
            schema: Arc::clone(&schema),
        }),
    )?;

    let actual_properties = projection.properties();
    let eq_group = actual_properties.eq_properties.eq_group();

    assert!(!eq_group.is_empty());
    Ok(())
}

It is because current code that builds new classes just compares source expressions:

let mut new_classes: IndexMap<Arc<dyn PhysicalExpr>, EquivalenceClass> =
IndexMap::new();
mapping.iter().for_each(|(source, target)| {
new_classes
.entry(Arc::clone(source))
.or_insert_with(EquivalenceClass::new_empty)
.push(Arc::clone(target));
});

Instead, I propose to build normalized mapping (where each source expression is normalized against existing equivalence classes) and then and then make a comparison.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions