Skip to content

Add LogicalPlanBuilder::join_on  #7766

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

While working on #7612 with @nseekhao I think the LogicalPlanBuilder::join* interfaces are confusing:

Specifically, they all have a space to put parallel lists of join columns

join_keys: (Vec<impl Into<Column>, Global>, Vec<impl Into<Column>, Global>),
filter: Option<Expr>

However, the the ExtractEquijoinPredicate optimizer pass already splits up join predicates into equijoin predicates and "other" predicates

DataFrame::join_on uses this interface to nice effect:

pub fn join_on(
    self,
    right: DataFrame,
    join_type: JoinType,
    on_exprs: impl IntoIterator<Item = Expr>
) -> Result<DataFrame>

Describe the solution you'd like

I would like someone to

  1. add LogicalPlanBuilder::join_on, that does the same thing as DataFrame::join_on
  2. Add documentation that explains subsequent optimizer passes will split apart expressions, so there is no need to do so if you don't want to
  3. Add a doc example to LogicalPlanBuilder::join_on that shows how to use it

Something like

impl LogicalPlanBuilder {

pub fn join_on(
    self,
    right: DataFrame,
    join_type: JoinType,
    on_exprs: impl IntoIterator<Item = Expr>
) -> Result<DataFrame> {
...
}

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions