-
Notifications
You must be signed in to change notification settings - Fork 4.1k
ARROW-10808: [Rust][DataFusion] Support nested expressions in aggregations. #8836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -56,6 +56,12 @@ impl Debug for ScalarUDF { | |
| } | ||
| } | ||
|
|
||
| impl PartialEq for ScalarUDF { | ||
| fn eq(&self, other: &Self) -> bool { | ||
| self.name == other.name && self.signature == other.signature | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am afraid that this may not be sufficient: two UDFs may have the same name and signature and represent different semantics. In general UDFs are anonymous functions and thus do not have a Note that while this can't happen in SQL, because UDFs are registered via In the let expr1 = pow.call(vec![col("a"), col("b")]);demonstrates how to use a UDF without registering it on the context, in which case its name is only used when printing the plan.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for pointing this out, I was afraid there might be an issue like this. I'm happy to back out the Before I do that, I'm wondering if you think there is a more suitable implementation of
I would advocate for some definition of equality for Let me know what you think. Thanks!
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not have a good answer here. Both options that you presented are reasonable. IMO we should just proceed with one, but let's double-check with @andygrove and @alamb ^_^
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that if we are performing comparison of expressions, we should have
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think to handle it in the general case we would have to require that the user defined function / aggregate itself define equality (perhaps with a default implementation that compares function pointers as suggested by @drusso ). I think user defined functions in this kind of framework are also tagged with other properties (like if they have side effects, and thus can't be optimized away) I personally suggest filing a ticket for this issue in the future -- it is kind of like the category of "more mature user defined function support".
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've filed ARROW-10963. |
||
| } | ||
| } | ||
|
|
||
| impl ScalarUDF { | ||
| /// Create a new ScalarUDF | ||
| pub fn new( | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -20,3 +20,4 @@ | |
|
|
||
| pub mod parser; | ||
| pub mod planner; | ||
| mod utils; | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering, is this the
PartialEqimplementation we always want, or just for this purpose? Otherwise it might be better to create a normal function outside the PartialEq for it? Or do we need it now for a map/set somewhere?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if it was a suitable implementation, and Jorge's raised a good reason why it's not. See thread below.
The changes in this pull request depend heavily on determining equivalency of two expressions. For example,
rebase_expr()andfind_exprs_in_expr()useVec::contains()(which requires the contained items implement thePartialEqtrait).