Skip to content

Remove struct UDF, and use named_struct everywhere #9839

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

This is a follow on to #9743 where @gstvg added a great named_struct function to construct StructArrays ❤️

As part of that PR, @yyy1000 noted that the existing code in the struct udf is now never called: #9743 (comment)

Describe the solution you'd like

  1. Make the invoke()` function reutrn a not yet implemented error https://github.com/apache/arrow-datafusion/blob/ce3d446be5f6a11664e100fc47940e6ecb5418d3/datafusion/functions/src/core/struct.rs#L90-L92

  2. Implement the simplify API to rewrite calls to struct() to a call to named_struct

https://github.com/apache/arrow-datafusion/blob/ce3d446be5f6a11664e100fc47940e6ecb5418d3/datafusion/expr/src/udf.rs#L372-L378

  1. Update the sql planner to call struct rather than building up the c0, `c1, etc and calling named_struct

Describe alternatives you've considered

We could also just remove the struct udf entirely, though in that case it is important to keep the struct expr_fn function for backwards compatibility

https://github.com/apache/arrow-datafusion/blob/ce3d446be5f6a11664e100fc47940e6ecb5418d3/datafusion/functions/src/core/mod.rs#L44

I think it could be implemented as its own function like

Additional context

No response

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions