Skip to content

Interval coercion:date_bin('1 hour',...) does not work but date_bin(interval '1 hour', ... does #4853

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Users of IOx are now using date_bin and hit a UX issue.

date_bin with a string value does not work and gives a hard to understand message:

select date_bin('1 hour', column1, TIMESTAMP '2001-01-01 00:00:00Z') 
from (values 
  (timestamp '2022-01-01 00:00:00'), 
  (timestamp '2022-01-01 01:00:00'), 
(timestamp '2022-01-02 00:00:00')
) as sq;
Plan("Coercion from [Utf8, Timestamp(Nanosecond, None), Timestamp(Nanosecond, None)] to the signature Exact([Interval(DayTime), Timestamp(Nanosecond, None), Timestamp(Nanosecond, None)]) failed.")

But it does work when the string is explicitly cast to an interval, with interval '1 hour'

select date_bin(interval '1 hour', column1, TIMESTAMP '2001-01-01 00:00:00Z') 
from (values 
  (timestamp '2022-01-01 00:00:00'), 
  (timestamp '2022-01-01 01:00:00'), 
(timestamp '2022-01-02 00:00:00')
) as sq;
+-----------------------------------------------------------------------------+
| datebin(IntervalDayTime("3600000"),sq.column1,Utf8("2001-01-01 00:00:00Z")) |
+-----------------------------------------------------------------------------+
| 2022-01-01T00:00:00                                                         |
| 2022-01-01T01:00:00                                                         |
| 2022-01-02T00:00:00                                                         |
+-----------------------------------------------------------------------------+
3 rows in set. Query took 0.015 seconds.

Describe the solution you'd like

I would like the string coerced to an interval, like postgres:

postgres=# select date_bin('1 hour', column1, TIMESTAMP '2001-01-01 00:00:00Z') 
from (values 
  (timestamp '2022-01-01 00:00:00'), 
  (timestamp '2022-01-01 01:00:00'), 
(timestamp '2022-01-02 00:00:00')
) as sq;
      date_bin       
---------------------
 2022-01-01 00:00:00
 2022-01-01 01:00:00
 2022-01-02 00:00:00
(3 rows)

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
It looks to me like utf8 --> interval coercion simply needs to be added to the functional coercion rule:

https://github.com/apache/arrow-datafusion/blob/f9b72f4230687b884a92f79d21762578d3d56281/datafusion/expr/src/type_coercion/functions.rs#L177-L179

Marking this as a good first issue as it should be straightforward and a good way to learn about the datafusion codebase

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions