Skip to content

[Bug] data is lost when inserting into using nereids planner when type is inconsistent(DATETIME TO DATE) #34626

@yx-keith

Description

@yx-keith

Search before asking

  • I had searched in the issues and found no similar issues.

Version

2.0.3

What's Wrong?

when I insert data into sink_table from source_table, START_CALL_TIME filed is of DATE type in sink_table and is of DATETYPE type in source_table, the inconsistent types lead to data loss when inserting data using nereids planner, but it is ok using old planner。

this is my sql:
INSERT INTO sink_table SELECT *
FROM (
SELECT DISTINCT acct_month, call_date, device_number, OPPOSE_NUMBER, org_trm_id
, START_CALL_TIME, call_duration
FROM source_table
WHERE acct_month = SUBSTR("20240131", 1, 6) AND call_date = SUBSTR("20240131", -2)
UNION
SELECT DISTINCT acct_month, call_date, OPPOSE_NUMBER AS device_number, device_number AS OPPOSE_NUMBER, org_trm_id
, START_CALL_TIME, call_duration
FROM source_table
WHERE acct_month = SUBSTR("20240131", 1, 6) AND call_date = SUBSTR("20240131", -2)
) alias_17125550933410;

correct result is 27837957 rows in old planner, but the wrong result is 27531867 rows in nereids planner.
but when change START_CALL_TIME to DATETIME type in sink_table, nereids planner is right.

What You Expected?

is there something wrong in nereids optimizer?

How to Reproduce?

Because this problem occurs in a production environment, we can mock table to get similiar plan:

sink_table:
image

source_table:
image

data in test.csv:
10000,2017-10-01,北京,20,0,2017-10-01 06:00:00,20,10,10
10000,2017-10-01,北京,20,0,2017-10-01 07:00:00,15,2,2
10001,2017-10-01,北京,30,1,2017-10-01 17:05:45,2,22,22
10002,2017-10-02,上海,20,1,2017-10-02 12:59:12,200,5,5
10003,2017-10-02,广州,32,0,2017-10-02 11:20:00,30,11,11
10004,2017-10-01,深圳,35,0,2017-10-01 10:00:15,100,3,3
10004,2017-10-03,深圳,35,0,2017-10-03 10:20:22,11,6,6

load test.csv to source_table:
curl --location-trusted -u root: -T test.csv -H "column_separator:," http://127.0.0.1:8030/api/demo/source_table/_stream_load

insert into sink_table from source_table:
insert into sink_table
select * from (
select DISTINCT user_id, date, city, age, sex,last_visit_date,cost,max_dwell_time,min_dwell_time from source_table where city='北京'
union
select DISTINCT user_id, date, city, age, sex,last_visit_date,cost,max_dwell_time,min_dwell_time from source_table where city='深圳'
) t;

nereids plan:
image
image

old planner:
image
image
image
image
image

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions