Describe the bug
I found this in the sqlite sqllogictest repository, random/expr/slt_good_3.slt:
query I rowsort label-676
SELECT ALL - CASE WHEN NOT - AVG ( - 41 ) IS NULL THEN 47 WHEN NULL IS NULL THEN COUNT ( * ) END + 93 + - - 44 * 91 + CASE + 44 WHEN - - 21 * 69 - 12 THEN 58 ELSE - 3 END * + + 23 * + 84 * - - 59
----
-337914
Just to confirm with sqlite3:
$ sqlite3
SQLite version 3.43.2 2023-10-10 12:14:04
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> SELECT ALL - CASE WHEN NOT - AVG ( - 41 ) IS NULL THEN 47 WHEN NULL IS NULL THEN COUNT ( * ) END + 93 + - - 44 * 91 + CASE + 44 WHEN - - 21 * 69 - 12 THEN 58 ELSE - 3 END * + + 23 * + 84 * - - 59;
-337914
sqlite>
Datafusion disagrees:
$ datafusion-cli
DataFusion CLI v33.0.0
❯ SELECT ALL - CASE WHEN NOT - AVG ( - 41 ) IS NULL THEN 47 WHEN NULL IS NULL THEN COUNT ( * ) END + 93 + - - 44 * 91 + CASE + 44 WHEN - - 21 * 69 - 12 THEN 58 ELSE - 3 END * + + 23 * + 84 * - - 59;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| (- CASE WHEN NOT (- AVG(Int64(-41))) IS NULL THEN Int64(47) WHEN NULL IS NULL THEN COUNT(*) END) + Int64(93) + (- Int64(-44)) * Int64(91) + CASE Int64(44) WHEN (- Int64(-21)) * Int64(69) - Int64(12) THEN Int64(58) ELSE Int64(-3) END * Int64(23) * Int64(84) * (- Int64(-59)) |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -337868 |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set. Query took 0.017 seconds.
❯
With SQLite, the expression simplifies to
SELECT - 47 + 93 + - - 44 * 91 + CASE + 44 WHEN - - 21 * 69 - 12 THEN 58 ELSE - 3 END * + + 23 * + 84 * - - 59;
and then
SELECT - 47 + 93 + - - 44 * 91 + - 3 * + + 23 * + 84 * - - 59;
which Datafusion agrees is -337914. Hence, Datafusion must have evaluated one of the cases differently!
sqlite> SELECT CASE WHEN NOT - AVG ( - 41 ) IS NULL THEN 47 WHEN NULL IS NULL THEN COUNT ( * ) END;
47
❯ SELECT CASE WHEN NOT - AVG ( - 41 ) IS NULL THEN 47 WHEN NULL IS NULL THEN COUNT ( * ) END;
+----------------------------------------------------------------------------------------------+
| CASE WHEN NOT (- AVG(Int64(-41))) IS NULL THEN Int64(47) WHEN NULL IS NULL THEN COUNT(*) END |
+----------------------------------------------------------------------------------------------+
| 1 |
+----------------------------------------------------------------------------------------------+
1 row in set. Query took 0.015 seconds.
Trying to isolate that further, SQLite thinks NOT - AVG ( - 41 ) IS NULL is 1, Datafusion thinks it's true which is the same with types, and Datafusion handles that CASE in isolation just fine:
sqlite> SELECT CASE WHEN NOT - AVG ( - 41 ) IS NULL THEN 47 WHEN NULL IS NULL THEN COUNT ( * ) END;
47
sqlite> select NOT - AVG ( - 41 ) IS NULL;
1
sqlite> select case when NOT - AVG ( - 41 ) IS NULL then 'yes' else 'no' end;
yes
❯ SELECT CASE WHEN NOT - AVG ( - 41 ) IS NULL THEN 47 WHEN NULL IS NULL THEN COUNT ( * ) END;
+----------------------------------------------------------------------------------------------+
| CASE WHEN NOT (- AVG(Int64(-41))) IS NULL THEN Int64(47) WHEN NULL IS NULL THEN COUNT(*) END |
+----------------------------------------------------------------------------------------------+
| 1 |
+----------------------------------------------------------------------------------------------+
1 row in set. Query took 0.014 seconds.
❯ select NOT - AVG ( - 41 ) IS NULL;
+---------------------------------+
| NOT (- AVG(Int64(-41))) IS NULL |
+---------------------------------+
| true |
+---------------------------------+
1 row in set. Query took 0.011 seconds.
❯ select case when NOT - AVG ( - 41 ) IS NULL then 'yes' else 'no' end;
+--------------------------------------------------------------------------------+
| CASE WHEN NOT (- AVG(Int64(-41))) IS NULL THEN Utf8("yes") ELSE Utf8("no") END |
+--------------------------------------------------------------------------------+
| yes |
+--------------------------------------------------------------------------------+
1 row in set. Query took 0.014 seconds.
But when it's part of that larger statement something goes wrong.
And this is it:
SELECT CASE WHEN true THEN 'sqlite and postgres' WHEN true THEN 'datafusion' END;
sqlite and postgres
❯ SELECT CASE WHEN true THEN 'sqlite and postgres' WHEN true THEN 'datafusion' END;
+---------------------------------------------------------------------------------------------------------+
| CASE WHEN Boolean(true) THEN Utf8("sqlite and postgres") WHEN Boolean(true) THEN Utf8("datafusion") END |
+---------------------------------------------------------------------------------------------------------+
| datafusion |
+---------------------------------------------------------------------------------------------------------+
1 row in set. Query took 0.006 seconds.
To Reproduce
❯ SELECT CASE WHEN true THEN 'sqlite and postgres' WHEN true THEN 'datafusion' END;
+---------------------------------------------------------------------------------------------------------+
| CASE WHEN Boolean(true) THEN Utf8("sqlite and postgres") WHEN Boolean(true) THEN Utf8("datafusion") END |
+---------------------------------------------------------------------------------------------------------+
| datafusion |
+---------------------------------------------------------------------------------------------------------+
1 row in set. Query took 0.006 seconds.
Expected behavior
sqlite> SELECT CASE WHEN true THEN 'sqlite and postgres' WHEN true THEN 'datafusion' END;
sqlite and postgres
Additional context
No response
Describe the bug
I found this in the sqlite sqllogictest repository,
random/expr/slt_good_3.slt:Just to confirm with sqlite3:
Datafusion disagrees:
With SQLite, the expression simplifies to
and then
which Datafusion agrees is -337914. Hence, Datafusion must have evaluated one of the cases differently!
Trying to isolate that further, SQLite thinks
NOT - AVG ( - 41 ) IS NULLis1, Datafusion thinks it'struewhich is the same with types, and Datafusion handles thatCASEin isolation just fine:But when it's part of that larger statement something goes wrong.
And this is it:
To Reproduce
Expected behavior
Additional context
No response