[fix](csv reader) fix csv parser incorrect if enclosing line_delimiter#38347
[fix](csv reader) fix csv parser incorrect if enclosing line_delimiter#38347dataroaring merged 1 commit intoapache:masterfrom sollhui:fix_csv_reader
Conversation
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
|
run buildall |
|
clang-tidy review says "All clean, LGTM! 👍" |
|
run buildall |
1 similar comment
|
run buildall |
|
clang-tidy review says "All clean, LGTM! 👍" |
1 similar comment
|
clang-tidy review says "All clean, LGTM! 👍" |
| if (_idx != _total_len) { | ||
| len = update_reading_bound(start); | ||
| } else { | ||
| _result = nullptr; |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
Co-authored-by: Xin Liao <liaoxinbit@126.com>
|
run buildall |
|
clang-tidy review says "All clean, LGTM! 👍" |
2 similar comments
|
clang-tidy review says "All clean, LGTM! 👍" |
|
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 39273 ms |
TPC-DS: Total hot run time: 173216 ms |
ClickBench: Total hot run time: 30.82 s |
#38347) (#38446) Csv reader parse data incorrect when data enclosing line_delimiter, for example, line_delimiter is \n and enclose is ', data as follows: ``` 'aaaaaaaaaaaa bbbb' ``` it will be parsed as two columns: `'aaaaaaaaaaaa` and `bbbb',` rather than one column ``` 'aaaaaaaaaaaa bbbb' ``` The reason why this happened is csv reader will not reset result when not match enclose in this `output_buf_read`, causing incorrect truncation was made. Co-authored-by: Xin Liao <liaoxinbit@126.com>
#38347) (#38445) Csv reader parse data incorrect when data enclosing line_delimiter, for example, line_delimiter is \n and enclose is ', data as follows: ``` 'aaaaaaaaaaaa bbbb' ``` it will be parsed as two columns: `'aaaaaaaaaaaa` and `bbbb',` rather than one column ``` 'aaaaaaaaaaaa bbbb' ``` The reason why this happened is csv reader will not reset result when not match enclose in this `output_buf_read`, causing incorrect truncation was made. Co-authored-by: Xin Liao <liaoxinbit@126.com>
#38347) Csv reader parse data incorrect when data enclosing line_delimiter, for example, line_delimiter is \n and enclose is ', data as follows: ``` 'aaaaaaaaaaaa bbbb' ``` it will be parsed as two columns: `'aaaaaaaaaaaa` and `bbbb',` rather than one column ``` 'aaaaaaaaaaaa bbbb' ``` The reason why this happened is csv reader will not reset result when not match enclose in this `output_buf_read`, causing incorrect truncation was made. Co-authored-by: Xin Liao <liaoxinbit@126.com>
|
Is anybody knew how to get rid of enclose symbol from column-value itself When Using stream-load CSV? |
Can you give an example? |
apache#38347) (apache#38446) Csv reader parse data incorrect when data enclosing line_delimiter, for example, line_delimiter is \n and enclose is ', data as follows: ``` 'aaaaaaaaaaaa bbbb' ``` it will be parsed as two columns: `'aaaaaaaaaaaa` and `bbbb',` rather than one column ``` 'aaaaaaaaaaaa bbbb' ``` The reason why this happened is csv reader will not reset result when not match enclose in this `output_buf_read`, causing incorrect truncation was made. Co-authored-by: Xin Liao <liaoxinbit@126.com>
Csv reader parse data incorrect when data enclosing line_delimiter, for example, line_delimiter is \n and enclose is ', data as follows:
it will be parsed as two columns:
'aaaaaaaaaaaaandbbbb',rather than one columnThe reason why this happened is csv reader will not reset result when not match enclose in this
output_buf_read, causing incorrect truncation was made.