[feature](csv)Supports reading CSV data using LF and CRLF as line separators.#37687
Merged
morningman merged 9 commits intoapache:masterfrom Jul 18, 2024
Merged
[feature](csv)Supports reading CSV data using LF and CRLF as line separators.#37687morningman merged 9 commits intoapache:masterfrom
morningman merged 9 commits intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
Contributor
Author
|
run buildall |
| } | ||
|
|
||
| [[nodiscard]] inline size_t line_delimiter_length() const final { return line_delimiter_len; } | ||
| [[nodiscard]] inline size_t line_delimiter_length() const { |
Contributor
There was a problem hiding this comment.
warning: annotate this function with 'override' or (rarely) 'final' [modernize-use-override]
Suggested change
| [[nodiscard]] inline size_t line_delimiter_length() const { | |
| [[nodiscard]] inline size_t line_delimiter_length() const override { |
4a05198 to
6fc9f08
Compare
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 40112 ms |
TPC-DS: Total hot run time: 174019 ms |
ClickBench: Total hot run time: 30.32 s |
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 39956 ms |
TPC-DS: Total hot run time: 174024 ms |
ClickBench: Total hot run time: 31.51 s |
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 40221 ms |
TPC-DS: Total hot run time: 172871 ms |
ClickBench: Total hot run time: 31.28 s |
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 39651 ms |
TPC-DS: Total hot run time: 173674 ms |
ClickBench: Total hot run time: 30.17 s |
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 39877 ms |
TPC-DS: Total hot run time: 174213 ms |
ClickBench: Total hot run time: 31.23 s |
Contributor
|
run p0 |
Contributor
|
run p1 |
Contributor
|
PR approved by at least one committer and no changes requested. |
Contributor
|
PR approved by anyone and no changes requested. |
hubgeter
added a commit
to hubgeter/doris
that referenced
this pull request
Jul 18, 2024
…arators. (apache#37687) Supports reading CSV data using LF and CRLF as line separators. csv file: ``` 1,abc 2,def\r 3,qwe 4,hello\r ``` if you `set keep_carriage_return = false` you will get : ```mysql 1 abc 2 def 3 qwe 4 hello ``` Here, both \r\n and \n are used as delimiters. if you `set keep_carriage_return = true` you will get : ```mysql 1 abc 2 def\r 3 qwe 4 hello\r ``` Here only \n is used as a delimiter. It should be noted that `set keep_carriage_return = true` is valid for tvf, but not for stream load/mysql load. This means that when you perform stream load/mysql load, crlf and lf will be automatically used as delimiters, even if you `set keep_carriage_return = true`.
hubgeter
added a commit
to hubgeter/doris
that referenced
this pull request
Jul 18, 2024
…arators. (apache#37687) ## Proposed changes Supports reading CSV data using LF and CRLF as line separators. csv file: ``` 1,abc 2,def\r 3,qwe 4,hello\r ``` if you `set keep_carriage_return = false` you will get : ```mysql 1 abc 2 def 3 qwe 4 hello ``` Here, both \r\n and \n are used as delimiters. if you `set keep_carriage_return = true` you will get : ```mysql 1 abc 2 def\r 3 qwe 4 hello\r ``` Here only \n is used as a delimiter. ## warning It should be noted that `set keep_carriage_return = true` is valid for tvf, but not for stream load/mysql load. This means that when you perform stream load/mysql load, crlf and lf will be automatically used as delimiters, even if you `set keep_carriage_return = true`.
morningman
pushed a commit
that referenced
this pull request
Jul 22, 2024
morningman
pushed a commit
to morningman/doris
that referenced
this pull request
Jul 23, 2024
…arators. (apache#37687) ## Proposed changes Supports reading CSV data using LF and CRLF as line separators. csv file: ``` 1,abc 2,def\r 3,qwe 4,hello\r ``` if you `set keep_carriage_return = false` you will get : ```mysql 1 abc 2 def 3 qwe 4 hello ``` Here, both \r\n and \n are used as delimiters. if you `set keep_carriage_return = true` you will get : ```mysql 1 abc 2 def\r 3 qwe 4 hello\r ``` Here only \n is used as a delimiter. ## warning It should be noted that `set keep_carriage_return = true` is valid for tvf, but not for stream load/mysql load. This means that when you perform stream load/mysql load, crlf and lf will be automatically used as delimiters, even if you `set keep_carriage_return = true`.
morningman
pushed a commit
that referenced
this pull request
Aug 28, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposed changes
Supports reading CSV data using LF and CRLF as line separators.
csv file:
if you
set keep_carriage_return = falseyou will get :
Here, both \r\n and \n are used as delimiters.
if you
set keep_carriage_return = trueyou will get :
Here only \n is used as a delimiter.
warning
It should be noted that
set keep_carriage_return = trueis valid for tvf, but not for stream load/mysql load. This means that when you perform stream load/mysql load, crlf and lf will be automatically used as delimiters, even if youset keep_carriage_return = true.Issue Number: close #xxx