feat:implement sql style 'substr_index' string function#8272
feat:implement sql style 'substr_index' string function#8272alamb merged 14 commits intoapache:mainfrom
Conversation
| ---- | ||
| NULL | ||
|
|
||
| query ? |
There was a problem hiding this comment.
awesome, can we also have the same tests with empty strings as input and search token?
There was a problem hiding this comment.
add empty string tests and 0 count tests
| /// SUBSTRING_INDEX('www.apache.org', '.', -2) = apache.org | ||
| /// SUBSTRING_INDEX('www.apache.org', '.', -1) = org | ||
| pub fn substr_index<T: OffsetSizeTrait>(args: &[ArrayRef]) -> Result<ArrayRef> { | ||
| let string_array = as_generic_string_array::<T>(&args[0])?; |
There was a problem hiding this comment.
we need to add a defense check args is exactly 3 elements
There was a problem hiding this comment.
thanks, I add the args len check
comphead
left a comment
There was a problem hiding this comment.
Thanks @Syleechan please take a look into minor comments
Ted-Jiang
left a comment
There was a problem hiding this comment.
LGTM 👍 , and tested below in mysql, maybe add them is better
SELECT SUBSTRING_INDEX('www.mysql.com', '.', 4); -> www.mysql.com SELECT SUBSTRING_INDEX('www.mysql.com', '.', 0); ->
just the same as SELECT substr_index('www.apache.org', 'ac', 2) |
| /// SUBSTRING_INDEX('www.apache.org', '.', -1) = org | ||
| pub fn substr_index<T: OffsetSizeTrait>(args: &[ArrayRef]) -> Result<ArrayRef> { | ||
| if args.len() != 3 { | ||
| return Err(DataFusionError::Internal(format!( |
There was a problem hiding this comment.
please use internal_err! macros
There was a problem hiding this comment.
thanks, change to internal_err.
|
Thanks @Syleechan , we are very close, please address the minor to use error macros, it gives a possibility to backtrace the error |
|
@alamb please help to merge if there are no other concerns, thanks. |
|
Thanks @Syleechan as well as @comphead and @Ted-Jiang for the review Since @Ted-Jiang is a committer as well, in the future please feel free to merge PRs once they have been reviewed and follow https://arrow.apache.org/datafusion/contributor-guide/index.html#pull-request-overview (as this one has) |
Which issue does this PR close?
Closes #.
Rationale for this change
https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_substring-index:~:text=SUBSTRING_INDEX(str%2Cdelim%2Ccount)
In calcite, the function expression name is substr_index
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?