Is your feature request related to a problem or challenge?
There seem some opportunities for optimizing ArrowBytesViewMap using some more cleverness.
For e.g. ClickBench query 5, >50% CPU is spent during intern:
A lot of it relates to getting / comparing the bytes from the buffers, etc (append_value, get_value, memcmp, makeview, etc).
Describe the solution you'd like
We should be able to avoid (re)creating views every time and comparing against slices, by storing/comparing the views directly, and avoiding the overhead of the GenericByteViewBuilder methods.
To do so, I think we need:
- Not use
values.iter() but use the view buffer and get buffer index
- Compare against the original view (and buffer in the index if needed)
- Update the new view with the new index (don't create it again).
Describe alternatives you've considered
No response
Additional context
No response
Is your feature request related to a problem or challenge?
There seem some opportunities for optimizing
ArrowBytesViewMapusing some more cleverness.For e.g. ClickBench query 5, >50% CPU is spent during
intern:A lot of it relates to getting / comparing the bytes from the buffers, etc (append_value, get_value, memcmp, makeview, etc).
Describe the solution you'd like
We should be able to avoid (re)creating views every time and comparing against slices, by storing/comparing the views directly, and avoiding the overhead of the
GenericByteViewBuildermethods.To do so, I think we need:
values.iter()but use the view buffer and get buffer indexDescribe alternatives you've considered
No response
Additional context
No response