GH-38865 [C++][Parquet] support passing a RowRange to RecordBatchReader#39731
GH-38865 [C++][Parquet] support passing a RowRange to RecordBatchReader#39731binmahone wants to merge 3 commits intoapache:mainfrom
Conversation
one thread for each logical range fix prebuffer conflict with datappagefilter fix coalesce problem ading tests fix style
|
I'm wondering the status of this pr. Is this row range API still considered valid, or superseded by another pr? |
|
I was told that the author no longer work on this any more. If no one has an interest to continue the effort, I can pick it up. |
|
I'd like to spend my spare time on my other un-finished patches, welcome to pick this @wgtmac |
It'd be great if you choose to pick this up. This is a very useful api that would also benefit other wrapper apis based on cpp implementations. |
|
Thank you for your contribution. Unfortunately, this pull request has been marked as stale because it has had no activity in the past 365 days. Please remove the stale label or comment below, or this PR will be closed in 14 days. Feel free to re-open this if it has been closed in error. If you do not have repository permissions to reopen the PR, please tag a maintainer. |
Rationale for this change
This PR is a supserset of #39608.
Skipping Page IO is supported in Pre-Buffer read mode.
What changes are included in this PR?
In this PR, RangeCacheEntry cached in range cache is modified:
Instead of caching the whole ReadRange, we excluded the bytes within
holes, which is calculated by user specified RowRange.Are these changes tested?
Yes, range_read_test.cc
Are there any user-facing changes?
a new GetRecordBatchReader API overload is added. NO existing API is broken