Skip to content

GH-38865 [C++][Parquet] support passing a RowRange to RecordBatchReader#39731

Closed
binmahone wants to merge 3 commits intoapache:mainfrom
binmahone:20240117_skipio2
Closed

GH-38865 [C++][Parquet] support passing a RowRange to RecordBatchReader#39731
binmahone wants to merge 3 commits intoapache:mainfrom
binmahone:20240117_skipio2

Conversation

@binmahone
Copy link
Copy Markdown

@binmahone binmahone commented Jan 22, 2024

Rationale for this change

This PR is a supserset of #39608.
Skipping Page IO is supported in Pre-Buffer read mode.

What changes are included in this PR?

In this PR, RangeCacheEntry cached in range cache is modified:
Instead of caching the whole ReadRange, we excluded the bytes within holes, which is calculated by user specified RowRange.

Are these changes tested?

Yes, range_read_test.cc

Are there any user-facing changes?

a new GetRecordBatchReader API overload is added. NO existing API is broken

one thread for each logical range

fix prebuffer conflict with datappagefilter

fix coalesce problem

ading tests

fix style
@hellishfire
Copy link
Copy Markdown
Contributor

I'm wondering the status of this pr. Is this row range API still considered valid, or superseded by another pr?
Either way, I believe parquet reader really needs a seek-to-row api to facilitate a variety of operations.
@binmahone @wgtmac

@wgtmac
Copy link
Copy Markdown
Member

wgtmac commented Nov 10, 2024

I was told that the author no longer work on this any more. If no one has an interest to continue the effort, I can pick it up.

@mapleFU
Copy link
Copy Markdown
Member

mapleFU commented Nov 10, 2024

I'd like to spend my spare time on my other un-finished patches, welcome to pick this @wgtmac

@hellishfire
Copy link
Copy Markdown
Contributor

hellishfire commented Nov 11, 2024

I was told that the author no longer work on this any more. If no one has an interest to continue the effort, I can pick it up.

It'd be great if you choose to pick this up. This is a very useful api that would also benefit other wrapper apis based on cpp implementations.

@github-actions
Copy link
Copy Markdown

Thank you for your contribution. Unfortunately, this pull request has been marked as stale because it has had no activity in the past 365 days. Please remove the stale label or comment below, or this PR will be closed in 14 days. Feel free to re-open this if it has been closed in error. If you do not have repository permissions to reopen the PR, please tag a maintainer.

@github-actions github-actions Bot added the Status: stale-warning Issues and PRs flagged as stale which are due to be closed if no indication otherwise label Nov 18, 2025
@github-actions github-actions Bot closed this Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting review Awaiting review Component: C++ Component: Parquet Status: stale-warning Issues and PRs flagged as stale which are due to be closed if no indication otherwise

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[C++][Parquet] support passing a RowRange to RecordBatchReader

4 participants