feat:get_ranges#3925
Conversation
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Split the overlong first line into a short numpydoc summary plus an extended description. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After the input split at the top of coalesced_get, merged groups only ever contain RangeByteRequest members. Replace the per-element isinstance filters (and the defensive ``else 0`` sort-key branch) with a single assertion at the top of the merged-group block and direct attribute access. Also remove the unreachable ``if total == 0: return`` guard (``indexed`` is non-empty by construction once we pass the earlier guard). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Exercise the ``kind == "missing"`` branch in the uncoalescable single-fetch arm for Offset/Suffix/None inputs, which was not hit by existing tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related correctness issues in coalesced_get's drain loop: 1. When the consumer breaks out of the async-for (early exit), the generator's finally block only awaited in-flight tasks rather than cancelling them. That wasted I/O. Cancel first, then gather. 2. The drain loop waited on completion_queue for ``total`` entries, but after a "missing" or "error" we cancel pending tasks -- and cancelled tasks never enqueue a completion. With max_concurrency > 1 this could hang. Rework the drain loop to break out immediately on the first miss/error; the finally block handles cleanup. The new structure also collapses the redundant miss/error branches and removes the now-unused ``total``/``drained``/``stopped`` bookkeeping. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Exercises the concurrent path where a missing key is observed while other fetches are still in flight. Uses an asyncio.Event to gate late arrivals until after the miss has been processed, giving the drain loop an opportunity to observe and discard post-stop completions, and verifies the iterator terminates cleanly without hanging or raising. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drives many slow ranges with a small max_concurrency, breaks out of the async-for after the first yield, and verifies that at least one still-running fetch was cancelled rather than being left to run to completion. Cancellation is observed via a counter in the fetch's CancelledError branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
coalesced_get is implemented as an async generator (uses yield) and callers need access to aclose() to drive its finally block deterministically. Declaring the return type as AsyncGenerator instead of AsyncIterator exposes aclose()/asend()/athrow() through the type system, matches the runtime object, and lets consumers (e.g. the consumer-break test) avoid type-ignore escape hatches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pyproject asyncio_mode=auto already covers async test dispatch; the explicit pytestmark was a vestige. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Used 0000.feature.md as a placeholder; rename to {pr-number}.feature.md
once the PR is opened.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The SupportsGetRanges protocol is private; a user-facing release note shouldn't advertise it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
for context, we do already have a |
The min_deps CI job pins fsspec to 2023.10.0, which predates AsyncFileSystemWrapper. Wrapping a sync MemoryFileSystem fails there at fixture setup. Guard the affected tests with the same skipif pattern already used in test_fsspec.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3925 +/- ##
==========================================
+ Coverage 93.26% 93.30% +0.03%
==========================================
Files 87 89 +2
Lines 11721 11829 +108
==========================================
+ Hits 10932 11037 +105
- Misses 789 792 +3
🚀 New features to boost your workflow:
|
|
I think making it iterable adds complexity and adds confusion to whether request coalescing is expected to be applied here or not. If you have requests:
Then of course we should coalesce all the first 3 requests into one. But then the async iterable implies that we might want to use one of the first responses before the last one arrives... but the last request would've arrived first. Either, if you really want the response type to be async iterable, the responses should probably have an But I think it would be much simpler to take in a sequence of byte ranges and return a sequence of results. Just like object-store/obstore/obspec. |
In the design in this PR we are iterating over IO calls the reader actually did, which is less than or equal to the number of byte ranges requested. So, assuming the first three are fused, we would either see: or Which, for sharding is useful -- you can start decoding chunks immediately while you wait for the rest of the sub-chunks to come in. Does that make sense? or am I misunderstanding something. |
ilan-gold
left a comment
There was a problem hiding this comment.
Nice!
At an architecture level, it's not immediately clear to me how this will be exposed to users. But maybe that is out of the scope of this PR? I think we don't want to turn this on by default without some rough estimates.
Co-authored-by: Ilan Gold <ilanbassgold@gmail.com>
Co-authored-by: Ilan Gold <ilanbassgold@gmail.com>
Co-authored-by: Ilan Gold <ilanbassgold@gmail.com>
|
@ilan-gold have another look, I cleaned things up here a bit. one notable change that I plan on re-using: define subfunctions that explicitly depend on a shared state, instead of relying implicitly on variables defined in the outer scope (like the queue and the semaphore) |
chuckwondo
left a comment
There was a problem hiding this comment.
Is there any reason you chose not to implement a range coalescing function independent of fetching? Perhaps I'm missing something, particularly since I'm still very much getting my feet wet with this codebase, but it seems to me that having an independent range coalescing function would allow for far simpler and easier testing.
As currently written, your tests must include fetching since you have coupled coalescing and fetching, which seems to have added quite a bit of verbosity/complexity to your tests that you might otherwise be able to avoid, since existing tests should already be covering the fetching.
|
No good reason! I think factoring the range coalescing into a separate routine is a great idea for the reasons you mention. I'll implement that |
|
I would also argue that Although you may not know how many total bytes are available, you certainly know, by definition, where the range starts (the offset), so as you coalesce earlier You can also coalesce beyond that point. For example, all This, of course, would seem to add a bit of complexity to the coalescing logic, so I suspect you could exclude such logic from this PR, and separately consider whether or not it would be worth the additional complexity at a later date (or never). |
|
Agree, but the only consumer of this method is the sharding codec which exclusively requests explicit byte ranges. So if we implemented coalescing for other byte range requests, it would never be used. |
|
I'm a bit confused then. Why do you accept all types of byte requests as input if you expect only byte range requests? |
|
The coalescing is a performance optimization, and we don't want to push branching on the request type to the caller. A non-coalesced request falls through to a regular fetch, which is submitted in the same batch as the coalesced fetches. The alternative requires two batches. And we can't predict the requirements of future codecs. We know today what sharding needs, so we optimize for that. If a future caller needs to coalesce offset range requests, we can implement it inside this function without changing the signature. |
|
That clarifies a bit for me, but I suspect I might need more time getting familiar with the library before that will be much more clear to me. After looking at the PR code a bit more, I see that you group the range requests, but don't actually collapse them (except when you actually fetch the bytes) so that once the bytes are fetched, you can slice them up according to the original range requests that you preserved. |
| DEFAULT_COALESCE_OPTIONS: CoalesceOptions = { | ||
| "max_gap_bytes": 1 << 20, # 1 MiB | ||
| "max_coalesced_bytes": 16 << 20, # 16 MiB | ||
| "max_concurrency": 10, |
There was a problem hiding this comment.
Why does this need its own maximal concurrency? How will this interact with the global setting?
There was a problem hiding this comment.
this needs a max concurrency because implementations may launch concurrent range reads. Right now this setting doesn't interact at all with the global setting, and that's hard to support given our current concurrency limit design -- codecs like sharding call concurrent_map recursively, so the global concurrency limit is alrady on shaky ground for sharding. We should do some testing with this parameter to see if there are any pathologies possible with too much concurrency.
this PR adds a
get_rangesprotocol for stores. the protocol defines the shape of a function that fetches multiple byte ranges within the same stored object. The purpose is to define a method stores can opt into if they offer an efficient way to fetch multiple byte ranges from the same object, which would be immediately useful for the sharding codec. The protocol looks like this:the return type is an async iterator over sequences, where each sequence is the result of an IO operation the store implemented. this provides some observability to the caller about the actual coalescing, if any, that occurred. Results are returned in computed order, so the inner result type is
tuple[int, Buffer | None], where theintis the index into the inputbyte_rangesfor that result.Only byte range requests that declare an explicit interval (
RangeByteRequest) are coalesced. Any other byte range, orNone, results in no coalescing and so the ranges will be fetched separately. I assume here that we do not care about coalescing overlapping suffix or prefix range requests, but we could add support for that if we need to.In addition to this protocol, there's a freestanding function that takes:
f(byte range) -> Awaitable[Buffer]function (which we would generate by combiningStore.getwithfunctools.partial)this function contains basic byte range coalescing logic, and it can be re-used for multiple stores. This is a non-abstract-base-class alternative to a default implementation on an abc.
That freestanding function is used to implement
get_rangeson theFsspecStore. This is probably not useful for local- or memory-backed storage, but is useful for remote storage. The actual implementation is lightweight:cc @aldenks, the idea here is to build a basis for your range coalescing work for the sharding codec
cc @kylebarron, would love your feedback on this design.
related issues/PRs:
#1758
#3004