Fix remote file system (get/put)#1955
Conversation
Signed-off-by: Kevin Su <pingsutw@apache.org>
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #1955 +/- ##
==========================================
- Coverage 62.70% 62.58% -0.13%
==========================================
Files 313 310 -3
Lines 23181 23108 -73
Branches 3511 3513 +2
==========================================
- Hits 14536 14462 -74
- Misses 8223 8224 +1
Partials 422 422 ☔ View full report in Codecov by Sentry. |
wild-endeavor
left a comment
There was a problem hiding this comment.
is there a unit test we can add?
| ) | ||
| print(f"Getting {from_path} to {to_path}") | ||
| return file_system.get(from_path, to_path, recursive=recursive, **kwargs) | ||
| dst = file_system.get(from_path, to_path, recursive=recursive, **kwargs) |
There was a problem hiding this comment.
what does dst stand for?
There was a problem hiding this comment.
maybe add a test like
, just making a dummy transformer, if that's easy to do. if not, just add the below as a comment in both places.There was a problem hiding this comment.
dst is short for destination, right?
There was a problem hiding this comment.
yes, since fsspec uses dst as well
|
@eapolinario basically with the pr that went in we kinda added a new API to the api without being able to make it explicit. Basically, when an fsspec filesystem now handles get/put, it has the option of returning a value, and that value is now also used by flytekit as the uri in most places. This was useful because the flyte fs is not a real fs, so users don't actually pick the to_path that they want to write to. instead where it was actually written is returned by the filesystem. In this case however, the s3 filesystem was returning |
| ) | ||
| print(f"Getting {from_path} to {to_path}") | ||
| return file_system.get(from_path, to_path, recursive=recursive, **kwargs) | ||
| dst = file_system.get(from_path, to_path, recursive=recursive, **kwargs) |
There was a problem hiding this comment.
dst is short for destination, right?
| print(f"Getting {from_path} to {to_path}") | ||
| return file_system.get(from_path, to_path, recursive=recursive, **kwargs) | ||
| dst = file_system.get(from_path, to_path, recursive=recursive, **kwargs) | ||
| if isinstance(dst, (str, pathlib.Path)): |
There was a problem hiding this comment.
IIUC, [None] is a guard value returned by fsspec, right? If that's the case, why don't we handle those explicitly?
There was a problem hiding this comment.
oh really? didn't see that. link me?
There was a problem hiding this comment.
and yes we should, but the action should still be to return the to_path.
There was a problem hiding this comment.
I don't think it's part of the spec and I fear this is specific to s3fs. Doing what we're doing in the PR is probably safer (as it might apply to other fsspec-compliant implementations).
There was a problem hiding this comment.
AsyncFileSystem.put or AsyncFileSystem.get always return a list.
Signed-off-by: Kevin Su <pingsutw@apache.org>
* test Signed-off-by: Kevin Su <pingsutw@apache.org> * test Signed-off-by: Kevin Su <pingsutw@apache.org> * test Signed-off-by: Kevin Su <pingsutw@apache.org> * lint Signed-off-by: Kevin Su <pingsutw@apache.org> * update-get Signed-off-by: Kevin Su <pingsutw@apache.org> * add unit test Signed-off-by: Kevin Su <pingsutw@apache.org> --------- Signed-off-by: Kevin Su <pingsutw@apache.org>
* test Signed-off-by: Kevin Su <pingsutw@apache.org> * test Signed-off-by: Kevin Su <pingsutw@apache.org> * test Signed-off-by: Kevin Su <pingsutw@apache.org> * lint Signed-off-by: Kevin Su <pingsutw@apache.org> * update-get Signed-off-by: Kevin Su <pingsutw@apache.org> * add unit test Signed-off-by: Kevin Su <pingsutw@apache.org> --------- Signed-off-by: Kevin Su <pingsutw@apache.org>
* test Signed-off-by: Kevin Su <pingsutw@apache.org> * test Signed-off-by: Kevin Su <pingsutw@apache.org> * test Signed-off-by: Kevin Su <pingsutw@apache.org> * lint Signed-off-by: Kevin Su <pingsutw@apache.org> * update-get Signed-off-by: Kevin Su <pingsutw@apache.org> * add unit test Signed-off-by: Kevin Su <pingsutw@apache.org> --------- Signed-off-by: Kevin Su <pingsutw@apache.org> Signed-off-by: Rafael Raposo <rafaelraposo@spotify.com>
TL;DR
Failed to serialize the flytefile due to below error:
[3/3] currentAttempt done. Last Error: SYSTEM::Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/flytekit/exceptions/scopes.py", line 165, in system_entry_point return wrapped(*args, **kwargs) File "/usr/local/lib/python3.10/site-packages/flytekit/core/base_task.py", line 662, in dispatch_execute literals_map, native_outputs_as_map = self._output_to_literal_map(native_outputs, exec_ctx) File "/usr/local/lib/python3.10/site-packages/flytekit/core/base_task.py", line 561, in _output_to_literal_map raise TypeError(msg) Message: Failed to convert outputs of task 'test.t1' at position 0: 'list' object has no attribute 'startswith' SYSTEM ERROR! Contact platform administrators.That's because asyncFileSystem (s3fs) return [None] when calling
file_system.puthereType
Are all requirements met?
Complete description
Tracking Issue
NA
Follow-up issue
NA