[pr into #785] Turn structured dataset into dataclass#802
Conversation
Signed-off-by: Kevin Su <pingsutw@apache.org>
Codecov Report
@@ Coverage Diff @@
## structured-dataset-proposal #802 +/- ##
===============================================================
+ Coverage 85.60% 85.71% +0.10%
===============================================================
Files 353 356 +3
Lines 30465 30585 +120
Branches 3674 3679 +5
===============================================================
+ Hits 26080 26216 +136
+ Misses 3716 3700 -16
Partials 669 669
Continue to review full report at Codecov.
|
| # If the type signature has the StructuredDataset class, it will, or at least should, also be a | ||
| # StructuredDataset instance. | ||
| if issubclass(python_type, StructuredDataset): | ||
| if inspect.isclass(python_type) and issubclass(python_type, StructuredDataset): |
There was a problem hiding this comment.
works for python 3.7-3.10 right?
There was a problem hiding this comment.
yeah, I've tested it with python 3.7~3.10.
| ) | ||
|
|
||
| # Otherwise assume it's a dataframe instance. Wrap it with some defaults | ||
| if get_origin(python_type) is Annotated: |
There was a problem hiding this comment.
should we have done this at the top of this function? or is it okay here?
There was a problem hiding this comment.
we extract the python_type in get_transformer instead of to_literal, so python_type could be Annotated here.
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
|
|
||
| # 2. Fill in columns by checking for StructuredDataset metadata. For example, StructuredDataset[my_cols, parquet] | ||
| elif issubclass(t, StructuredDataset): | ||
| elif inspect.isclass(t) and issubclass(t, StructuredDataset): |
There was a problem hiding this comment.
remind me again what this inspect.isclass is supposed to catch? can you add a comment? i keep forgetting.
There was a problem hiding this comment.
That's for Annotated[pd.Dataframe, my_col]. I just moved expected_python_type = get_args(expected_python_type)[0] to the beginning of the to_python and to_literal. Therefore, we don't need inspect.isclass(t) any more, so I removed it.
Signed-off-by: Kevin Su pingsutw@apache.org
TL;DR
Please replace this text with a description of what this PR accomplishes.
Type
Are all requirements met?
Complete description
How did you fix the bug, make the feature etc. Link to any design docs etc
Tracking Issue
https://github.com/lyft/flyte/issues/
Follow-up issue
NA
OR
https://github.com/lyft/flyte/issues/