Skip to content

Add encoder/decoder in structureDataset for snowflake.#1811

Closed
pohanhuang wants to merge 16 commits into
flyteorg:masterfrom
pohanhuang:snowflake-structure-dataset
Closed

Add encoder/decoder in structureDataset for snowflake.#1811
pohanhuang wants to merge 16 commits into
flyteorg:masterfrom
pohanhuang:snowflake-structure-dataset

Conversation

@pohanhuang

@pohanhuang pohanhuang commented Aug 30, 2023

Copy link
Copy Markdown
Contributor

TL;DR

  1. Add encoder/decoder in structureDataset for the snowflake.
  2. Currently support the new protocol, snowflake, and sf.
    we can add them here, the default is a list of additional protocols, like the following.
image

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

Example code for user, we can parse the uro with the following format of uri,
"snowflake://{dummy_user}:{dummy_account}/{dummy_warehouse}/{dummy_database}/{dummy_schema}/{dummy_table}"

import mock
import pandas as pd
import pytest
from typing_extensions import Annotated

from flytekit import StructuredDataset, kwtypes, task, workflow, Secret

@task(secret_requests=[Secret(group="snowflake", key="private_key.pem")])
def t1() -> StructuredDataset:
    df = pd.DataFrame({"NAME": ["Tom", "Joseph"], "AGE": [220, 27]})
    return StructuredDataset(
        dataframe=df, uri="sf://phuang9527:rq06467.us-east4.gcp/COMPUTE_WH/AGENT/PEOPLE/PEOPLE"
    )

@task(secret_requests=[Secret(group="snowflake", key="private_key.pem")])
def t2() -> StructuredDataset:
    df = pd.DataFrame({"NAME": ["Tom", "Joseph"], "AGE": [235, 27]})
    return StructuredDataset(
        dataframe=df, uri="snowflake://phuang9527:rq06467.us-east4.gcp/COMPUTE_WH/AGENT/PEOPLE/PEOPLE"
    )

@task(secret_requests=[Secret(group="snowflake", key="private_key.pem")])
def t3(sd: StructuredDataset) -> pd.DataFrame:
    return sd.open(pd.DataFrame).all() # we will load the whole table for user


@workflow
def wf() -> pd.DataFrame:
    t1()
    return t3(sd=t2())

if __name__ == "__main__":
    wf()

Tracking Issue

Flyte Agent Ecosystem

@codecov

codecov Bot commented Aug 30, 2023

Copy link
Copy Markdown

Codecov Report

Attention: Patch coverage is 59.72222% with 29 lines in your changes missing coverage. Please review.

Project coverage is 54.84%. Comparing base (660f5aa) to head (0b47872).
Report is 671 commits behind head on master.

Files with missing lines Patch % Lines
flytekit/types/structured/snowflake.py 45.65% 25 Missing ⚠️
flytekit/core/type_engine.py 50.00% 2 Missing ⚠️
flytekit/types/structured/__init__.py 71.42% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1811      +/-   ##
==========================================
+ Coverage   54.33%   54.84%   +0.51%     
==========================================
  Files         288      295       +7     
  Lines       21934    22282     +348     
  Branches     3344     3357      +13     
==========================================
+ Hits        11917    12221     +304     
- Misses       9855     9902      +47     
+ Partials      162      159       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pohanhuang pohanhuang mentioned this pull request Aug 30, 2023
8 tasks
@pohanhuang pohanhuang marked this pull request as ready for review August 31, 2023 09:48
@pohanhuang

Copy link
Copy Markdown
Contributor Author

Hi @pingsutw , just finished add ingencoder/decoder in structureDataset for snowflake and unit test, please approve the testing and take a look.

Comment thread flytekit/types/structured/__init__.py Outdated
@pohanhuang pohanhuang requested a review from pingsutw September 1, 2023 04:38
@pohanhuang

Copy link
Copy Markdown
Contributor Author

Hi @pingsutw, please approve the CI test.

@pohanhuang pohanhuang force-pushed the snowflake-structure-dataset branch from 7fac349 to 09052cb Compare September 10, 2023 09:39
@pohanhuang

Copy link
Copy Markdown
Contributor Author

Hi @pingsutw, please approve the CI test.

@pohanhuang

Copy link
Copy Markdown
Contributor Author

Hi @pingsutw, just finished the changes, Please take a look.

@pohanhuang

Copy link
Copy Markdown
Contributor Author

Hi @pingsutw, just finished the changes to get, please review.
Thanks a lot!

pingsutw
pingsutw previously approved these changes Sep 14, 2023
Comment thread flytekit/types/structured/__init__.py Outdated
Comment thread flytekit/types/structured/snowflake.py Outdated
@pohanhuang

Copy link
Copy Markdown
Contributor Author

Hi @pingsutw, just finished the changes to get, please review.
Thanks a lot!

pingsutw
pingsutw previously approved these changes Sep 15, 2023
Signed-off-by: HH <hhcs9527@gmail.com>
Signed-off-by: HH <hhcs9527@gmail.com>
Signed-off-by: HH <hhcs9527@gmail.com>
Signed-off-by: HH <hhcs9527@gmail.com>
Signed-off-by: HH <hhcs9527@gmail.com>
Signed-off-by: HH <hhcs9527@gmail.com>
Signed-off-by: HH <hhcs9527@gmail.com>
@pohanhuang pohanhuang force-pushed the snowflake-structure-dataset branch from 13c6d5b to ba3d2fc Compare September 27, 2023 16:59
Signed-off-by: HH <hhcs9527@gmail.com>
@pohanhuang pohanhuang requested a review from pingsutw September 27, 2023 17:38
Signed-off-by: HH <hhcs9527@gmail.com>
Signed-off-by: HH <hhcs9527@gmail.com>
@pohanhuang

Copy link
Copy Markdown
Contributor Author

Hi @pingsutw,
I finished the code change to support multiple protocols and fixed the URI.

I will create the new PR to support the new URI.
new URI: uri="snowflake://dummy_user:dummy_account/dummy_warehouse/dummy_database/dummy_schema/dummy_table

Thanks a lot

Signed-off-by: HH <hhcs9527@gmail.com>
Signed-off-by: HH <hhcs9527@gmail.com>
Signed-off-by: HH <hhcs9527@gmail.com>
@Future-Outlier Future-Outlier mentioned this pull request Jul 25, 2024
3 tasks
@Future-Outlier

Copy link
Copy Markdown
Member

This is done by other PR, so I'm going to close it, thank you @hhcs9527 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants