Skip to content

MLO-70 multi partition support#59

Open
MDobransky wants to merge 4 commits intodevelopfrom
MLO-70_multi_partition_support
Open

MLO-70 multi partition support#59
MDobransky wants to merge 4 commits intodevelopfrom
MLO-70_multi_partition_support

Conversation

@MDobransky
Copy link
Copy Markdown
Collaborator

No description provided.

@MDobransky MDobransky requested a review from vvancak as a code owner March 20, 2026 20:53
df = self._execute(feature_group, run_date, pipeline)
self.writer.write(df, info_date, target)
records = self._check_written(info_date, target)
records = self._check_written(info_date, target, df)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This df comes from df._execute and when you run .collect() on it, it will trigger the whole computation once again. The whole purpose of _check_written instead of df.count() was that we don't compute the same dataframe twice as some of our computations are heavy


df = self.reader.get_table(
table.get_table_path(), date_column=table.partition, date_from=date, date_to=date, filters=filters
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

date_from=info_date, date_to=info_date

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants