Skip to content

[EPIC] Streaming partitioned writes #6569

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

This is a tracking epic for a collection of features related to writing data.

The basic idea is better / full support for writing data:

  1. to multiple (possibly Partitoned by value) files
  2. To different file types (parquet, csv, json, avro, arrow)
  3. In a streaming fashion (input doesn't need to be entirely buffered)
  4. From SQL (via INSERT, INSERT INTO, COPY, Etc)
  5. Stream to a target object_store (aka multi-part S3 upload)

This is partially supported today programmatically (see SessionContext::write_csv, etc)

Subtasks:

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions