Skip to content

[Ballista] Introduce QueryStageScheduler for better managing the stage-based task scheduling #1704

@yahoNanJing

Description

@yahoNanJing

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

The implementation for get_next_schedulable_task() is not efficient due to lack of classification of task stages and task status, especially when there's thousands of tasks to be scheduled. And it's also not easy to:

  • stage-level priority-based scheduling
  • stage-level retry
  • speculative task scheduling

Describe the solution you'd like

To draw lessons from the Spark, it's better to divide the task scheduling into two levels' scheduling:

  • to introduce QueryStageScheduler for scheduling stages
  • to introduce StageManager for managing job stages and the state machine for each stage.
  • to let TaskScheduler for fetching tasks from running stages in StageManager

An example sequential diagram is as follows:
Picture1

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions