Skip to content

CLI: clarify backend selection (no --backend flag; CUDA not wired to LLM path) #4

@pekkah

Description

@pekkah

Summary

Users may try --backend cuda based on the README mentioning a CUDA backend, but no such flag exists on the RunCommand. The flag is silently ignored by Spectre.Console and inference falls back to CPU. The CUDA backend (SharpInference.Cuda) is wired only to ImageCommand (image generation), not to LLM inference.

What exists

  • -g 0 (default): CPU only
  • -g N: hybrid - N layers on Vulkan GPU + rest on CPU
  • -g -1: all layers on Vulkan GPU (auto-detect via TierPlanner)

Vulkan only. Requesting CUDA for chat inference is not supported.

Suggested fixes

  1. Reject unknown options (Spectre.Console has StrictParsing mode) instead of silently ignoring --backend cuda
  2. Add a --backend {cpu,vulkan} option that maps onto -g semantics, and emit an explicit error if cuda is requested for chat inference
  3. README cleanup: clarify that CUDA is image-only

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions