CLI: clarify backend selection (no --backend flag; CUDA not wired to LLM path)

## Summary

Users may try `--backend cuda` based on the README mentioning a CUDA backend, but no such flag exists on the `RunCommand`. The flag is silently ignored by Spectre.Console and inference falls back to CPU. The CUDA backend (`SharpInference.Cuda`) is wired only to `ImageCommand` (image generation), not to LLM inference.

## What exists

- `-g 0` (default): CPU only
- `-g N`: hybrid - N layers on Vulkan GPU + rest on CPU
- `-g -1`: all layers on Vulkan GPU (auto-detect via `TierPlanner`)

Vulkan only. Requesting CUDA for chat inference is not supported.

## Suggested fixes

1. Reject unknown options (Spectre.Console has `StrictParsing` mode) instead of silently ignoring `--backend cuda`
2. Add a `--backend {cpu,vulkan}` option that maps onto `-g` semantics, and emit an explicit error if `cuda` is requested for chat inference
3. README cleanup: clarify that CUDA is image-only

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLI: clarify backend selection (no --backend flag; CUDA not wired to LLM path) #4

Summary

What exists

Suggested fixes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

CLI: clarify backend selection (no --backend flag; CUDA not wired to LLM path) #4

Description

Summary

What exists

Suggested fixes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions