Server: surface SHARPI_CPU_MOE as a SharpInferenceServerOptions field (mirror #80)

## Background

After the server refactor (PR landing this session) most CLI MoE-cache knobs reach
the engine via options → env vars in `InferenceEngineLoader.ApplyMoeEnvironment`:
`SHARPI_MOE_WARMPIN`, `SHARPI_MOE_WARMPIN_AFTER`, `SHARPI_MOE_PREDICT_PREFETCH`,
`SHARPI_EXPERT_STATS`. One placement knob is conspicuously missing:
**`SHARPI_CPU_MOE`** — the all-or-nothing override that forces routed experts to
the CPU side.

It matters in practice: per the README perf table, `Qwen3.6-35B-A3B-MTP` only lands
its 22.9 t/s on the CUDA hybrid path when `SHARPI_CPU_MOE=1`. An operator running
the server has to know to export the env var before `dotnet run`, which defeats the
purpose of the options surface that mirrors the CLI for everything else.

## Cause

`SharpInferenceServerOptions` has no field for this; `ApplyMoeEnvironment` never
writes `SHARPI_CPU_MOE`. The engine's `CudaHybridGdnForwardPass` reads the env var
directly at construction time, so the option has to be translated before model load.

## Scope

1. Add `bool? CpuMoe { get; set; }` to `SharpInferenceServerOptions` (nullable so
   default behaviour — engine auto-selects from SLRU sizing — is preserved when
   unset). XML doc cross-references the CLI's `--cpu-moe` (issue #80).
2. In `InferenceEngineLoader.ApplyMoeEnvironment`, when set, write
   `SHARPI_CPU_MOE=0|1` early.
3. One unit test that verifies the option round-trips into the env var ahead of
   model load (matches the existing MoE-knob coverage pattern).

## Acceptance

- [ ] `SharpInference:CpuMoe=true` in `appsettings.json` produces the same forward-
      pass routing as `SHARPI_CPU_MOE=1`.
- [ ] Unset / null preserves the auto-select default (no env-var write).
- [ ] CLI `--cpu-moe` (issue #80) and the server option resolve to the same engine
      behaviour.

## Related

- #80 — CLI `--cpu-moe` flag wrapping the same env var
- `src/SharpInference.Server/InferenceEngineLoader.cs` — `ApplyMoeEnvironment`
- `src/SharpInference.Server/SharpInferenceServerOptions.cs` — MoE knobs section
- `CudaHybridGdnForwardPass` (engine) — reads `SHARPI_CPU_MOE` at construction

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server: surface SHARPI_CPU_MOE as a SharpInferenceServerOptions field (mirror #80) #93

Background

Cause

Scope

Acceptance

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Server: surface SHARPI_CPU_MOE as a SharpInferenceServerOptions field (mirror #80) #93

Description

Background

Cause

Scope

Acceptance

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions