Skip to content

app stop/start fails with HTTP 500 on RabbitMQ 4.3+: CLI declares a transient non-exclusive observer queue #156

@perelin

Description

@perelin

Hi! $LLM (claude4.8) and I did some bug hunting. Here are the results.


Describe the bug

runtipi-cli app stop|start <app> exits with HTTP 500 on instances whose runtipi-queue runs RabbitMQ 4.3+, even though the actual app lifecycle action succeeds in the backend.

Root cause is in the CLI itself, not the server. ListenForMessage declares a per-invocation observer queue that is transient and non-exclusive:

https://github.com/runtipi/cli/blob/develop/internal/utils/queue.go#L76-L84

observerQueueName := fmt.Sprintf("go_app_events_observer_%d", time.Now().UnixNano())
q, err := ch.QueueDeclare(
    observerQueueName,
    false, // durable    = false  -> transient
    true,  // autoDelete = true
    false, // exclusive  = false  -> non-exclusive
    false,
    nil,
)
if err != nil {
    log.Fatalf("Failed to declare queue '%s': %v", observerQueueName, err)
}

durable=false + exclusive=false is exactly a transient, non-exclusive classic queue. RabbitMQ deprecated this and made it deny-by-default in 4.3.0 (transient_nonexcl_queues). So QueueDeclare returns Exception (541) INTERNAL_ERROR - Feature 'transient_nonexcl_queues' is deprecated, log.Fatalf aborts the CLI, and the wrapper reports HTTP 500.

Because the image tag used for the queue (rabbitmq:4-alpine) floats across minors, existing instances drift onto 4.3 on a normal image pull and the CLI starts failing without any config change.

To Reproduce

  1. Run an instance with runtipi-queue on rabbitmq:4-alpine resolving to 4.3.x (e.g. 4.3.1).
  2. ./runtipi-cli app stop <app>:<store> (or start).
  3. Observe:
Failed to declare queue 'go_app_events_observer_<nanos>': Exception (541) Reason:
"INTERNAL_ERROR - Feature `transient_nonexcl_queues` is deprecated. By default, this feature is not permitted anymore. ..."
Error code: 500
Response: {"statusCode":500,"message":"Internal server error","path":"/api/app-lifecycle/<app>/stop"}
✗ Failed to stop app <app>. See logs/error.log for more details.

On the broker side, every CLI call logs:

operation queue.declare caused a connection exception internal_error:
"Feature `transient_nonexcl_queues` is deprecated. ..."

Expected behavior

app stop/start should succeed (exit 0) on RabbitMQ 4.3+.

Note: the action actually runs

Importantly, the backend still performs the stop/start — the server-side RPC queues were made durable in runtipi/runtipi#2595 (shipped in v4.10.0), so system-events-queue / app-events-queue / repo-queue are fine and the Web UI works. Only the CLI's own observer queue still uses the deprecated declaration, so the CLI reports a false-positive failure while the container is actually recreated (verified: uptime resets). This makes the failure look fatal to anyone driving Runtipi via the CLI / scripts.

Suggested fix

Declare the observer queue as exclusive (it is a short-lived, per-connection reply/observer queue that should be auto-removed when the CLI disconnects). Exclusive queues are exempt from the transient_nonexcl_queues deprecation:

q, err := ch.QueueDeclare(
    observerQueueName,
    false, // durable
    true,  // autoDelete
    true,  // exclusive  <-- changed from false
    false,
    nil,
)

(Alternatively durable=true, mirroring the approach in runtipi/runtipi#2595, but exclusive=true matches the queue's actual lifecycle better.)

Environment

  • CLI: v4.2.1
  • Runtipi: v4.10.0
  • runtipi-queue: rabbitmq:4-alpine → 4.3.1
  • OS: Ubuntu (Linux 6.x)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions