feat: Optimize for small sites by default to cap memory usage by tanmoysrt · Pull Request #60 · frappe/bench-cli

tanmoysrt · 2026-06-17T06:30:39Z

Admin service takes ~40 MB memory. None is going to access it everytime. After a specific time, the service can be shutdown. On new request only, it can wake up.
Do malloc trimming periodically (default: 5mins) or after specified no of requests (default : 100) to reclaim memory.
Socketio was broken on prod setup, fixed

Defaults are set for small sites. These could be disabled or increased for large prod sites.

Admin service takes ~40 MB memory. None is going to access it everytime. After a specific time, the service can be shutdown. On new request only, it can wake up.

Adds a throttled post_request hook that calls malloc_trim(0) after N requests or N seconds, returning freed glibc arena memory to the OS so transient spikes don't pin the web worker's RSS. Controlled by new malloc_trim_requests (default 100) and malloc_trim_interval (default 300) gunicorn config keys; both 0 disables the hook. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The previous hook only ran inside post_request, so the time-based trim fired only when a request happened to arrive — an idle worker never reclaimed after a spike, and under load the request count tripped first, leaving the interval effectively dead. Replace it with a daemon timer thread started per worker in post_worker_init: it trims every malloc_trim_interval seconds regardless of traffic, and post_request wakes it early once malloc_trim_requests requests have accrued. Verified live: trims fire every interval on a fully idle worker. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ions Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Replace the separate scheduler companion and per-group worker companions with one run_worker_pool companion. The pool runs every queue (deduped union across [workers] groups) with num_workers equal to the summed group counts, and the Frappe scheduler runs as a thread inside the pool workers — so the dedicated scheduler process is gone, one fewer process per bench. Verified live on a companion-manager bench: pool processes jobs, the embedded scheduler thread runs, and web-tree PSS drops vs the legacy layout. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Replace the malloc_trim timer (redundant — gunicorn worker recycling / max_requests covers post-spike reclaim) with a memory_allocator choice in [gunicorn]: "auto" (default) LD_PRELOADs jemalloc when libjemalloc is on the host, else uses stock pymalloc/glibc; "jemalloc"/"pymalloc" force it. jemalloc keeps pymalloc on top (no PYTHONMALLOC override) so small-object pooling is preserved. malloc_arena_max now applies only on the pymalloc path. Allocator env flows through _py_memory_env to web + companions (via fork) + standalone workers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Reframe memory_allocator as a two-way switch: - "pymalloc" (default): stock CPython on glibc, best throughput — for prod. - "jemalloc": LD_PRELOAD jemalloc with MALLOC_CONF=dirty_decay_ms:0, muzzy_decay_ms:0 so freed pages return to the OS immediately via MADV_DONTNEED (not lazy MADV_FREE) — for small/demo benches and memory-overcommitted hosts (Firecracker). Falls back to pymalloc if libjemalloc is absent. Drop the "auto" mode. Verified live: under jemalloc a post-spike worker settles back near baseline with LazyFree=0 (memory truly returned). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Worker recycling is the only reliable way to return the heap a web worker accretes under load: testing showed neither jemalloc nor pymalloc (nor arena tuning) releases it on idle, since CPython's obmalloc retains the arenas. Drop the marginal memory_allocator option and add max_requests/max_requests_jitter. Keep malloc_arena_max for idle arena capping. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

tanmoysrt · 2026-06-17T08:56:38Z

Minimal Config -

1 Gunicorn worker x 8 threads
1 worker-pool [default, short, long]
1 socketio

Memory usage after start : ~180 MB
Warmed up state memory usage : 200~220MB

Realtime auth derives its get_user_info callback URL from the Origin header. HTTPS clients (TLS terminated upstream) sent Origin: https://..., so the callback hit a non-existent local :443 and every connection was rejected as Unauthorized, churning into 'Session is disconnected'. The /socket.io block now sets X-Frappe-Site-Name and rewrites Origin to $scheme://$http_host so the callback uses the scheme nginx actually serves. Also stop+disable a bench's dropped systemd units (e.g. socketio after enabling companion mode) before reload, so an orphaned process no longer holds :9000 and crash-loops the gunicorn companion on bind. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

tanmoysrt and others added 5 commits June 16, 2026 20:52

feat(admin): Add socket activation for admin service

1394e20

Admin service takes ~40 MB memory. None is going to access it everytime. After a specific time, the service can be shutdown. On new request only, it can wake up.

docs: document malloc_trim_requests/malloc_trim_interval gunicorn opt…

1129062

…ions Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

tanmoysrt changed the title ~~feat(admin): Add socket activation for admin service~~ feat: Optimize for small sites by default to cap memory usage Jun 17, 2026

tanmoysrt and others added 3 commits June 17, 2026 07:26

tanmoysrt merged commit 4d3fed1 into frappe:main Jun 17, 2026
2 checks passed

tanmoysrt deleted the admin_socket_activation branch June 17, 2026 12:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Optimize for small sites by default to cap memory usage#60

feat: Optimize for small sites by default to cap memory usage#60
tanmoysrt merged 9 commits into
frappe:mainfrom
tanmoysrt:admin_socket_activation

tanmoysrt commented Jun 17, 2026 •

edited

Loading

Uh oh!

tanmoysrt commented Jun 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tanmoysrt commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tanmoysrt commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tanmoysrt commented Jun 17, 2026 •

edited

Loading

tanmoysrt commented Jun 17, 2026 •

edited

Loading