Skip to content

feat: Optimize for small sites by default to cap memory usage#60

Merged
tanmoysrt merged 9 commits into
frappe:mainfrom
tanmoysrt:admin_socket_activation
Jun 17, 2026
Merged

feat: Optimize for small sites by default to cap memory usage#60
tanmoysrt merged 9 commits into
frappe:mainfrom
tanmoysrt:admin_socket_activation

Conversation

@tanmoysrt

@tanmoysrt tanmoysrt commented Jun 17, 2026

Copy link
Copy Markdown
Member
  • Admin service takes ~40 MB memory. None is going to access it everytime. After a specific time, the service can be shutdown. On new request only, it can wake up.
  • Do malloc trimming periodically (default: 5mins) or after specified no of requests (default : 100) to reclaim memory.
  • Socketio was broken on prod setup, fixed

Defaults are set for small sites. These could be disabled or increased for large prod sites.

tanmoysrt and others added 5 commits June 16, 2026 20:52
Admin service takes ~40 MB memory.
None is going to access it everytime.
After a specific time, the service can be shutdown.

On new request only, it can wake up.
Adds a throttled post_request hook that calls malloc_trim(0) after N
requests or N seconds, returning freed glibc arena memory to the OS so
transient spikes don't pin the web worker's RSS. Controlled by new
malloc_trim_requests (default 100) and malloc_trim_interval (default
300) gunicorn config keys; both 0 disables the hook.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The previous hook only ran inside post_request, so the time-based trim
fired only when a request happened to arrive — an idle worker never
reclaimed after a spike, and under load the request count tripped first,
leaving the interval effectively dead.

Replace it with a daemon timer thread started per worker in
post_worker_init: it trims every malloc_trim_interval seconds regardless
of traffic, and post_request wakes it early once malloc_trim_requests
requests have accrued. Verified live: trims fire every interval on a
fully idle worker.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ions

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the separate scheduler companion and per-group worker companions
with one run_worker_pool companion. The pool runs every queue (deduped
union across [workers] groups) with num_workers equal to the summed group
counts, and the Frappe scheduler runs as a thread inside the pool workers
— so the dedicated scheduler process is gone, one fewer process per bench.

Verified live on a companion-manager bench: pool processes jobs, the
embedded scheduler thread runs, and web-tree PSS drops vs the legacy
layout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tanmoysrt tanmoysrt changed the title feat(admin): Add socket activation for admin service feat: Optimize for small sites by default to cap memory usage Jun 17, 2026
tanmoysrt and others added 3 commits June 17, 2026 07:26
Replace the malloc_trim timer (redundant — gunicorn worker recycling /
max_requests covers post-spike reclaim) with a memory_allocator choice in
[gunicorn]: "auto" (default) LD_PRELOADs jemalloc when libjemalloc is on
the host, else uses stock pymalloc/glibc; "jemalloc"/"pymalloc" force it.
jemalloc keeps pymalloc on top (no PYTHONMALLOC override) so small-object
pooling is preserved. malloc_arena_max now applies only on the pymalloc
path. Allocator env flows through _py_memory_env to web + companions
(via fork) + standalone workers.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reframe memory_allocator as a two-way switch:
- "pymalloc" (default): stock CPython on glibc, best throughput — for prod.
- "jemalloc": LD_PRELOAD jemalloc with MALLOC_CONF=dirty_decay_ms:0,
  muzzy_decay_ms:0 so freed pages return to the OS immediately via
  MADV_DONTNEED (not lazy MADV_FREE) — for small/demo benches and
  memory-overcommitted hosts (Firecracker). Falls back to pymalloc if
  libjemalloc is absent.

Drop the "auto" mode. Verified live: under jemalloc a post-spike worker
settles back near baseline with LazyFree=0 (memory truly returned).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Worker recycling is the only reliable way to return the heap a web worker
accretes under load: testing showed neither jemalloc nor pymalloc (nor arena
tuning) releases it on idle, since CPython's obmalloc retains the arenas. Drop
the marginal memory_allocator option and add max_requests/max_requests_jitter.
Keep malloc_arena_max for idle arena capping.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tanmoysrt

tanmoysrt commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

Minimal Config -

  • 1 Gunicorn worker x 8 threads
  • 1 worker-pool [default, short, long]
  • 1 socketio

Memory usage after start : ~180 MB
Warmed up state memory usage : 200~220MB

Realtime auth derives its get_user_info callback URL from the Origin
header. HTTPS clients (TLS terminated upstream) sent Origin: https://...,
so the callback hit a non-existent local :443 and every connection was
rejected as Unauthorized, churning into 'Session is disconnected'. The
/socket.io block now sets X-Frappe-Site-Name and rewrites Origin to
$scheme://$http_host so the callback uses the scheme nginx actually serves.

Also stop+disable a bench's dropped systemd units (e.g. socketio after
enabling companion mode) before reload, so an orphaned process no longer
holds :9000 and crash-loops the gunicorn companion on bind.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tanmoysrt tanmoysrt merged commit 4d3fed1 into frappe:main Jun 17, 2026
2 checks passed
@tanmoysrt tanmoysrt deleted the admin_socket_activation branch June 17, 2026 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant