feat: Optimize for small sites by default to cap memory usage#60
Merged
Conversation
Admin service takes ~40 MB memory. None is going to access it everytime. After a specific time, the service can be shutdown. On new request only, it can wake up.
Adds a throttled post_request hook that calls malloc_trim(0) after N requests or N seconds, returning freed glibc arena memory to the OS so transient spikes don't pin the web worker's RSS. Controlled by new malloc_trim_requests (default 100) and malloc_trim_interval (default 300) gunicorn config keys; both 0 disables the hook. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The previous hook only ran inside post_request, so the time-based trim fired only when a request happened to arrive — an idle worker never reclaimed after a spike, and under load the request count tripped first, leaving the interval effectively dead. Replace it with a daemon timer thread started per worker in post_worker_init: it trims every malloc_trim_interval seconds regardless of traffic, and post_request wakes it early once malloc_trim_requests requests have accrued. Verified live: trims fire every interval on a fully idle worker. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ions Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the separate scheduler companion and per-group worker companions with one run_worker_pool companion. The pool runs every queue (deduped union across [workers] groups) with num_workers equal to the summed group counts, and the Frappe scheduler runs as a thread inside the pool workers — so the dedicated scheduler process is gone, one fewer process per bench. Verified live on a companion-manager bench: pool processes jobs, the embedded scheduler thread runs, and web-tree PSS drops vs the legacy layout. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the malloc_trim timer (redundant — gunicorn worker recycling / max_requests covers post-spike reclaim) with a memory_allocator choice in [gunicorn]: "auto" (default) LD_PRELOADs jemalloc when libjemalloc is on the host, else uses stock pymalloc/glibc; "jemalloc"/"pymalloc" force it. jemalloc keeps pymalloc on top (no PYTHONMALLOC override) so small-object pooling is preserved. malloc_arena_max now applies only on the pymalloc path. Allocator env flows through _py_memory_env to web + companions (via fork) + standalone workers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reframe memory_allocator as a two-way switch: - "pymalloc" (default): stock CPython on glibc, best throughput — for prod. - "jemalloc": LD_PRELOAD jemalloc with MALLOC_CONF=dirty_decay_ms:0, muzzy_decay_ms:0 so freed pages return to the OS immediately via MADV_DONTNEED (not lazy MADV_FREE) — for small/demo benches and memory-overcommitted hosts (Firecracker). Falls back to pymalloc if libjemalloc is absent. Drop the "auto" mode. Verified live: under jemalloc a post-spike worker settles back near baseline with LazyFree=0 (memory truly returned). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Worker recycling is the only reliable way to return the heap a web worker accretes under load: testing showed neither jemalloc nor pymalloc (nor arena tuning) releases it on idle, since CPython's obmalloc retains the arenas. Drop the marginal memory_allocator option and add max_requests/max_requests_jitter. Keep malloc_arena_max for idle arena capping. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Member
Author
|
Minimal Config -
Memory usage after start : |
Realtime auth derives its get_user_info callback URL from the Origin header. HTTPS clients (TLS terminated upstream) sent Origin: https://..., so the callback hit a non-existent local :443 and every connection was rejected as Unauthorized, churning into 'Session is disconnected'. The /socket.io block now sets X-Frappe-Site-Name and rewrites Origin to $scheme://$http_host so the callback uses the scheme nginx actually serves. Also stop+disable a bench's dropped systemd units (e.g. socketio after enabling companion mode) before reload, so an orphaned process no longer holds :9000 and crash-loops the gunicorn companion on bind. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Defaults are set for small sites. These could be disabled or increased for large prod sites.