Runtime ELF patching and rtld_audit removal by wdcui · Pull Request #810 · microsoft/litebox

wdcui · 2026-04-25T01:26:25Z

This PR replace the rtld_audit based syscall interception with runtime ELF patching during mmap. When a PROT_EXEC segment is mapped, the shim now patches syscall instructions in-place and places trampoline stubs in a dynamically-allocated region near the code.

wdcui · 2026-04-25T21:55:43Z

This PR is ready for review. Thanks!

sangho2 · 2026-04-27T21:17:35Z

Agents have found a bunch of vulnerable code paths that do not rewrite executable pages and more. We should fix all of them and perhaps should merge #805 before this PR. I add them below to ease agent-driven code patching. I'll "manually" add review later.

do_mmap_file() only fails the mapping if maybe_patch_exec_segment() returns false (litebox_shim_linux/src/syscalls/mm.rs:129-140). The runtime patching path logs and returns true for trampoline allocation failure, too-far trampoline, entry write failure, mprotect failures, read/writeback failures, and rewriter errors (:692-724, :747-770, :773-889). That leaves the executable mapping live with original syscall bytes.
patch_code_segment() can mutate code_buf by replacing unpatchable syscalls with traps and return empty stubs plus skipped addresses (litebox_syscall_rewriter/src/lib.rs:448-452, :753-765). The Linux mmap caller only writes code_buf back inside Ok(stubs) if !stubs.is_empty() (litebox_shim_linux/src/syscalls/mm.rs:810-864), while Ok(_) does nothing (:878-880). Trap-only rewrites are therefore dropped. In mixed cases, the trap bytes are written back because stubs are non-empty, but skipped sites are only logged and the mapping still succeeds.
Runtime rewriting is gated on the original mmap prot containing PROT_EXEC (litebox_shim_linux/src/syscalls/mm.rs:118-134). mprotect dispatch goes straight to sys_mprotect (litebox_shim_linux/src/lib.rs:641-642), which delegates directly to common mprotect (litebox_shim_linux/src/syscalls/mm.rs:375-381). Since mappings get VM_MAYEXEC by default, a non-exec file mapping can later become exec without rewrite.
AOT rewriting filters actual executable text sections (litebox_syscall_rewriter/src/lib.rs:183-194, :219-230, :282-312). Runtime rewriting passes the whole mapped executable range to patch_code_segment() (litebox_shim_linux/src/syscalls/mm.rs:773-795) and writes the whole buffer back (:863-864). That can patch bytes in executable PT_LOAD padding/data that AOT would not touch.
For unpatched binaries, ElfParsedFile::load() only bumps info.brk (litebox_common_linux/src/loader.rs:487-491); it does not map or reserve a VMA. The Linux loader loads the interpreter before set_initial_brk(info.brk (litebox_shim_linux/src/loader/elf.rs:240-250), and ET_DYN reservation only reserves the ELF load span, not the extra runtime-trampoline gap (litebox_common_linux/src/loader.rs:398-405). Nuance: brk expansion checks overlaps (litebox/src/mm/mod.rs:343-356), so it should not silently map over an interpreter VMA above the current break. But because the unreserved gap is below the artificially bumped initial break, later mappings can occupy it, runtime trampoline allocation can collide/fail, and brk shrink can treat that gap as heap-owned and unmap mappings inside it (:323-340).
Untrusted trampoline trailer disables runtime rewriting. parse_trampoline() treats a trailer with TRAMPOLINE_MAGIC and trampoline_size == 0 as success, interpreting it as “rewriter checked this binary and found no syscall instructions,” without authenticating that claim (litebox_common_linux/src/loader.rs:270-306). The Linux loader accepts that result and proceeds (litebox_shim_linux/src/loader/elf.rs:182-190), while has_trampoline() remains false so it only reserves runtime space (:200-209). Later, the mmap-time runtime hook independently calls check_trampoline_magic(), which only checks the final 32-byte magic and returns pre_patched = true even for trampoline_size == 0 (litebox_shim_linux/src/syscalls/mm.rs:541-561). maybe_patch_exec_segment() then takes the state.pre_patched branch, skips mapping when size is zero, and returns true without scanning or rewriting the executable segment (:599-664). A malicious unpatched ELF can therefore append a fake LITEBOX0 trailer with zero size and keep raw syscall instructions executable.
OP-TEE accepts UnpatchedBinary (litebox_shim_optee/src/loader/elf.rs:205-208), maps ELF bytes through anonymous mappings plus copy (:107-145), and OP-TEE mmap rejects file-backed mappings (litebox_shim_optee/src/syscalls/mm.rs:78-88). The Linux runtime mmap rewriter is not present there, so the comment that mmap-time patching will handle it is wrong. Caveat: on LVBS, raw syscall is still routed through the platform syscall callback, but OP-TEE-on-Linux-userland has the same raw-host-syscall concern as the Linux userland shim.
Remapping the same file offset leaves fresh unpatched code (litebox_shim_linux/src/syscalls/mm.rs:730). patched_offsets is keyed only by file offset. A second mmap() of the same executable file offset creates new bytes from the original file, but the rewriter skips it as "already patched." That gives another direct raw-syscall mapping.
ET_DYN base address inference is unreliable (litebox_shim_linux/src/syscalls/mm.rs:142, litebox_shim_linux/src/syscalls/mm.rs:501, litebox_shim_linux/src/syscalls/mm.rs:587). The runtime state uses the first observed mapping address as the ELF base. If the first mapping is not the offset-0/base mapping, trampoline addresses are computed incorrectly, causing skipped rewriting, failed jumps, or wrong mappings.
Trampoline metadata is less validated in the mmap path than in the loader (litebox_shim_linux/src/syscalls/mm.rs:558, litebox_shim_linux/src/syscalls/mm.rs:603, litebox_shim_linux/src/syscalls/mm.rs:610). The mmap path does not validate alignment, bounds, file_offset + size == header_offset, overflow, or sane address ranges before MAP_FIXED. This can clobber guest mappings or amplify the forged-header bypass.

CvvT · 2026-04-28T22:04:24Z

@sangho2 and I discussed about it and have an alternative idea: how about we hook open to rewrite all ELF files and then mmap only needs to load the trampoline code? It may avoid the issues like the syscall instruction happens to across two mmap regions (unlikely though) and then is missed by the rewriter.

One potential issue with this design is that if we are going to use broker for fs, the changes to the ELF files are reflected to the host as well (unless we have some sort of overlay fs).

wdcui · 2026-04-28T22:18:04Z

I thought about this before. What if I just open an elf file for binary analysis?

sangho2 · 2026-04-28T22:57:43Z

I thought about this before. What if I just open an elf file for binary analysis?

mmap does have more corner cases than open. To tackle all of them (or to avoid regression to due to a lack of rtld), we might need to add a bunch of new code.

wdcui · 2026-04-29T03:39:54Z

do_mmap_file() only fails the mapping if maybe_patch_exec_segment() returns false (litebox_shim_linux/src/syscalls/mm.rs:129-140). The runtime patching path logs and returns true for trampoline allocation failure, too-far trampoline, entry write failure, mprotect failures, read/writeback failures, and rewriter errors (:692-724, :747-770, :773-889). That leaves the executable mapping live with original syscall bytes.

patch_code_segment() can mutate code_buf by replacing unpatchable syscalls with traps and return empty stubs plus skipped addresses (litebox_syscall_rewriter/src/lib.rs:448-452, :753-765). The Linux mmap caller only writes code_buf back inside Ok(stubs) if !stubs.is_empty() (litebox_shim_linux/src/syscalls/mm.rs:810-864), while Ok(_) does nothing (:878-880). Trap-only rewrites are therefore dropped. In mixed cases, the trap bytes are written back because stubs are non-empty, but skipped sites are only logged and the mapping still succeeds.

Runtime rewriting is gated on the original mmap prot containing PROT_EXEC (litebox_shim_linux/src/syscalls/mm.rs:118-134). mprotect dispatch goes straight to sys_mprotect (litebox_shim_linux/src/lib.rs:641-642), which delegates directly to common mprotect (litebox_shim_linux/src/syscalls/mm.rs:375-381). Since mappings get VM_MAYEXEC by default, a non-exec file mapping can later become exec without rewrite.

AOT rewriting filters actual executable text sections (litebox_syscall_rewriter/src/lib.rs:183-194, :219-230, :282-312). Runtime rewriting passes the whole mapped executable range to patch_code_segment() (litebox_shim_linux/src/syscalls/mm.rs:773-795) and writes the whole buffer back (:863-864). That can patch bytes in executable PT_LOAD padding/data that AOT would not touch.

For unpatched binaries, ElfParsedFile::load() only bumps info.brk (litebox_common_linux/src/loader.rs:487-491); it does not map or reserve a VMA. The Linux loader loads the interpreter before set_initial_brk(info.brk (litebox_shim_linux/src/loader/elf.rs:240-250), and ET_DYN reservation only reserves the ELF load span, not the extra runtime-trampoline gap (litebox_common_linux/src/loader.rs:398-405). Nuance: brk expansion checks overlaps (litebox/src/mm/mod.rs:343-356), so it should not silently map over an interpreter VMA above the current break. But because the unreserved gap is below the artificially bumped initial break, later mappings can occupy it, runtime trampoline allocation can collide/fail, and brk shrink can treat that gap as heap-owned and unmap mappings inside it (:323-340).

Untrusted trampoline trailer disables runtime rewriting. parse_trampoline() treats a trailer with TRAMPOLINE_MAGIC and trampoline_size == 0 as success, interpreting it as “rewriter checked this binary and found no syscall instructions,” without authenticating that claim (litebox_common_linux/src/loader.rs:270-306). The Linux loader accepts that result and proceeds (litebox_shim_linux/src/loader/elf.rs:182-190), while has_trampoline() remains false so it only reserves runtime space (:200-209). Later, the mmap-time runtime hook independently calls check_trampoline_magic(), which only checks the final 32-byte magic and returns pre_patched = true even for trampoline_size == 0 (litebox_shim_linux/src/syscalls/mm.rs:541-561). maybe_patch_exec_segment() then takes the state.pre_patched branch, skips mapping when size is zero, and returns true without scanning or rewriting the executable segment (:599-664). A malicious unpatched ELF can therefore append a fake LITEBOX0 trailer with zero size and keep raw syscall instructions executable.

OP-TEE accepts UnpatchedBinary (litebox_shim_optee/src/loader/elf.rs:205-208), maps ELF bytes through anonymous mappings plus copy (:107-145), and OP-TEE mmap rejects file-backed mappings (litebox_shim_optee/src/syscalls/mm.rs:78-88). The Linux runtime mmap rewriter is not present there, so the comment that mmap-time patching will handle it is wrong. Caveat: on LVBS, raw syscall is still routed through the platform syscall callback, but OP-TEE-on-Linux-userland has the same raw-host-syscall concern as the Linux userland shim.

Remapping the same file offset leaves fresh unpatched code (litebox_shim_linux/src/syscalls/mm.rs:730). patched_offsets is keyed only by file offset. A second mmap() of the same executable file offset creates new bytes from the original file, but the rewriter skips it as "already patched." That gives another direct raw-syscall mapping.

ET_DYN base address inference is unreliable (litebox_shim_linux/src/syscalls/mm.rs:142, litebox_shim_linux/src/syscalls/mm.rs:501, litebox_shim_linux/src/syscalls/mm.rs:587). The runtime state uses the first observed mapping address as the ELF base. If the first mapping is not the offset-0/base mapping, trampoline addresses are computed incorrectly, causing skipped rewriting, failed jumps, or wrong mappings.

Trampoline metadata is less validated in the mmap path than in the loader (litebox_shim_linux/src/syscalls/mm.rs:558, litebox_shim_linux/src/syscalls/mm.rs:603, litebox_shim_linux/src/syscalls/mm.rs:610). The mmap path does not validate alignment, bounds, file_offset + size == header_offset, overflow, or sane address ranges before MAP_FIXED. This can clobber guest mappings or amplify the forged-header bypass.

I worked an agent and fixed some of the issues. Here is a summary.

Context: Syscall rewriting is for reliability, NOT security. Security is enforced by kernel-level monitors (seccomp). Adversarial binary scenarios are out of scope.

================================================================================

#1 — Runtime patching returns true on failures, leaving unpatched exec mappings (Medium)

Status: No action.

do_mmap_file() only fails the mapping if maybe_patch_exec_segment() returns false. The runtime patching path logs and returns true for: trampoline allocation failure, too-far trampoline, entry write failure, mprotect failures, read/writeback failures, and rewriter errors. This leaves the executable mapping live with original syscall bytes.

Rationale: Returning false would fail the mmap (OutOfMemory), breaking legitimate programs that happen to have unmappable trampoline regions. Since rewriting is for reliability not security, letting the program continue with unintercepted syscalls is the less-harmful failure mode. The two options are:

Fail the mmap → kills the program
Allow the mapping → program runs but with unintercepted syscalls
Option 2 is correct for a reliability tool.

================================================================================

#2 — Trap-only rewrites dropped when stubs.is_empty() (High)

Status: FIXED (commit 7143622).

patch_code_segment() can mutate code_buf by replacing unpatchable syscalls with ICEBP;HLT traps and return empty stubs. The Ok(stubs) if !stubs.is_empty() arm wrote code_buf back, but the Ok(_) arm did nothing — trap bytes were silently discarded.

Fix: The Ok(_) arm now writes back code_buf if it differs from the original code, then falls through to restore RX protections.

================================================================================

#3 — mprotect bypass: non-exec → exec without rewriting (High)

Status: No action in this PR. Worth filing as follow-up issue.

mprotect dispatch (lib.rs:641-643) goes straight to sys_mprotect with no special handling. Since mappings get VM_MAYEXEC by default, a non-exec file mapping can later become exec without rewrite.

Rationale: Fixing this requires intercepting mprotect, tracking file-backed VMAs, and running the rewriter at mprotect time — a significant architectural change beyond the scope of this PR. Only exploitable by adversarial code (deliberately mapping non-exec then flipping to exec), which is out of scope for a reliability tool.

================================================================================

#4 — Runtime rewriting patches entire mapped exec range, not just .text (Low)

Status: No action.

AOT rewriting filters actual executable text sections. Runtime rewriting passes the whole mapped executable range to patch_code_segment() and writes the whole buffer back, potentially patching bytes in executable PT_LOAD padding/data that AOT would not touch.

Rationale: The rewriter scans for actual 0F 05 (syscall) opcodes, so it only patches real syscall instructions. False positives (0F 05 appearing as data constants within executable PT_LOAD segments) are vanishingly rare in practice, and the consequence (a corrupted data constant) is benign compared to leaving a real syscall unpatched.

================================================================================

#5 — Unpatched binary brk/VMA reservation gap (Medium)

Status: No action in this PR. Worth filing as follow-up issue.

For unpatched binaries, load() bumps info.brk but doesn't map or reserve a VMA for the trampoline gap. The unreserved gap below the artificially bumped initial break can be occupied by later mappings, causing runtime trampoline allocation to collide/fail. brk shrink could also treat that gap as heap-owned and unmap mappings inside it.

Rationale: Runtime trampoline allocation uses MAP_FIXED_NOREPLACE with fallback, so collisions result in degraded patching (comment #1 applies — patching silently doesn't happen but the program continues). The brk-shrink-unmapping scenario is theoretical — real programs don't shrink brk into this region. Low risk but worth tracking.

================================================================================

#6 — Fake LITEBOX0 trailer bypasses runtime rewriting (Critical per reviewer)

Status: No action.

A malicious unpatched ELF can append a fake LITEBOX0 trailer with trampoline_size=0. parse_trampoline() treats this as "rewriter checked this binary and found no syscall instructions." check_trampoline_magic() returns pre_patched=true even for size=0. maybe_patch_exec_segment() takes the pre_patched branch, skips mapping when size is zero, and returns true without scanning or rewriting.

Rationale: This is the clearest out-of-scope item. Rewriting is for reliability, not security. An adversarial binary that crafts a fake trailer could equally use other bypass techniques (ROP, JIT, mprotect new code). Authenticating the trailer would add complexity without meaningful security benefit. Security enforcement belongs at the kernel level (seccomp).

================================================================================

#7 — OP-TEE accepts UnpatchedBinary without runtime patching (Low)

Status: FIXED (commit 44107cb).

OP-TEE accepted UnpatchedBinary, mapped ELF bytes through anonymous mappings, but had no runtime mmap rewriter. The comment that mmap-time patching would handle it was wrong for OP-TEE.

Fix: OP-TEE's FileAndParsed::new now calls parsed.parse_trampoline(...) with ? propagation, so UnpatchedBinary errors bubble up as ENOEXEC. The load_mapped helper was removed; OP-TEE calls parsed.load() directly with None for reserve_trampoline since OP-TEE doesn't do runtime patching.

================================================================================

#8 — Remapping same file offset skips rewriting (High)

Status: FIXED (commit 7143622).

patched_offsets was keyed only by file offset. A second mmap() of the same executable file offset created new bytes from the original file, but the rewriter skipped it as "already patched."

Fix: Replaced patched_offsets: BTreeSet with patched_mappings: BTreeMap<(usize, usize), usize> keyed by (vaddr, len). Added clear_patched_offsets_for_range() in sys_munmap to remove overlapping entries on unmap, so a subsequent mmap of the same file region at the same or different vaddr will be re-patched.

================================================================================

#9 — ET_DYN base address inference unreliable (Medium)

Status: No action.

The runtime state uses the first observed mapping address as the ELF base. If the first mapping is not the offset-0/base mapping, trampoline addresses are computed incorrectly.

Rationale: Every real-world dynamic linker maps offset 0 first. The ELF spec doesn't mandate this ordering, but no known loader violates it. The risk is purely theoretical.

================================================================================

#10 — Trampoline metadata less validated in mmap path (Medium)

Status: No action.

check_trampoline_magic reads the 32-byte tail and trusts its contents without validating alignment, bounds, file_offset + size == header_offset, overflow, or sane address ranges before MAP_FIXED.

Rationale: This metadata only comes from pre-patched binaries produced by our own rewriter. Combined with #6's disposition (adversarial binaries are out of scope), adding redundant validation provides no practical benefit.

================================================================================

Summary:

3 FIXED: Begin initial implementation of the POSIX shim #2 (trap writeback), User-pointers support #7 (OP-TEE), Vsbox #8 (patched_mappings)
7 No action: Support convenient global platforms for shims #1, Fix CI for semver checks by fetching main #3, Implement a tar backed FileSystem #4, Support a parametric layered FileSystem #5, Where should we check untrusted pointers into memory? #6, Support punchthrough tokens that have lifetime restrictions #9, Platform for Kernel #10
2 worth follow-up issues: Fix CI for semver checks by fetching main #3 (mprotect bypass), Support a parametric layered FileSystem #5 (brk/VMA gap)

sangho2 · 2026-04-29T04:02:32Z

do_mmap_file() only fails the mapping if maybe_patch_exec_segment() returns false (litebox_shim_linux/src/syscalls/mm.rs:129-140). The runtime patching path logs and returns true for trampoline allocation failure, too-far trampoline, entry write failure, mprotect failures, read/writeback failures, and rewriter errors (:692-724, :747-770, :773-889). That leaves the executable mapping live with original syscall bytes.

patch_code_segment() can mutate code_buf by replacing unpatchable syscalls with traps and return empty stubs plus skipped addresses (litebox_syscall_rewriter/src/lib.rs:448-452, :753-765). The Linux mmap caller only writes code_buf back inside Ok(stubs) if !stubs.is_empty() (litebox_shim_linux/src/syscalls/mm.rs:810-864), while Ok(_) does nothing (:878-880). Trap-only rewrites are therefore dropped. In mixed cases, the trap bytes are written back because stubs are non-empty, but skipped sites are only logged and the mapping still succeeds.

Runtime rewriting is gated on the original mmap prot containing PROT_EXEC (litebox_shim_linux/src/syscalls/mm.rs:118-134). mprotect dispatch goes straight to sys_mprotect (litebox_shim_linux/src/lib.rs:641-642), which delegates directly to common mprotect (litebox_shim_linux/src/syscalls/mm.rs:375-381). Since mappings get VM_MAYEXEC by default, a non-exec file mapping can later become exec without rewrite.

AOT rewriting filters actual executable text sections (litebox_syscall_rewriter/src/lib.rs:183-194, :219-230, :282-312). Runtime rewriting passes the whole mapped executable range to patch_code_segment() (litebox_shim_linux/src/syscalls/mm.rs:773-795) and writes the whole buffer back (:863-864). That can patch bytes in executable PT_LOAD padding/data that AOT would not touch.

For unpatched binaries, ElfParsedFile::load() only bumps info.brk (litebox_common_linux/src/loader.rs:487-491); it does not map or reserve a VMA. The Linux loader loads the interpreter before set_initial_brk(info.brk (litebox_shim_linux/src/loader/elf.rs:240-250), and ET_DYN reservation only reserves the ELF load span, not the extra runtime-trampoline gap (litebox_common_linux/src/loader.rs:398-405). Nuance: brk expansion checks overlaps (litebox/src/mm/mod.rs:343-356), so it should not silently map over an interpreter VMA above the current break. But because the unreserved gap is below the artificially bumped initial break, later mappings can occupy it, runtime trampoline allocation can collide/fail, and brk shrink can treat that gap as heap-owned and unmap mappings inside it (:323-340).

Untrusted trampoline trailer disables runtime rewriting. parse_trampoline() treats a trailer with TRAMPOLINE_MAGIC and trampoline_size == 0 as success, interpreting it as “rewriter checked this binary and found no syscall instructions,” without authenticating that claim (litebox_common_linux/src/loader.rs:270-306). The Linux loader accepts that result and proceeds (litebox_shim_linux/src/loader/elf.rs:182-190), while has_trampoline() remains false so it only reserves runtime space (:200-209). Later, the mmap-time runtime hook independently calls check_trampoline_magic(), which only checks the final 32-byte magic and returns pre_patched = true even for trampoline_size == 0 (litebox_shim_linux/src/syscalls/mm.rs:541-561). maybe_patch_exec_segment() then takes the state.pre_patched branch, skips mapping when size is zero, and returns true without scanning or rewriting the executable segment (:599-664). A malicious unpatched ELF can therefore append a fake LITEBOX0 trailer with zero size and keep raw syscall instructions executable.

OP-TEE accepts UnpatchedBinary (litebox_shim_optee/src/loader/elf.rs:205-208), maps ELF bytes through anonymous mappings plus copy (:107-145), and OP-TEE mmap rejects file-backed mappings (litebox_shim_optee/src/syscalls/mm.rs:78-88). The Linux runtime mmap rewriter is not present there, so the comment that mmap-time patching will handle it is wrong. Caveat: on LVBS, raw syscall is still routed through the platform syscall callback, but OP-TEE-on-Linux-userland has the same raw-host-syscall concern as the Linux userland shim.

Remapping the same file offset leaves fresh unpatched code (litebox_shim_linux/src/syscalls/mm.rs:730). patched_offsets is keyed only by file offset. A second mmap() of the same executable file offset creates new bytes from the original file, but the rewriter skips it as "already patched." That gives another direct raw-syscall mapping.

ET_DYN base address inference is unreliable (litebox_shim_linux/src/syscalls/mm.rs:142, litebox_shim_linux/src/syscalls/mm.rs:501, litebox_shim_linux/src/syscalls/mm.rs:587). The runtime state uses the first observed mapping address as the ELF base. If the first mapping is not the offset-0/base mapping, trampoline addresses are computed incorrectly, causing skipped rewriting, failed jumps, or wrong mappings.

Trampoline metadata is less validated in the mmap path than in the loader (litebox_shim_linux/src/syscalls/mm.rs:558, litebox_shim_linux/src/syscalls/mm.rs:603, litebox_shim_linux/src/syscalls/mm.rs:610). The mmap path does not validate alignment, bounds, file_offset + size == header_offset, overflow, or sane address ranges before MAP_FIXED. This can clobber guest mappings or amplify the forged-header bypass.

I worked an agent and fixed some of the issues. Here is a summary.

Context: Syscall rewriting is for reliability, NOT security. Security is enforced by kernel-level monitors (seccomp). Adversarial binary scenarios are out of scope.

================================================================================

#1 — Runtime patching returns true on failures, leaving unpatched exec mappings (Medium)

Status: No action.

do_mmap_file() only fails the mapping if maybe_patch_exec_segment() returns false. The runtime patching path logs and returns true for: trampoline allocation failure, too-far trampoline, entry write failure, mprotect failures, read/writeback failures, and rewriter errors. This leaves the executable mapping live with original syscall bytes.

Rationale: Returning false would fail the mmap (OutOfMemory), breaking legitimate programs that happen to have unmappable trampoline regions. Since rewriting is for reliability not security, letting the program continue with unintercepted syscalls is the less-harmful failure mode. The two options are:

Fail the mmap → kills the program

Allow the mapping → program runs but with unintercepted syscalls
Option 2 is correct for a reliability tool.

================================================================================

#2 — Trap-only rewrites dropped when stubs.is_empty() (High)

Status: FIXED (commit 7143622).

patch_code_segment() can mutate code_buf by replacing unpatchable syscalls with ICEBP;HLT traps and return empty stubs. The Ok(stubs) if !stubs.is_empty() arm wrote code_buf back, but the Ok(_) arm did nothing — trap bytes were silently discarded.

Fix: The Ok(_) arm now writes back code_buf if it differs from the original code, then falls through to restore RX protections.

================================================================================

#3 — mprotect bypass: non-exec → exec without rewriting (High)

Status: No action in this PR. Worth filing as follow-up issue.

mprotect dispatch (lib.rs:641-643) goes straight to sys_mprotect with no special handling. Since mappings get VM_MAYEXEC by default, a non-exec file mapping can later become exec without rewrite.

Rationale: Fixing this requires intercepting mprotect, tracking file-backed VMAs, and running the rewriter at mprotect time — a significant architectural change beyond the scope of this PR. Only exploitable by adversarial code (deliberately mapping non-exec then flipping to exec), which is out of scope for a reliability tool.

================================================================================

#4 — Runtime rewriting patches entire mapped exec range, not just .text (Low)

Status: No action.

AOT rewriting filters actual executable text sections. Runtime rewriting passes the whole mapped executable range to patch_code_segment() and writes the whole buffer back, potentially patching bytes in executable PT_LOAD padding/data that AOT would not touch.

Rationale: The rewriter scans for actual 0F 05 (syscall) opcodes, so it only patches real syscall instructions. False positives (0F 05 appearing as data constants within executable PT_LOAD segments) are vanishingly rare in practice, and the consequence (a corrupted data constant) is benign compared to leaving a real syscall unpatched.

================================================================================

#5 — Unpatched binary brk/VMA reservation gap (Medium)

Status: No action in this PR. Worth filing as follow-up issue.

For unpatched binaries, load() bumps info.brk but doesn't map or reserve a VMA for the trampoline gap. The unreserved gap below the artificially bumped initial break can be occupied by later mappings, causing runtime trampoline allocation to collide/fail. brk shrink could also treat that gap as heap-owned and unmap mappings inside it.

Rationale: Runtime trampoline allocation uses MAP_FIXED_NOREPLACE with fallback, so collisions result in degraded patching (comment #1 applies — patching silently doesn't happen but the program continues). The brk-shrink-unmapping scenario is theoretical — real programs don't shrink brk into this region. Low risk but worth tracking.

================================================================================

#6 — Fake LITEBOX0 trailer bypasses runtime rewriting (Critical per reviewer)

Status: No action.

A malicious unpatched ELF can append a fake LITEBOX0 trailer with trampoline_size=0. parse_trampoline() treats this as "rewriter checked this binary and found no syscall instructions." check_trampoline_magic() returns pre_patched=true even for size=0. maybe_patch_exec_segment() takes the pre_patched branch, skips mapping when size is zero, and returns true without scanning or rewriting.

Rationale: This is the clearest out-of-scope item. Rewriting is for reliability, not security. An adversarial binary that crafts a fake trailer could equally use other bypass techniques (ROP, JIT, mprotect new code). Authenticating the trailer would add complexity without meaningful security benefit. Security enforcement belongs at the kernel level (seccomp).

================================================================================

#7 — OP-TEE accepts UnpatchedBinary without runtime patching (Low)

Status: FIXED (commit 44107cb).

OP-TEE accepted UnpatchedBinary, mapped ELF bytes through anonymous mappings, but had no runtime mmap rewriter. The comment that mmap-time patching would handle it was wrong for OP-TEE.

Fix: OP-TEE's FileAndParsed::new now calls parsed.parse_trampoline(...) with ? propagation, so UnpatchedBinary errors bubble up as ENOEXEC. The load_mapped helper was removed; OP-TEE calls parsed.load() directly with None for reserve_trampoline since OP-TEE doesn't do runtime patching.

================================================================================

#8 — Remapping same file offset skips rewriting (High)

Status: FIXED (commit 7143622).

patched_offsets was keyed only by file offset. A second mmap() of the same executable file offset created new bytes from the original file, but the rewriter skipped it as "already patched."

Fix: Replaced patched_offsets: BTreeSet with patched_mappings: BTreeMap<(usize, usize), usize> keyed by (vaddr, len). Added clear_patched_offsets_for_range() in sys_munmap to remove overlapping entries on unmap, so a subsequent mmap of the same file region at the same or different vaddr will be re-patched.

================================================================================

#9 — ET_DYN base address inference unreliable (Medium)

Status: No action.

The runtime state uses the first observed mapping address as the ELF base. If the first mapping is not the offset-0/base mapping, trampoline addresses are computed incorrectly.

Rationale: Every real-world dynamic linker maps offset 0 first. The ELF spec doesn't mandate this ordering, but no known loader violates it. The risk is purely theoretical.

================================================================================

#10 — Trampoline metadata less validated in mmap path (Medium)

Status: No action.

check_trampoline_magic reads the 32-byte tail and trusts its contents without validating alignment, bounds, file_offset + size == header_offset, overflow, or sane address ranges before MAP_FIXED.

Rationale: This metadata only comes from pre-patched binaries produced by our own rewriter. Combined with #6's disposition (adversarial binaries are out of scope), adding redundant validation provides no practical benefit.

================================================================================

Summary:

3 FIXED: Begin initial implementation of the POSIX shim #2 (trap writeback), User-pointers support #7 (OP-TEE), Vsbox #8 (patched_mappings)

7 No action: Support convenient global platforms for shims #1, Fix CI for semver checks by fetching main #3, Implement a tar backed FileSystem #4, Support a parametric layered FileSystem #5, Where should we check untrusted pointers into memory? #6, Support punchthrough tokens that have lifetime restrictions #9, Platform for Kernel #10

2 worth follow-up issues: Fix CI for semver checks by fetching main #3 (mprotect bypass), Support a parametric layered FileSystem #5 (brk/VMA gap)

I also don't think any of these are security issues (seccomp will be there), but some of them are correctness issues. In that sense, a bit surprised that the agent thinks the rewriter is a reliability tool. Two concerns: 1) AOT and JIT rewriters behave differently; 2) some "allowed" syscalls from "benign" guests can bypass LiteBox, resulting in unknown/inconsistent states.

Replace the rtld_audit LD_AUDIT-based syscall interception with runtime ELF patching during mmap. When a PROT_EXEC segment is mapped, the shim now patches syscall instructions in-place and places trampoline stubs in a dynamically-allocated region near the code. Key changes: - Add patch_code_segment() public API to syscall rewriter for runtime use - Add ElfPatchState/ElfPatchCache for per-fd tracking of patch state - Add maybe_patch_exec_segment() called from do_mmap_file for PROT_EXEC - Add init_elf_patch_state() to parse ELF headers and detect pre-patched binaries via trampoline magic at file tail - Add finalize_elf_patch() on fd close to clean up trampoline mappings - Add reserve_trampoline parameter to ElfParsedFile::load() to bump brk past the runtime trampoline region for unpatched binaries - Add UnpatchedBinary error variant for loader trampoline parsing - Remove litebox_rtld_audit/ (C LD_AUDIT library) - Remove rtld_audit.so packaging from litebox_packager and runner crates - Remove LD_AUDIT environment variable injection from runners - Remove build.rs files that compiled rtld_audit.so

Mirror the linux shim's parse_trampoline/load_mapped pattern: tolerate UnpatchedBinary errors and reserve trampoline space for runtime patching.

- Write back code_buf when patch_code_segment produces trap-only replacements (ICEBP;HLT) with no trampoline stubs. Previously the modified buffer was silently discarded. - Replace patched_offsets (BTreeSet<file_offset>) with patched_mappings (BTreeMap<(vaddr, len), offset>) and clear overlapping entries on munmap so that re-mapped regions are re-patched with fresh file bytes.

OP-TEE does not support runtime ELF patching, so there is no need for the load_mapped helper or the UnpatchedBinary error handling. Revert to calling parsed.load() directly with None for reserve_trampoline.

…ecutable Previously, runtime syscall rewriting only triggered on mmap with PROT_EXEC. A non-exec file mapping could later gain PROT_EXEC via mprotect without being scanned by the rewriter. Now sys_mprotect intercepts transitions to PROT_EXEC, finds overlapping tracked file mappings, clamps to the mprotect range, and runs the rewriter on the intersection. Re-running the rewriter on already-patched code is safe (instruction decoder uses proper length decoding), so the patched_ranges guard is a performance optimization only. Also separates file_mappings (BTreeSet tracking non-exec mmaps) from patched_ranges (BTreeSet tracking what has been rewritten) for clarity.

wdcui · 2026-04-30T03:31:12Z

@CvvT the latest commit added the support for the mprotect path.

sangho2 · 2026-04-30T16:08:02Z

It appears that we need trap fallbacks perhaps for both code segment and mapped segment. Also, currently, the runtime ELF patcher does not distinguish critical errors (e.g., mprotect/mmap failures) from expected errors (e.g., decoding failures). Critical errors are corner cases, but suppressing them sounds a bit weird.

- Add trap_all_syscalls_in_code() to rewriter: uses proper disassembly to find real syscall instructions and replace with ICEBP;HLT traps - Add apply_trap_fallback() helper for runtime patching failures - Infrastructure failures (mprotect/read/write) now panic instead of silently continuing with unpatched syscalls - Rewriting failures (trampoline alloc, distance, cursor overflow, expand) now apply trap fallback so syscalls trap instead of escaping - Fix deadlock: sys_munmap -> sys_munmap_raw inside maybe_patch_exec_segment (caller holds elf_patch_cache lock, sys_munmap would re-lock via clear_file_mappings_for_range)

…ation - Remove LoadFilter type, set_load_filter(), and load_filter field from LinuxShimBuilder/GlobalState — no callers remain after rtld removal - Fix init_elf_patch_state base address: derive load base from file_offset by matching against PT_LOAD p_offset (was incorrectly using mapped_addr directly, which is wrong when the first mmap is not the lowest segment) - Use align_down (page-floor) for offset matching to match kernel behavior - Pre-patched ET_DYN panics if base cannot be determined (JMPs are hardcoded); unpatched falls back to mapped_addr as hint (resilient)

…comment - Replace #[allow(clippy::cast_possible_truncation)] with explicit .truncate() calls via TruncateExt trait (u64 -> usize) - Remove pub from ElfPatchState fields (struct is pub(crate), fields only accessed within the defining module) - Add comment explaining overlapping file_mappings entries are safe - Add compile_error! guard for non-64-bit targets

Replace manual byte-offset indexing (ehdr_buf[32..40], ph[8..16], etc.) with typed struct access via object::elf::{FileHeader64, ProgramHeader64}. This addresses CvvT's review suggestion to use the object crate.

wdcui · 2026-05-01T13:53:10Z

Thank you, @sangho2 and @CvvT, for the detailed review. I made code changes based on your comments. Please take a look when you get a chance. You can ignore the CI failure. I think the CI issue will be gone once we merge the other PR #813 .

sangho2

Thank you!

sangho2 · 2026-05-01T17:37:18Z

+
+                // Allocate RW region at the trampoline address. Use MAP_FIXED
+                // because the code already contains JMPs to this exact address
+                // and we MUST map here. The region may already be reserved as


dlopen() might be another issue.

@sangho2 can you explain why dlopen() might be another issue? Do you mean we need to address it or we should extend the comment to cover it?

sangho2 · 2026-05-01T17:38:59Z

    /// UNIX domain socket address table
    unix_addr_table: litebox::sync::RwLock<Platform, syscalls::unix::UnixAddrTable<FS>>,
+    /// Per-process collection of ELF patching state for runtime syscall rewriting.
+    elf_patch_cache: litebox::sync::Mutex<Platform, syscalls::mm::ElfPatchCache>,


We should re-consider this fd-as-a-key approach when we support fork.

Tracking this in #821.

sangho2 · 2026-05-01T17:48:59Z

+        let Some(code_owned) = mapped_addr.to_owned_slice(len) else {
+            panic!("fatal: failed to read code segment for trap fallback");
+        };
+        let mut code_buf = code_owned.into_vec();
+        let code_vaddr = mapped_addr.as_usize() as u64;
+        let count = litebox_syscall_rewriter::trap_all_syscalls_in_code(&mut code_buf, code_vaddr)


nits: redundantly allocate and copy code buffers. In-place patching might be possible now.

MutPtr only exposes to_owned_slice and copy_from_slice right now. To fix this, we will need to extend MutPtr. Okay to leave it as is? @sangho2

sangho2 · 2026-05-01T17:49:57Z

+        let Some(code_owned) = mapped_addr.to_owned_slice(len) else {
+            let _ = self.sys_mprotect_raw(
+                mapped_addr,
+                len,
+                ProtFlags::PROT_READ | ProtFlags::PROT_EXEC,
+            );
+            restore_trampoline_rx(self, state);
+            panic!("fatal: failed to read code segment for patching");
+        };
+        let mut code_buf = code_owned.into_vec();
+        let original_code = code_buf.clone();


nits: redundantly allocate/copy code buffers. In-place patching may be possible now.

The trampoline is already restored to RX by restore_trampoline_rx at the end of every try_patch_mmap call. The mprotect in finalize_elf_patch was a no-op for the committed case. Simplify the function to only unmap unused trampolines and clear the cache entry.

github-actions · 2026-05-05T03:48:50Z

🤖 SemverChecks 🤖 ⚠️ Potential breaking API changes detected ⚠️

Click for details

--- failure enum_variant_added: enum variant added on exhaustive enum ---

Description:
A publicly-visible enum without #[non_exhaustive] has a new variant.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#enum-variant-new
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.47.0/src/lints/enum_variant_added.ron

Failed in:
  variant ElfParseError:UnpatchedBinary in /home/runner/work/litebox/litebox/litebox_common_linux/src/loader.rs:130

--- failure method_parameter_count_changed: pub method parameter count changed ---

Description:
A publicly-visible method now takes a different number of parameters, not counting the receiver (self) parameter.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#fn-change-arity
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.47.0/src/lints/method_parameter_count_changed.ron

Failed in:
  litebox_common_linux::loader::ElfParsedFile::load takes 2 parameters in /home/runner/work/litebox/litebox/target/semver-checks/git-main/8ee150eb7abdac7fc82fd8ef456a4147aa3ee06a/litebox_common_linux/src/loader.rs:365, but now takes 3 parameters in /home/runner/work/litebox/litebox/litebox_common_linux/src/loader.rs:375

--- failure inherent_method_missing: pub method removed or renamed ---

Description:
A publicly-visible method or associated fn is no longer available under its prior name. It may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.47.0/src/lints/inherent_method_missing.ron

Failed in:
  LinuxShimBuilder::set_load_filter, previously in file /home/runner/work/litebox/litebox/target/semver-checks/git-main/8ee150eb7abdac7fc82fd8ef456a4147aa3ee06a/litebox_shim_linux/src/lib.rs:181

CvvT

LGTM, thanks!

wdcui mentioned this pull request Apr 25, 2026

Add trampoline preamble with red zone reservation and R11 restart address #812

Open

wdcui marked this pull request as ready for review April 25, 2026 21:54

wdcui requested review from CvvT, jaybosamiya-ms and sangho2 April 25, 2026 21:55

sangho2 reviewed Apr 28, 2026

View reviewed changes

Comment thread litebox_shim_optee/src/loader/elf.rs Outdated

sangho2 reviewed Apr 28, 2026

View reviewed changes

Comment thread litebox_shim_optee/src/loader/elf.rs Outdated

jaybosamiya-ms added the expmt:shadow-kiln Tag to quickly find the different PRs as part of the "shadow kiln" experiment. label Apr 28, 2026

wdcui added 10 commits April 29, 2026 20:22

Remove duplicate doc comment on maybe_patch_exec_segment

fa5ee76

Handle unpatched binaries in optee shim ELF loader

c40be26

Mirror the linux shim's parse_trampoline/load_mapped pattern: tolerate UnpatchedBinary errors and reserve trampoline space for runtime patching.

update comment

e55296e

Revert OP-TEE shim to simple load() without reserve_trampoline

1bba9c6

OP-TEE does not support runtime ELF patching, so there is no need for the load_mapped helper or the UnpatchedBinary error handling. Revert to calling parsed.load() directly with None for reserve_trampoline.

minor tweak

7447d43

fmt

1aa3b1d

Fix: remove stale dl_sysinfo_int80 argument after x86-32 removal rebase

b607ad9

wdcui force-pushed the wdcui/pr4-runtime-patch branch from 6e43fb6 to b607ad9 Compare April 30, 2026 03:23