Re-encode IP-relative instructions when relocating into trampoline#813
Conversation
|
This PR is ready for review, but it should NOT be merged untill PR #811 is merged so we can remove the temporary x86 test change in this PR. |
0453491 to
415d3f9
Compare
|
@sangho2 now the x86 removal PR is merged, I guess we can remove the "must-not-merge" tag? |
Sure. Let me remove it. |
|
A potential issue found by agent:
Should we prefer rewriting instructions without |
Maybe just ignore |
Sounds reasonable to me. |
CvvT
left a comment
There was a problem hiding this comment.
LGTM, thanks!
One comment and one question: Should we prefer rewriting instructions without jmp (if it is more stable to rewrite instructions without jmp)?
As we discussed offline, we don't need this optimization in this PR right now. |
f85f0e2 to
73e4e80
Compare
|
@sangho2 would you like to take another look at this PR or it's okay for me to merge it? Thanks! |
sangho2
left a comment
There was a problem hiding this comment.
Looks reasonable in general. Seems that there are still some corner cases, but we could fix them later.
When the syscall rewriter copies pre-syscall or post-syscall instructions into the trampoline, any RIP-relative memory operands become incorrect because the instruction is now at a different virtual address. Detect this with is_ip_rel_memory_operand and re-encode affected instructions via iced_x86::Encoder at the correct trampoline IP. If re-encoding fails (e.g. the instruction changes size), the pre-syscall path falls back to hook_syscall_and_after; the post-syscall path rolls back the trampoline data to a checkpoint and returns InsufficientBytesBeforeOrAfter so the syscall is trapped instead.
Extract encode_instructions_for_trampoline() and reencode_instructions_at() so both pre-syscall and post-syscall paths use the same encode-first, append-on-success pattern. This eliminates the trampoline_data checkpoint/ rollback mechanism and fixes an O(n) skip_while scan by using direct index-based slicing from the backward/forward scan loops.
Instead of only re-encoding instructions with RIP-relative memory operands and raw-copying the rest, always run all relocated instructions through the encoder. This correctly handles IP-relative branch targets (call/jmp/jcc) in addition to RIP-relative memory, and allows the backward/forward scans to cross outgoing control transfers on x86_64 since the encoder fixes up relative displacements automatically. The x86_32 scan paths retain the control-transfer break since the encoder is 64-bit only; x86_32 support is removed in a separate PR.
Call/IndirectCall instructions must not be relocated into the trampoline because the return address pushed on the stack would point into the trampoline instead of the original code, breaking unwinding, profiling, and call/pop PIC patterns. Remove the now-unused arch parameter from hook_syscall_and_after.
Both loops previously included the syscall instruction itself in their iteration range and used code-equality guards (inst_id != i / code() != syscall_inst.code()) to skip the flow-control check on it. This was necessary because iced-x86 classifies syscall as FlowControl::Call. Refactor: - Backward loop: start at (0..i) instead of (0..=i), and hoist the control-transfer-target check for the syscall address outside the loop. - Forward loop (hook_syscall_and_after): skip(inst_index + 1) instead of skip(inst_index), removing both code-equality guards. Behavior is preserved: the syscall is still included in the replaced byte range, and real call instructions still terminate the scan.
673535f to
b6e7001
Compare
|
🤖 SemverChecks 🤖 No breaking API changes detected Note: this does not mean API is unchanged, or even that there are no breaking changes; simply, none of the detections triggered. |
This PR improves the syscall rewriter by handling IP-relative instructions in pre or post syscall case (detect RIP-relative operands and re-encode them at the correct trampoline virtual address).