fix(sandbox): make peer-binary resolution netns-robust#1302
Draft
laitingsheng wants to merge 1 commit intoNVIDIA:mainfrom
Draft
fix(sandbox): make peer-binary resolution netns-robust#1302laitingsheng wants to merge 1 commit intoNVIDIA:mainfrom
laitingsheng wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
a5c9cea to
37b1707
Compare
`/proc/<entrypoint_pid>/net/tcp` is the only source the proxy queried to find the socket inode for a TCP peer. If the entrypoint PID was stale (process died, PID recycled) or sat in a different netns than the connection's actual netns — which is what happens in nested container setups such as macOS Docker Desktop with k3s — the scan returned "No ESTABLISHED TCP connection found" and the proxy denied the request with `binary=-`. Match the connection by `(local_port, remote_port)` so the search can safely walk other netns. On primary-PID miss, walk `/proc` deduping by the netns inode of each PID and scan one `/proc/<pid>/net/tcp` per distinct netns. Socket inodes are kernel-global, so the inode found in any netns resolves to the same FD-holding processes downstream. Threads through `client.local_addr()` (already captured but unused) so `evaluate_opa_tcp` and the bypass monitor can pass the destination port the kernel sees on the sandbox side. Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
37b1707 to
1753e60
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Peer-binary resolution scanned only
/proc/<entrypoint_pid>/net/tcp{,6}to find the socket inode for a TCP peer. Ifentrypoint_pidwas stale (process exited, PID recycled) or sat in a different netns than the one actually carrying the connection — which is what happens in nested-container setups like macOS Docker Desktop with k3s — the scan returned "No ESTABLISHED TCP connection found" and the proxy denied the request withbinary=-. Add a(local_port, remote_port)filter and a netns-deduped/procwalk fallback so the inode is found wherever the connection actually lives.Related Issue
Fixes NVIDIA/NemoClaw#1471.
Changes
crates/openshell-sandbox/src/procfs.rs:parse_proc_net_tcpnow matches connections by bothlocal_port == peer_portandrem_port == remote_port. On primary-PID miss, walk/proc, dedup by the netns inode of each PID viastat /proc/<pid>/ns/net, and scan one/proc/<pid>/net/tcpper distinct netns. Socket inodes are kernel-global, so a match found in any netns resolves to the correct FD-holding processes downstream. New helpersscan_pid_net_tcpandparse_hex_port.crates/openshell-sandbox/src/proxy.rs: stop droppingclient.local_addr()on the floor — thread the proxy's accepted local address throughevaluate_opa_tcp→resolve_process_identity→resolve_tcp_peer_socket_owners, both for CONNECT and for forward-proxy paths.crates/openshell-sandbox/src/bypass_monitor.rs: pass the kmsg event'sdst_portso bypass-attempt identity resolution uses the same filter.crates/openshell-sandbox/src/procfs.rs(tests): two new tests — one asserts the fallback walks/procand finds the connection whenentrypoint_pidis a PID that doesn't exist; the other asserts the remote-port filter rejects a stale match when only the local port collides.Testing
mise run pre-commitpassesChecklist