fix(lup): import-safe standalone library + tool-gate primitive + behavior tests#14
Open
joy-void-joy wants to merge 23 commits into
Open
fix(lup): import-safe standalone library + tool-gate primitive + behavior tests#14joy-void-joy wants to merge 23 commits into
joy-void-joy wants to merge 23 commits into
Conversation
lup.paths ran find_project_root() at import time, so a pip-installed lup could not even be imported outside a [tool.lup] project. Root, version, and base dirs now resolve on first accessor call (cached in an internal PathConfig), configure() stays the override, and configure(root=X) tolerates a missing pyproject (version falls back to 0.0.0 unless given). The from-importable mutable globals are gone; devtools read the version through the agent_version() accessor and default typer version options to None (resolved per invocation).
The settings existed but were never applied, so session data always landed under the project root. The typer callback now configures lup.paths on every invocation, anchoring relative values at the project root and honoring absolute overrides.
query(output_type=T, options=prebuilt) computed an output_format that build_client then ignored, so structured output came back None with no error. prepare_output_format() now injects the computed format into the provided options (ValueError when they already set one), and build_client raises on any keyword argument combined with pre-built options instead of silently discarding it.
An error ResultMessage raised inside the collector before the yield, so the failing message was never seen by iteration consumers, never logged, and never reached the trace. The error is now logged, written to the trace logger, and yielded; the RuntimeError fires when the consumer resumes the generator.
update_session_metadata chose its target via a lexicographic sort of full paths, which orders version directories wrong across versions (0.10.0 < 0.9.0) and could update a stale file. Candidates are now ranked by the filename timestamp via lup.paths.parse_timestamp.
run_code(timeout_seconds=0) is documented as no timeout, but the host still enforced a 0+5 second deadline, killing the REPL connection and losing all session state. compute_deadline() now returns None for non-positive timeouts, recv_response blocks indefinitely in that case, and the in-sandbox SIGALRM is already skipped for non-positive values.
setup_notes granted ro=[traces/<version>/], which contains logs/ — contradicting the contract that trace logs are invisible to the agent. The RO grant now lists the sessions/ and outputs/ directories of every existing version (current one included), never logs/.
Four hook factories shared one pattern: deny tool B (or Stop) with an agent-readable message until condition A holds. create_tool_gate in lup.hooks is the general primitive (gated tool(s), static or dynamic message, input-aware unlock predicate, optional on_unlock_tool tracked via PostToolUse, deny/block styles, PreToolUse or Stop). The existing factories — create_reflection_gate, create_stop_guard, create_pending_event_guard, create_meta_before_sleep_guard — keep their public signatures and behavior as thin presets over it.
truncate_str_fields dropped max_len_list on recursion, so nested lists inside dicts or lists reverted to the default limit.
The lines list duplicated entries[].content and only lines fed save(), so entry edits could diverge from the saved file. save() now renders the line stream from entries; read_entries() stays for replaying recent trace context in persistent sessions.
The time field was declared on SleepResult but never populated.
history.save_session and notes.setup_notes hardcoded the format string that parse_timestamp expects; both now share the constant.
The marker sat inside the module docstring, rendering in help() and not registering as a comment. It is now a bare comment line at the top of the file, where the edits hook recognizes file-level markers.
The catch-all swallowed every failure during socket teardown; only socket-layer close errors (OSError, ValueError on closed I/O) are expected there.
Renames in background (task, wake_event, running, message_generator, run, handle_response), realtime (scheduler attributes; the ideas property becomes a plain attribute), and throttle (max_concurrent, min_interval, loop_states) per the no-private-prefix convention. dict[str, object] payloads that are inherently open (JSON Schema, domain session JSON, SDK stream turns) carry claude: ignore markers.
lup.mcp imports the mcp package but the dependency was undeclared (satisfied only transitively). The py.typed marker ships via setuptools package-data so consumers get type information, and the README documents the library since pyproject declares it.
Adds the genuinely public surface that was missing from __all__: scheduler and guards, background agents, session history, notes setup, throttle, retry, tool gate and hook factories, metrics tracking, and timestamp helpers. Sandbox is exported via lazy module __getattr__ so import lup works without the docker extra. In-repo imports stay direct-from-module.
Covers create_tool_gate and all four presets (deny before unlock, pass/allow after), lup_tool response paths (success, ToolError, validation failure, direct call), permission hook RW/RO enforcement, the notes RO grant excluding logs (regression), resolve_version progressive fallback, latest-session selection by parsed timestamp across versions (regression), lazy path configuration without a pyproject, TraceLogger save/slice and nested truncation limits, with_retry, tracked metrics, nudge/capture hooks, barrel drift, query option injection with pre-built options, error-result tracing, and the sandbox deadline computation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Makes
packages/lupgenuinely standalone and adds the library's missing behavior-test layer. Stacked on #13.Import-safety / one path system
lup.pathsresolves the project root lazily (a pip-installedlupno longer crashes at import outside a[tool.lup]project);configure()tolerates roots without a pyproject (version falls back to0.0.0)AGENT_NOTES_PATH/AGENT_LOGS_PATHare wired throughlup.paths.configure()at CLI startup — they now relocate all session data, not just trace files; four devtools value-imports ofAGENT_VERSION(frozen at import) migrated to the accessorSilently-wrong-result APIs fixed
query(output_type=T, options=prebuilt)no longer drops the structured-output request (previously returnedNonesilently)update_session_metadatapicks the latest session by parsed timestamp, not lexicographic path (0.10.0<0.9.0)run_code(timeout_seconds=0)means no deadline, as documented — it previously killed the REPL after 5s and lost all stateResultMessages are emitted/traced before raising, so failing sessions leave evidencePermissions: the notes read-only grant no longer includes
notes/traces/<version>/logs/— the docstring's "agent cannot access logs" is now true (RO covers sessions/ and outputs/ across versions, logs excluded, pinned by tests).create_tool_gate— the four existing deny-until-unlocked guards (reflection gate, stop guard, pending-event guard, meta-before-sleep) were one pattern implemented four ways; they're now thin presets over a single documented primitive inlup.hooks, with the pattern written up in PATTERNS.md.Library hygiene:
TraceLoggerstores entries once (lines rendered at save;read_entrieskept as the replay affordance);SleepResult.timeis set;TIMESTAMP_FMTdeduplicated; dated model defaults bumped toclaude-sonnet-4-6;lup.lib.*ghost references removed;_-prefix sweep in background/realtime/throttle; file-level# claude: ignoremoved out of the mcp docstring.Packaging:
mcpdependency declared (was only transitive),py.typedships in the wheel,packages/lup/README.mdadded (pyproject already pointed at it).Barrel:
lup.__init__now declares the complete public API with a drift test asserting every__all__name resolves.Test plan
lup_toolerror propagation, RW/RO permission hooks (incl. the logs exclusion),resolve_versionfallback, latest-by-timestamp, lazy paths/configure, TraceLogger round-trip, truncation recursion, client option-injection, sandbox deadline computation, barrel drift