[kernel-1116] browser logging: Event Schema & Pipeline#184
[kernel-1116] browser logging: Event Schema & Pipeline#184archandatta wants to merge 10 commits intomainfrom
Conversation
09ed5ed to
29f2bbf
Compare
rgarcia
left a comment
There was a problem hiding this comment.
nice direction on the pipeline/schema. one thing i'd consider before cementing BrowserEvent v1: if this is eventually going to drive both capture controls (for example: "turn on cdp console only" or "turn on network request/response capture at headers-only detail") and subscriptions, it might be worth making those selector dimensions first-class in the envelope instead of encoding too much into a single type string.
concretely, i think i'd lean toward:
- keeping the primary event identity semantic, e.g.
console.log,network.request,input.click - adding explicit provenance fields like
source_kind(cdp,kernel_api,extension,local_process) plussource_name/source_event(for exampleRuntime.consoleAPICalled) - adding an explicit
detail_level(minimal,default,verbose,raw) - possibly making
categoryfirst-class too instead of deriving it from the type prefix
i probably would not use raw Runtime.* / Network.* as the primary type, since that makes future non-cdp producers feel awkward/second-class. i think the semantic-type + provenance split ages better if we later want to emit events from things like:
- third-party extensions running in the browser and talking to localhost
- vm-local helper processes/programs running alongside the browser
- server/api-driven tool actions like screenshot/input/recording events
that shape also gives the system a much more natural control surface for both capture config and subscriptions, since selectors can operate directly on stable fields like category, topic, source_kind, and detail_level instead of needing to parse overloaded event names.
Sayan-
left a comment
There was a problem hiding this comment.
focused review on the pipeline since raf had some feedback to the event schema!
…ader godoc, and test correctness
… add concurrent publish+read race test
29f2bbf to
997edb4
Compare
b9a88df to
1644fe7
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| func truncateIfNeeded(ev BrowserEvent) (BrowserEvent, []byte) { | ||
| data, err := json.Marshal(ev) | ||
| if err != nil { | ||
| return ev, data |
There was a problem hiding this comment.
Marshal error silently writes corrupt JSONL line
Low Severity
truncateIfNeeded silently swallows json.Marshal errors and returns nil data (since Marshal returns nil on error). The caller in Pipeline.Publish passes this nil data to FileWriter.Write, which executes append(nil, '\n') and writes a bare newline to the JSONL file — a corrupt, non-JSON line. The function returns no error, so the pipeline cannot detect the failure. The first error path (return ev, data) and second (return ev, nil) are also inconsistent, suggesting an oversight.
Additional Locations (1)
| func truncateIfNeeded(ev BrowserEvent) (BrowserEvent, []byte) { | ||
| data, err := json.Marshal(ev) | ||
| if err != nil { | ||
| return ev, data |
| r.nextSeq = oldest | ||
| r.rb.mu.RUnlock() | ||
| data := json.RawMessage(fmt.Sprintf(`{"dropped":%d}`, dropped)) | ||
| return BrowserEvent{Type: "events.dropped", Category: CategorySystem, Source: SourceKernelAPI, Data: data}, nil |
There was a problem hiding this comment.
On final review I'm wondering if the drop sentinel should be a separate type from BrowserEvent. A drop notification is stream metadata, not browser content.
A deeper dive a la opus:
The current sentinel has fabricated/zero fields (Seq: 0, Ts: 0, Category: "system", Source: "kernel_api") that aren't actually observed from the browser. Since there are no consumers yet, now's the cheapest time to split this:
type ReadResult struct {
Event *BrowserEvent // nil when Dropped > 0
Dropped uint64 // count of events lost at this point
}This makes the contract compile-time safe consumers can't accidentally serialize a drop as a real event because Event is nil. The transport layer (e.g. a future WebSocket handler) can decide how to represent drops on the wire independently.
| mu sync.Mutex | ||
| ring *RingBuffer | ||
| files *FileWriter | ||
| seq atomic.Uint64 |
There was a problem hiding this comment.
might be simpler to just make this an uint64 since we only mutate it while holding the lock. nbd
| "time" | ||
| ) | ||
|
|
||
| // Pipeline glues a RingBuffer and a FileWriter into a single write path |
There was a problem hiding this comment.
may be helpful to capture the intended use of the pipeline wrt lifecycle (e.g. can you stop and restart the same pipeline, do you need to send a terminal event before closing, etc)
|
|
||
| // Start sets the capture session ID that will be stamped on every subsequent | ||
| // published event | ||
| func (p *Pipeline) Start(captureSessionID string) { |
There was a problem hiding this comment.
i think we're still missing a separation between subscription selectors and native producer config. some producers will be native to the image server and controllable by us (like cdp), while others may be external and not controllable at all. because of that, i'm not sure subscriptions should implicitly mean "turn this producer on/reconfigure it". i think the api may want subscriptions to stay a pure consumer-side filter, with native producer enablement/config modeled explicitly and separately.
| Ts int64 `json:"ts"` | ||
| Type string `json:"type"` | ||
| Category EventCategory `json:"category"` | ||
| Source Source `json:"source"` |
There was a problem hiding this comment.
this is starting to feel like provenance wants to be an object rather than multiple top-level fields. i'd consider making source structured and moving cdp-specific context under source.metadata, e.g.
{
"type": "console.log",
"category": "console",
"detail_level": "standard",
"source": {
"kind": "cdp",
"event": "Runtime.consoleAPICalled",
"metadata": {
"target_id": "...",
"cdp_session_id": "...",
"frame_id": "...",
"parent_frame_id": "..."
}
},
"data": { ... }
}that keeps the top level focused on stable cross-producer fields and makes non-cdp producers feel first-class too.
|
|
||
| // BrowserEvent is the canonical event structure for the browser capture pipeline. | ||
| type BrowserEvent struct { | ||
| CaptureSessionID string `json:"capture_session_id"` |
There was a problem hiding this comment.
what is capture_session_id intended to identify? this reads more like pipeline/native-producer lifecycle metadata than part of the portable event schema. if resume/subscription are meant to work across heterogeneous producers, i'd be careful about baking a producer-owned session id into every event.
| // If the ring has already published events, the reader will receive an | ||
| // events_dropped BrowserEvent on the first Read call if it has fallen behind | ||
| // the oldest retained event | ||
| func (rb *RingBuffer) NewReader() *Reader { |
There was a problem hiding this comment.
i think the resumable story wants to show up in the api here eventually. right now NewReader() always starts from 0, so there isn't yet an explicit way to resume from a previously persisted cursor/offset after reconnect.
| // Reader tracks an independent read position in a RingBuffer. | ||
| type Reader struct { | ||
| rb *RingBuffer | ||
| nextSeq uint64 // publish index, not BrowserEvent.Seq |
There was a problem hiding this comment.
more fundamentally, i think Event.Seq probably wants to be the public stream offset / resume token. right now the reader tracks a separate publish-index concept (nextSeq / written), which creates two seq-like notions in the model. i'd strongly consider aligning the reader/ring terminology and api around Event.Seq instead of introducing a second cursor concept.
| r.nextSeq = oldest | ||
| r.rb.mu.RUnlock() | ||
| data := json.RawMessage(fmt.Sprintf(`{"dropped":%d}`, dropped)) | ||
| return BrowserEvent{Type: "events.dropped", Category: CategorySystem, Source: SourceKernelAPI, Data: data}, nil |
There was a problem hiding this comment.
i'm a little nervous about representing drops as a normal Event. this feels like stream/transport metadata rather than producer-emitted content. maybe Read should return a separate result type, e.g. ReadResult{Event *Event, Dropped uint64}, where Event is nil when the reader has fallen behind. then the transport layer can choose how to surface drops on the wire without baking them into the canonical event schema.
|
|
||
| const ( | ||
| DetailMinimal DetailLevel = "minimal" | ||
| DetailDefault DetailLevel = "default" |
There was a problem hiding this comment.
nit: if this is meant to be a concrete level, i'd consider renaming default to standard. minimal/standard/verbose/raw reads like an actual ladder, whereas default feels more like a policy alias whose meaning could vary by producer.
| ) | ||
|
|
||
| // BrowserEvent is the canonical event structure for the browser capture pipeline. | ||
| type BrowserEvent struct { |
There was a problem hiding this comment.
nit: i wonder if Event would be a cleaner name than BrowserEvent. it feels more future-proof if extension/local-process/server-native producers end up as first-class peers.
| mu sync.RWMutex | ||
| buf []BrowserEvent | ||
| head int // next write position (mod cap) | ||
| written uint64 // total ever published (monotonic) |
There was a problem hiding this comment.
nit: could we make this comment a bit more explicit? "written" took me a second to parse. something like "monotonic count of publishes into the logical stream" would make it clearer that this is not bytes-written and not Event.Seq.
| buf []BrowserEvent | ||
| head int // next write position (mod cap) | ||
| written uint64 // total ever published (monotonic) | ||
| notify chan struct{} |
There was a problem hiding this comment.
nit: "notify" feels a bit ambiguous here since it doesn't say who is being notified. i'd consider renaming this to something reader-scoped like readerWake or readerNotify, plus a short comment that it's closed-and-replaced on each publish to wake blocked readers.


Note
Medium Risk
Introduces new concurrency-heavy event ingestion and persistence code (ring buffer fan-out, atomic sequencing, and file appends), which can impact reliability and performance under load. Also adds truncation logic that may drop payload data when events exceed the 1MB limit.
Overview
Adds a new
eventspackage implementing a canonicalBrowserEventschema (category/source/detail level fields) and a capture pipeline that stampscapture_session_id, monotonicseq, and timestamps, then enforces a 1MB max record size by truncating oversizeddataand marking events astruncated.Implements two sinks: a non-blocking in-memory
RingBufferwith per-reader cursors and anevents.droppedsentinel when readers fall behind, and a per-category JSONLFileWriterthat lazily opens<category>.logfiles and serializes concurrent writes. Includes extensive unit tests covering serialization, overflow/drop behavior, concurrent readers/writers, file routing, sequencing, and truncation.Written by Cursor Bugbot for commit 36cff2d. This will update automatically on new commits. Configure here.