Skip to content

in_ebpf: Implement sched trace#11743

Open
cosmo0920 wants to merge 4 commits into
masterfrom
cosmo0920-implement-sched-trace
Open

in_ebpf: Implement sched trace#11743
cosmo0920 wants to merge 4 commits into
masterfrom
cosmo0920-implement-sched-trace

Conversation

@cosmo0920

@cosmo0920 cosmo0920 commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

In this PR, I implemented sched eBPF trace.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
% sudo bin/fluent-bit -i ebpf -ptrace=trace_sched -o stdout -v 
  • Debug log output from testing the change
Fluent Bit v5.0.8
* Copyright (C) 2015-2026 The Fluent Bit Authors
* Fluent Bit is a CNCF graduated project under the Fluent organization
* https://fluentbit.io

______ _                  _    ______ _ _           _____  _____ 
|  ___| |                | |   | ___ (_) |         |  ___||  _  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   _|___ \ | |/' |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \|  /| |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V //\__/ /\ |_/ /
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)\___/


[2026/06/16 13:54:15.334] [ info] Configuration:
[2026/06/16 13:54:15.356] [ info]  flush time     | 1.000000 seconds
[2026/06/16 13:54:15.361] [ info]  grace          | 5 seconds
[2026/06/16 13:54:15.361] [ info]  daemon         | 0
[2026/06/16 13:54:15.361] [ info] ___________
[2026/06/16 13:54:15.361] [ info]  inputs:
[2026/06/16 13:54:15.362] [ info]      ebpf
[2026/06/16 13:54:15.362] [ info] ___________
[2026/06/16 13:54:15.362] [ info]  filters:
[2026/06/16 13:54:15.362] [ info] ___________
[2026/06/16 13:54:15.362] [ info]  outputs:
[2026/06/16 13:54:15.363] [ info]      stdout.0
[2026/06/16 13:54:15.363] [ info] ___________
[2026/06/16 13:54:15.363] [ info]  collectors:
[2026/06/16 13:54:15.426] [ info] [fluent bit] version=5.0.8, commit=93e00a7309, pid=191057
[2026/06/16 13:54:15.436] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2026/06/16 13:54:15.441] [ info] [storage] ver=1.5.4, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2026/06/16 13:54:15.442] [ info] [simd    ] SSE2
[2026/06/16 13:54:15.442] [ info] [cmetrics] version=2.1.4
[2026/06/16 13:54:15.442] [ info] [ctraces ] version=0.7.1
[2026/06/16 13:54:15.457] [ info] [input:ebpf:ebpf.0] initializing
[2026/06/16 13:54:15.457] [ info] [input:ebpf:ebpf.0] storage_strategy='memory' (memory only)
[2026/06/16 13:54:15.459] [debug] [ebpf:ebpf.0] created event channels: read=21 write=22
[2026/06/16 13:54:15.459] [debug] [input:ebpf:ebpf.0] initializing eBPF input plugin
[2026/06/16 13:54:15.464] [debug] [input:ebpf:ebpf.0] processing trace: trace_sched
[2026/06/16 13:54:15.464] [debug] [input:ebpf:ebpf.0] setting up trace configuration for: trace_sched
[2026/06/16 13:54:20.383] [debug] [input:ebpf:ebpf.0] attaching BPF program for trace: trace_sched
[2026/06/16 13:54:20.396] [debug] [input:ebpf:ebpf.0] registering trace handler for: trace_sched
[2026/06/16 13:54:20.400] [ info] [input:ebpf:ebpf.0] registered trace handler for: trace_sched
[2026/06/16 13:54:20.401] [ info] [input:ebpf:ebpf.0] trace configuration completed for: trace_sched
[2026/06/16 13:54:20.402] [debug] [input:ebpf:ebpf.0] setting up collector with poll interval: 1000 ms
[2026/06/16 13:54:20.404] [ info] [input:ebpf:ebpf.0] eBPF input plugin initialized successfully
[2026/06/16 13:54:20.409] [debug] [stdout:stdout.0] created event channels: read=38 write=39
[2026/06/16 13:54:20.447] [ info] [sp] stream processor started
[2026/06/16 13:54:20.449] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2
[2026/06/16 13:54:20.458] [ info] [output:stdout:stdout.0] worker #0 started
[2026/06/16 13:54:21.149] [debug] [input:ebpf:ebpf.0] collecting events from ring buffers
[2026/06/16 13:54:21.149] [debug] [input:ebpf:ebpf.0] consuming events from ring buffer trace_sched
[2026/06/16 13:54:21.254] [debug] [input:ebpf:ebpf.0] successfully consumed events from ring buffer trace_sched
[2026/06/16 13:54:22.150] [debug] [task] created task=0x7a62b90 id=0 OK
[2026/06/16 13:54:22.152] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[2026/06/16 13:54:22.153] [debug] [input:ebpf:ebpf.0] collecting events from ring buffers
[2026/06/16 13:54:22.153] [debug] [input:ebpf:ebpf.0] consuming events from ring buffer trace_sched
[2026/06/16 13:54:22.216] [debug] [input:ebpf:ebpf.0] successfully consumed events from ring buffer trace_sched
[0] ebpf.0: [[1781585661.153101206, {}], {"event_type"=>"sched", "pid"=>182312, "tid"=>182312, "comm"=>"kworker/u48:37", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>182312, "next_prio"=>120, "runq_latency_ns"=>4142, "wakeup_tracked"=>true}]
[1] ebpf.0: [[1781585661.184334297, {}], {"event_type"=>"sched", "pid"=>191055, "tid"=>191055, "comm"=>"sudo", "cpu"=>7, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>191055, "next_prio"=>120, "runq_latency_ns"=>1673, "wakeup_tracked"=>true}]
[2] ebpf.0: [[1781585661.185261201, {}], {"event_type"=>"sched", "pid"=>182312, "tid"=>182312, "comm"=>"kworker/u48:37", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>182312, "next_prio"=>120, "runq_latency_ns"=>573, "wakeup_tracked"=>true}]
[3] ebpf.0: [[1781585661.185435407, {}], {"event_type"=>"sched", "pid"=>191057, "tid"=>191060, "comm"=>"flb-logger", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>191057, "next_prio"=>120, "runq_latency_ns"=>2000, "wakeup_tracked"=>true}]
[4] ebpf.0: [[1781585661.185582407, {}], {"event_type"=>"sched", "pid"=>2914, "tid"=>2914, "comm"=>"gnome-shell", "cpu"=>2, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>2914, "next_prio"=>120, "runq_latency_ns"=>10106, "wakeup_tracked"=>true}]
[5] ebpf.0: [[1781585661.185902025, {}], {"event_type"=>"sched", "pid"=>3905, "tid"=>3905, "comm"=>"gnome-terminal-", "cpu"=>0, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>3905, "next_prio"=>120, "runq_latency_ns"=>16919, "wakeup_tracked"=>true}]
[6] ebpf.0: [[1781585661.186042958, {}], {"event_type"=>"sched", "pid"=>17, "tid"=>17, "comm"=>"rcu_preempt", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>17, "next_prio"=>120, "runq_latency_ns"=>911, "wakeup_tracked"=>true}]
[7] ebpf.0: [[1781585661.186180045, {}], {"event_type"=>"sched", "pid"=>191057, "tid"=>191060, "comm"=>"flb-logger", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>191057, "next_prio"=>120, "runq_latency_ns"=>902, "wakeup_tracked"=>true}]
[8] ebpf.0: [[1781585661.186326172, {}], {"event_type"=>"sched", "pid"=>191057, "tid"=>191060, "comm"=>"flb-logger", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>191057, "next_prio"=>120, "runq_latency_ns"=>750, "wakeup_tracked"=>true}]
[9] ebpf.0: [[1781585661.186462778, {}], {"event_type"=>"sched", "pid"=>191057, "tid"=>191060, "comm"=>"flb-logger", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>191057, "next_prio"=>120, "runq_latency_ns"=>702, "wakeup_tracked"=>true}]
[10] ebpf.0: [[1781585661.186612450, {}], {"event_type"=>"sched", "pid"=>17, "tid"=>17, "comm"=>"rcu_preempt", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>17, "next_prio"=>120, "runq_latency_ns"=>1017, "wakeup_tracked"=>true}]
[11] ebpf.0: [[1781585661.186748056, {}], {"event_type"=>"sched", "pid"=>114161, "tid"=>114161, "comm"=>"kworker/0:2", "cpu"=>0, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>114161, "next_prio"=>120, "runq_latency_ns"=>2394, "wakeup_tracked"=>true}]
[12] ebpf.0: [[1781585661.186883401, {}], {"event_type"=>"sched", "pid"=>191057, "tid"=>191060, "comm"=>"flb-logger", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>191057, "next_prio"=>120, "runq_latency_ns"=>779, "wakeup_tracked"=>true}]
[13] ebpf.0: [[1781585661.187026272, {}], {"event_type"=>"sched", "pid"=>37, "tid"=>37, "comm"=>"migration/5", "cpu"=>5, "prev_pid"=>191058, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>37, "next_prio"=>0, "runq_latency_ns"=>4154, "wakeup_tracked"=>true}]
[14] ebpf.0: [[1781585661.187164388, {}], {"event_type"=>"sched", "pid"=>191057, "tid"=>191058, "comm"=>"flb-pipeline", "cpu"=>2, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>191057, "next_prio"=>120, "runq_latency_ns"=>11312811, "wakeup_tracked"=>true}]
[15] ebpf.0: [[1781585661.187507879, {}], {"event_type"=>"sched", "pid"=>191057, "tid"=>191060, "comm"=>"flb-logger", "cpu"=>4, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>191057, "next_prio"=>120, "runq_latency_ns"=>3720, "wakeup_tracked"=>true}]
[16] ebpf.0: [[1781585661.187671687, {}], {"event_type"=>"sched", "pid"=>17, "tid"=>17, "comm"=>"rcu_preempt", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>17, "next_prio"=>120, "runq_latency_ns"=>1450, "wakeup_tracked"=>true}]
[17] ebpf.0: [[1781585661.187810012, {}], {"event_type"=>"sched", "pid"=>162963, "tid"=>162963, "comm"=>"kworker/u49:3", "cpu"=>7, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>162963, "next_prio"=>100, "runq_latency_ns"=>993, "wakeup_tracked"=>true}]
[18] ebpf.0: [[1781585661.187956038, {}], {"event_type"=>"sched", "pid"=>2914, "tid"=>2948, "comm"=>"KMS thread", "cpu"=>3, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>2914, "next_prio"=>79, "runq_latency_ns"=>2909, "wakeup_tracked"=>true}]
[19] ebpf.0: [[1781585661.188091020, {}], {"event_type"=>"sched", "pid"=>204, "tid"=>204, "comm"=>"kworker/7:1H", "cpu"=>7, "prev_pid"=>162963, "prev_prio"=>100, "prev_state"=>1026, "next_pid"=>204, "next_prio"=>100, "runq_latency_ns"=>2715, "wakeup_tracked"=>true}]
[20] ebpf.0: [[1781585661.188580396, {}], {"event_type"=>"sched", "pid"=>25, "tid"=>25, "comm"=>"migration/2", "cpu"=>2, "prev_pid"=>191058, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>25, "next_prio"=>0, "runq_latency_ns"=>1976, "wakeup_tracked"=>true}]
[21] ebpf.0: [[1781585661.188790502, {}], {"event_type"=>"sched", "pid"=>2914, "tid"=>2914, "comm"=>"gnome-shell", "cpu"=>4, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>2914, "next_prio"=>120, "runq_latency_ns"=>1633, "wakeup_tracked"=>true}]
[22] ebpf.0: [[1781585661.188932113, {}], {"event_type"=>"sched", "pid"=>191057, "tid"=>191058, "comm"=>"flb-pipeline", "cpu"=>7, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>191057, "next_prio"=>120, "runq_latency_ns"=>0, "wakeup_tracked"=>false}]
[23] ebpf.0: [[1781585661.189386685, {}], {"event_type"=>"sched", "pid"=>182312, "tid"=>182312, "comm"=>"kworker/u48:37", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>182312, "next_prio"=>120, "runq_latency_ns"=>1310, "wakeup_tracked"=>true}]
[24] ebpf.0: [[1781585661.189909187, {}], {"event_type"=>"sched", "pid"=>17, "tid"=>17, "comm"=>"rcu_preempt", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>17, "next_prio"=>120, "runq_latency_ns"=>1148, "wakeup_tracked"=>true}]
[25] ebpf.0: [[1781585661.190062880, {}], {"event_type"=>"sched", "pid"=>191057, "tid"=>191060, "comm"=>"flb-logger", "cpu"=>4, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>191057, "next_prio"=>120, "runq_latency_ns"=>1691, "wakeup_tracked"=>true}]
[26] ebpf.0: [[1781585661.190214704, {}], {"event_type"=>"sched", "pid"=>17, "tid"=>17, "comm"=>"rcu_preempt", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>17, "next_prio"=>120, "runq_latency_ns"=>2135, "wakeup_tracked"=>true}]
[27] ebpf.0: [[1781585661.190354331, {}], {"event_type"=>"sched", "pid"=>49, "tid"=>49, "comm"=>"migration/7", "cpu"=>7, "prev_pid"=>191058, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>49, "next_prio"=>0, "runq_latency_ns"=>2826, "wakeup_tracked"=>true}]
[28] ebpf.0: [[1781585661.190490954, {}], {"event_type"=>"sched", "pid"=>191057, "tid"=>191058, "comm"=>"flb-pipeline", "cpu"=>2, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>191057, "next_prio"=>120, "runq_latency_ns"=>0, "wakeup_tracked"=>false}]
[29] ebpf.0: [[1781585661.190647010, {}], {"event_type"=>"sched", "pid"=>2914, "tid"=>2914, "comm"=>"gnome-shell", "cpu"=>4, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>2914, "next_prio"=>120, "runq_latency_ns"=>2971, "wakeup_tracked"=>true}]
[30] ebpf.0: [[1781585661.190784685, {}], {"event_type"=>"sched", "pid"=>2914, "tid"=>2948, "comm"=>"KMS thread", "cpu"=>3, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>2914, "next_prio"=>79, "runq_latency_ns"=>2324, "wakeup_tracked"=>true}]
[31] ebpf.0: [[1781585661.190931511, {}], {"event_type"=>"sched", "pid"=>3905, "tid"=>3905, "comm"=>"gnome-terminal-", "cpu"=>5, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>3905, "next_prio"=>120, "runq_latency_ns"=>2208, "wakeup_tracked"=>true}]
[32] ebpf.0: [[1781585661.191068394, {}], {"event_type"=>"sched", "pid"=>162963, "tid"=>162963, "comm"=>"kworker/u49:3", "cpu"=>7, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>162963, "next_prio"=>100, "runq_latency_ns"=>1454, "wakeup_tracked"=>true}]
[33] ebpf.0: [[1781585661.191203469, {}], {"event_type"=>"sched", "pid"=>2914, "tid"=>2914, "comm"=>"gnome-shell", "cpu"=>4, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>2914, "next_prio"=>120, "runq_latency_ns"=>1605, "wakeup_tracked"=>true}]
[34] ebpf.0: [[1781585661.191348661, {}], {"event_type"=>"sched", "pid"=>191057, "tid"=>191060, "comm"=>"flb-logger", "cpu"=>4, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>191057, "next_prio"=>120, "runq_latency_ns"=>7155, "wakeup_tracked"=>true}]
[35] ebpf.0: [[1781585661.191483572, {}], {"event_type"=>"sched", "pid"=>162963, "tid"=>162963, "comm"=>"kworker/u49:3", "cpu"=>7, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>162963, "next_prio"=>100, "runq_latency_ns"=>2475, "wakeup_tracked"=>true}]
[36] ebpf.0: [[1781585661.191621755, {}], {"event_type"=>"sched", "pid"=>2914, "tid"=>2948, "comm"=>"KMS thread", "cpu"=>3, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>2914, "next_prio"=>79, "runq_latency_ns"=>4133, "wakeup_tracked"=>true}]
[37] ebpf.0: [[1781585661.191766596, {}], {"event_type"=>"sched", "pid"=>162963, "tid"=>162963, "comm"=>"kworker/u49:3", "cpu"=>7, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>162963, "next_prio"=>100, "runq_latency_ns"=>2603, "wakeup_tracked"=>true}]
[38] ebpf.0: [[1781585661.191902416, {}], {"event_type"=>"sched", "pid"=>204, "tid"=>204, "comm"=>"kworker/7:1H", "cpu"=>7, "prev_pid"=>162963, "prev_prio"=>100, "prev_state"=>1026, "next_pid"=>204, "next_prio"=>100, "runq_latency_ns"=>4181, "wakeup_tracked"=>true}]
[39] ebpf.0: [[1781585661.192049347, {}], {"event_type"=>"sched", "pid"=>2914, "tid"=>2914, "comm"=>"gnome-shell", "cpu"=>4, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>2914, "next_prio"=>120, "runq_latency_ns"=>2610, "wakeup_tracked"=>true}]
[40] ebpf.0: [[1781585661.192184228, {}], {"event_type"=>"sched", "pid"=>182312, "tid"=>182312, "comm"=>"kworker/u48:37", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>182312, "next_prio"=>120, "runq_latency_ns"=>2321, "wakeup_tracked"=>true}]
[41] ebpf.0: [[1781585661.192318677, {}], {"event_type"=>"sched", "pid"=>17, "tid"=>17, "comm"=>"rcu_preempt", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>17, "next_prio"=>120, "runq_latency_ns"=>4156, "wakeup_tracked"=>true}]
[42] ebpf.0: [[1781585661.192462993, {}], {"event_type"=>"sched", "pid"=>2914, "tid"=>2914, "comm"=>"gnome-shell", "cpu"=>4, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>2914, "next_prio"=>120, "runq_latency_ns"=>3552, "wakeup_tracked"=>true}]
[43] ebpf.0: [[1781585661.192600763, {}], {"event_type"=>"sched", "pid"=>182312, "tid"=>182312, "comm"=>"kworker/u48:37", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>182312, "next_prio"=>120, "runq_latency_ns"=>2775, "wakeup_tracked"=>true}]
[44] ebpf.0: [[1781585661.192736946, {}], {"event_type"=>"sched", "pid"=>3905, "tid"=>3905, "comm"=>"gnome-terminal-", "cpu"=>5, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>3905, "next_prio"=>120, "runq_latency_ns"=>4560, "wakeup_tracked"=>true}]
[45] ebpf.0: [[1781585661.192879801, {}], {"event_type"=>"sched", "pid"=>17, "tid"=>17, "comm"=>"rcu_preempt", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>17, "next_prio"=>120, "runq_latency_ns"=>16451, "wakeup_tracked"=>true}]
[46] ebpf.0: [[1781585661.193014721, {}], {"event_type"=>"sched", "pid"=>17, "tid"=>17, "comm"=>"rcu_preempt", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>17, "next_prio"=>120, "runq_latency_ns"=>6270, "wakeup_tracked"=>true}]
[47] ebpf.0: [[1781585661.193155737, {}], {"event_type"=>"sched", "pid"=>2914, "tid"=>2914, "comm"=>"gnome-shell", "cpu"=>4, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>2914, "next_prio"=>120, "runq_latency_ns"=>9536, "wakeup_tracked"=>true}]
[48] ebpf.0: [[1781585661.193288763, {}], {"event_type"=>"sched", "pid"=>17, "tid"=>17, "comm"=>"rcu_preempt", "cpu"=>6, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>17, "next_prio"=>120, "runq_latency_ns"=>8038, "wakeup_tracked"=>true}]
[49] ebpf.0: [[1781585661.193427863, {}], {"event_type"=>"sched", "pid"=>31, "tid"=>31, "comm"=>"migration/4", "cpu"=>4, "prev_pid"=>2914, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>31, "next_prio"=>0, "runq_latency_ns"=>3595, "wakeup_tracked"=>true}]
[50] ebpf.0: [[1781585661.193577381, {}], {"event_type"=>"sched", "pid"=>2914, "tid"=>2914, "comm"=>"gnome-shell", "cpu"=>0, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>2914, "next_prio"=>120, "runq_latency_ns"=>0, "wakeup_tracked"=>false}]
[51] ebpf.0: [[1781585661.193720766, {}], {"event_type"=>"sched", "pid"=>2914, "tid"=>2948, "comm"=>"KMS thread", "cpu"=>3, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>2914, "next_prio"=>79, "runq_latency_ns"=>3313, "wakeup_tracked"=>true}]
[52] ebpf.0: [[1781585661.193866135, {}], {"event_type"=>"sched", "pid"=>25, "tid"=>25, "comm"=>"migration/2", "cpu"=>2, "prev_pid"=>191058, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>25, "next_prio"=>0, "runq_latency_ns"=>3542, "wakeup_tracked"=>true}]
[53] ebpf.0: [[1781585661.193998853, {}], {"event_type"=>"sched", "pid"=>191057, "tid"=>191058, "comm"=>"flb-pipeline", "cpu"=>4, "prev_pid"=>0, "prev_prio"=>120, "prev_state"=>0, "next_pid"=>191057, "next_prio"=>120, "runq_latency_ns"=>0, "wakeup_tracked"=>false}]
<snip>
[2026/06/16 13:54:22.272] [debug] [out flush] cb_destroy coro_id=0
[2026/06/16 13:54:22.280] [debug] [task] destroy task=0x7a62b90 (task_id=0)
^C[2026/06/16 13:54:22] [engine] caught signal (SIGINT)
[2026/06/16 13:54:22.688] [debug] [task] created task=0x6683e80 id=0 OK
[2026/06/16 13:54:22.689] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[2026/06/16 13:54:22.689] [ warn] [engine] service will shutdown in max 5 seconds
[2026/06/16 13:54:22.690] [debug] [engine] task 0 already scheduled to run, not re-scheduling it.
[2026/06/16 13:54:22.690] [ info] [engine] pausing all inputs..
[2026/06/16 13:54:22.691] [ info] [input] pausing ebpf.0
[2026/06/16 13:54:22.693] [debug] [input:ebpf:ebpf.0] collector paused
  • Attached Valgrind output that shows no leaks or memory corruption was found
==191057== 
==191057== HEAP SUMMARY:
==191057==     in use at exit: 0 bytes in 0 blocks
==191057==   total heap usage: 12,965 allocs, 12,965 frees, 47,959,419 bytes allocated
==191057== 
==191057== All heap blocks were freed -- no leaks are possible
==191057== 

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features
    • Added scheduler trace support for capturing scheduling transitions, including CPU context, previous/next process metadata, and run-queue latency derived from wakeup timing.
    • Extended the unified event schema with a new scheduler (sched) event type and payload fields, and improved string rendering for the scheduler event type.
  • Tests
    • Added runtime coverage to verify scheduler event encoding and decoding, including validation of scheduler-specific fields (event type, CPU, PIDs, and run-queue latency).

@coderabbitai

coderabbitai Bot commented Apr 24, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds comprehensive eBPF scheduler event tracing to Fluent Bit. The change defines a new EVENT_TYPE_SCHED event type with fields capturing process wakeup, context-switch, and run-queue latency; implements kernel tracepoint handlers to collect these events; adds user-space encoding logic to convert eBPF data into Fluent Bit log records; integrates into the existing trace framework; and includes runtime tests validating the encoding path.

Changes

Scheduler trace support

Layer / File(s) Summary
Event type and data contract
plugins/in_ebpf/traces/includes/common/events.h, plugins/in_ebpf/traces/includes/common/encoder.h
EVENT_TYPE_SCHED enum value added; struct sched_event defines process identity (prev/next PID, priority), previous state, CPU, run-queue latency, and wakeup tracking fields; struct sched_sample bundles common event header with sched event payload; struct event union extended with sched member; event_type_to_string() maps new type to "sched" string.
eBPF kernel program
plugins/in_ebpf/traces/sched/bpf.c
Defines wakeup_info struct and declares wakeup_by_pid hash map (PID → timestamp) and events ring buffer; implements sched_wakeup and sched_wakeup_new handlers recording task wakeup kernel timestamps per PID; sched_switch handler reads scheduling context from tracepoint, applies mount-namespace filter, reserves output event, populates sched fields with task identities/priorities/CPU, conditionally computes runq_latency_ns from stored prior wakeup time, and submits to ring buffer; declares module license.
User-space event encoding
plugins/in_ebpf/traces/sched/handler.h, plugins/in_ebpf/traces/sched/handler.c
New header declares function prototypes for encoding and handling; encode_sched_event() wraps sched sample into internal event, encodes common fields via shared helper, appends sched-specific fields (CPU, prev/next PID/priority/state, run-queue latency, wakeup tracking) to Fluent Bit log record with transactional rollback on error; trace_sched_handler() validates input buffer size and event type, invokes encoder, appends encoded output to Fluent Bit input, manages encoder state, and returns operation status.
Trace framework integration
plugins/in_ebpf/traces/traces.h
Includes generated trace_sched.skel.h skeleton header and sched/handler.h; expands DEFINE_GET_BPF_OBJECT(trace_sched) macro to generate BPF object accessor function; registers trace_sched in trace_table with trace_sched_handler as event handler.
Handler tests
tests/runtime/CMakeLists.txt, tests/runtime/in_ebpf_sched_handler.c
Adds CMake build rule for in_ebpf_sched_handler executable; test scaffolding allocates and initializes Fluent Bit log encoder/decoder with cleanup; helper functions validate messagepack key names and verify decoded scheduler field values; test_sched_event_encoding() constructs EVENT_TYPE_SCHED event with scheduler details, encodes through handler, decodes resulting messagepack map, and asserts expected field values/types match; test registered in TEST_LIST.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • fluent/fluent-bit#11735: Adds DNS event type to shared eBPF trace infrastructure with similar pattern to this scheduler trace (event type enum, payload struct, encoder mapping).
  • fluent/fluent-bit#11646: Extends common eBPF event definitions for TCP trace by modifying the same event_type_to_string() and struct event union that this PR uses.
  • fluent/fluent-bit#11568: Adds VFS event type to the shared eBPF encoding layer following the same architectural pattern as this scheduler trace implementation.

Suggested reviewers

  • edsiper

Poem

🐰 A scheduler awakes, timestamps in paw,
Through kernelspace traces we capture each draw,
Ring buffers deliver the wakeup ballet,
Fluent Bit logs the latency each day,
With tests all encoded, the trace finds its way! 📊✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'in_ebpf: Implement sched trace' accurately and clearly summarizes the main change—implementing a new scheduler trace feature for the in_ebpf input plugin.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cosmo0920-implement-sched-trace

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@cosmo0920 cosmo0920 force-pushed the cosmo0920-implement-sched-trace branch from a8dc2f7 to 5f38fd5 Compare April 24, 2026 08:59
@cosmo0920 cosmo0920 force-pushed the cosmo0920-implement-sched-trace branch from 5f38fd5 to 8235ef9 Compare May 7, 2026 06:20
@cosmo0920 cosmo0920 force-pushed the cosmo0920-implement-sched-trace branch from 8235ef9 to 4916257 Compare May 7, 2026 07:03
@cosmo0920 cosmo0920 force-pushed the cosmo0920-implement-sched-trace branch from 4916257 to d98bc64 Compare May 7, 2026 07:20

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fa08b76c7f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/in_ebpf/traces/sched/handler.c

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/in_ebpf/traces/sched/bpf.c`:
- Around line 79-103: The code is reading namespace/UID/GID from the current
(outgoing) task but builds the event for the next task; change the sched_switch
handler so mntns_id and uid_gid are read from the next task instead of using
gadget_get_mntns_id() and bpf_get_current_uid_gid(): locate where
next_pid/ctx->next_comm are used, obtain the next task pointer (the sched_switch
context field for the next task), use bpf_core_read to fetch next task's
nsproxy->mnt_ns inum (or other mntns id field) and its credentials (uid/gid)
into local vars, call gadget_should_discard_mntns_id() with that next-task
mntns_id, and assign event->common.uid/gid/mntns_id from those next-task values
before filling pid/tid/comm.
- Around line 28-31: The sched handler is reserving the full union-sized struct
event (via gadget_reserve_buf(&events, sizeof(*event))) which makes each
sched_switch sample huge; create a compact sched-specific record path: define a
dedicated compact struct sched_event (not the large union) and either use a
separate ringbuf map or ensure gadget_reserve_buf is called with sizeof(struct
sched_event) when handling sched_switch; update the sched event
producer/consumer functions (the code around the events map, gadget_reserve_buf
usage, and the sched_switch handler) to allocate, populate, and submit only the
small sched_event payload so the ring buffer capacity is not consumed by the
large execve union members.

In `@plugins/in_ebpf/traces/sched/handler.c`:
- Around line 138-153: trace_sched_handler currently returns early on failures
from encode_sched_event and flb_input_log_append without calling
flb_log_event_encoder_reset, risking leftover msgpack buffer contamination;
update the failure paths in trace_sched_handler so that immediately before each
early return (after encode_sched_event returns non-zero and after
flb_input_log_append returns -1) you call flb_log_event_encoder_reset(encoder)
to fully clear the encoder state (not just flb_log_event_encoder_reset_record),
ensuring the underlying buffer is cleared before returning.

In `@tests/runtime/in_ebpf_sched_handler.c`:
- Around line 97-117: The test currently only checks values for keys when they
appear in the loop inside verify_decoded_values, so missing keys will not fail
the test; add boolean flags (e.g., found_event_type, found_cpu, found_prev_pid,
found_next_pid, found_runq_latency_ns) and set the appropriate flag inside each
branch that matches key_matches(...) for "event_type", "cpu", "prev_pid",
"next_pid", and "runq_latency_ns"; after the loop, assert each flag is true
using TEST_CHECK (or equivalent) to ensure all required keys were present and
validated against original->details.sched fields.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b41a2fde-b47a-4623-9e1c-dd3abcb22bd7

📥 Commits

Reviewing files that changed from the base of the PR and between 926bbe7 and fa08b76.

📒 Files selected for processing (8)
  • plugins/in_ebpf/traces/includes/common/encoder.h
  • plugins/in_ebpf/traces/includes/common/events.h
  • plugins/in_ebpf/traces/sched/bpf.c
  • plugins/in_ebpf/traces/sched/handler.c
  • plugins/in_ebpf/traces/sched/handler.h
  • plugins/in_ebpf/traces/traces.h
  • tests/runtime/CMakeLists.txt
  • tests/runtime/in_ebpf_sched_handler.c

Comment thread plugins/in_ebpf/traces/sched/bpf.c
Comment thread plugins/in_ebpf/traces/sched/bpf.c Outdated
Comment thread plugins/in_ebpf/traces/sched/handler.c
Comment thread tests/runtime/in_ebpf_sched_handler.c
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/in_ebpf/traces/sched/bpf.c`:
- Line 76: The code is reading from next->pid which represents the
thread-specific ID in the Linux kernel, but it should read from next->tgid which
is the actual process ID (thread group ID). Change the bpf_core_read call that
reads next_pid to use &next->tgid as the source address instead of &next->pid to
correctly report the process ID rather than the thread ID for multi-threaded
tasks. Apply the same fix to the similar code at lines 104-105.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b376d7e2-f6a0-46a7-a72e-aafb90eef46c

📥 Commits

Reviewing files that changed from the base of the PR and between fa08b76 and 82e0ffa.

📒 Files selected for processing (8)
  • plugins/in_ebpf/traces/includes/common/encoder.h
  • plugins/in_ebpf/traces/includes/common/events.h
  • plugins/in_ebpf/traces/sched/bpf.c
  • plugins/in_ebpf/traces/sched/handler.c
  • plugins/in_ebpf/traces/sched/handler.h
  • plugins/in_ebpf/traces/traces.h
  • tests/runtime/CMakeLists.txt
  • tests/runtime/in_ebpf_sched_handler.c
✅ Files skipped from review due to trivial changes (1)
  • plugins/in_ebpf/traces/includes/common/encoder.h
🚧 Files skipped from review as they are similar to previous changes (6)
  • tests/runtime/CMakeLists.txt
  • plugins/in_ebpf/traces/sched/handler.h
  • plugins/in_ebpf/traces/includes/common/events.h
  • plugins/in_ebpf/traces/traces.h
  • tests/runtime/in_ebpf_sched_handler.c
  • plugins/in_ebpf/traces/sched/handler.c

Comment thread plugins/in_ebpf/traces/sched/bpf.c Outdated
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/in_ebpf/traces/sched/bpf.c (1)

113-116: ⚠️ Potential issue | 🟡 Minor

Asymmetric ID types in prev_pid vs next_pid fields.

prev_pid is populated from prev->pid (kernel thread ID), while next_pid is populated from next->tgid (process ID). This creates inconsistent semantics:

  • event->details.prev_pid contains the thread ID of the previous task
  • event->details.next_pid contains the process ID of the next task

The common struct correctly distinguishes these as separate pid and tid fields, but the details struct has no prev_tid field and no separate prev_tgid. Either populate prev_pid from prev->tgid for consistency, or add a prev_tid field to match the pattern used in common.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/in_ebpf/traces/sched/bpf.c` around lines 113 - 116, The prev_pid and
next_pid fields have inconsistent semantics: prev_pid is populated from
prev->pid (thread ID) while next_pid is populated from next->tgid (process ID).
Fix this asymmetry by ensuring both fields use the same type of identifier.
Change the assignment of event->details.prev_pid to use prev->tgid instead of
prev->pid, making it consistent with how next_pid is populated from next->tgid.
This ensures both fields represent process IDs rather than mixing thread IDs and
process IDs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@plugins/in_ebpf/traces/sched/bpf.c`:
- Around line 113-116: The prev_pid and next_pid fields have inconsistent
semantics: prev_pid is populated from prev->pid (thread ID) while next_pid is
populated from next->tgid (process ID). Fix this asymmetry by ensuring both
fields use the same type of identifier. Change the assignment of
event->details.prev_pid to use prev->tgid instead of prev->pid, making it
consistent with how next_pid is populated from next->tgid. This ensures both
fields represent process IDs rather than mixing thread IDs and process IDs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 944d01d6-172e-4f09-98fa-a25472270811

📥 Commits

Reviewing files that changed from the base of the PR and between 82e0ffa and 93e00a7.

📒 Files selected for processing (1)
  • plugins/in_ebpf/traces/sched/bpf.c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants