perf_hooks: emit user timing marks, measures and timerify to trace events#18789
perf_hooks: emit user timing marks, measures and timerify to trace events#18789jasnell wants to merge 3 commits intonodejs:masterfrom
Conversation
|
Didn’t review closely but tests are missing. |
src/node_perf.cc
Outdated
There was a problem hiding this comment.
IIRC, this is doing a floating point divide which is more expensive than an integer divide. 1000 would be an integral constant.
There was a problem hiding this comment.
Confirmed. Floating point divide + 2 conversions between integer and floating point.
#include <stdint.h>
uint64_t foo(uint64_t nanos) {
return nanos / 1e3;
}
// clang -O3 -mllvm --x86-asm-syntax=intel -S foo.c movq xmm0, rdi
punpckldq xmm0, xmmword ptr [rip + LCPI0_0] ## xmm0 = xmm0[0],mem[0],xmm0[1],mem[1]
subpd xmm0, xmmword ptr [rip + LCPI0_1]
haddpd xmm0, xmm0
divsd xmm0, qword ptr [rip + LCPI0_2]
movsd xmm1, qword ptr [rip + LCPI0_3] ## xmm1 = mem[0],zero
movapd xmm2, xmm0
subsd xmm2, xmm1
cvttsd2si rax, xmm2
movabs rcx, -9223372036854775808
xor rcx, rax
cvttsd2si rax, xmm0
ucomisd xmm0, xmm1
cmovae rax, rcx
pop rbp
ret|
@ofrobots ... nit fixed! |
src/node_perf.cc
Outdated
There was a problem hiding this comment.
Question: what would the the correct context here? To me it seems that an AsyncResource or async_context be created when the observer is constructed, and that should be used on the callback. Do you agree?
Aside: Do our compilers support async_context{0,0} syntax yet?
There was a problem hiding this comment.
Still thinking about that. PerformanceObserver currently is not an AsyncResource and only actually exists on the JS side. Making it an AsyncResource would be a minor divergence from the spec but is certainly not out of the question.
There was a problem hiding this comment.
And I don't know about the compilers question.
There was a problem hiding this comment.
If you don't want to inherit from AsyncResource, you can attache an async_context to the PerformanceObserver instances for the same effect.
I'm okay if you want to do this in a follow-on, as the issue existed before this PR, and is orthogonal.
There was a problem hiding this comment.
Added example to #18810 in case it is useful. If the CI passes there, then that would answer the compiler question :).
564a74f to
c6ecad5
Compare
|
@ofrobots ... just went ahead and made |
c6ecad5 to
4108481
Compare
4108481 to
ac17e74
Compare
|
CI is looking good with one unrelated failure. |
src/tracing/trace_event.h
Outdated
There was a problem hiding this comment.
In sync with upstream? Nope, not yet. But that reminds me...
@ofrobots ... ^^ To implement this I needed this macro implemented also.
There was a problem hiding this comment.
@jasnell created https://chromium-review.googlesource.com/c/v8/v8/+/924507 that adds the same macro upstream.
Note that for the trace-event macros, a sync with upstream is no longer a necessity. Our copy of trace_event.h is derived from upstream, but has a slightly different implementation. It is okay for them to pick up features at a different pace. Please do continue to ping if we add more macros locally. There is no need to block on upstream.
doc/api/tracing.md
Outdated
There was a problem hiding this comment.
The flag should be named --trace-events-categories for consistency.
There was a problem hiding this comment.
This flag already exists and is not introduced by this PR
ac17e74 to
f67e78d
Compare
Adds the `node.perf.usertiming` trace events category for recording usertiming marks and measures (e.g. `perf_hooks.performance.mark()`) in the trace events timeline. Adds the `node.perf.function` trace events category for recording `perf_hooks.performance.timerify()` durations in the trace events timeline.
f67e78d to
02adcf7
Compare
PR-URL: #18789 Reviewed-By: Ali Ijaz Sheikh <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
Adds the `node.perf.usertiming` trace events category for recording usertiming marks and measures (e.g. `perf_hooks.performance.mark()`) in the trace events timeline. Adds the `node.perf.function` trace events category for recording `perf_hooks.performance.timerify()` durations in the trace events timeline. PR-URL: #18789 Reviewed-By: Ali Ijaz Sheikh <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
PR-URL: #18789 Reviewed-By: Ali Ijaz Sheikh <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
|
Should this be backported to |
|
This depends on #17640 which can't go in v9.x. |
PR-URL: nodejs#18789 Reviewed-By: Ali Ijaz Sheikh <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
Adds the `node.perf.usertiming` trace events category for recording usertiming marks and measures (e.g. `perf_hooks.performance.mark()`) in the trace events timeline. Adds the `node.perf.function` trace events category for recording `perf_hooks.performance.timerify()` durations in the trace events timeline. PR-URL: nodejs#18789 Reviewed-By: Ali Ijaz Sheikh <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
PR-URL: nodejs#18789 Reviewed-By: Ali Ijaz Sheikh <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
This reverts commit 009e418. AFAIU the discussion at [1], PerformanceObserver had been made to inherit from AsyncResource more or less as a band-aid in lack of a better async_context candidate to invoke it in. In order to enable access to AsyncLocalStores from PerformanceObservers invoked synchronously through e.g. measure() or mark(), the current async_context, if any, should be retained. Note that this is a breaking change, but - as has been commented at [1], PerformanceObserver being derived from AsyncResource is a "minor divergence from the spec" anyway, - to my knowledge this is an internal implementation detail which has never been documented and - I can't think of a good reason why existing PerformanceObserver implementations would possibly rely on it. OTOH, it's probably worthwhile to not potentially invoke before() and after() async_hooks for each and every PerformanceObserver notification. [1] nodejs#18789 Co-Authored-By: ZauberNerd <[email protected]> PR-URL: nodejs#36343 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Rich Trott <[email protected]>
This reverts commit 009e418. AFAIU the discussion at [1], PerformanceObserver had been made to inherit from AsyncResource more or less as a band-aid in lack of a better async_context candidate to invoke it in. In order to enable access to AsyncLocalStores from PerformanceObservers invoked synchronously through e.g. measure() or mark(), the current async_context, if any, should be retained. Note that this is a breaking change, but - as has been commented at [1], PerformanceObserver being derived from AsyncResource is a "minor divergence from the spec" anyway, - to my knowledge this is an internal implementation detail which has never been documented and - I can't think of a good reason why existing PerformanceObserver implementations would possibly rely on it. OTOH, it's probably worthwhile to not potentially invoke before() and after() async_hooks for each and every PerformanceObserver notification. [1] #18789 Co-Authored-By: ZauberNerd <[email protected]> PR-URL: #36343 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Rich Trott <[email protected]>
Emit user timing marks/measures and performance.timerify() measures to trace events.
/cc @nodejs/diagnostics @mcollina
Checklist
make -j4 test(UNIX), orvcbuild test(Windows) passesAffected core subsystem(s)
perf_hooks, trace_events