Skip to content

Surface headers for Linking To#190

Merged
ms609 merged 13 commits intotransfer-consensusfrom
expose-lapjv
Mar 23, 2026
Merged

Surface headers for Linking To#190
ms609 merged 13 commits intotransfer-consensusfrom
expose-lapjv

Conversation

@ms609
Copy link
Owner

@ms609 ms609 commented Mar 20, 2026

No description provided.

@codecov
Copy link

codecov bot commented Mar 20, 2026

Codecov Report

❌ Patch coverage is 98.13084% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.76%. Comparing base (4a60427) to head (5ae05ee).
⚠️ Report is 1 commits behind head on transfer-consensus.

Files with missing lines Patch % Lines
inst/include/TreeDist/mutual_clustering_impl.h 95.89% 6 Missing ⚠️
Additional details and impacted files
@@                  Coverage Diff                   @@
##           transfer-consensus     #190      +/-   ##
======================================================
- Coverage               95.76%   95.76%   -0.01%     
======================================================
  Files                      53       57       +4     
  Lines                    5267     5406     +139     
======================================================
+ Hits                     5044     5177     +133     
- Misses                    223      229       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link

github-actions bot commented Mar 20, 2026

⚠️ This benchmark result is outdated. See the latest comment below.

Performance benchmark results

Call Status Change Time (ms)
ClusteringInfoDistance(tr200) ⚪ NSD 0.99% 15.5 →
15.4, 15.2
ClusteringInfoDistance(tr50) ⚪ NSD 3.27% 12 →
11.7, 11.5
LAPJV(test2000) 🟣 ~same -1.78% 91.9 →
93.5, 93.7
LAPJV(test40) 🟠 Slower 🙁 -5.96% 0.0148 →
0.0157, 0.0157
LAPJV(test400) 🟢 Faster! 4.11% 3.1 →
2.96, 2.97
MutualClusteringInfo(tr200) ⚪ NSD 2.51% 22 →
21.4, 21.5
MutualClusteringInfo(tr50) ⚪ NSD -0.27% 21.1 →
20.9, 21.7
PathDist(postTrees) ⚪ NSD 3% 3.4 →
3.33, 3.28
PhylogeneticInfoDistance(tr200) 🟣 ~same 0.53% 229 →
228, 229
PhylogeneticInfoDistance(tr50) 🟣 ~same 1.43% 78.3 →
77.3, 77
RobinsonFoulds(tr200) ⚪ NSD -1.01% 2.8 →
2.83, 2.82
RobinsonFoulds(tr200) ⚪ NSD -1.02% 2.59 →
2.63, 2.61
RobinsonFoulds(tr50) ⚪ NSD -0.75% 4.29 →
4.34, 4.29

@github-actions
Copy link

github-actions bot commented Mar 20, 2026

⚠️ This benchmark result is outdated. See the latest comment below.

Performance benchmark results

Call Status Change Time (ms)
ClusteringInfoDistance(tr200) 🟢 Faster! 9.83% 16.1 →
14.5, 14.4
ClusteringInfoDistance(tr50) ⚪ NSD -2.37% 10.3 →
10.4, 10.7
LAPJV(test2000) 🟠 Slower 🙁 -14.32% 100 →
114, 115
LAPJV(test40) 🟠 Slower 🙁 -9.67% 0.0143 →
0.0157, 0.0155
LAPJV(test400) 🟠 Slower 🙁 -5.7% 3.02 →
3.2, 3.18
MutualClusteringInfo(tr200) ⚪ NSD 5.57% 22.2 →
21, 20.9
MutualClusteringInfo(tr50) ⚪ NSD -1.94% 21.4 →
21.9, 21.7
PathDist(postTrees) 🟠 Slower 🙁 -38.89% 3.4 →
4.72, 4.72
PhylogeneticInfoDistance(tr200) 🟣 ~same 1.01% 232 →
229, 230
PhylogeneticInfoDistance(tr50) 🟢 Faster! 6.86% 74.5 →
69.4, 69.4
RobinsonFoulds(tr200) ⚪ NSD 0.34% 2.68 →
2.68, 2.66
RobinsonFoulds(tr200) ⚪ NSD -0.32% 2.39 →
2.42, 2.39
RobinsonFoulds(tr50) ⚪ NSD -0.89% 4.03 →
4.07, 4.05

@github-actions
Copy link

github-actions bot commented Mar 22, 2026

⚠️ This benchmark result is outdated. See the latest comment below.

Performance benchmark results

Call Status Change Time (ms)
ClusteringInfoDistance(tr200) ⚪ NSD 0.69% 15.7 →
15.6, 15.5
ClusteringInfoDistance(tr50) ⚪ NSD -0.02% 10.2 →
10.2, 10.3
LAPJV(test2000) 🟠 Slower 🙁 -21.73% 101 →
128, 121
LAPJV(test40) 🟠 Slower 🙁 -11.68% 0.0142 →
0.0159, 0.0159
LAPJV(test400) 🟠 Slower 🙁 -11.8% 3.04 →
3.42, 3.36
MutualClusteringInfo(tr200) ⚪ NSD -1.52% 21.9 →
21.9, 22.3
MutualClusteringInfo(tr50) ⚪ NSD -0.96% 21.2 →
21.4, 21.4
PathDist(postTrees) ⚪ NSD -1.3% 3.3 →
3.27, 3.36
PhylogeneticInfoDistance(tr200) 🟣 ~same 0.12% 235 →
234, 234
PhylogeneticInfoDistance(tr50) 🟣 ~same -2.01% 71.6 →
72.4, 73.3
RobinsonFoulds(tr200) ⚪ NSD 0.47% 2.66 →
2.65, 2.65
RobinsonFoulds(tr200) ⚪ NSD 0.66% 2.38 →
2.39, 2.36
RobinsonFoulds(tr50) ⚪ NSD -0.25% 4 →
4.03, 3.98

lap_impl.h: add __attribute__((optimize("align-functions=64",
"align-loops=16"))) on lap() (GCC only) to stabilise instruction
alignment across TU layout changes.

cost_matrix.h: add setWithTranspose() and markTransposed() methods.

lap.cpp: use combined fill+transpose in lapjv() to eliminate the extra
makeTranspose() O(n²) pass introduced by the header refactor.

tree_distance_functions.cpp: add cpp_mci_impl_score() wrapper that calls
TreeDist::mutual_clustering_score() from the installable header, covering
find_exact_matches_raw() and the MCI score computation.

test-mci_impl.R: exercises exact-match early exit and partial-match + LAP
paths through the header implementation.
@github-actions
Copy link

github-actions bot commented Mar 22, 2026

⚠️ This benchmark result is outdated. See the latest comment below.

Performance benchmark results

Call Status Change Time (ms)
ClusteringInfoDistance(tr200) 🟢 Faster! 7.02% 15.6 →
14.5, 14.6
ClusteringInfoDistance(tr50) ⚪ NSD -4.63% 10.1 →
10.6, 10.7
LAPJV(test2000) 🟠 Slower 🙁 -9.78% 109 →
119, 119
LAPJV(test40) 🟠 Slower 🙁 -5.36% 0.0143 →
0.015, 0.0151
LAPJV(test400) 🟠 Slower 🙁 -9.19% 3.11 →
3.42, 3.35
MutualClusteringInfo(tr200) 🟢 Faster! 5.28% 22 →
20.8, 20.7
MutualClusteringInfo(tr50) ⚪ NSD -1.41% 21.4 →
21.7, 21.7
PathDist(postTrees) ⚪ NSD 1.15% 3.31 →
3.27, 3.28
PhylogeneticInfoDistance(tr200) 🟣 ~same 0.18% 234 →
234, 233
PhylogeneticInfoDistance(tr50) 🟠 Slower 🙁 -5.9% 71.9 →
76.3, 76
RobinsonFoulds(tr200) ⚪ NSD 1.97% 2.69 →
2.64, 2.64
RobinsonFoulds(tr200) ⚪ NSD 1.85% 2.41 →
2.37, 2.35
RobinsonFoulds(tr50) ⚪ NSD 1.36% 4.05 →
4.02, 3.97

PKG_CXXFLAGS is overridden by ASAN/ubsan ~/.R/Makevars on CI,
dropping the inst/include path.  PKG_CPPFLAGS is not overridden.
@github-actions
Copy link

github-actions bot commented Mar 22, 2026

⚠️ This benchmark result is outdated. See the latest comment below.

Performance benchmark results

Call Status Change Time (ms)
ClusteringInfoDistance(tr200) 🟣 ~same -3.87% 15.2 →
15.6, 16
ClusteringInfoDistance(tr50) ⚪ NSD 0% 12.4 →
12.5, 12.2
LAPJV(test2000) 🟠 Slower 🙁 -15.51% 90.8 →
106, 104
LAPJV(test40) 🟠 Slower 🙁 -6.72% 0.0146 →
0.0155, 0.0157
LAPJV(test400) 🟠 Slower 🙁 -4.93% 2.93 →
3.08, 3.06
MutualClusteringInfo(tr200) ⚪ NSD -1.32% 23.4 →
23.4, 23.7
MutualClusteringInfo(tr50) ⚪ NSD 1.27% 25.6 →
25.4, 25.2
PathDist(postTrees) ⚪ NSD 2.79% 3.55 →
3.48, 3.4
PhylogeneticInfoDistance(tr200) ⚪ NSD 2.88% 248 →
240, 241
PhylogeneticInfoDistance(tr50) 🟣 ~same -2.38% 79 →
81, 80.6
RobinsonFoulds(tr200) ⚪ NSD -0.83% 2.97 →
3, 2.98
RobinsonFoulds(tr200) ⚪ NSD -0.98% 2.73 →
2.74, 2.77
RobinsonFoulds(tr50) ⚪ NSD -1.24% 4.54 →
4.58, 4.6

ms609 added 2 commits March 23, 2026 06:37
lap_impl.h is for downstream LinkingTo consumers only.  Including it in
TreeDist's own lap.cpp changed the TU context enough for GCC 14's register
allocator to produce ~8% more instructions in the Dijkstra hot loop,
causing a 20-25% regression on standalone LAPJV (n >= 400).

Fix: define lap() directly in lap.cpp (matching main's pattern) and add
GCC align-functions=64 / align-loops=16 attributes.  The lapjv() wrapper
now fills the transposed buffer first (matching R's column-major storage)
then untransposes — restoring the cache-friendly construction pattern.

Residual: ~5-9% on LAPJV 1999x1999 vs main, from the different CostMatrix
class definition visible through the installed headers (different method
set, namespace wrapping).  Tree distance metrics (CID, MSD, PID, etc.)
are unaffected — they call lap() from pairwise_distances.cpp.

cost_matrix.h: add dim8() accessor (needed by lapjv() matrix fill).
@github-actions
Copy link

github-actions bot commented Mar 23, 2026

⚠️ This benchmark result is outdated. See the latest comment below.

Performance benchmark results

Call Status Change Time (ms)
ClusteringInfoDistance(tr200) ⚪ NSD -1.71% 15 →
15.3, 15.2
ClusteringInfoDistance(tr50) ⚪ NSD -1.97% 11.5 →
11.8, 11.7
LAPJV(test2000) 🟠 Slower 🙁 -17.23% 81.6 →
93.4, 103
LAPJV(test40) 🟠 Slower 🙁 -20.93% 0.0146 →
0.0176, 0.0178
LAPJV(test400) ⚪ NSD -12.5% 2.86 →
3.21, 3.75
MutualClusteringInfo(tr200) ⚪ NSD -1.43% 21.6 →
21.6, 22.1
MutualClusteringInfo(tr50) ⚪ NSD -1.89% 22.4 →
22.8, 22.8
PathDist(postTrees) 🟠 Slower 🙁 -50.98% 3.47 →
5.24, 5.23
PhylogeneticInfoDistance(tr200) 🟣 ~same 2.52% 238 →
232, 232
PhylogeneticInfoDistance(tr50) 🟣 ~same -1.6% 78.9 →
80.3, 80.1
RobinsonFoulds(tr200) ⚪ NSD -1.06% 2.8 →
2.82, 2.85
RobinsonFoulds(tr200) ⚪ NSD -2.15% 2.57 →
2.62, 2.63
RobinsonFoulds(tr50) ⚪ NSD -1.21% 4.3 →
4.37, 4.35

ms609 added 3 commits March 23, 2026 07:12
The expanded lap.o (direct lap() implementation) shifted the DLL layout
enough to regress PathDist by ~50% on Linux/GCC -O3 — a function in a
completely different TU that was a linker-level alignment casualty.

Per-function __attribute__((optimize("align-functions=64"))) only
controls alignment within the object file; it cannot prevent neighbouring
functions from being placed at unfavorable offsets when another .o changes
size.  A global -falign-functions=64 in PKG_CXXFLAGS ensures every
function entry point is 64-byte aligned regardless of link order.

Removed the per-function attribute from lap.cpp (now redundant).
Kept it in lap_impl.h for downstream LinkingTo consumers.
The 'Initialize ASan configuration' step overwrote src/Makevars with '>',
destroying PKG_CPPFLAGS = -I../inst/include added by the expose-lapjv
branch.  Use sed to replace PKG_CXXFLAGS in-place instead, preserving
PKG_CPPFLAGS and PKG_LIBS.
CRAN's R CMD check flags this as non-portable (GCC-specific).
Tree distance metrics were unaffected by the alignment regression;
only standalone LAPJV microbenchmarks showed sensitivity.
@github-actions
Copy link

Performance benchmark results

Call Status Change Time (ms)
ClusteringInfoDistance(tr200) ⚪ NSD -2.83% 15.3 →
15.6, 15.7
ClusteringInfoDistance(tr50) ⚪ NSD -4.29% 11.5 →
11.8, 12.1
LAPJV(test2000) 🟠 Slower 🙁 -12.1% 85.1 →
97.2, 95
LAPJV(test40) 🟠 Slower 🙁 -18.64% 0.0145 →
0.0172, 0.0172
LAPJV(test400) 🟠 Slower 🙁 -9.1% 2.88 →
3.14, 3.15
MutualClusteringInfo(tr200) ⚪ NSD -2.73% 22.1 →
22.5, 22.9
MutualClusteringInfo(tr50) ⚪ NSD -4.18% 23.6 →
24.3, 25.1
PathDist(postTrees) ⚪ NSD 0.19% 3.47 →
3.48, 3.4
PhylogeneticInfoDistance(tr200) 🟢 Faster! 4.06% 241 →
232, 230
PhylogeneticInfoDistance(tr50) ⚪ NSD -0.36% 78 →
78.4, 78.3
RobinsonFoulds(tr200) ⚪ NSD -0.6% 2.84 →
2.85, 2.85
RobinsonFoulds(tr200) ⚪ NSD -1.14% 2.6 →
2.63, 2.62
RobinsonFoulds(tr50) ⚪ NSD -0.77% 4.34 →
4.37, 4.38

@ms609 ms609 marked this pull request as ready for review March 23, 2026 08:04
@ms609 ms609 merged commit 9e01c6e into transfer-consensus Mar 23, 2026
22 of 23 checks passed
@ms609 ms609 deleted the expose-lapjv branch March 23, 2026 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant