From 8833d5b6e977a28a690b8207e2f820cc69181f13 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Fri, 27 Feb 2026 12:30:50 +0000
Subject: [PATCH 1/2] Initial plan


From 382778895d38c997ba76e31ae6321a45c55fba8f Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Fri, 27 Feb 2026 12:44:48 +0000
Subject: [PATCH 2/2] reduce token consumption in duplicate-code-detector
 workflow
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Change schedule from daily to weekly (Sunday ~03:00) to cut baseline frequency 7×
- Restrict analysis scope to pkg/ and actions/ directories only
- Cap files analyzed per run at 20 (by commit order)
- Add early-exit sample check (step 2a): scan 5 files first;
  stop immediately with noop if no high-confidence duplicates found
- Update Analysis Depth section to reflect new scope constraints
- Regenerate lock file via make recompile

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
---
 .../duplicate-code-detector.lock.yml          |  6 +++---
 .github/workflows/duplicate-code-detector.md  | 19 +++++++++++++++----
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/.github/workflows/duplicate-code-detector.lock.yml b/.github/workflows/duplicate-code-detector.lock.yml
index ac22947714a..da62649be67 100644
--- a/.github/workflows/duplicate-code-detector.lock.yml
+++ b/.github/workflows/duplicate-code-detector.lock.yml
@@ -27,13 +27,13 @@
 #   Imports:
 #     - shared/mcp/serena-go.md
 #
-# gh-aw-metadata: {"schema_version":"v1","frontmatter_hash":"fcd0677bc45a2e116662616286ca59ac108757b01a1588e6c83c767465fc9871"}
+# gh-aw-metadata: {"schema_version":"v1","frontmatter_hash":"abe5620bd6caaf355981bc59f52b3cc4752649f018ed2c5074cca826f74e07dc"}
 
 name: "Duplicate Code Detector"
 "on":
   schedule:
-  - cron: "9 3 * * *"
-    # Friendly format: daily (scattered)
+  - cron: "9 3 * * 0"
+    # Friendly format: weekly on sunday around 03:00 (scattered)
   workflow_dispatch:
 
 permissions: {}
diff --git a/.github/workflows/duplicate-code-detector.md b/.github/workflows/duplicate-code-detector.md
index ba3974afa18..e0a402dd857 100644
--- a/.github/workflows/duplicate-code-detector.md
+++ b/.github/workflows/duplicate-code-detector.md
@@ -3,7 +3,7 @@ name: Duplicate Code Detector
 description: Identifies duplicate code patterns across the codebase and suggests refactoring opportunities
 on:
   workflow_dispatch:
-  schedule: daily
+  schedule: weekly on sunday around 03:00
 permissions:
   contents: read
   issues: read
@@ -54,12 +54,21 @@ Activate the project in Serena:
 Identify and analyze modified files:
 - Determine files changed in the recent commits
 - **ONLY analyze .go and .cjs files** - exclude all other file types
+- **ONLY analyze files inside `pkg/` or `actions/` directories** - skip all files outside these directories (e.g., `vendor/`, `node_modules/`, root-level files)
 - **Exclude JavaScript files except .cjs** from analysis (files matching patterns: `*.js`, `*.mjs`, `*.jsx`, `*.ts`, `*.tsx`)
 - **Exclude test files** from analysis (files matching patterns: `*_test.go`, `*.test.js`, `*.test.cjs`, `*.spec.js`, `*.spec.cjs`, `*.test.ts`, `*.spec.ts`, `*_test.py`, `test_*.py`, or located in directories named `test`, `tests`, `__tests__`, or `spec`)
 - **Exclude workflow files** from analysis (files under `.github/workflows/*`)
+- **Cap at 20 files**: if more than 20 qualifying files changed, analyze only the 20 most recently touched (by commit order — the last 20 in the git diff output)
 - Use `get_symbols_overview` to understand file structure
 - Use `read_file` to examine modified file contents
 
+### 2a. Early-Exit Sample Check
+
+Before performing a full analysis, apply a quick sample check:
+- Select up to 5 files from step 2 and perform a rapid duplicate scan on those files only
+- If **zero** high-confidence duplicate patterns are found in the sample, **stop immediately** and call the `noop` tool — do not continue to the full analysis. A "high-confidence" duplicate means: >10 lines of near-identical code OR 3+ instances of the same pattern across different files.
+- Only proceed to step 3 if at least one credible duplicate candidate is found in the sample
+
 ### 3. Duplicate Detection
 
 Apply semantic code analysis to find duplicates:
@@ -151,10 +160,12 @@ Create separate issues for each distinct duplication pattern found (maximum 3 pa
 
 ### Analysis Depth
 
+- **Directory Scope**: ONLY analyze files inside `pkg/` and `actions/` directories — do NOT traverse any directory outside these two
 - **File Type Restriction**: ONLY analyze .go and .cjs files - ignore all other file types
-- **Primary Focus**: All .go and .cjs files changed in the current push (excluding test files and workflow files)
-- **Secondary Analysis**: Check for duplication with existing .go and .cjs codebase (excluding test files and workflow files)
-- **Cross-Reference**: Look for patterns across .go and .cjs files in the repository
+- **File Cap**: Analyze at most 20 files per run; stop early if no duplicates are found in a 5-file sample (see step 2a)
+- **Primary Focus**: All .go and .cjs files changed in the current push (excluding test files and workflow files) that are within `pkg/` or `actions/`
+- **Secondary Analysis**: Check for duplication with existing .go and .cjs codebase (excluding test files and workflow files) within `pkg/` or `actions/`
+- **Cross-Reference**: Look for patterns across .go and .cjs files in the repository within `pkg/` or `actions/`
 - **Historical Context**: Consider if duplication is new or existing
 
 ## Issue Template
