Skip to content

feat(review-pr): auto-filter low-risk files before chunking#215

Open
derekmisler wants to merge 3 commits into
mainfrom
feat/auto-filter-large-diffs
Open

feat(review-pr): auto-filter low-risk files before chunking#215
derekmisler wants to merge 3 commits into
mainfrom
feat/auto-filter-large-diffs

Conversation

@derekmisler
Copy link
Copy Markdown
Contributor

Summary

Fixes the root cause of large-PR review timeouts by making the risk-scoring pipeline automatically filter out low-risk files before diff chunking, without requiring per-repo exclude-paths configuration.

Root causes addressed

  1. Risk scoring happened after chunking — scores were computed but never used to filter the diff. Every file (including test files, generated code, etc.) got chunked and processed by the full multi-agent pipeline.

  2. TEST_FILE_RE in score-risk.ts missed Rust/Ruby test files — Rust test/bench/spec files and Ruby spec files scored non-zero and were included in chunks unnecessarily.

Changes

Change 1 — Extend TEST_FILE_RE in src/score-risk/score-risk.ts

Extended the regex to also match:

  • Rust suffix patterns: _test.rs, _bench.rs, _spec.rs
  • Ruby spec files: _spec.rb
  • Any file under a tests/, test/, benches/, __tests__/, or specs/ directory component

Added 9 new Vitest tests covering these new patterns.

Change 2 — New module src/auto-filter-diff/

Pure TypeScript module with two-phase filtering:

  • Phase 1 (auto-exclude): Removes all score-0 files. Unknown files (not in riskScores) are kept as they need review.
  • Phase 2 (progressive cap): If remaining diff > maxDiffLines (default 3000), removes lowest-risk files first until it fits. Always keeps at least 1 file.

Includes 18 Vitest unit tests covering all phases and edge cases.

Change 3 — Wire up in review-pr/action.yml

  • Added max-diff-lines input (default "3000", set "0" to disable)
  • Moved "Score file risk" step to run before "Split diff into chunks"
  • Added new "Auto-filter low-risk files" step between scoring and chunking

New step order: Filter excluded paths → Score file risk → Auto-filter → Split into chunks → Generate file history

Change 4 — Propagate through .github/workflows/review-pr.yml

Added max-diff-lines as a workflow-level number input (default 3000) and passed it through to the review-pr action call.

Change 5 — tsup.config.ts

Added 'src/auto-filter-diff/index.ts' to the entry map so tsup bundles it as dist/auto-filter-diff.js.

Test results

Test Files  20 passed (20)
     Tests  358 passed (358)

Biome formatting and TypeScript type-check pass cleanly. The pre-existing actionlint warning about the node24 runner in test-e2e.yml is unrelated to this PR (present on main before these changes).

- Extend TEST_FILE_RE in score-risk.ts to match Rust test/bench/spec files
  (_test.rs, _bench.rs, _spec.rs), Ruby spec files (_spec.rb), and any file
  under tests/, test/, benches/, __tests__/, or specs/ directory components.
  These files now score 0 and are automatically excluded from review.

- Add new src/auto-filter-diff/ module (pure TypeScript + Vitest tests):
  Phase 1 removes all score-0 files; Phase 2 progressively removes
  lowest-risk files to meet an optional maxDiffLines cap (default 3000).
  Files not in riskScores are treated as unknown and kept for review.

- Wire auto-filter into review-pr/action.yml:
  - Add max-diff-lines input (default 3000, set 0 to disable)
  - Move 'Score file risk' BEFORE 'Split diff into chunks' so scores
    are available when the new 'Auto-filter low-risk files' step runs
  - New step runs after scoring, before chunking

- Propagate max-diff-lines through .github/workflows/review-pr.yml
  (workflow-level input, passed to the review-pr action call)

- Add auto-filter-diff entry to tsup.config.ts

All 358 tests pass; Biome + tsc clean.
Copy link
Copy Markdown
Contributor

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

This PR correctly implements a two-phase auto-filter mechanism and moves risk scoring before diff chunking. The new TEST_FILE_RE additions for Rust/Ruby patterns look correct, and the tsup.config.ts entry is properly wired. One medium-severity logic issue was found in Phase 2 of autoFilterDiff.

});

// slice(0, -1) ensures we never remove the last (highest-risk) file.
const removable = sortedAsc.slice(0, -1);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Phase 2 progressive cap can remove unknown files — violates "unknown = needs review" invariant

slice(0, -1) only protects the single last element of sortedAsc from the removable set. When multiple unknown files exist (not present in riskScores), they all receive Infinity as their sort key and cluster at the end of the ascending sort. Only the very last one is excluded from removable; all others are candidates for removal.

Trigger scenario: A PR has 3 new files that haven't been scored yet (common for a first PR touching a new directory). sortedAsc = [unknownA, unknownB, unknownC]. removable = sortedAsc.slice(0, -1) = [unknownA, unknownB]. If totalLines > maxDiffLines, unknownA and unknownB are dropped despite the spec requiring unknowns to be kept for review.

Similarly with 1 scored file + 2 unknowns: sortedAsc = [scored(1), unknown1, unknown2]. removable = [scored(1), unknown1]. After scored is removed, unknown1 gets removed next — a false exclusion.

Fix: Filter removable to exclude files not present in riskScores:

const removable = sortedAsc
  .slice(0, -1)
  .filter((s) => riskScores[s.path] !== undefined);

The existing test ('treats unknown files as high-priority to keep') only tests one unknown file against one scored file — the multi-unknown case is not covered.

Copy link
Copy Markdown
Contributor Author

@derekmisler derekmisler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

No new findings from this review pass the quality gate. One confirmed bug is already flagged in an existing inline comment (line 164 of src/auto-filter-diff/auto-filter-diff.ts): the Phase 2 progressive cap uses slice(0, -1) to protect the last file in the ascending-risk sort, but when multiple unscored files exist they all sort to Infinity and only the very last is protected — the others remain removable, violating the "unknown = needs review" invariant.

All other items examined were verified clean: missing set -euo pipefail in the auto-filter step (single-command step, no functional impact), remainingLines preamble accounting (no production trigger path), TEST_FILE_RE regex (works correctly), step ordering (correct: filter → score → auto-filter → chunk), lock-check guard asymmetry (harmless: pr.diff is never created when skip=true), and extractFilePath edge cases (correct for normal paths).

Docker Agent added 2 commits May 21, 2026 19:53
Phase 2 was using slice(0, -1) which only protects the single last
element of sortedAsc. When multiple files have no entry in riskScores
they all sort to the end (Infinity key), so all but the very last were
still candidates for removal — violating the 'unknown = needs review'
invariant.

Fix: additionally filter removable to only include files that have an
explicit entry in riskScores. Unknown files are now unconditionally
kept regardless of the progressive cap.

Also adds a regression test: 3 unknown files + 1 scored file with the
diff over the cap. The test asserts all three unknown files are kept
and only the scored file is a removal candidate.
Tests that the auto-filter pipeline correctly:
- Excludes 15 Rust test/bench files under crates/vm/tests/integration/
  (all score 0 via the extended TEST_FILE_RE directory-component rule)
- Keeps 2 Go files that score >0 (error-handling pattern in content)
- Produces a single chunk containing only the 2 remaining files
- The remaining diff includes a planted SQL injection in pkg/storage/db.go
  (fmt.Sprintf with unsanitised user input) for the agent to catch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants