Skip to content

feat: accurate risk scoring — differentiate Added/Modified/Deleted in detect_changes#416

Open
marxo126 wants to merge 9 commits intoabhigyanpatwari:mainfrom
marxo126:fix/detect-changes-weighted-risk
Open

feat: accurate risk scoring — differentiate Added/Modified/Deleted in detect_changes#416
marxo126 wants to merge 9 commits intoabhigyanpatwari:mainfrom
marxo126:fix/detect-changes-weighted-risk

Conversation

@marxo126
Copy link

Summary

detect_changes previously marked all symbols in changed files as "Modified", making risk scores meaningless for feature branches. A branch adding 300 new widgets and intents would score HIGH risk — the same as a branch that deletes 50 core functions.

This PR fixes that by:

  • Classifying each symbol as Added, Modified, or Deleted using tree-sitter diffing
  • Weighting risk by actual danger: score = (modified_with_callers × 3) + (deleted × 5)
  • Adding a human-readable explanation field to the summary

How it works

  1. Added files (git diff --diff-filter=A): all symbols → "Added" (zero parsing needed)
  2. Modified files (git diff --diff-filter=M): parse the old version with tree-sitter via git show base_ref:file, extract symbol names, diff against current index → each symbol classified as "Added" (new in this file) or "Modified" (existed before)
  3. Deleted files (git diff --diff-filter=D): parse old version to identify removed symbols → "Deleted"
  4. Risk formula: only modified symbols with upstream callers and deleted symbols contribute to risk. New code cannot break existing code.

Real-world results

PricePal v1.3 (iOS 26 app — added widgets, intents, services)

Before (broken) After (this PR)
Added 0 514
Modified 711 (everything) 188
Deleted 0 9
Risk HIGH (meaningless) HIGH (57) — justified
Explanation (none) "514 new symbols added (low risk — no existing code depends on them yet). 188 symbols modified, affecting 3 execution flows. 9 symbols removed — any code that calls them will break."

CuePrompt PiP rewrite (macOS/iOS teleprompter — heavy refactor)

Before (broken) After (this PR)
Added 0 2
Modified 120 (everything) 73
Deleted 0 45
Risk some process count CRITICAL (261) — correct
Explanation (none) "2 new symbols added (low risk). 73 symbols modified, affecting 26 execution flows. 45 symbols removed — any code that calls them will break."

PricePal scored HIGH because it's mostly new code with a few core files touched. CuePrompt scored CRITICAL because 45 symbols were deleted during a PiP rewrite — that's genuinely dangerous. The old tool would have given similar meaningless scores for both.

Changes

src/mcp/local/local-backend.ts

  • extractSymbolNamesFromContent() — new helper: parses file content with tree-sitter and returns Set<"Type:name"> for diffing
  • detectChanges() — rewritten:
    • Uses --diff-filter=A/M/D instead of --name-only
    • For modified files: parses old version at base_ref to diff symbol sets
    • For deleted files: parses old version to identify removed symbols
    • Extracts node type from ID prefix (LadybugDB doesn't support labels())
    • Weighted risk formula instead of simple process count
    • Human-readable explanation in summary
    • Skips process lookup for Added symbols (optimization — no callers possible)

src/mcp/tools.ts

  • Updated tool description to document change types and risk formula

New summary fields

{
  "changed_count": 711,
  "added_count": 514,
  "modified_count": 188,
  "deleted_count": 9,
  "affected_count": 3,
  "changed_files": 39,
  "risk_level": "high",
  "risk_score": 57,
  "explanation": "514 new symbols added (low risk — no existing code depends on them yet). 188 symbols modified, affecting 3 execution flows. 9 symbols removed — any code that calls them will break."
}

Backward compatibility

  • changed_symbols[].change_type was always "Modified" before — now it can be "Added", "Modified", or "Deleted". Consumers that only checked for "Modified" will still work.
  • New summary fields (added_count, modified_count, deleted_count, risk_score, explanation) are additive — existing consumers won't break.
  • risk_level values unchanged: low, medium, high, critical.

Test plan

  • Tested on PricePal (61-file iOS 26 app, 39 changed files, 711 symbols)
  • Tested on CuePrompt (19-file macOS/iOS app, 4 changed files, 120 symbols)
  • All 3,589 existing tests pass (2 pre-existing Swift return-type failures unrelated)
  • TypeScript compiles clean
  • Works across all scopes: compare, staged, all, unstaged

Fixes #415

🤖 Generated with Claude Code

marxo126 and others added 9 commits March 21, 2026 19:52
The patch script fails to parse tree-sitter-swift@0.6.0's binding.gyp
because the file contains both Python-style # comments AND trailing
commas in JSON arrays. The existing regex strips # comments but leaves
trailing commas, causing JSON.parse() to fail with:

  "Unexpected token ']'"

This silently prevents tree-sitter-swift from building, which means
Swift files are skipped entirely during analysis.

Fix: add a second regex pass to strip trailing commas before ] or }
after comment removal.

Fixes abhigyanpatwari#386, abhigyanpatwari#406

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three changes that together enable cross-file call resolution for Swift:

1. export-detection.ts: Treat internal (default) Swift symbols as exported.
   Swift's default access level is `internal` (module-scoped, visible to
   all files in the same target). Only private/fileprivate are file-scoped.
   Previously all non-public/open symbols were marked unexported.

2. import-processor.ts: Add implicit import edges between all Swift files
   in the same module/target. Swift has no file-level imports — all files
   see each other automatically. Without these edges, the tiered resolver
   can't find cross-file symbols at Tier 2a (import-scoped).
   Supports SPM targets via Package.swift; falls back to single-module
   for Xcode projects without SPM.

3. call-processor.ts: Add constructor fallback for free-form calls.
   Swift constructors look like free function calls (no `new` keyword):
   `let ocr = OCRService()`. The call form is inferred as `free`, which
   filters out Class/Struct targets. Now retries with `constructor` form
   when free-form finds no callable but the name resolves to a type.

Tested on 61-file iOS 26 project (PricePal):
- Before: 0 cross-file CALLS edges
- After: full cross-file resolution (OCRService traced from ScanViewModel)
- 3,099 nodes, 10,449 edges, 246 clusters, 243 flows

Related: abhigyanpatwari#406, abhigyanpatwari#407

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract shared addSwiftImplicitImports() helper (DRY — was duplicated
  in processImports and processImportsFromExtracted)
- Cache importMap.get(srcFile) outside inner loop (avoids redundant
  Map lookups per iteration)
- Fix export-detection: use \bprivate\b regex instead of includes()
  to avoid substring false positives
- Fix groupSwiftFilesByTarget: check path boundary with indexOf + char
  check instead of loose includes()

All 141 relevant tests pass (queries + imports + calls).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When Swift extensions create multiple Class nodes with the same name
(e.g. Product.swift + ProductMatchableConformance.swift), the call
resolver gets multiple candidates and refuses to emit a CALLS edge.

Add dedup: when all candidates share the same type (Class/Struct) and
differ only by file, prefer the primary definition (shortest filepath).

Note: This fix is partial — some constructor calls inside function
bodies may still be consumed by the type-env constructor binding
scanner before reaching resolveCallTarget. Filed as known limitation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Upgrades:
- tree-sitter: ^0.21.0 → ^0.22.4
- tree-sitter-swift: ^0.6.0 → ^0.7.1

Benefits:
- tree-sitter-swift@0.7.1 ships prebuilds (no manual node-gyp needed)
- Fixes parsing of Swift 5.9+ features: #Predicate, typed throws, ~Copyable
- All 13 language parsers verified compatible with tree-sitter@0.22.4

Tested:
- All 141 query/import/call tests pass
- All 13 parsers (C, C++, C#, Go, Java, JS, Kotlin, PHP, Python, Ruby,
  Rust, TypeScript, Swift) parse correctly with 0.22.4
- PricePal (61-file iOS 26 project) indexes fully: 3,094 nodes, 10,459 edges

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix assignment query: tree-sitter-swift 0.7.1 uses named fields
  (target:/result:/suffix:) instead of positional children
- Update export detection tests: Swift `internal` (default) is now
  correctly treated as exported (module-scoped visibility)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
detect_changes previously marked ALL symbols in changed files as
"Modified", making risk scores useless for feature branches (e.g.,
adding 300 new symbols scored HIGH when actual risk was LOW).

Changes:
- Use git diff --diff-filter to classify files as Added/Modified/Deleted
- For added files: all symbols → change_type "Added" (free, no parsing)
- For modified files: parse old version with tree-sitter, diff symbol
  names to classify each as Added/Modified/Deleted within the file
- For deleted files: parse old version to identify lost symbols
- New weighted risk formula:
  score = (modified_with_callers × 3) + (deleted × 5) + (added × 0.1)
- Summary now includes added_count, modified_count, deleted_count
- Skip process lookup for Added symbols (no callers can exist yet)

Fixes abhigyanpatwari#415

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LadybugDB doesn't support Cypher labels() — node types are encoded
in the ID prefix (e.g., "Struct:path:name"). Extract type from ID
to correctly match symbol keys between old and new versions.

Also move baseRef computation outside the modified-files loop and
fix deleted-files to reuse the same baseRef.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add explanation field to summary with plain-language risk description
- Fix modifiedWithCallers to only count symbols that actually participate
  in execution flows (was counting ALL modified when ANY flow existed)
- Remove added symbols from risk score — new code cannot break existing
  code, so it contributes zero risk weight
- Final formula: score = (modified_with_callers × 3) + (deleted × 5)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Mar 21, 2026

@marxo126 is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

detect_changes treats new symbols as Modified — inflates risk on feature branches

1 participant