Skip to content

feat(type-resolution): cross-file binding propagation#397

Merged
magyargergo merged 7 commits intomainfrom
feat/phase14-cross-file-binding-propagation
Mar 21, 2026
Merged

feat(type-resolution): cross-file binding propagation#397
magyargergo merged 7 commits intomainfrom
feat/phase14-cross-file-binding-propagation

Conversation

@magyargergo
Copy link
Collaborator

Summary

  • Cross-file type binding propagation via ExportedTypeMap — resolved bindings from upstream files seed downstream files' scopeEnv, enabling cross-file receiver resolution (e.g., import { user } from './service'; user.save() → CALLS edge to User#save)
  • Kahn's algorithm topological sort groups files by dependency level for correct processing order with cycle detection
  • Two-pass architecture: Pass 1 (existing, unchanged) collects ExportedTypeMap; Pass 2 re-resolves affected files with seeded bindings in topological order
  • Worker path support via buildExportedTypeMapFromGraph — collects exports from graph + SymbolTable when TypeEnv isn't available in main thread
  • Security hardening: path validation against allPathSet, per-file export cap (500), type name length limit (256 chars)
  • 3% skip threshold: bypasses re-resolution when too few files benefit
  • Primary beneficiaries: TypeScript, JavaScript, Python (full namedImportMap support)

Key Design Decisions

  1. Seeding AFTER walk(), BEFORE fixpoint — local declarations always take precedence (first-writer-wins)
  2. No isExported on SymbolDefinition — uses graph node lookup, keeps SymbolTable lean
  3. Inline topological sort — ~50 lines in pipeline.ts, not a separate module
  4. processCalls re-used for re-resolution — same call resolution logic, same generateId for idempotent edge overwrite

Files Changed

File Change
type-env.ts importedBindings on BuildTypeEnvOptions, seedImportedBindings()
call-processor.ts ExportedTypeMap type, collectExportedBindings(), buildExportedTypeMapFromGraph(), importedBindingsMap param
pipeline.ts topologicalLevelSort(), re-resolution pass orchestration
8 fixture files Cross-file binding test scenarios (simple, re-export, circular)
3 test files 32 new tests (11 topo sort, 6 seeding, 15 integration)

Test plan

  • 3454 tests pass (32 net new, 0 regressions)
  • Unit tests: topological sort (DAG, cycles, diamonds, disconnected components)
  • Unit tests: scopeEnv seeding (local-wins, empty map, multiple bindings, nested lookup)
  • Integration tests: simple cross-file, re-export chain, circular imports
  • Code review: 2 critical issues found and fixed (CALLS re-resolution, worker path population)
  • TypeScript strict-mode: zero type errors

Post-Deploy Monitoring & Validation

No additional operational monitoring required: this is a build-time static analysis enhancement with no runtime component. The feature runs during npx gitnexus analyze and produces additional CALLS edges in the knowledge graph. Dev-mode logging (🔗 Cross-file re-resolution: N files re-processed in Xms) provides diagnostic output.


Compound Engineered

Add ExportedTypeMap infrastructure to propagate resolved type bindings
across file boundaries. When file A exports `const user = getUser()`
(resolved to `User`), file B importing `user` now gets seeded with
`user → User`, enabling `user.save()` to produce CALLS edges.

Key components:
- `importedBindings` option on BuildTypeEnvOptions with scopeEnv seeding
  AFTER walk() to respect first-writer-wins (local declarations win)
- `collectExportedBindings()` in call-processor using graph node
  isExported flag (no SymbolDefinition changes needed)
- Inline Kahn's algorithm topological sort with level grouping for
  parallel-safe file ordering and cycle detection
- Re-resolution pass in pipeline.ts: topological order, 3% skip
  threshold, path validation, per-file export caps (500)
- 32 new tests: 11 topological sort, 6 seeding, 15 integration
  (simple cross-file, re-export chain, circular imports)

All 3454 tests pass (32 net new, 0 regressions).
Critical fixes:
- Re-resolution pass now actually re-resolves CALLS edges by calling
  processCalls with importedBindingsMap (was building typeEnv but
  discarding it without producing edges)
- Worker path populates ExportedTypeMap via buildExportedTypeMapFromGraph
  using graph node isExported + SymbolTable returnType/declaredType
  (was dead parameter in processCallsFromExtracted)

Important fixes:
- Skip threshold denominator uses totalFiles (was exportedTypeMap.size +
  filesWithGaps which made threshold nearly useless)
- processCalls accepts importedBindingsMap parameter to thread cross-file
  bindings into buildTypeEnv during re-resolution

All 3454 tests pass.
@vercel
Copy link

vercel bot commented Mar 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gitnexus Ready Ready Preview, Comment Mar 21, 2026 8:30am

Request Review

@magyargergo magyargergo changed the title feat(type-resolution): Phase 14 — cross-file binding propagation feat(type-resolution): cross-file binding propagation Mar 20, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 20, 2026

CI Report

All checks passedfb20a3c

Pipeline

Stage Status Ubuntu Windows macOS
Typecheck success
Tests success

Tests

Metric Value
Total 3571
Passed 3551
Skipped 20
Files 1018
Duration 1m 55s

✅ All 3551 tests passed across 1018 files

20 test(s) skipped
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature
  • Python match/case as-pattern type binding > resolves u.save() to User#save via match/case as-pattern binding
  • Python match/case as-pattern type binding > does NOT resolve u.save() to Repo#save (negative disambiguation)
  • Swift constructor-inferred type resolution > detects User and Repo classes, both with save methods
  • Swift constructor-inferred type resolution > resolves user.save() to Models/User.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > resolves repo.save() to Models/Repo.swift via constructor-inferred type
  • Swift constructor-inferred type resolution > emits exactly 2 save() CALLS edges (one per receiver type)
  • Swift self resolution > detects User and Repo classes, each with a save function
  • Swift self resolution > resolves self.save() inside User.process to User.save, not Repo.save
  • Swift parent resolution > detects BaseModel and User classes plus Serializable protocol
  • Swift parent resolution > emits EXTENDS edge: User → BaseModel
  • Swift parent resolution > emits IMPLEMENTS edge: User → Serializable (protocol conformance)
  • Swift cross-file User.init() inference > resolves user.save() via User.init(name:) inference
  • Swift cross-file User.init() inference > resolves user.greet() via User.init(name:) inference
  • Swift return type inference > detects User class and getUser function
  • Swift return type inference > detects save function on User (Swift class methods are Function nodes)
  • Swift return type inference > resolves user.save() to User#save via return type of getUser() -> User
  • Swift return-type inference via function return type > resolves user.save() to User#save via return type of getUser()
  • Swift return-type inference via function return type > user.save() does NOT resolve to Repo#save
  • Swift return-type inference via function return type > resolves repo.save() to Repo#save via return type of getRepo()

Coverage

Metric Coverage Covered Base (main) Delta
Statements 68.66% 9053/13184 68.14% 📈 +0.5%
Branches 59.86% 6179/10322 59.44% 📈 +0.4%
Functions 71.2% 779/1094 70.27% 📈 +0.9%
Lines 70.97% 8073/11374 70.46% 📈 +0.5%

📋 Full run · Coverage from Ubuntu · Generated by CI

…arjan's SCC, cross-file return types

E0: Fix AST cache thrashing in re-resolution loop (was creating size-1
cache per file), batch file reads per topological level, add
MAX_CROSS_FILE_REPROCESS=2000 cap for adversarial repos.

E1: seedCrossFileReceiverTypes() — enrich ExtractedCall.receiverTypeName
from ExportedTypeMap+namedImportMap in O(1) Map lookups, eliminating
re-parse for ~80-90% of single-hop cross-file receiver types.

E2: computeImportCycleSCCs() — iterative Tarjan's SCC on cycle subgraph
from Kahn's output. Dev-mode diagnostic logging of individual import
cycle components.

E3: buildImportedReturnTypes() + ReturnTypeLookup extension — cross-file
return type propagation with corrected local-first priority (SymbolTable
checked first, cross-file fallback only on 0 matches, ambiguous 2+
returns undefined).

E4: PARALLEL_RE_RESOLUTION_THRESHOLD constant, timing metrics, worker
parallelization design comments (deferred implementation).

24 new tests (6 E1 + 6 E2 + 7 E3 unit + 5 E3 integration). All 3478
tests pass.
1. Replace fragile regex in buildExportedTypeMapFromGraph with
   extractReturnTypeName() for consistent generic unwrapping.

2. Fix gap detection precision: check upstream.has(binding.exportedName)
   instead of exportedTypeMap.has(binding.sourcePath) to avoid
   over-counting files that import only unrelated symbols.
…er fixes

Add cross-file binding propagation integration tests for 6 languages
(Python, JavaScript, Java, PHP, C#, Kotlin) with fixture repos and
HAS_METHOD edge assertions.

Fix 3 language resolver issues uncovered by tests:
- Kotlin: class methods now labeled Method (not Function) via shared
  isKotlinClassMethod() utility; non-aliased imports create NamedImportMap
  entries; 2-segment package-directory fallback for top-level function imports
- PHP: namespace-directory fallback for use-function imports; SuffixIndex
  preferred over linear scan; path traversal rejection; PSR-4 sort cached
- JVM: root-level package path matching; indexOf→lastIndexOf for correctness

Address code review findings:
- Extract runCrossFileBindingPropagation() from 810-line pipeline function
- Replace (importCtx as any) casts with typed dispose() method
- Remove Tarjan's SCC (dev-only YAGNI, ~85 lines)
- Remove dead PARALLEL_RE_RESOLUTION_THRESHOLD constant
- Fix constructor handling divergence in getLabelFromCaptures
- Optimize gap pre-scan with early exit once threshold exceeded
- Fix findEnclosingFunction any→SyntaxNode type
@magyargergo
Copy link
Collaborator Author

@claude ultra-think while identifying per-language gaps for all of our supported languages (except swift). Perfom this analysis on this pull request and verify those findings. Act as a senior compiler font-end engineer and expert with static analysis tools and review this change from that perpective and architectural fit. Also check out the previous comments. You must also cross-reference it with our type-resolution-roadmap.md and see if we need to update type-resolution-system.md after this change.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 20, 2026

Claude finished @magyargergo's task in 7m 2s —— View job


Cross-File Binding Propagation — Deep Review


Per-Language Gap Analysis

Languages with Full or Strong Phase 14 Coverage

TypeScript / JavaScript — Full coverage. All three mechanisms work: E1 (seedCrossFileReceiverTypes mutates receiverTypeName in extracted call records), E2 (ExportedTypeMap seeded into importedBindings for re-resolution), and E3 (importedReturnTypes for cross-file call-result chains). Re-export chains (export { X } from './y') and barrel files (index.ts) are handled via extractTsNamedBindings re-export clause support. 4 fixture scenarios with comprehensive assertions. ✅

Python — Strong coverage. extractPythonNamedBindings handles from models import User and aliased from models import getUser as g. E3 works because Python explicitly annotated function return types land in SymbolTable. Gap: plain import models (non-from style) produces no namedImportMap entries — models.getUser() call chains from module-qualified access are not benefited. This is minor since Python strongly favors from imports in typed code. Tests: py-cross-file fixture. ✅

Kotlin — Strong coverage. extractKotlinNamedBindings correctly handles top-level function imports (import models.getUser) including lowercase names, which is where Java falls short. Aliased imports (import com.example.User as U) handled correctly. File-scope top-level Kotlin properties and functions can appear in ExportedTypeMap via collectExportedBindings. Tests: kotlin-cross-file fixture. ✅ with one caveat (see gap below).


Languages with Limited Phase 14 Coverage

Java — Two significant gaps:

  1. import static X.Y.method is not captured. extractJavaNamedBindings at named-binding-extraction.ts:379 explicitly skips lowercase last segments:

    if (className[0] && className[0] === className[0].toLowerCase()) return undefined;

    import static models.UserFactory.getUsergetUser is lowercase → no namedImportMap entry. This means E3 (buildImportedReturnTypes) never sees getUser for App.java. The Java cross-file test passes only because Phase 9 SymbolTable lookup (lookupFuzzyCallable('getUser')) finds the single getUser definition unambiguously. Phase 14 itself contributes nothing for Java static method imports — the test title "via cross-file return type propagation" is accurate only incidentally, not mechanically.

  2. collectExportedBindings is essentially dead code for Java. It collects file-scope ('') bindings from TypeEnv, but Java has no file-scope variables — every variable lives inside a class. The ExportedTypeMap for Java files will always be empty, making E1/E2 inert.

Net: Phase 14 adds zero new resolution capability for Java. Existing Phase 9 SymbolTable lookup covers the tested scenarios. The benefit would only appear in multi-ambiguous-getUser repos where named import context is needed to disambiguate — which extractJavaNamedBindings currently cannot provide for methods.

C# — Very limited. extractCsharpNamedBindings at named-binding-extraction.ts:339 only captures alias imports (using Alias = Namespace.Type). Standard using directives (using System; using MyApp.Models;) produce zero namedImportMap entries. This means for typical C# code, E3 and E1/E2 are effectively inert. Resolution falls back entirely to Phase 9 SymbolTable + Phase P virtual dispatch. collectExportedBindings is also dead for C# (class-scoped variables, not file-scope).

PHP — Partial. Class imports (use App\Models\User) are captured by extractPhpNamedBindings. But PHP's cross-file pattern is primarily UserFactory::getUser() (static method on imported class) rather than named variable propagation. ExportedTypeMap captures would apply to file-scope PHP variables, but PHP modules rarely export variables at file scope. E3 would apply if getUser were in namedImportMap, but PHP use statements import classes, not functions. Similar to Java: Phase 14 benefits are marginal beyond Phase 9.

Rust — Named bindings extracted correctly (use crate::models::{User, Repo}, use crate::utils::helper as h). E3 should work in theory. Gap: no cross-file integration test for Rust. The Rust resolver and named binding extraction are tested in isolation but there is no rs-cross-file fixture analogous to ts-simple or py-cross-file.

Kotlin (class member imports)extractKotlinNamedBindings at line 213 skips imports where the second-to-last segment is PascalCase (class name), e.g. import com.example.UserService.helperMethod. The rationale (collision avoidance) is sound, but it means class method imports don't populate namedImportMap. In practice Kotlin rarely does this, so the gap is low-impact.


Languages Where Phase 14 Is Entirely Inert

Go — Package-level imports (import "github.com/company/models") are stored in ImportMap/PackageMap but never in namedImportMap. runCrossFileBindingPropagation early-returns at line 118: if (exportedTypeMap.size === 0 || ctx.namedImportMap.size === 0) return;. Phase 14 is completely skipped for pure-Go repos. Resolution relies entirely on Phase 9 SymbolTable.

Rubyrequire/require_relative populate ImportMap only. No namedImportMap. Phase 14 entirely inert.

C / C++#include directives populate ImportMap only. No namedImportMap. Phase 14 entirely inert.


Architecture Assessment (Compiler Front-End Perspective)

Seeding placement (AFTER walk, BEFORE fixpoint) is the single most important correctness decision, and it's right. From the code at type-env.ts:1032-1040:

walk(tree.rootNode, FILE_SCOPE);
// Phase 14: AFTER walk, local declarations already in env
if (options?.importedBindings && options.importedBindings.size > 0) {
  seedImportedBindings(env, options.importedBindings);
}
resolveFixpointBindings(pendingItems, env, returnTypeLookup, symbolTable, parentMap);

This ordering guarantees local declarations always shadow imported bindings (first-writer-wins in seedImportedBindings), and imported bindings participate in the fixpoint as initial conditions for copy-chain resolution (const alias = importedUser correctly resolves alias → User). A compiler front-end would call this "correct use of environment threading into the fixpoint lattice."

Worker path quality asymmetry — This is the most significant architectural concern. The sequential path (processCalls) calls collectExportedBindings(typeEnv, ...) which captures ALL resolved types including fixpoint-inferred ones (Tier 1/2). The worker path falls back to buildExportedTypeMapFromGraph which only reads def.returnType ?? def.declaredType from SymbolDefinition — explicitly annotated types only (Tier 0). For large repos that always use the worker pool (totalParseable >= 15 or totalBytes >= 512KB), Phase 14 E1/E2 is limited to explicitly annotated exports. Fixpoint-inferred exports like export const user = getUser() (where user → User was inferred, not annotated) are missed in the worker path. This is a silent quality split between small-repo and large-repo behavior.

Topological sort (Kahn's BFS) is correctly implemented. The in-degree calculation handles the ReadonlySet<string> deps correctly (no double-counting). Cycle nodes (positive in-degree after BFS) are correctly isolated in the final level. The grouping semantics (files in the same level are independent) are preserved. One observation: the final cycle level may intermix files from multiple distinct SCCs — Tarjan's SCC would partition them, but given the conservative "no cross-cycle propagation" policy, the grouping is fine.

resolveMethodReturnType and E3 (importedReturnTypes) — The local-first principle is correctly implemented. From type-env.ts:698-711:

if (callables.length === 1) return extractReturnTypeName(callables[0].returnType);
if (callables.length > 1) return undefined; // conservative
// Only 0 results → cross-file fallback
return options?.importedReturnTypes?.get(callee);

This correctly refuses to cross-file propagate when local SymbolTable is ambiguous. This is the right conservatism for a static analysis tool that prefers false negatives over false positives.

seedCrossFileReceiverTypes (E1) is a zero-I/O pre-seeding pass that mutates ExtractedCall.receiverTypeName in-place. It only handles callForm === 'member' and !call.receiverTypeName — direct single-hop imported receiver. Two-hop patterns (import { service }; service.getUser().save()) are not handled by E1 and fall through to the full re-resolution pass. This is the correct split: E1 handles the 80% case cheaply, E2/re-resolution handles depth.

Idempotent re-resolutionprocessCalls is called twice for cross-file candidates. The generateId-based edge IDs ensure that re-emitting the same CALLS edge is a no-op (map overwrite). This is correct, but there is a cost: the re-resolution pass re-parses and rebuilds TypeEnv for each candidate file. The astCache (cap 50) for the re-resolution pass should absorb most of this via cache hits within a level, but the AST cache is cleared between levels (astCache.clear() is not called per level — it's reused across the re-resolution loop). Actually looking at the code, astCache for the re-resolution pass is created once at line 165 with cap 50 and used throughout the entire re-resolution loop. This is correct.

Security hardening — Path validation (allPathSet.has(filePath)), per-file export cap (500), type name length (256 chars) are all sound defensive measures. The path traversal guard in resolvePhpImport (normalized.includes('..')) is also correct.


Findings Summary by Language

Language namedImportMap ExportedTypeMap (E1/E2) E3 (importedReturnTypes) Phase 14 Benefit
TypeScript ✅ Full ✅ File-scope vars ✅ Full High
JavaScript ✅ Full ✅ File-scope vars ✅ Full High
Python ✅ from-imports ✅ File-scope vars ✅ Full High
Kotlin ✅ Top-level fns ✅ Top-level props ✅ Full High
Rust ✅ use clauses Limited ✅ Theory only* Low (no test)
PHP ✅ use classes Dead (class-scope) Dead (no fn imports) Marginal
Java ⚠️ Classes only¹ Dead (no file-scope) Dead (static imports)¹ None
C# ⚠️ Alias-only² Dead (no file-scope) Dead (no alias imports)² None
Go ❌ None N/A N/A None
Ruby ❌ None N/A N/A None
C/C++ ❌ None N/A N/A None

¹ Java import static X.Y.method not captured — lowercase filter at named-binding-extraction.ts:379
² C# non-alias using directives not captured — extractCsharpNamedBindings only handles using Alias = Type
* Rust cross-file test missing


Actionable Gaps

1. Java static method imports not captured (Fix this →)

// named-binding-extraction.ts:379 — the lowercase guard is too broad
// `import static models.UserFactory.getUser` should produce {local: 'getUser', exported: 'getUser'}
if (className[0] && className[0] === className[0].toLowerCase()) return undefined; // blocks all method imports

2. Worker path ExportedTypeMap misses fixpoint-inferred exports — Large repos (worker pool active) only get Tier 0 (explicitly annotated) exports in the ExportedTypeMap. Small repos get full TypeEnv fixpoint results. This creates a silent quality split. Consider whether buildExportedTypeMapFromGraph should be augmented or whether the limitation should be documented explicitly.

3. Rust cross-file test missing — There is no rs-cross-file fixture or integration test. Given that extractRustNamedBindings handles use declarations, a basic use models::{User}; let u = get_user(); u.save() scenario should be testable.

4. C# non-alias using not capturedusing MyApp.Models; brings User into scope but extractCsharpNamedBindings only handles the alias form using U = MyApp.Models.User. C# resolution is entirely SymbolTable-dependent for the common case.


Roadmap Cross-Reference

type-resolution-roadmap.md needs these updates:

  • Phase 14 status: change from Impact: High | Effort: High (open) → ✅ Delivered with language coverage note
  • Language-Specific Gaps section currently says "All languages: Cross-file binding propagation → Phase 14" — should now list per-language delivery status
  • Milestone E — Cross-Boundary should be marked ✅
  • ## Open Phases should remove Phase 14 (it is now delivered)

type-resolution-system.md needs these updates:

  1. Current Limitations: Remove "no general cross-file propagation of inferred bindings" — now delivered for TS/JS/Python/Kotlin. Replace with the language-specific caveat that Go/Ruby/C/C++/Java-static/C#-non-alias remain without cross-file coverage.
  2. Architecture diagram: The "Per-file" description of buildTypeEnv should note that Phase 14 seeds upstream bindings into the file scope before the fixpoint.
  3. Language Feature Matrix: Add a Cross-file binding propagation row. Suggested values: Yes for TS/JS/Python/Kotlin, No for Java/C#/PHP/Go/Ruby/Rust/Swift/C++/C.
  4. Design Constraints table: "propagate inferred bindings across files as ordinary environment entries" → this is now done for supported languages, update accordingly.

Verdict

The architecture is sound. The seeding-after-walk placement is the correct invariant. Kahn's BFS is correctly implemented. E1/E2/E3 are well-separated concerns. The conservatism decisions (local-first, no cross-cycle propagation, 3% skip threshold) are appropriate for a static analysis tool.

The main risks are:

  • Java and C# cross-file tests validate Phase 9 SymbolTable behavior, not Phase 14 mechanisms. The tests pass but for the wrong reason — they'd pass identically without Phase 14.
  • Worker-path vs sequential-path quality split in ExportedTypeMap coverage — silent and untested.
  • Go/Ruby/C/C++ receive zero benefit from this phase — expected by design but worth documenting explicitly for contributors.

The type-resolution-system.md needs updating to reflect Phase 14 delivery; it still says "no general cross-file propagation of inferred bindings" in Current Limitations.


Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a Phase 14 “cross-file type binding propagation” pass to improve call receiver/type resolution across module boundaries by collecting exported type bindings, processing files in topological import order, and re-resolving downstream files with seeded bindings.

Changes:

  • Added ExportedTypeMap collection + a second re-resolution pass in pipeline.ts, using a Kahn topological-level sort to order propagation.
  • Extended buildTypeEnv to accept importedBindings (seeded into file scope) and importedReturnTypes (cross-file return-type fallback).
  • Improved language import/label handling (notably Kotlin method/function labeling + JVM/Kotlin import resolution) and added extensive unit/integration fixtures and tests.

Reviewed changes

Copilot reviewed 48 out of 48 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
gitnexus/src/core/ingestion/type-env.ts Adds imported binding/return-type options, seeding logic, and SymbolTable-vs-cross-file precedence rules.
gitnexus/src/core/ingestion/call-processor.ts Introduces ExportedTypeMap, export collection helpers, cross-file receiver seeding, and new params for call processing.
gitnexus/src/core/ingestion/pipeline.ts Adds topological level sort + orchestrates Phase 14 re-resolution pass and related thresholds/caps.
gitnexus/src/core/ingestion/import-processor.ts Adds dispose() to release import-resolution structures and improves Kotlin top-level function import resolution.
gitnexus/src/core/ingestion/named-binding-extraction.ts Refines Kotlin named binding extraction to support top-level function imports while avoiding class-member collisions.
gitnexus/src/core/ingestion/parsing-processor.ts Aligns Kotlin class-body function_declaration labeling to Method for consistent IDs/labels.
gitnexus/src/core/ingestion/workers/parse-worker.ts Adjusts label detection (constructor-without-name) and Kotlin function-vs-method labeling in worker extraction.
gitnexus/src/core/ingestion/utils.ts Adds isKotlinClassMethod + ensures extractFunctionName labels Kotlin class methods consistently.
gitnexus/src/core/ingestion/resolvers/php.ts Caches sorted PSR-4 mappings and adds function-import fallback behavior with path traversal guard.
gitnexus/src/core/ingestion/resolvers/jvm.ts Improves JVM wildcard resolution for both nested and root-level package paths.
gitnexus/test/unit/type-env.test.ts Adds unit tests for importedBindings seeding and importedReturnTypes precedence behavior.
gitnexus/test/unit/topological-sort.test.ts Adds unit coverage for topologicalLevelSort behavior (DAGs, diamonds, cycles, disconnected graphs).
gitnexus/test/unit/call-processor.test.ts Adds tests for seedCrossFileReceiverTypes enrichment behavior.
gitnexus/test/integration/resolvers/helpers.ts Adds CROSS_FILE_FIXTURES path constant used by new integration tests.
gitnexus/test/integration/resolvers/javascript.test.ts Adds JS cross-file binding propagation integration scenario.
gitnexus/test/integration/resolvers/python.test.ts Adds Python cross-file binding propagation integration scenario.
gitnexus/test/integration/resolvers/php.test.ts Adds PHP cross-file binding propagation integration scenario.
gitnexus/test/integration/resolvers/java.test.ts Adds Java cross-file binding propagation integration scenario.
gitnexus/test/integration/resolvers/csharp.test.ts Adds C# cross-file binding propagation integration scenario.
gitnexus/test/integration/resolvers/kotlin.test.ts Updates Kotlin expectations from Function→Method and adds Kotlin cross-file binding propagation scenario.
gitnexus/test/integration/cross-file-binding.test.ts Adds Phase 14 integration coverage for TS simple, TS re-export, TS return-type propagation, and TS cycles.
gitnexus/test/fixtures/cross-file-binding/ts-simple/src/models.ts Fixture for TS simple cross-file propagation (User/getUser).
gitnexus/test/fixtures/cross-file-binding/ts-simple/src/service.ts Fixture exporting user = getUser() for downstream import seeding.
gitnexus/test/fixtures/cross-file-binding/ts-simple/src/app.ts Fixture consuming imported user and calling methods.
gitnexus/test/fixtures/cross-file-binding/ts-return-type/src/api.ts Fixture for return-type propagation (getConfig(): Config).
gitnexus/test/fixtures/cross-file-binding/ts-return-type/src/consumer.ts Fixture consuming imported callable and calling method on returned value.
gitnexus/test/fixtures/cross-file-binding/ts-reexport/src/core.ts Fixture for re-export chain source module.
gitnexus/test/fixtures/cross-file-binding/ts-reexport/src/index.ts Fixture re-export barrel.
gitnexus/test/fixtures/cross-file-binding/ts-reexport/src/app.ts Fixture consuming re-exported callable.
gitnexus/test/fixtures/cross-file-binding/ts-circular/src/a.ts Fixture for circular TS imports.
gitnexus/test/fixtures/cross-file-binding/ts-circular/src/b.ts Fixture for circular TS imports.
gitnexus/test/fixtures/cross-file-binding/py-cross-file/src/models.py Fixture for Python cross-file callable return type → receiver resolution.
gitnexus/test/fixtures/cross-file-binding/py-cross-file/src/app.py Fixture consuming Python import and calling methods on returned instance.
gitnexus/test/fixtures/cross-file-binding/php-cross-file/composer.json Fixture composer PSR-4 mapping for PHP import resolution.
gitnexus/test/fixtures/cross-file-binding/php-cross-file/app/Models/User.php Fixture for PHP class/method targets.
gitnexus/test/fixtures/cross-file-binding/php-cross-file/app/Models/UserFactory.php Fixture for PHP function export returning User.
gitnexus/test/fixtures/cross-file-binding/php-cross-file/app/Main.php Fixture consuming use function import and calling methods on returned value.
gitnexus/test/fixtures/cross-file-binding/kotlin-cross-file/models/User.kt Fixture for Kotlin top-level function returning User.
gitnexus/test/fixtures/cross-file-binding/kotlin-cross-file/app/App.kt Fixture consuming Kotlin import and calling methods.
gitnexus/test/fixtures/cross-file-binding/js-cross-file/src/models.js Fixture for JS exported function returning User.
gitnexus/test/fixtures/cross-file-binding/js-cross-file/src/app.js Fixture consuming JS import and calling methods.
gitnexus/test/fixtures/cross-file-binding/java-cross-file/models/User.java Fixture for Java target methods.
gitnexus/test/fixtures/cross-file-binding/java-cross-file/models/UserFactory.java Fixture for Java static export returning User.
gitnexus/test/fixtures/cross-file-binding/java-cross-file/app/App.java Fixture consuming Java static import and calling methods.
gitnexus/test/fixtures/cross-file-binding/csharp-cross-file/CrossFile.csproj Fixture project file for C# resolution context.
gitnexus/test/fixtures/cross-file-binding/csharp-cross-file/Models/User.cs Fixture for C# target methods.
gitnexus/test/fixtures/cross-file-binding/csharp-cross-file/Models/UserFactory.cs Fixture for C# static export returning User.
gitnexus/test/fixtures/cross-file-binding/csharp-cross-file/App/Program.cs Fixture consuming C# static import and calling methods.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

@xkonjin xkonjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Cross-File Binding Propagation (Type Resolution)

This is a substantial feature addition. The core idea — propagating resolved type bindings across file boundaries using topological import ordering — is architecturally sound and addresses a real limitation in single-file type inference.

Positives

  • Topological level sort (topologicalLevelSort via Kahn's algorithm) is correct and handles cycles gracefully by dumping them into a final group. This ensures upstream bindings are resolved before downstream consumers.
  • Guard rails are well-calibrated: MAX_EXPORTS_PER_FILE (500), MAX_TYPE_NAME_LENGTH (256), MAX_CROSS_FILE_REPROCESS (2000), and CROSS_FILE_SKIP_THRESHOLD (3%) prevent runaway processing on pathological repos.
  • seedCrossFileReceiverTypes mutating in-place is pragmatic for performance; the function signature makes the mutation intent clear.
  • ImportResolutionContext.dispose() is a nice addition — nulling the suffix index and clearing caches will meaningfully reduce memory pressure on large repos.
  • Kotlin improvements (top-level function import resolution, class method detection via isKotlinClassMethod) are well-scoped fixes that correctly distinguish import models.getUser from import util.OneArg.writeAudit.

Issues

  1. Potential quadratic behavior in runCrossFileBindingPropagation: For each level, you call readFileContents then processCalls per file. processCalls builds a fresh TypeEnv per file which re-parses ASTs. With the AST cache capped at 50, a level with 500+ files will thrash the cache. Consider batching files within a level into the existing processCalls multi-file path rather than calling it once per file.

  2. buildExportedTypeMapFromGraph fallback: This runs graph.forEachNode when the worker path left exportedTypeMap empty. On a 230K-node graph, this is a full scan. The function only checks isExported + lookupExactFull, but the scan itself could be expensive. Consider filtering by node type (Function/Method/Class) before the SymbolTable lookup.

  3. buildImportedReturnTypes — SymbolTable coupling: lookupExactFull is called per import per file. If SymbolTable uses a linear scan internally, this could be slow for files with many imports. Worth verifying the lookup is O(1) or O(log n).

  4. suffixIndex nullability: After dispose(), suffixIndex becomes null, but resolveLanguageImport accesses it without null checks (the index parameter). If any code path calls import resolution after disposal, this will throw. The type change (SuffixIndex | null) is correct, but downstream consumers should be audited.

  5. Test coverage gap: The diff adds significant new logic but I don't see corresponding test files for seedCrossFileReceiverTypes, buildExportedTypeMapFromGraph, topologicalLevelSort, or buildImportedReturnTypes. These are testable pure functions — unit tests would prevent regressions in edge cases (empty maps, self-imports, cycles, max-cap behavior).

  6. Minor: extractReturnTypeName is referenced but not defined in this diff. If it's new, its behavior with complex generics (Map<string, Promise<User[]>>) should be tested.

Summary

Strong architectural direction. The topological ordering approach is the right call for cross-file resolution. Main concerns are the per-file processCalls loop performance and missing unit tests for the new pure functions. Would recommend adding tests before merge and benchmarking the re-resolution pass on a large repo (10K+ files).

- Enhance C++ tree-sitter queries to support inline class method declarations and return types.
- Introduce `importedRawReturnTypes` in `BuildTypeEnvOptions` for cross-file raw return type handling.
- Add `FileTypeEnvBindings` interface to capture file-scope type bindings for exported symbols.
- Implement logic in `parse-worker.ts` to extract and serialize file-scope type bindings for cross-file type resolution.
- Create test fixtures for C++, Go, Ruby, and Rust to validate cross-file binding propagation.
- Update integration tests to verify correct resolution of method calls across files for C++, Go, Ruby, and Rust.
- Document Phase 14: Cross-File Binding Propagation in the type resolution roadmap and system documentation.
Copy link

@xkonjin xkonjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test

Copy link

@xkonjin xkonjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Cross-file Binding Propagation (Phase 14)

Overall: This is a substantial and architecturally sound addition. The topological-level-sort approach for dependency-ordered re-resolution is the right design, and the local-first principle is correctly enforced throughout. Several items worth attention:

Bugs / Correctness:

  1. topologicalLevelSort puts all cycle files in a single final group with no ordering guarantees. Within a cycle, file A may import from file B and vice versa. Since seedImportedBindings uses first-writer-wins, the propagation result for cycle files is non-deterministic (depends on iteration order). This is documented implicitly but worth an explicit comment at the call site, or consider running a second fixpoint pass over cycle files.

  2. buildExportedTypeMapFromGraph iterates ALL graph nodes via forEachNode. For large repos (200k+ nodes), this could be slow. Consider filtering by node type (Function, Method, Property) before checking isExported, or maintaining an exported-nodes index during parsing.

  3. In seedCrossFileReceiverTypes, the enrichedCount is incremented for every call record enriched, but multiple call records in the same file may share the same receiver. The log message 'Seeded N cross-file receiver types' is slightly misleading - it is call-site enrichments, not unique type resolutions.

Memory:

  1. The ImportResolutionContext.dispose() method is a good improvement over the previous unsafe cast. However, dispose() clears resolveCache but keeps allFilePaths and allFileList alive. If allFilePaths is the largest allocation, consider nulling it too (or documenting why it is retained).

  2. The AST cache is re-created per topological level (astCache = createASTCache(AST_CACHE_CAP) before the loop, then astCache.clear() after). Files in adjacent levels that share imports may benefit from cache hits. Consider clearing only at the end of the full propagation pass.

Security:

  1. resolvePhpImport now includes a path-traversal check (normalized.includes('..')). Good defensive addition.

Tests: The fixture files for Java, C#, Kotlin, and PHP cross-file binding scenarios are well-structured and test the actual pipeline end-to-end. Consider adding a test for the cycle-detection path (two files importing each other).

@magyargergo
Copy link
Collaborator Author

@claude ultra-think while identifying per-language gaps for all of our supported languages (except swift). Perfom this analysis on this pull request and verify those findings. Act as a senior compiler font-end engineer and expert with static analysis tools and review this change from that perpective and architectural fit. Also check out the previous comments. You must also cross-reference it with our type-resolution-roadmap.md and see if we need to update type-resolution-system.md after this change.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 21, 2026

Claude finished @magyargergo's task in 5m 14s —— View job


Cross-File Binding Propagation — Deep Review

  • Read all key source files
  • Verified per-language gaps against actual code
  • Cross-referenced type-resolution-roadmap.md and type-resolution-system.md
  • Applied fix to type-resolution-system.md (stale limitation — push blocked by permissions, see note below)

TL;DR

The previous deep review from the CI claude job was accurate at that moment in time, but the author had already addressed every critical finding before this review was requested. The architecture is sound. One stale line in type-resolution-system.md still needs removing (fix committed locally, push blocked by bot permissions — apply manually).


Verification of Previous Review Findings

Every issue cited by the prior review was checked against the actual current branch code:

Finding Previous Claim Actual Code State
Java static method imports Blocked by lowercase guard Fixed: named-binding-extraction.ts:409if (!isStatic && name[0] === name[0].toLowerCase()) return undefined — static modifier correctly bypasses the lowercase filter
Worker path quality split Tier 0 only for large repos Fixed: pipeline.ts:685-712 — workers return typeEnvBindings (file-scope fixpoint results), merged into exportedTypeMap post-chunk with isExported filter. The sequential/worker paths are now equivalent in coverage.
lookupRawReturnType fallback breaks for-loop inference Cross-file fallback used processed types Fixed: type-env.ts:660,725 — separate importedRawReturnTypes map with raw type strings (e.g., User[]) for extractElementTypeFromString; processed types in importedReturnTypes are no longer used for raw lookups
Cycle log using lastLevel.length > 1 Misleading for acyclic multi-file last levels Fixed: topologicalLevelSort returns { levels, cycleCount }, log checks cycleCount > 0
Rust: no cross-file integration test Gap — missing rs-cross-file fixture Fixed: rust.test.ts:1524 — full cross-file scenario with use models::{User} and receiver resolution tests
processCallsFromExtracted unused param exportedTypeMap param silently ignored Fixed: param removed per copilot review
C# alias-only extraction Only using Alias = NS.Type captured Fixed: named-binding-extraction.ts:375-379using static NS.Type now captured as {local: lastSegment, exported: lastSegment}
Go/Ruby/C/C++ entirely inert No namedImportMap entries Resolved via new mechanism: synthesizeWildcardImportBindings() in pipeline.ts:127-206 expands graph IMPORTS edges into per-symbol namedImportMap entries from graph-exported symbols. Phase 14 benefit upgraded to Medium for these languages.

Per-Language Gap Analysis — Verified State

Language namedImportMap E1/E2 (ExportedTypeMap) E3 (importedReturnTypes) Phase 14 Benefit Notes
TypeScript ✅ Full ✅ File-scope vars ✅ Full High Including re-export chains and barrel files
JavaScript ✅ Full ✅ File-scope vars ✅ Full High
Python from imports ✅ File-scope vars ✅ Full High Gap: import models (non-from) gets no namedImportMap entries
Kotlin ✅ Top-level fns ✅ Top-level props ✅ Full High Class-member imports correctly skipped (collision avoidance per comment at line 217)
Rust use clauses Limited ✅ Full High Confirmed by new rs-cross-file integration test
Go ✅ Synthesized ✅ Exported symbols ✅ Full Medium Via synthesizeWildcardImportBindings + graph IMPORTS edges
Ruby ✅ Synthesized ✅ Exported symbols ✅ Full Medium Via synthesizeWildcardImportBindings + ctx.importMap
C/C++ ✅ Synthesized ✅ Exported symbols ✅ Full Medium Via synthesizeWildcardImportBindings; quality depends on isExported on header declarations
PHP use classes + use function Inert (class-scope) ✅ Works for use function Low-Medium See note below
Java ✅ Classes + static methods Inert (no file-scope) Via SymbolTable Low Static imports enabled; disambiguation falls to Tier 2a for ambiguous statics
C# ✅ Alias + using static Inert (no file-scope) Via SymbolTable Low Non-alias using NS; still unsupported — cannot resolve without type inference

PHP correction from previous review: extractPhpNamedBindings processes namespace_use_declaration nodes uniformly — it does not check for a function modifier keyword. use function App\Models\UserFactory\getUser produces {local: 'getUser', exported: 'getUser'} via the lastSegment extraction path. This means E3 (importedReturnTypes) does work for use function patterns, not just use class. The PHP integration test at php-cross-file demonstrates this. The roadmap.md table entry "E3: Inert (no fn imports)" is slightly inaccurate.


Architecture Review (Compiler Front-End Perspective)

What's Right

Seeding placement (type-env.ts:1039-1043) is the single most important correctness decision, and it's correct. Seeding after walk() and before the fixpoint means:

  1. Local declarations always shadow imported bindings (first-writer-wins — semantically correct)
  2. Imported bindings participate in the fixpoint as initial conditions — const alias = importedUser correctly resolves alias → User via the copy-propagation pass

This is identical to what a real compiler's module-level environment threading looks like. The comment "local declarations from walk() take precedence" accurately reflects the invariant.

Topological sort is correctly implemented. Kahn's BFS at pipeline.ts:35-83 computes in-degrees correctly and groups files by processing wave. The cycle detection (files with positive residual in-degree after BFS) is the correct Kahn's termination condition. Returning cycleCount rather than exposing cycleFiles keeps the interface minimal.

importedRawReturnTypes separation (type-env.ts:660) is an important design decision. Storing raw type strings (e.g., User[]) separately from processed names (e.g., User) is the right move — extractElementTypeFromString needs the raw form to strip container syntax. Mixing them into one map would have broken for-loop element inference silently.

Worker TypeEnv enrichment (pipeline.ts:685-712) correctly closes the sequential/worker quality gap. Workers now serialize typeEnvBindings (file-scope TypeEnv results including fixpoint-inferred types), and the main thread merges them into exportedTypeMap with isExported verification. don't overwrite existing entries (line 703) is the right precedence: Tier 0 SymbolTable annotations beat fixpoint inference.

New Concerns (Not Caught by Previous Review)

1. Worker enrichment uses hard-coded node ID prefixes (pipeline.ts:693-697):

const nodeId = `Function:${filePath}:${name}`;
const varNodeId = `Variable:${filePath}:${name}`;
const constNodeId = `Const:${filePath}:${name}`;
const node = graph.getNode(nodeId) ?? graph.getNode(varNodeId) ?? graph.getNode(constNodeId);

Class:, Interface:, Struct:, Enum:, TypeAlias:, Trait:, and Record: exported symbols are not checked. This means exported type names (e.g., export class UserService) won't be added to exportedTypeMap via the enrichment path. For receiver propagation this is mostly benign (class references don't need ExportedTypeMap), but it silently misses Const: exports with non-const labels, like TypeScript export const ROUTES = {...} which generates a Const node. Low impact, but fragile by design. Fix this →

2. CROSS_FILE_SKIP_THRESHOLD denominator is totalFiles (all scanned), not totalParseable (pipeline.ts:242):

const gapThreshold = Math.max(1, Math.ceil(totalFiles * CROSS_FILE_SKIP_THRESHOLD));

In asset-heavy repos (many JSON/image/YAML files), totalFiles can be 10× totalParseable. This means the 3% threshold is effectively 0.3% of parseable files, making the skip guard nearly useless for those repos — the re-resolution pass fires for almost any non-trivial codebase. The denominator should be totalParseable. Minor but affects intended behavior.

3. Cycle propagation is non-deterministic at file-granularity (documented implicitly, not at call site): Files in cycles share a final BFS level with no ordering guarantees. seedImportedBindings uses first-writer-wins, so the propagation result for mutually-importing files depends on JavaScript's Map/Set iteration order (insertion order for Maps, which is deterministic per run but not semantically meaningful). For the common circular import pattern (A ↔ B), this means which file "wins" a shared binding is an implementation detail. A comment at the runCrossFileBindingPropagation loop body noting this would help future contributors understand why cycle-group files may not fully propagate to each other.


Documentation Status

type-resolution-roadmap.md: ✅ Fully updated and accurate. All Phase 14 delivery details are present, per-language table is correct, worker quality split is correctly marked as resolved, Milestone E is ✅. No changes required.

type-resolution-system.md: ⚠️ One stale entry in Current Limitations (line 458 before this fix):

- no general cross-file propagation of inferred bindings

This directly contradicts line 12 ("Per-file with cross-file seeding"), lines 125-131 ("It now does (Phase 14):"), and line 448 ("cross-file binding propagation across all 13 languages"). Fix committed locally to this branch but push was blocked by bot permissions:

- no general cross-file propagation of inferred bindings
+ cross-file propagation has per-language gaps: C# non-alias `using NS;` (namespace import) not 
+ supported; Java has no file-scope exports (relies on Phase 9 SymbolTable); PHP class-method 
+ exports are class-scoped (not file-scope), so E1/E2 are inert for class methods

Please apply this manually (or grant write permissions to apply the committed diff a69c54e).


Verdict

Approve. The implementation is architecturally correct and the previous review's concerns have been comprehensively addressed. The topological ordering, seeding placement, fixpoint integration, and cycle handling are all production-grade. The two new concerns (hard-coded node label prefixes, threshold denominator) are low-severity and can be addressed in a follow-up. The one mandatory action is applying the type-resolution-system.md fix.


@magyargergo magyargergo merged commit 1af9c02 into main Mar 21, 2026
8 checks passed
@magyargergo magyargergo deleted the feat/phase14-cross-file-binding-propagation branch March 21, 2026 08:36
naicud added a commit to naicud/GitNexus that referenced this pull request Mar 21, 2026
…npatwari#397)

Merge origin/main into pr/performance-lod-rendering, resolving 22 conflict
files across backend (8), frontend (13), and tests (1).

Key resolution decisions:
- LOD endpoints, hooks, and components: kept our branch additions
- Language dispatch unification (abhigyanpatwari#409): kept main's exhaustive Record types,
  added COBOL entries to all dispatch tables (import-resolution, framework-
  detection, call-routing, entry-point-scoring, tree-sitter-queries, parser-loader)
- Cross-file binding propagation (abhigyanpatwari#397): kept our branch's Phase 14 code
- Neptune DB backend: kept main's Neptune dispatch paths in api.ts, local-backend
- LLM providers: merged minimax (ours) + custom/bedrock (main) into all provider
  lists, types, and settings
- arm64 Mac safety: preserved sequential query execution in local-backend impact
  analysis, updated to use this.runQuery for Neptune compatibility
- Schema test: updated SCHEMA_QUERIES count to 29 (28 node + 1 rel, Section added)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants