Type Resolution System

GitNexus's type resolution system maps variables to likely declared types across the supported languages so the ingestion pipeline can perform receiver-constrained call resolution.

When the code contains a call such as user.save(), the resolver tries to determine that user is a User, allowing call resolution to prefer User#save over unrelated methods such as Repo#save.

This system is designed to be:

Conservative — it prefers missing a binding over introducing a misleading one
Single-pass — bindings are collected during a single AST walk, with a limited post-pass for assignment propagation
Scope-aware — function-local bindings are isolated from file-level bindings
Per-file — the environment is built for one file at a time, though it may consult the global SymbolTable for validation in specific cases

It is not a full compiler type checker. Its job is to recover enough type information to improve call-edge accuracy during ingestion.

Purpose in the Pipeline

Type resolution sits between parsing and call resolution.

parse-worker.ts
     │
     ▼
buildTypeEnv(tree, language, symbolTable?)
     │
     ├──► TypeEnvironment.lookup(varName, callNode)
     │         │
     │         ▼
     │    call-processor.ts
     │    - resolves receiver type for method calls
     │    - filters candidates by receiver match
     │    - verifies deferred constructor / initializer bindings
     │
     └──► discarded after file processing

The TypeEnvironment is built once per file. call-processor.ts then uses lookup() to determine receiver types and narrow candidate symbols from the SymbolTable.

Architecture

                                 ┌──────────────────────┐
                                 │     type-env.ts      │
                                 │                      │
                                 │  buildTypeEnv()      │
                                 │  - Single AST walk   │
                                 │  - Scope tracking    │
                                 │  - Tier orchestration│
                                 └──────────┬───────────┘
                                            │ dispatches to
                    ┌───────────────────────┬┴┬────────────────────────┐
                    │                       │ │                        │
          ┌─────────▼──────────┐  ┌─────────▼─▼─────────┐  ┌──────────▼─────────┐
          │   shared.ts        │  │  <language>.ts      │  │    types.ts        │
          │                    │  │                      │  │                    │
          │  Container table   │  │  Per-language        │  │  Extractor         │
          │  Type helpers      │  │  extractors          │  │  interface defs    │
          │  Generic helpers   │  │  (shared + per-lang) │  │                    │
          └────────────────────┘  └──────────────────────┘  └────────────────────┘

Main files

File	Purpose
`type-env.ts`	Core engine. Walks the AST once, tracks scopes, collects bindings, and exposes `buildTypeEnv()` plus the `TypeEnvironment` interface.
`types.ts`	TypeScript interfaces for extractor hooks such as `TypeBindingExtractor`, `ForLoopExtractor`, and `PatternBindingExtractor`.
`shared.ts`	Language-agnostic helpers such as `extractSimpleTypeName`, `extractElementTypeFromString`, `resolveIterableElementType`, `CONTAINER_DESCRIPTORS`, and `TYPED_PARAMETER_TYPES`.
`index.ts`	Dispatch map from `SupportedLanguages` to `LanguageTypeConfig`.
`typescript.ts`	TypeScript and JavaScript extractors, including JSDoc support.
`jvm.ts`	Java and Kotlin extractors.
`csharp.ts`	C# extractors.
`go.ts`	Go extractors, including range semantics.
`rust.ts`	Rust extractors, including `if let`, match-related handling, and `Self` resolution.
`python.ts`	Python extractors, including `match` / `case` handling.
`php.ts`	PHP extractors, including PHPDoc support.
`ruby.ts`	Ruby extractors, including YARD support.
`swift.ts`	Swift extractors. Currently the most minimal configuration.
`c-cpp.ts`	Shared C / C++ extractors.

Supported Languages

The current type-resolution layer supports 13 languages:

TypeScript
JavaScript
Python
Java
Kotlin
C#
Go
Rust
PHP
Ruby
Swift
C
C++

Not all languages have the same level of coverage. Swift remains the most minimal. C and some C++ cases naturally benefit less from receiver typing than object-oriented languages.

Design Constraints

The type resolution layer is intentionally narrower than a compiler-grade type system.

It does:

resolve variable types from declarations, parameters, initializers, loops, and selected pattern constructs
normalize common wrappers such as nullable types and generic containers
improve receiver matching during call resolution
verify some ambiguous initializer bindings against the SymbolTable

It does not:

perform full semantic type checking
run fixpoint inference
propagate inferred bindings across files as ordinary environment entries
guarantee resolution for every ambiguous construct

TypeEnvironment Model

buildTypeEnv() returns a TypeEnvironment that contains:

scoped bindings collected from the current file
deferred constructor / initializer binding candidates
lookup helpers used by call resolution
pattern override data for branch-local narrowing where supported

Scope model

The environment is scope-aware so identical variable names in different functions do not collide.

File scope ('')
├── config → Config
├── users → Map
│
├── processUsers@100
│   ├── user → User
│   └── alias → User
│
└── processRepos@200
    └── repo → Repo

Scope keys

'' for file scope
functionName@startIndex for function-local scope

These scope keys are also used later when verifying deferred bindings in call processing, so any future change to scope-key format must stay consistent across both layers.

Lookup Semantics

TypeEnvironment.lookup() resolves types in this effective order:

special receivers
- this, self, $this → enclosing class
- super, base, parent → parent class
position-indexed pattern overrides
function-local scope
file-level scope

Special receivers are handled as a dedicated fast path rather than ordinary lexical bindings.

Resolution Tiers

Bindings are collected during the same AST walk. Higher-confidence sources win over weaker inference.

Tier 0: Explicit Type Annotations

Direct extraction from AST type nodes.

// TypeScript
const user: User = getUser()

// Java
User user = getUser()

// Go
var user User

// Rust
let user: User = get_user()

// Python
user: User = get_user()

extractDeclaration() reads the declaration type node and normalizes it through extractSimpleTypeName().

Parameters are handled separately by extractParameter() using the same normalization logic. The shared TYPED_PARAMETER_TYPES set controls which AST node types are treated as typed parameters.

Tier 0b: For-Loop Element Type Resolution

Also referred to as Tier 1c in Phase 6 PR and test naming.

For-each style loops often introduce a variable with no explicit type. In those cases, the resolver derives the loop variable type from the iterable's container type.

foreach (var user in users) { user.Save(); }

// TypeScript
for (const user of users) { user.save(); }

// Rust
for user in users { user.save(); }

This is handled by resolveIterableElementType() through a three-step cascade:

Declaration type nodes
Uses raw type annotation nodes when available, including cases such as User[] or List[User].
Scope environment string
Uses extractElementTypeFromString() to parse a stored type string.
AST walk fallback
Walks upward to enclosing declarations or parameters when needed.

Tier 0c: Pattern Binding

Pattern-matching constructs may introduce a new variable or temporarily narrow an existing one.

if (obj is User user) { user.Save(); }

// Java
if (obj instanceof User user) { user.save(); }

// Rust
if let Some(user) = opt { user.save(); }

// Python
match obj:
    case User() as user:
        user.save()

Binding behavior depends on the language:

first-writer-wins is used by default
position-indexed branch overrides are used where branch-local narrowing must not leak between branches, most notably Kotlin

Tier 1: Initializer / Constructor Inference

When there is no explicit annotation, the resolver can infer a type from the initializer.

const user = new User()

// C#
var user = new User()

// Kotlin
val user = User()

// Go
user := User{}
ptr := &User{}
user2 := new(User)

// Ruby
user = User.new

Some languages can identify constructor-like syntax directly. Others need validation through the SymbolTable, because syntax alone cannot always distinguish User() from getUser().

In those cases the system records an unverified binding candidate and later validates it against known class / struct symbols.

Tier 2: Assignment Chain Propagation

Bindings can propagate through simple identifier assignments.

const user: User = getUser()
const alias = user
const other = alias

This is handled after the main walk through a single pass over pending assignments.

This supports simple forward propagation, but there is no iterative fixpoint step. For example:

const b = a
const a: User = getUser()

will not resolve b.

Container Type Descriptors

CONTAINER_DESCRIPTORS defines the type-parameter semantics for common containers.

That allows the resolver to distinguish key-yielding methods from value-yielding methods instead of always assuming the last generic argument.

for (const key of map.keys()) { ... }    // key → string
for (const val of map.values()) { ... }  // val → User

Unknown containers fall back to heuristics, keeping the system conservative rather than fully semantic.

Examples of descriptor-driven behavior

Map<K, V> / Dictionary<K, V> / similar key-value containers
List<T> / Array<T> / Vec<T> / Set<T> / similar single-element containers
method-aware yield selection such as .keys(), .values(), .keySet(), .Values

Comment-Based Types

For less strictly typed ecosystems, the resolver can fall back to documentation-based type information.

Supported comment systems:

JSDoc for JavaScript / TypeScript
PHPDoc for PHP
YARD for Ruby

These are used conservatively and only when AST-level type information is missing or insufficient.

SymbolTable Interaction

Although the environment is built per file, it may consult the global SymbolTable in specific validation paths.

This is important for languages where constructor-like syntax is ambiguous. A binding candidate such as val user = User() may need confirmation that User is a class-like symbol rather than an ordinary function.

This means the system is still per-file in binding construction, but not completely isolated from project-wide symbol knowledge.

Deferred Binding Verification in Call Processing

A key detail is that some initializer bindings are not fully resolved inside TypeEnv itself.

call-processor.ts later verifies deferred bindings and may infer receiver types from:

validated class / struct constructor candidates
uniquely resolved function or method calls that expose a usable return type

So return-type-aware receiver inference already exists in a constrained downstream form today. Phase 7.3 extended this by threading ReturnTypeLookup into TypeEnv via ForLoopExtractorContext, enabling for-loop call-expression iterables (e.g., for (const u of getUsers())) to resolve element types in 7 languages (TS/JS, Java, Kotlin, C#, Go, Rust, Python, PHP). General assignment propagation (var x = f() binding the return type of f into the scope env) remains pending — the pendingCallResults infrastructure exists but is dormant until Phase 9.

Language Feature Matrix

Feature	TS	JS	Java	Kotlin	C#	Go	Rust	Python	PHP	Ruby	Swift	C++	C
Declarations	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
Parameters	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
Initializer / constructor inference	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
Constructor binding scan	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
For-loop element types	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	No	Yes	Yes
Pattern binding	Yes	Yes	Yes	Yes	No	Yes	Yes	No	No	No	No	No	No
Assignment chains	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	No	Yes	Yes	Yes
Field/property type resolution	Yes	No†	Yes	Yes	Yes	Yes	Yes	Yes*	Yes	YARD	No	Yes	No‡
Comment-based types	JSDoc	JSDoc	No	No	No	No	No	No	PHPDoc	YARD	No	No	No
Return type extraction	JSDoc	JSDoc	No	No	No	No	No	No	PHPDoc	YARD	No	No	No
Write access (ACCESSES write)	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes§	Yes	Yes	Yes	No

* Python class-level annotated attributes (address: Address) now resolve declaredType correctly. The self.x instance attribute pattern is not yet supported.

† JS field topology is captured (field_definition → HAS_PROPERTY edges) but declaredType is never set — JS has no AST type annotations. Disambiguation via lookupFieldByOwner requires declaredType. JSDoc @type support is a Phase 9 candidate.

‡ C has no @definition.property query pattern. Struct member fields are not captured. C++ captures class/struct member fields via field_declaration.

§ PHP write access covers instance property writes ($obj->field = value) and static property writes (ClassName::$field = value). Nullsafe writes ($obj?->field = value) are not tracked because this is invalid PHP syntax — null-safe member access on the left-hand side of assignment is a parse error.

Current Strengths

The current system provides strong value for call resolution because it combines:

explicit annotation extraction across 13 languages
generic-aware loop element typing (including call-expression iterables)
initializer-based inference with SymbolTable validation
selected pattern-based narrowing
scope-aware lookups
comment-based fallbacks for dynamic ecosystems (JSDoc, PHPDoc, YARD)
constrained return-type-aware receiver inference in call processing
deep field/property chains up to 3 levels across 9 languages
ACCESSES edge emission for field read access (via chain walking) and field write access (via assignment capture) across 12 languages
mixed field+method chain resolution (e.g. svc.getUser().address.save())
type-preserving stdlib passthrough for unwrap(), clone(), expect(), etc.

This is enough to materially improve call-edge precision even without implementing a full static type system.

Current Limitations

Important gaps still remain:

no general cross-file propagation of inferred bindings
no fixpoint inference
limited branch-sensitive narrowing outside selected pattern constructs
limited Swift support compared with other languages
no complete destructuring-based field typing
no broad expression-level return-type propagation inside TypeEnv (for-loop call-expression iterables are resolved in 7 languages via ReturnTypeLookup, but general var x = f() assignment propagation is pending)

Contributor Notes

When modifying this system, treat the following as load-bearing invariants:

Conservatism matters more than recall
A missed binding is usually safer than a misleading receiver type.
Scope-key format is shared behavior
If scope keys change, constructor-binding verification and any downstream lookup using those keys must change in sync.
Tier naming may differ across code and PR discussions
For-loop element inference may appear as "Tier 0b" in documentation and "Tier 1c" in Phase 6 PR / test naming.
Comment-based types are fallback signals, not primary truth
They should remain lower-trust than explicit AST-derived types.
Return-type-aware inference already exists in constrained form
Future roadmap work should extend and generalize it rather than reintroduce it from scratch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Type Resolution System

Purpose in the Pipeline

Architecture

Main files

Supported Languages

Design Constraints

TypeEnvironment Model

Scope model

Scope keys

Lookup Semantics

Resolution Tiers

Tier 0: Explicit Type Annotations

Tier 0b: For-Loop Element Type Resolution

Tier 0c: Pattern Binding

Tier 1: Initializer / Constructor Inference

Tier 2: Assignment Chain Propagation

Container Type Descriptors

Examples of descriptor-driven behavior

Comment-Based Types

SymbolTable Interaction

Deferred Binding Verification in Call Processing

Language Feature Matrix

Current Strengths

Current Limitations

Contributor Notes

Uh oh!

FilesExpand file tree

type-resolution-system.md

Latest commit

History

type-resolution-system.md

File metadata and controls

Type Resolution System

Purpose in the Pipeline

Architecture

Main files

Supported Languages

Design Constraints

TypeEnvironment Model

Scope model

Scope keys

Lookup Semantics

Resolution Tiers

Tier 0: Explicit Type Annotations

Tier 0b: For-Loop Element Type Resolution

Tier 0c: Pattern Binding

Tier 1: Initializer / Constructor Inference

Tier 2: Assignment Chain Propagation

Container Type Descriptors

Examples of descriptor-driven behavior

Comment-Based Types

SymbolTable Interaction

Deferred Binding Verification in Call Processing

Language Feature Matrix

Current Strengths

Current Limitations

Contributor Notes