GitNexus's type resolution system maps variables to likely declared types across the supported languages so the ingestion pipeline can perform receiver-constrained call resolution.
When the code contains a call such as user.save(), the resolver tries to determine that user is a User, allowing call resolution to prefer User#save over unrelated methods such as Repo#save.
This system is designed to be:
- Conservative — it prefers missing a binding over introducing a misleading one
- Single-pass — bindings are collected during a single AST walk, with a limited post-pass for assignment propagation
- Scope-aware — function-local bindings are isolated from file-level bindings
- Per-file — the environment is built for one file at a time, though it may consult the global
SymbolTablefor validation in specific cases
It is not a full compiler type checker. Its job is to recover enough type information to improve call-edge accuracy during ingestion.
Type resolution sits between parsing and call resolution.
parse-worker.ts
│
▼
buildTypeEnv(tree, language, symbolTable?)
│
├──► TypeEnvironment.lookup(varName, callNode)
│ │
│ ▼
│ call-processor.ts
│ - resolves receiver type for method calls
│ - filters candidates by receiver match
│ - verifies deferred constructor / initializer bindings
│
└──► discarded after file processing
The TypeEnvironment is built once per file. call-processor.ts then uses lookup() to determine receiver types and narrow candidate symbols from the SymbolTable.
┌──────────────────────┐
│ type-env.ts │
│ │
│ buildTypeEnv() │
│ - Single AST walk │
│ - Scope tracking │
│ - Tier orchestration│
└──────────┬───────────┘
│ dispatches to
┌───────────────────────┬┴┬────────────────────────┐
│ │ │ │
┌─────────▼──────────┐ ┌─────────▼─▼─────────┐ ┌──────────▼─────────┐
│ shared.ts │ │ <language>.ts │ │ types.ts │
│ │ │ │ │ │
│ Container table │ │ Per-language │ │ Extractor │
│ Type helpers │ │ extractors │ │ interface defs │
│ Generic helpers │ │ (shared + per-lang) │ │ │
└────────────────────┘ └──────────────────────┘ └────────────────────┘
| File | Purpose |
|---|---|
type-env.ts |
Core engine. Walks the AST once, tracks scopes, collects bindings, and exposes buildTypeEnv() plus the TypeEnvironment interface. |
types.ts |
TypeScript interfaces for extractor hooks such as TypeBindingExtractor, ForLoopExtractor, and PatternBindingExtractor. |
shared.ts |
Language-agnostic helpers such as extractSimpleTypeName, extractElementTypeFromString, resolveIterableElementType, CONTAINER_DESCRIPTORS, and TYPED_PARAMETER_TYPES. |
index.ts |
Dispatch map from SupportedLanguages to LanguageTypeConfig. |
typescript.ts |
TypeScript and JavaScript extractors, including JSDoc support. |
jvm.ts |
Java and Kotlin extractors. |
csharp.ts |
C# extractors. |
go.ts |
Go extractors, including range semantics. |
rust.ts |
Rust extractors, including if let, match-related handling, and Self resolution. |
python.ts |
Python extractors, including match / case handling. |
php.ts |
PHP extractors, including PHPDoc support. |
ruby.ts |
Ruby extractors, including YARD support. |
swift.ts |
Swift extractors. Currently the most minimal configuration. |
c-cpp.ts |
Shared C / C++ extractors. |
The current type-resolution layer supports 13 languages:
- TypeScript
- JavaScript
- Python
- Java
- Kotlin
- C#
- Go
- Rust
- PHP
- Ruby
- Swift
- C
- C++
Not all languages have the same level of coverage. Swift remains the most minimal. C and some C++ cases naturally benefit less from receiver typing than object-oriented languages.
The type resolution layer is intentionally narrower than a compiler-grade type system.
It does:
- resolve variable types from declarations, parameters, initializers, loops, and selected pattern constructs
- normalize common wrappers such as nullable types and generic containers
- improve receiver matching during call resolution
- verify some ambiguous initializer bindings against the
SymbolTable
It does not:
- perform full semantic type checking
- run fixpoint inference
- propagate inferred bindings across files as ordinary environment entries
- guarantee resolution for every ambiguous construct
buildTypeEnv() returns a TypeEnvironment that contains:
- scoped bindings collected from the current file
- deferred constructor / initializer binding candidates
- lookup helpers used by call resolution
- pattern override data for branch-local narrowing where supported
The environment is scope-aware so identical variable names in different functions do not collide.
File scope ('')
├── config → Config
├── users → Map
│
├── processUsers@100
│ ├── user → User
│ └── alias → User
│
└── processRepos@200
└── repo → Repo
''for file scopefunctionName@startIndexfor function-local scope
These scope keys are also used later when verifying deferred bindings in call processing, so any future change to scope-key format must stay consistent across both layers.
TypeEnvironment.lookup() resolves types in this effective order:
- special receivers
this,self,$this→ enclosing classsuper,base,parent→ parent class
- position-indexed pattern overrides
- function-local scope
- file-level scope
Special receivers are handled as a dedicated fast path rather than ordinary lexical bindings.
Bindings are collected during the same AST walk. Higher-confidence sources win over weaker inference.
Direct extraction from AST type nodes.
// TypeScript
const user: User = getUser()
// Java
User user = getUser()
// Go
var user User
// Rust
let user: User = get_user()
// Python
user: User = get_user()extractDeclaration() reads the declaration type node and normalizes it through extractSimpleTypeName().
Parameters are handled separately by extractParameter() using the same normalization logic. The shared TYPED_PARAMETER_TYPES set controls which AST node types are treated as typed parameters.
Also referred to as Tier 1c in Phase 6 PR and test naming.
For-each style loops often introduce a variable with no explicit type. In those cases, the resolver derives the loop variable type from the iterable's container type.
foreach (var user in users) { user.Save(); }
// TypeScript
for (const user of users) { user.save(); }
// Rust
for user in users { user.save(); }This is handled by resolveIterableElementType() through a three-step cascade:
-
Declaration type nodes
Uses raw type annotation nodes when available, including cases such asUser[]orList[User]. -
Scope environment string
UsesextractElementTypeFromString()to parse a stored type string. -
AST walk fallback
Walks upward to enclosing declarations or parameters when needed.
Pattern-matching constructs may introduce a new variable or temporarily narrow an existing one.
if (obj is User user) { user.Save(); }
// Java
if (obj instanceof User user) { user.save(); }
// Rust
if let Some(user) = opt { user.save(); }
// Python
match obj:
case User() as user:
user.save()Binding behavior depends on the language:
- first-writer-wins is used by default
- position-indexed branch overrides are used where branch-local narrowing must not leak between branches, most notably Kotlin
When there is no explicit annotation, the resolver can infer a type from the initializer.
const user = new User()
// C#
var user = new User()
// Kotlin
val user = User()
// Go
user := User{}
ptr := &User{}
user2 := new(User)
// Ruby
user = User.newSome languages can identify constructor-like syntax directly. Others need validation through the SymbolTable, because syntax alone cannot always distinguish User() from getUser().
In those cases the system records an unverified binding candidate and later validates it against known class / struct symbols.
Bindings can propagate through simple identifier assignments.
const user: User = getUser()
const alias = user
const other = aliasThis is handled after the main walk through a single pass over pending assignments.
This supports simple forward propagation, but there is no iterative fixpoint step. For example:
const b = a
const a: User = getUser()will not resolve b.
CONTAINER_DESCRIPTORS defines the type-parameter semantics for common containers.
That allows the resolver to distinguish key-yielding methods from value-yielding methods instead of always assuming the last generic argument.
for (const key of map.keys()) { ... } // key → string
for (const val of map.values()) { ... } // val → UserUnknown containers fall back to heuristics, keeping the system conservative rather than fully semantic.
Map<K, V>/Dictionary<K, V>/ similar key-value containersList<T>/Array<T>/Vec<T>/Set<T>/ similar single-element containers- method-aware yield selection such as
.keys(),.values(),.keySet(),.Values
For less strictly typed ecosystems, the resolver can fall back to documentation-based type information.
Supported comment systems:
- JSDoc for JavaScript / TypeScript
- PHPDoc for PHP
- YARD for Ruby
These are used conservatively and only when AST-level type information is missing or insufficient.
Although the environment is built per file, it may consult the global SymbolTable in specific validation paths.
This is important for languages where constructor-like syntax is ambiguous. A binding candidate such as val user = User() may need confirmation that User is a class-like symbol rather than an ordinary function.
This means the system is still per-file in binding construction, but not completely isolated from project-wide symbol knowledge.
A key detail is that some initializer bindings are not fully resolved inside TypeEnv itself.
call-processor.ts later verifies deferred bindings and may infer receiver types from:
- validated class / struct constructor candidates
- uniquely resolved function or method calls that expose a usable return type
So return-type-aware receiver inference already exists in a constrained downstream form today. Phase 7.3 extended this by threading ReturnTypeLookup into TypeEnv via ForLoopExtractorContext, enabling for-loop call-expression iterables (e.g., for (const u of getUsers())) to resolve element types in 7 languages (TS/JS, Java, Kotlin, C#, Go, Rust, Python, PHP). General assignment propagation (var x = f() binding the return type of f into the scope env) remains pending — the pendingCallResults infrastructure exists but is dormant until Phase 9.
| Feature | TS | JS | Java | Kotlin | C# | Go | Rust | Python | PHP | Ruby | Swift | C++ | C |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Declarations | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Parameters | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Initializer / constructor inference | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Constructor binding scan | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| For-loop element types | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
| Pattern binding | Yes | Yes | Yes | Yes | No | Yes | Yes | No | No | No | No | No | No |
| Assignment chains | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes | Yes |
| Field/property type resolution | Yes | No† | Yes | Yes | Yes | Yes | Yes | Yes* | Yes | YARD | No | Yes | No‡ |
| Comment-based types | JSDoc | JSDoc | No | No | No | No | No | No | PHPDoc | YARD | No | No | No |
| Return type extraction | JSDoc | JSDoc | No | No | No | No | No | No | PHPDoc | YARD | No | No | No |
| Write access (ACCESSES write) | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes§ | Yes | Yes | Yes | No |
* Python class-level annotated attributes (address: Address) now resolve declaredType correctly. The self.x instance attribute pattern is not yet supported.
† JS field topology is captured (field_definition → HAS_PROPERTY edges) but declaredType is never set — JS has no AST type annotations. Disambiguation via lookupFieldByOwner requires declaredType. JSDoc @type support is a Phase 9 candidate.
‡ C has no @definition.property query pattern. Struct member fields are not captured. C++ captures class/struct member fields via field_declaration.
§ PHP write access covers instance property writes ($obj->field = value) and static property writes (ClassName::$field = value). Nullsafe writes ($obj?->field = value) are not tracked because this is invalid PHP syntax — null-safe member access on the left-hand side of assignment is a parse error.
The current system provides strong value for call resolution because it combines:
- explicit annotation extraction across 13 languages
- generic-aware loop element typing (including call-expression iterables)
- initializer-based inference with SymbolTable validation
- selected pattern-based narrowing
- scope-aware lookups
- comment-based fallbacks for dynamic ecosystems (JSDoc, PHPDoc, YARD)
- constrained return-type-aware receiver inference in call processing
- deep field/property chains up to 3 levels across 9 languages
- ACCESSES edge emission for field read access (via chain walking) and field write access (via assignment capture) across 12 languages
- mixed field+method chain resolution (e.g.
svc.getUser().address.save()) - type-preserving stdlib passthrough for
unwrap(),clone(),expect(), etc.
This is enough to materially improve call-edge precision even without implementing a full static type system.
Important gaps still remain:
- no general cross-file propagation of inferred bindings
- no fixpoint inference
- limited branch-sensitive narrowing outside selected pattern constructs
- limited Swift support compared with other languages
- no complete destructuring-based field typing
- no broad expression-level return-type propagation inside
TypeEnv(for-loop call-expression iterables are resolved in 7 languages viaReturnTypeLookup, but generalvar x = f()assignment propagation is pending)
When modifying this system, treat the following as load-bearing invariants:
-
Conservatism matters more than recall
A missed binding is usually safer than a misleading receiver type. -
Scope-key format is shared behavior
If scope keys change, constructor-binding verification and any downstream lookup using those keys must change in sync. -
Tier naming may differ across code and PR discussions
For-loop element inference may appear as "Tier 0b" in documentation and "Tier 1c" in Phase 6 PR / test naming. -
Comment-based types are fallback signals, not primary truth
They should remain lower-trust than explicit AST-derived types. -
Return-type-aware inference already exists in constrained form
Future roadmap work should extend and generalize it rather than reintroduce it from scratch.