-
Notifications
You must be signed in to change notification settings - Fork 16.6k
Description
The following bug summary was produced during a bug investigation with OpenAI Codex (GPT 5.4). I (Alex Reinking) have reviewed all of the machine-generated code and text below for accuracy. This issue was discovered in Halide's CI here: halide/Halide#9073
If you are interested in using Halide's LLVM builds, they can be easily installed via uv:
$ mkdir halide-repro-env && cd halide-repro-env
$ uv init
Initialized project `halide-repro-env`
$ uv add halide-llvm --prerelease=allow --index https://pypi.halide-lang.org/simple
Using CPython 3.14.0
Creating virtual environment at: .venv
Resolved 2 packages in 108ms
Installed 1 package in 106ms
+ halide-llvm==23.0.0.dev86417+gf014202dNote these builds contain (close to) the minimum necessary to build Halide.
Summary
We are seeing an LLVM assertion failure when compiling AArch64 SVE2 IR that combines:
llvm.aarch64.sve.udot.nxv2i64
llvm.vector.reduce.add.nxv2i64The crash occurs in LLVM's ExpandReductions pass, which appears to assume that vector_reduce_add operands are fixed-width vectors.
This was observed while compiling Halide-generated IR, but we reduced it to a standalone LLVM reproducer.
We have now confirmed the standalone reproducer against both:
- the packaged assertions-on LLVM used by Halide in
.venv/.../halide_llvm/data - a full local LLVM build at
f014202dac32
Regression Window
For us, the regression appeared after updating LLVM 23 from:
69780be1
to:
f014202d
The most likely exposing change in that range is:
- 221d2f5
[AArch64] Add partial reduce patterns for new sve dot variants (#184649)
This is only a suspicion. The actual assertion is in generic LLVM ExpandReductions, not in AArch64-specific code.
Observed Failure
Assertion:
Assertion failed: (isa<To>(Val) && "cast<Ty>() argument of incompatible type!"),
function cast, file Casting.h, line 572.
Standalone reproducer stack:
Running pass 'Function Pass Manager' on module '/tmp/reduce_udot_min.ll'.
Running pass 'Expand reduction intrinsics' on function '@f'
...
(anonymous namespace)::expandReductions(llvm::Function&, llvm::TargetTransformInfo const*)
llvm::FPPassManager::runOnFunction
llvm::FPPassManager::runOnModule
llvm::legacy::PassManagerImpl::run
Why This Looks Wrong
In llvm/lib/CodeGen/ExpandReductions.cpp, vector_reduce_add expansion does:
cast<FixedVectorType>(Vec->getType())However, in this reproducer, Vec has type:
<vscale x 2 x i64>That makes the cast invalid for scalable-vector reductions.
Minimal IR Reproducer
This .ll is sufficient to reproduce the assertion in a standalone LLVM API driver:
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
target triple = "aarch64--linux-gnueabihf"
declare <vscale x 2 x i64> @llvm.aarch64.sve.udot.nxv2i64(<vscale x 2 x i64>, <vscale x 8 x i16>, <vscale x 8 x i16>)
declare i64 @llvm.vector.reduce.add.nxv2i64(<vscale x 2 x i64>)
define i64 @f(ptr %p) {
entry:
%a = load <vscale x 8 x i16>, ptr %p, align 16
%p2 = getelementptr i8, ptr %p, i64 64
%b = load <vscale x 8 x i16>, ptr %p2, align 16
%d = call <vscale x 2 x i64> @llvm.aarch64.sve.udot.nxv2i64(<vscale x 2 x i64> zeroinitializer, <vscale x 8 x i16> %a, <vscale x 8 x i16> %b)
%r = call i64 @llvm.vector.reduce.add.nxv2i64(<vscale x 2 x i64> %d)
ret i64 %r
}Standalone Reproducer
I do not currently have a pure llc file.ll reproducer.
I do have a standalone reproducer using LLVM's own APIs. It reproduces both with the packaged assertions-on LLVM used by Halide and with a full local build of LLVM at f014202dac32.
Driver source:
#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/CodeGen/CommandFlags.h"
#include "llvm/IR/LegacyPassManager.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/MDBuilder.h"
#include "llvm/IR/Module.h"
#include "llvm/IRReader/IRReader.h"
#include "llvm/MC/TargetRegistry.h"
#include "llvm/Pass.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/InitLLVM.h"
#include "llvm/Support/Path.h"
#include "llvm/Support/SourceMgr.h"
#include "llvm/Support/TargetSelect.h"
#include "llvm/Support/ToolOutputFile.h"
#include "llvm/Target/TargetMachine.h"
#include "llvm/Target/TargetOptions.h"
#include "llvm/TargetParser/Triple.h"
#include "llvm/Transforms/IPO/AlwaysInliner.h"
#include <memory>
#include <string>
using namespace llvm;
static std::string getModuleFlagString(Module &M, StringRef Name) {
if (auto *MD = M.getModuleFlag(Name)) {
if (auto *MDS = dyn_cast<MDString>(MD)) {
return MDS->getString().str();
}
}
return "";
}
static std::unique_ptr<TargetMachine> makeTM(Module &M) {
Triple TT = M.getTargetTriple();
std::string TripleStr = TT.getTriple();
std::string Error;
const Target *T = TargetRegistry::lookupTarget(TT, Error);
if (!T) {
errs() << "lookupTarget failed for " << TripleStr << ": " << Error << "\n";
return nullptr;
}
std::string CPU = getModuleFlagString(M, "halide_mcpu_target");
std::string Features = getModuleFlagString(M, "halide_mattrs");
TargetOptions Options;
auto RM = std::optional<Reloc::Model>(Reloc::PIC_);
return std::unique_ptr<TargetMachine>(
T->createTargetMachine(TT, CPU, Features, Options, RM));
}
static bool emitOne(StringRef InPath, StringRef OutPath,
CodeGenFileType FileType) {
LLVMContext Ctx;
SMDiagnostic Err;
auto M = parseIRFile(InPath, Err, Ctx);
if (!M) {
Err.print("llvm_emit_repro", errs());
return false;
}
auto TM = makeTM(*M);
if (!TM) {
return false;
}
M->setDataLayout(TM->createDataLayout());
std::error_code EC;
auto Out = std::make_unique<ToolOutputFile>(OutPath, EC, sys::fs::OF_None);
if (EC) {
errs() << "open failed for " << OutPath << ": " << EC.message() << "\n";
return false;
}
legacy::PassManager PM;
PM.add(new TargetLibraryInfoWrapperPass(Triple(M->getTargetTriple())));
PM.add(createAlwaysInlinerLegacyPass());
TM->Options.MCOptions.AsmVerbose = true;
TM->addPassesToEmitFile(PM, Out->os(), nullptr, FileType);
PM.run(*M);
Out->keep();
return true;
}
int main(int argc, char **argv) {
InitLLVM X(argc, argv);
if (argc < 3) {
errs() << "usage: llvm_emit_repro outdir file1.ll [file2.ll ...]\n";
return 2;
}
LLVMInitializeAArch64TargetInfo();
LLVMInitializeAArch64Target();
LLVMInitializeAArch64TargetMC();
LLVMInitializeAArch64AsmPrinter();
LLVMInitializeAArch64AsmParser();
std::string OutDir = argv[1];
for (int i = 2; i < argc; i++) {
std::string InPath = argv[i];
std::string Base = sys::path::filename(InPath).str();
std::string OutPath = OutDir + "/" + Base + ".s";
errs() << "emitting " << InPath << " -> " << OutPath << "\n";
if (!emitOne(InPath, OutPath, CodeGenFileType::AssemblyFile)) {
return 1;
}
}
return 0;
}How I Built the Reproducer
I confirmed this with two different LLVM builds:
- an assertions-on packaged LLVM used by Halide
- a full local LLVM build at
f014202dac32
The same reproducer source and the same minimal .ll worked for both.
For a generic local LLVM build rooted at $LLVM_ROOT with build directory $LLVM_BUILD, the build command is:
cat > /tmp/llvm_emit_repro.cpp <<'EOF'
// paste the reproducer source from this issue here
EOF
cat > /tmp/reduce_udot_min.ll <<'EOF'
; paste the minimal IR from this issue here
EOF
/usr/bin/clang++ \
-O0 /tmp/llvm_emit_repro.cpp \
-I"$LLVM_ROOT/llvm/include" \
-I"$LLVM_BUILD/include" \
-std=c++17 \
$(test -d /opt/homebrew/lib && echo -L/opt/homebrew/lib) \
$("$LLVM_BUILD/bin/llvm-config" --ldflags --system-libs --libs all) \
-o /tmp/llvm_emit_reproRun command:
mkdir -p /tmp/llvm_emit_repro_out
/tmp/llvm_emit_repro /tmp/llvm_emit_repro_out /tmp/reduce_udot_min.llObserved output:
emitting /tmp/reduce_udot_min.ll -> /tmp/llvm_emit_repro_out/reduce_udot_min.ll.s
Assertion failed: (isa<To>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file Casting.h, line 572.
...
Running pass 'Expand reduction intrinsics' on function '@f'
For a packaged LLVM installation rooted at $LLVM_PKG, an equivalent build command is:
SDK=$(xcrun --show-sdk-path)
"$LLVM_PKG/bin/clang++" \
-isysroot "$SDK" \
-O0 /tmp/llvm_emit_repro.cpp \
-I"$LLVM_PKG/include" \
-std=c++17 \
-Wl,-rpath,"$LLVM_PKG/lib" \
$("$LLVM_PKG/bin/llvm-config" --ldflags --system-libs --libs all) \
-o /tmp/llvm_emit_repro_pkgRun command:
mkdir -p /tmp/llvm_emit_repro_out
/tmp/llvm_emit_repro_pkg /tmp/llvm_emit_repro_out /tmp/reduce_udot_min.llObserved output was the same.
llc Status
For completeness: an assertions-enabled local llc build from the same local LLVM build did not reproduce for me on this minimal .ll:
llc -mtriple=aarch64-unknown-linux-gnu -mattr=+sve2 -o /tmp/reduce_udot_min.s /tmp/reduce_udot_min.llThat succeeded.
So the issue is currently confirmed as a standalone LLVM API reproducer, but not yet as a standalone llc command-line reproducer.
Workaround
Disabling ExpandReductions avoids the crash in Halide:
HL_LLVM_ARGS='-disable-expand-reductions'Expected Behavior
LLVM should not assert here. It should either:
- support scalable
vector_reduce_addinExpandReductions, or - avoid expanding such reductions when only fixed-width handling exists.
Suggested Direction
The immediate problem seems to be that ExpandReductions assumes fixed-width vector operands for vector_reduce_add and friends. A guard for scalable vectors before any cast<FixedVectorType> would avoid the assertion and likely point to the intended target-specific handling path.