Skip to content

Benchmarking SpecExit #229

@dhia680

Description

@dhia680

Hi,
Thank you for the great work on SpecExit! I'm trying to benchmark the method using Qwen3-8B (and Qwen3-14B) as the base model.

Issue

I attempted to use a standard EAGLE3 draft model (AngelSlim/Qwen3-4B_eagle3) with the provided inference code, but encountered a weight shape mismatch:

RuntimeError: Error(s) in loading state_dict for Model: size mismatch for fc.weight: copying a param with shape torch.Size([2560, 7680]) from checkpoint, the shape in current model is torch.Size([2563, 7680]).

This is expected since standard EAGLE3 models don't include the +3 outputs for the CPR (Confidence-Progress-Remain) prediction head required by SpecExit.

Request

Would it be possible to release pre-trained SpecExit draft models that include the CPR head? Specifically:

  1. Qwen3-based models (e.g., Qwen3-8B + compatible draft model)
  2. Any models used in the paper's benchmarks (for reproduction purposes)

Alternatively, if releasing full models isn't feasible, could you provide details about your draft model training setup (train dataset, hyper-params, duration, any useful tips).
Particularly, sharing the sharegpt_train.jsonand sharegpt_test.json files would be very helpful, as no support for creating your training dataset is there in the codebase.

Failure details

  • Codebase used: https://anonymous.4open.science/r/SpecExit-B802
  • Running inference with gen_ea_answer.py
  • Benchmark: GSM8K
  • Base model: Qwen/Qwen3-8B
  • Attempted draft model (which turns out to be a standard eagle3 model): AngelSlim/Qwen3-4B_eagle3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions