A Gradio-based web interface for testing distilled speech-to-speech translation models.
# Install dependencies
pip install -r demo/requirements.txt
# Run the demo
python demo/app.py
# Open http://localhost:7860 in your browser- 🎙️ Record Audio: Use your microphone to record speech
- 📁 Upload Files: Support for WAV/MP3 audio files
- 🌐 Multiple Languages: EN↔ZH, ZH↔FR translation pairs
- 🔊 Auto Playback: Hear translations immediately
- 📊 Real-time Status: See translation progress and metrics
- Select a language pair from the dropdown
- Record audio or upload a file
- Click Translate
- Listen to the translated result!
Place your ONNX models in the models/ directory:
models/
├── en_zh/
│ └── model.onnx
├── zh_en/
│ └── model.onnx
├── zh_fr/
│ └── model.onnx
└── fr_zh/
└── model.onnx
Then run the demo:
python demo/app.py --model-dir models--host Host to bind to (default: 127.0.0.1)
--port Port to listen on (default: 7860)
--share Create a public Gradio link
--model-dir Directory containing trained models
If no trained models are found, the demo runs in mock mode:
- UI is fully functional
- Audio processing works (record, upload, playback)
- "Translation" returns slightly modified audio (for UI testing)
This allows you to test the interface before training models.
┌─────────────────────────────────────────────────────────────┐
│ 🎤 S2ST-Distill Demo │
├─────────────────────────────────────────────────────────────┤
│ Language Pair: [English → Chinese ▼] │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────────┐ ┌─────────────────────────────────┐ │
│ │ 🎙️ Record │ │ 🔊 Translated Audio │ │
│ │ │ │ │ │
│ │ [● Record] │ │ [▶ Play] │ │
│ │ │ │ │ │
│ │ — OR — │ │ Status: │ │
│ │ │ │ ✅ Translated successfully! │ │
│ │ 📁 Upload │ │ 📥 Input: 3.2s from microphone │ │
│ │ [Choose File] │ │ 🌐 Direction: English → Chinese │ │
│ └─────────────────┘ └─────────────────────────────────┘ │
│ │
│ [🔄 Translate] │
└─────────────────────────────────────────────────────────────┘
- Python 3.10+
- Modern web browser with microphone access
- For full functionality: PyTorch + ONNX Runtime
MIT License - see LICENSE for details.