🎤 Whisper Speaker Diarization Demo

AI-powered speaker identification and transcription

🔬 Processing Information

💻 CPU processing only (slower than GPU)
📦 No file size limits - process any audio length
🌍 For multi-language audio, use larger models (medium, large-v2, large-v3)
⚡ Larger models provide better accuracy but take longer to process
⚠️ Very large files may take significant time and memory

📁 Upload Audio File (No size limit)

Supported: MP3, WAV, M4A, FLAC, etc.

🎯 Whisper Model

🌍 Language

🎯 Processing Mode

Standard Diarization Speaker Separation

Standard: Traditional diarization | Separation: Pre-separate speakers

🎵 Audio Enhancement

🔢 Suppress Numerals

⚡ Batch Size

1 4

📚 How to Use

Upload audio (any size)
Choose processing mode
Configure settings (optional)
Click process and wait
Download results

🎯 Processing Modes

Standard: Traditional speaker diarization
Speaker Separation: Pre-separate speakers first

🌍 Model Selection

tiny.en/base.en/small.en: Fast, English only
medium.en: Better accuracy, English only
medium/large-v2/large-v3: Best for multi-language audio

⚠️ Large File Warning

Large files will take longer to process
Monitor system resources during processing

📋 Results