metadata
			pipeline_tag: voice-activity-detection
license: bsd-2-clause
tags:
  - speech-processing
  - semantic-vad
  - multilingual
datasets:
  - pipecat-ai/smart-turn-data-v3-train
  - pipecat-ai/smart-turn-data-v3-test
Smart Turn v3
Smart Turn v3 is an open‑source semantic Voice Activity Detection (VAD) model that tells you whether a speaker has finished their turn by analysing the raw waveform, not the transcript.
Links
- Blog post: Smart Turn v3
 - GitHub repo with training and inference code
 - Datasets with training and inference code
 
Model architecture
- Backbone: Whisper Tiny encoder
 - Head: shallow linear classifier
 - Params: 8 M (int8)
 - Checkpoint: 8 MB ONNX
 
How to use
Please see the blog post and GitHub repo for more information on using the model, either standalone or with Pipecat.