A voice activity detection (VAD) library for Unity.
Records voice data from any sources (IVoiceSource, e.g. recording by UnityEngine.Microphone),
detects voice activity by any logic,
and provides voice data to any buffers (IVoiceBuffer, e.g. buffering to WAV file) when voice is active.
You can customize voice sources, voice buffers, and voice activity detection logics adjusting your use cases.
- Sources
-
UnityEngine.Microphone-> UnityMicrophoneSource -
UnityEngine.AudioSource(OnAudioFilterReadcallback) -> UnityAudioSource - Native microphone
-
- Buffers
- Null (Detection only) -> NullVoiceBuffer
- Wave file (by NAudio) -> WaveFileVoiceBuffer
- AudioClip -> AudioClipBuffer
- Voice activity detection logics
- Queueing-based simple VAD logic -> QueueingVoiceActivityDetector
- Less memory usage but less stability
- Cumulative VAD logic -> CumulativeVoiceActivityDetector
- More stability but more memory usage and less noise robustness
- Queueing-based simple VAD logic -> QueueingVoiceActivityDetector
Add following dependencies to your /Packages/manifest.json.
{
"dependencies": {
"com.mochineko.voice-activity-detection": "https://github.com/mochi-neko/voice-activity-detection-unity.git?path=/Assets/Mochineko/VoiceActivityDetection#0.4.2",
"com.cysharp.unitask": "https://github.com/Cysharp/UniTask.git?path=src/UniTask/Assets/Plugins/UniTask",
"com.neuecc.unirx": "https://github.com/neuecc/UniRx.git?path=Assets/Plugins/UniRx/Scripts",
"com.naudio.core": "https://github.com/mochi-neko/simple-audio-codec-unity.git?path=/Assets/NAudio/NAudio.Core#0.2.0",
...
}
}- VAD as component
- VAD with echo
- VAD by AudioSource
- VAD with OpenAI/Whisper API transcription
- VAD by cumulative logic
See also Samples.
See CHANGELOG.
See NOTICE.
Licensed under the MIT license.