1 Star 1 Fork 0

Lance / android-vad

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
MIT

Android Voice Activity Detection (VAD)

This VAD library can process audio in real-time utilizing Gaussian Mixture Model (GMM) which helps identify presence of human speech in an audio sample that contains a mixture of speech and noise. VAD work offline and all processing done on device.

Library based on WebRTC VAD from Google which is reportedly one of the best available: it's fast, modern and free. This algorithm has found wide adoption and has recently become one of the gold-standards for delay-sensitive scenarios like web-based interaction.

If you are looking for a higher accuracy and faster processing time I recommend to use Deep Neural Networks(DNN). Please see for reference the following paper with DNN vs GMM comparison.

drawing

Parameters

VAD library only accepts 16-bit mono PCM audio stream and can work with next Sample Rates, Frame Sizes and Classifiers.

&nbsp
Valid Sample Rate Valid Frame Size
8000Hz 80, 160, 240
16000Hz 160, 320, 480
32000Hz 320, 640, 960
48000Hz 480, 960, 1440
&nbsp
Valid Classifiers
NORMAL
LOW_BITRATE
AGGRESSIVE
VERY_AGGRESSIVE

Silence duration (ms) - this parameter used in Continuous Speech detector, the value of this parameter will define the necessary and sufficient duration of negative results to recognize it as silence.

Voice duration (ms) - this parameter used in Continuous Speech detector, the value of this parameter will define the necessary and sufficient duration of positive results to recognize result as speech.

Recommended parameters:

  • Sample Rate - 16KHz,
  • Frame Size - 160,
  • Mode - VERY_AGGRESSIVE,
  • Silence Duration - 500ms,
  • Voice Duration - 500ms;

Usage

VAD supports 2 different ways of detecting speech:

  1. Continuous Speech listener was designed to detect long utterances without returning false positive results when user makes pauses between sentences.
 Vad vad = new Vad(VadConfig.newBuilder()
                .setSampleRate(VadConfig.SampleRate.SAMPLE_RATE_16K)
                .setFrameSize(VadConfig.FrameSize.FRAME_SIZE_160)
                .setMode(VadConfig.Mode.VERY_AGGRESSIVE)
                .setSilenceDurationMillis(500)
                .setVoiceDurationMillis(500)
                .build());

        vad.start();
        
        vad.addContinuousSpeechListener(short[] audioFrame, new VadListener() {
            @Override
            public void onSpeechDetected() {
                //speech detected!
            }

            @Override
            public void onNoiseDetected() {
                //noise detected!
            }
        });
        
        vad.stop();
  1. Speech detector was designed to detect speech/noise in small audio frames and return result for every frame. This method will not work for long utterances.
 Vad vad = new Vad(VadConfig.newBuilder()
                .setSampleRate(VadConfig.SampleRate.SAMPLE_RATE_16K)
                .setFrameSize(VadConfig.FrameSize.FRAME_SIZE_160)
                .setMode(VadConfig.Mode.VERY_AGGRESSIVE)
                .build());

        vad.start();
        
        boolean isSpeech = vad.isSpeech(short[] audioFrame);
        
        vad.stop();

Requirements

Android VAD supports Android 4.1 (Jelly Bean) and later.

Development

To open the project in Android Studio:

  1. Go to File menu or the Welcome Screen
  2. Click on Open...
  3. Navigate to VAD's root directory.
  4. Select setting.gradle

Download

Gradle is the only supported build configuration, so just add the dependency to your project build.gradle file:

  1. Add it in your root build.gradle at the end of repositories:
allprojects {
   repositories {
     maven { url 'https://jitpack.io' }
   }
}
  1. Add the dependency
dependencies {
    implementation 'com.github.gkonovalov:android-vad:1.0.1'
}

You also can download precompiled AAR library and APK files from GitHub's releases page.


Georgiy Konovalov 2021 (c) MIT License

Copyright 2019 Georgiy Konovalov Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

简介

声音活动检测 This VAD library can process audio in real-time utilizing GMM which helps identify presence of human speech in an audio sample that contains a mixture of speech and noise. 展开 收起
MIT
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
1
https://gitee.com/lancewoo/android-vad.git
git@gitee.com:lancewoo/android-vad.git
lancewoo
android-vad
android-vad
master

搜索帮助