Skip to content

SGS VAD ALGORITHM USER GUIDE


REVISION HISTORY

Revision No.
Description
Date
1.0
  • Initial release
  • 03/27/2024
    1.01
  • Add Json API description
  • 12/09/2024
    1.02
  • Update file description and copyright
  • 04/25/2025
    1.03
  • Add note of sampling rate
  • 05/08/2025
    1.04
  • Remove copyright
  • 05/14/2025
    1.05
  • Corrected format
  • 10/28/2025
    1.1
  • Modified API return value and update description of Chapter 1
  • 11/25/2025
    1.2
  • Add IaaVad_GetResult API and remove vad_result param from IaaVad_Run
  • 12/03/2025

    1. Overview

    1.1. Module Description

    The Voice Activity Detection (VAD) algorithm is used to detect whether there is voice activity in the current input by processing the input sound.


    1.2. Basic Structure

    After the VAD algorithm allocates memory and completes parameter initialization and configuration, it only requires an input signal buffer. The VAD processes this input buffer according to the configured parameters and algorithm, then writes the detection results to the VAD output result pointer.


    1.3. Function Introduction

    VAD :

    Detect intervals within the audio signal that contain speech activity.


    1.4. Application Scenarios

    In communication systems, it can determine whether the user is speaking and reduce bandwidth usage during silent periods for low-power applications. It can also reduce computational load by identifying speech segments, allowing the system to process only the portions that actually contain speech.


    1.5. Chip Difference Description

    Across different chip series, the VAD algorithm demonstrates consistent performance with no observable differences.


    1.6. Example Introduction

    Use the VAD API to obtain the memory size required by the VAD algorithm, initialize the VAD algorithm handle, configure parameters to the VAD handle, configure running mode to the VAD handle, execute the VAD algorithm, and release the VAD algorithm resources.

        #include <stdio.h>
        #include <string.h>
        #include <time.h>
        #include <stdlib.h>
        #ifndef OS_WINDOWS
        #include <sys/ioctl.h>
        #endif
        #include <sys/types.h>
        #include <sys/stat.h>
        #include <sys/time.h>
    
        #include "AudioVadProcess.h"
    
    
        #define USE_MALLOC   (1)
        unsigned int WorkingBuffer2[1] = {0};
    
        typedef unsigned char               uint8;
        typedef unsigned short              uint16;
        typedef unsigned long               uint32;
    
        float AVERAGE_RUN(int a)
        {
            static unsigned int num = 0;
            static float avg = 0;
            if(num == 0) avg = 0;
            num++;
            avg = avg + ((float)a - avg) / ((float)num);
            return avg;
        }
        unsigned int _OsCounterGetMs(void)
        {
            struct  timeval t1;
            gettimeofday(&t1,NULL);
            unsigned int T = ( (1000000 * t1.tv_sec)+ t1.tv_usec );
            return T;
        }
    
    
        int main(int argc, char *argv[])
        {
            short input[1024];
            char input_file[512];
            unsigned int T0, T1;
            float avg = 0;
            int counter=0;
            int mode = 1;
            int vad_result=0;
        #if USE_MALLOC
            char *working_buf_ptr = (char*)malloc(IaaVad_GetBufferSize());
        #else
            char working_buf_ptr[512*100*2];
        #endif
    
            FILE * fin;
            int ret1;
    
            VAD_HANDLE handle;
            VadInit vad_init;
            VadConfig vad_config;
    
            int PN=128;
    
            vad_init.point_number = PN;
            vad_init.channel = 1;
            vad_init.sample_rate = IAA_VAD_SAMPLE_RATE_8000;
    
            vad_config.vote_frame = 100;
            vad_config.sensitivity = VAD_SEN_HIGH;
            handle = IaaVad_Init((char *)working_buf_ptr, &vad_init);
            if(handle==NULL)
            {
                printf("VAD init error\r\n");
                return -1;
            }
            else
            {
                printf("VAD init succeed\r\n");
            }
    
            if(IaaVad_Config(handle, &vad_config))
            {
                printf("Config Error!");
                return -1;
            }
            if(IaaVad_SetMode(handle, mode))
            {
                printf("Config Error!");
                return -1;
            }
            if(argc < 2)
                sprintf(input_file,"%s","./../sample/data/merge_test3.wav");
            else
                strcpy(input_file, argv[1]);
    
            fin = fopen(input_file, "rb");
            if(!fin)
            {
                printf("the input file %s could not be open\n",input_file);
                return -1;
            }
            fread(input, sizeof(char), 44, fin); // read header 44 bytes
            while(fread(input, sizeof(short), vad_init.point_number*vad_init.channel, fin))
            {
    
                counter++;
                T0  = (long)_OsCounterGetMs();
                ret1 = IaaVad_Run(handle, input);
                IaaVad_GetResult(handle, &vad_result);
                T1  = (long)_OsCounterGetMs();
                avg += (T1 - T0);
    
                if(counter%1000== 999)
                {
                    printf("counter = %d\n", counter);
                    printf("current time = %f\t", (float)counter*PN/vad_init.sample_rate);
                    printf("vad result = %d\t",vad_result);
                    printf("process time = %lu(us)\n",(long)(T1 - T0));
                }
    
                if(ret1 < 0)
                {
                    printf("Error occured in Voice Activity Detection\n");
                    break;
                }
    
            }
            avg /= counter;
            printf("AVG is %.2f us\n",avg);
            IaaVad_Free(handle);
            free(working_buf_ptr);
            fclose(fin);
    
            printf("Done\n");
        return 0;
        }
    

    Use the VAD API to read parameters from the VAD JSON file, obtain the memory size required for VAD algorithm execution, initialize the VAD algorithm handle, configure parameters to the handle, configure running mode to the handle, execute the VAD algorithm, and release the VAD algorithm resources.

        #include <stdio.h>
        #include <string.h>
        #include <time.h>
        #include <stdlib.h>
        #ifndef OS_WINDOWS
        #include <sys/ioctl.h>
        #endif
        #include <sys/types.h>
        #include <sys/stat.h>
        #include <sys/time.h>
    
        #include "AudioVadProcess.h"
    
    
        #define USE_MALLOC   (1)
        unsigned int WorkingBuffer2[1] = {0};
    
        typedef unsigned char               uint8;
        typedef unsigned short              uint16;
        typedef unsigned long               uint32;
    
        float AVERAGE_RUN(int a)
        {
            static unsigned int num = 0;
            static float avg = 0;
            if(num == 0) avg = 0;
            num++;
            avg = avg + ((float)a - avg) / ((float)num);
            return avg;
        }
        unsigned int _OsCounterGetMs(void)
        {
            struct  timeval t1;
            gettimeofday(&t1,NULL);
            unsigned int T = ( (1000000 * t1.tv_sec)+ t1.tv_usec );
            return T;
        }
    
    
        int main(int argc, char *argv[])
        {
            short input[1024];
            char input_file[512];
            char output_file[512];
            unsigned int T0, T1;
            float avg = 0;
            int counter=0;
            FILE *fin, *fout;
            int ret1;
            int vad_result;
    
            VAD_HANDLE handle;
            VadInit vad_init;
            VadConfig vad_config;
            VadOption vad_option;
            char vad_para_json_file[512];
            sprintf(vad_para_json_file,"%s","./../sample/data/VadParamJson.json");
            unsigned int vad_para_buffersize = IaaVad_GetJsonFileSize(vad_para_json_file);
    
        #if USE_MALLOC
            char *working_buf_ptr = (char*)malloc(IaaVad_GetBufferSize());
            char *vad_para_json_buf_ptr = (char*)malloc(vad_para_buffersize);
        #else
            char working_buf_ptr[512*100*2];
            char vad_para_json_buf_ptr[512*100*2];
        #endif
    
            memset(&vad_init,0,sizeof(VadInit));
            memset(&vad_config,0,sizeof(VadConfig));
            memset(&vad_option,0,sizeof(VadOption));
            ret1 = IaaVad_InitReadFromJson(&vad_init, vad_para_json_buf_ptr, vad_para_json_file, vad_para_buffersize);
            ret1 = IaaVad_ConfigReadFromJson(&vad_config, vad_para_json_buf_ptr, vad_para_json_file, vad_para_buffersize);
            ret1 = IaaVad_OptionReadFromJson(&vad_option, vad_para_json_buf_ptr, vad_para_json_file, vad_para_buffersize);
    
    
            handle = IaaVad_Init((char *)working_buf_ptr, &vad_init);
            if(handle==NULL)
            {
                printf("VAD init error\r\n");
                return -1;
            }
            else
            {
                printf("VAD init succeed\r\n");
            }
    
            if(IaaVad_Config(handle, &vad_config))
            {
                printf("Config Error!");
                return -1;
            }
            if(IaaVad_SetMode(handle, vad_option.mode))
            {
                printf("Config Error!");
                return -1;
            }
            if(argc < 2)
                sprintf(input_file,"%s","./../sample/data/merge_test3.wav");
            else
                strcpy(input_file, argv[1]);
    
            sprintf(output_file,"%s","./../sample/data/VAD_result.txt");
    
            fin = fopen(input_file, "rb");
            if(!fin)
            {
                printf("the input file %s could not be open\n",input_file);
                return -1;
            }
    
            fout = fopen(output_file, "w");
                if(!fin)
            {
                printf("the output file %s could not be open\n",output_file);
                return -1;
            }
    
            fread(input, sizeof(char), 44, fin); // read header 44 bytes
            fprintf(fout,"%s\t%s\n","time","vad result");
            while(fread(input, sizeof(short), vad_init.point_number*vad_init.channel, fin))
            {
    
            counter++;
            T0  = (long)_OsCounterGetMs();
            ret1 = IaaVad_Run(handle, input);
            IaaVad_GetResult(handle, &vad_result);
            T1  = (long)_OsCounterGetMs();
            avg += (T1 - T0);
    
            if(counter%1000== 999)
            {
                printf("counter = %d\n", counter);
                printf("current time = %f\t", (float)counter*vad_init.point_number/vad_init.sample_rate);
                printf("vad result = %d\t",vad_result);
                printf("process time = %lu(us)\n",(long)(T1 - T0));
            }
    
            fprintf(fout,"%f\t%d\n",(float)counter*vad_init.point_number/vad_init.sample_rate,vad_result);
    
            if(ret1 < 0)
            {
                printf("Error occured in Voice Activity Detection\n");
                break;
            }
    
            }
            avg /= counter;
            printf("AVG is %.2f us\n",avg);
            IaaVad_Free(handle);
            free(working_buf_ptr);
            fclose(fin);
            fclose(fout);
            printf("Done\n");
        return 0;
        }
    

    2. API Reference

    API Name Function
    IaaVad_GetBufferSize Get the memory size required to run the VAD algorithm
    IaaVad_Init Initialize VAD algorithm
    IaaVad_Config Set VAD algorithm parameters
    IaaVad_Run Run VAD algorithm
    IaaVad_Free Free VAD algorithm resources
    IaaVad_SetMode Set VAD operation mode
    IaaVad_GetJsonFileSize VAD get size of json file
    IaaVad_InitReadFromJson Set json parameter into VAD init structure
    IaaVad_ConfigReadFromJson Set json parameter into VAD config structure
    IaaVad_OptionReadFromJson Set json parameter into VAD option structure
    IaaVad_GetResult Get VAD algorithm result

    2.1. IaaVad_GetBufferSize

    • Function

      Get the memory size required to run the VAD algorithm.

    • Syntax

      unsigned int IaaVad_GetBufferSize(void);
      
    • Parameter

      Parameter Name Description Input/Output
      N/A
    • Return Value

      The return value is the memory size required to run the VAD algorithm.

    • Dependency

      • Header file: AudioVadProcess.h

      • Library file: libVAD_LINUX.so/ libVAD_LINUX.a

    • Note

      This interface only returns the required memory size, and the application and release of memory needed to be handled by the application.

    • Example

      Please refer to the Example Introduction

    2.2. IaaVad_Init

    • Function

      Initialize the VAD algorithm.

    • Syntax

      VAD_HANDLE IaaVad_Init(char* const working_buffer_address, VadInit *vad_init);
      
    • Parameter

      Parameter name Description Input/Output
      working_buffer_address Memory address used by VAD algorithm Input
      vad_init Initialization structure pointer of VAD algorithm Input
    • Return value

      Return value Result
      Not NULL Successful
      NULL Failed
    • Dependency

      • Header file: AudioVadProcess.h

      • Library file: libVAD_LINUX.so/ libVAD_LINUX.a

    • Example

      Please refer to the Example Introduction

    2.3. IaaVad_Config

    • Function

      Set VAD algorithm parameters.

    • Syntax

      ALGO_VAD_RET IaaVad_Config(VAD_HANDLE handle, VadConfig *vad_config);
      
    • Parameter

      Parameter name Description Input/Output
      handle VAD algorithm handle Input
      vad_config VAD algorithm parameter setting structure Input
    • Return Value

      Return value Result
      0 Successful
      Non-zero Failed, refer to error code
    • Dependency

      • Header file: AudioVadProcess.h

      • Library file: libVAD_LINUX.so/ libVAD_LINUX.a

    • Example

      Please refer to the Example Introduction

    2.4. IaaVad_Run

    • Function

      Run VAD algorithm.

    • Syntax

      ALGO_VAD_RET IaaVad_Run(VAD_HANDLE handle,short* pss_audio_in);
      
    • Parameter

      Parameter name Description Input/Output
      handle Algorithm handle Input
      pss_audio_in Input data pointer Input
    • Return Value

      Return value Result
      0 Successful
      Non-zero Failed, refer to error code
    • Dependency

      • Header file: AudioVadProcess.h

      • Library file: libVAD_LINUX.so/ libVAD_LINUX.a

    • Example

      Please refer to the Example Introduction

    2.5. IaaVad_Free

    • Function

      Free the resources of the VAD algorithm.

    • Syntax

      ALGO_VAD_RET IaaVad_Free(VAD_HANDLE handle);
      
    • Parameter

      Return value Result
      0 Successful
      Non-zero Failed, refer to error code
    • Return Value

      Return value Result
      0 Successful
      Non-zero Failed
    • Dependency

      • Header file: AudioVadProcess.h

      • Library file: libVAD_LINUX.so/ libVAD_LINUX.a

    • Note

      IaaVad_Free must be called first before releasing the memory used by the VAD algorithm.

    • Example

      Please refer to the Example Introduction

    2.6. IaaVad_SetMode

    • Function

      Set VAD algorithm operation mode.

    • Syntax

      ALGO_VAD_RET IaaVad_SetMode(VAD_HANDLE handle, int mode);
      
    • Parameter

      Parameter name Description Input/Output
      handle VAD algorithm handle Input
      mode Different VAD algorithms. 0~1 are different traditional human voice detection algorithms. 2~3 is the human voice detection method using deep learning Input
    • Return Value

      Return value Result
      0 Successful
      Non-zero Failed, refer to error code
    • Dependency

      • Header file: AudioVadProcess.h

      • Library file: libVAD_LINUX.so/ libVAD_LINUX.a

    • Example

      Please refer to the Example Introduction

    2.7. IaaVad_GetJsonFileSize

    • Features

      VAD get size of json file.

    • Syntax

      unsigned int IaaVad_GetJsonFileSize(char* jsonfile);
      
    • Parameters

      Parameter name Description Input/Output
      jsonfile name of json file Input
    • Return value

      Return value is the memory size required for decoding json file.

    • Dependency

      • Header: AudioVadProcess.h

      • Library: libVAD_LINUX.so/ libVAD_LINUX.a

    2.8. IaaVad_InitReadFromJson

    • Features

      Set json parameter into VAD init structure.

    • Syntax

      ALGO_VAD_RET IaaVad_InitReadFromJson(VadInit* vad_init, char* jsonBuffer, char* jsonfile, unsigned int buffSize);
      
    • Parameters

      Parameter name Description Input/Output
      vad_init vad init structure Input
      jsonBuffer json buffer memory address Input
      jsonfile name of json file Input
      buffSize size of json file Input
    • Return value

      Return value Result
      0 Successful
      Non-zero Failed, refer to error code
    • Dependency

      • Header: AudioVadProcess.h

      • Library: libVAD_LINUX.so/ libVAD_LINUX.a

    2.9. IaaVad_ConfigReadFromJson

    • Features

      Set json parameter into VAD config structure.

    • Syntax

      ALGO_VAD_RET IaaVad_ConfigReadFromJson(VadConfig* vad_config, char* jsonBuffer, char* jsonfile, unsigned int buffSize);
      
    • Parameters

      Parameter name Description Input/Output
      vad_config vad config structure Input
      jsonBuffer json buffer memory address Input
      jsonfile name of json file Input
      buffSize size of json file Input
    • Return value

      Return value Result
      0 Successful
      Non-zero Failed, refer to error code
    • Dependency

      • Header: AudioVadProcess.h

      • Library: libVAD_LINUX.so/ libVAD_LINUX.a

    2.10. IaaVad_OptionReadFromJson

    • Features

      Set json parameter into VAD option structure.

    • Syntax

      ALGO_VAD_RET IaaVad_OptionReadFromJson(VadOption* vad_option, char* jsonBuffer, char* jsonfile, unsigned int buffSize);
      
    • Parameters

      Parameter name Description Input/Output
      vad_option vad option structure Input
      jsonBuffer json buffer memory address Input
      jsonfile name of json file Input
      buffSize size of json file Input
    • Return value

      Return value Result
      0 Successful
      Non-zero Failed, refer to error code
    • Dependency

      • Header: AudioVadProcess.h

      • Library: libVAD_LINUX.so/ libVAD_LINUX.a

    2.11. IaaVad_GetResult

    • Function

      Get VAD algorithm result.

    • Syntax

      ALGO_VAD_RET IaaVad_GetResult(VAD_HANDLE handle, int* vad_result);
      
    • Parameter

      Parameter name Description Input/Output
      handle VAD algorithm handle Input
      vad_result vad result output. 0: Un-detect Speech. 1: Detect Speech. Output
    • Return Value

      Return value Result
      0 Successful
      Non-zero Failed, refer to error code
    • Dependency

      • Header file: AudioVadProcess.h

      • Library file: libVAD_LINUX.so/ libVAD_LINUX.a

    • Example

      Please refer to the Example Introduction


    3. VAD Data Type

    The relevant data types of the VAD module are defined as follows:

    Data Type Definition
    IAA_VAD_SAMPLE_RATE Sampling rate type of VAD algorithm
    VadSensitivity Sensitivity type of VAD algorithm
    VadInit Initialization data structure type of VAD algorithm
    VadConfig Parameter setting structure type of VAD algorithm
    VAD_HANDLE Handle type of VAD algorithm
    VadOption Option setting structure type of VAD algorithm

    3.1. IAA_VAD_SAMPLE_RATE

    • Description

      Define the sampling rate type of the VAD algorithm.

    • Definition

      typedef enum {
      
          IAA_VAD_SAMPLE_RATE_8000  =  8000,
      
          IAA_VAD_SAMPLE_RATE_16000 = 16000,
      
          IAA_VAD_SAMPLE_RATE_48000 = 48000,
      
      }IAA_VAD_SAMPLE_RATE;
      
    • Member

      Member name Description
      IAA_VAD_SAMPLE_RATE_8000 Sampling rate 8000Hz
      IAA_VAD_SAMPLE_RATE_16000 Sampling rate 16000Hz
      IAA_VAD_SAMPLE_RATE_48000 Sampling rate 48000Hz
    • Note

      MODE2 and MODE3 only support 8000 Hz sampling rate.

    • Related Data Type and Interface

      VadInit

    3.2. VadSensitivity

    • Description

      Define the sensitivity type of the VAD algorithm.

    • Definition

      typedef enum {
      
          VAD_SEN_LOW,
      
          VAD_SEN_MID,
      
          VAD_SEN_HIGH,
      
      }VadSensitivity;
      
    • Member

      Member name Description
      VAD_SEN_LOW Low sensitivity, harder to detect human voice
      VAD_SEN_MID Medium sensitivity
      VAD_SEN_HIGH High sensitivity, easier to detect human voice
    • Note

      N/A.

    • Related Data Type and Interface

      VadConfig

    3.3. VadInit

    • Description

      Define the initialization parameter type of the VAD algorithm.

    • Definition

      typedef struct
      
      {
      
          unsigned int point_number;
      
          unsigned int channel;
      
          IAA_VAD_SAMPLE_RATE sample_rate;
      
      }VadInit;
      
    • Member

      Member name Description
      point_number The number of sampling points processed by the VAD algorithm once
      channel Channel number
      sample_rate Sampling rate, currently supports 8k/16k/48k
    • Note

      N/A.

    • Related Data Type and Interface

      IaaVad_Init

      IaaVad_InitReadFromJson

    3.4. VadConfig

    • Description

      Define the configuration parameter structure type of the VAD algorithm.

    • Definition

      typedef struct
      
      {
      
          unsigned int vote_frame;
      
          VadSensitivity sensitivity;
      
      }VadConfig;
      
    • Member

      Member Name Description
      vote_frame In mode 0, the length required for the final estimated result, that is, the number of delayed frames. Other modes are not affected by this parameter.
      sensitivity Sensitivity setting
    • Related Data Type and Interface

      IaaVad_Config

      IaaVad_ConfigReadFromJson

    3.5. VAD_HANDLE

    3.6. VadOption

    • Description

      Define the option parameter structure type of the VAD algorithm.

    • Definition

      typedef struct{
      
          int mode;
      
      }VadOption;
      
    • Member

      Member name Description
      mode set mode in VAD algorithm, use in IaaVad_SetMode
    • Note

      N/A

    • Related Data Type and Interface

      IaaVad_OptionReadFromJson

    4. ERROR CODE

    VAD API error codes are shown as follow:

    Error code Definition Description
    0x00000000 ALGO_VAD_RET_SUCCESS VAD runs successfully
    0x10000901 ALGO_VAD_RET_INIT_ERROR VAD isn't initialized
    0x10000902 ALGO_VAD_RET_INVALID_HANDLE VAD HANDLE is invalid
    0x10000903 ALGO_VAD_RET_INVALID_SAMPLE_RATE Sampling frequency doesn't support
    0x10000904 ALGO_VAD_RET_INVALID_POINT_NUMBER Points per frame doesn't support
    0x10000905 ALGO_VAD_RET_INVALID_CHANNEL Channel number doesn't support
    0x10000906 ALGO_VAD_RET_INVALID_CONFIG Incorrect config setting
    0x10000907 ALGO_VAD_RET_INVALID_MODE VAD mode parameter setting is invalid
    0x10000908 ALGO_VAD_RET_API_CONFLICT Other APIs are running
    0x10000909 ALGO_VAD_RET_INVALID_CALLING Incorrect order of calling API
    0x10000910 ALGO_VAD_RET_INVALID_JSONFILE VAD fails to read json file