SSTAR SE ALGORITHM USER GUIDE¶
REVISION HISTORY¶
| Revision No. | Description |
Date |
|---|---|---|
| 1.1.0 | 05/01/2021 | |
| 1.1.1 | 06/25/2021 | |
| 1.1.2 | 06/30/2021 | |
| 1.1.3 | 07/01/2021 | |
| 1.1.4 | 07/12/2021 | |
| 1.1.5 | 08/12/2021 | |
| 1.1.6 | 10/18/2021 | |
| 2.0.0 | 11/29/2022 | |
| 2.0.1 | 04/11/2022 | |
| 2.1.0 | 12/14/2023 |
1. OVERVIEW¶
1.1. Algorithm Introduction¶
The Speech Enhancement (SE) algorithm uses AI algorithms to enhance input speech. It can be used to suppress both stationary noise and non-stationary noise.
1.2. Algorithm Specification¶
The working sampling rate of this algorithm is 8k/16kHz, and the length of each frame is 128 sampling points (8ms).
2. API INTRODUCTION¶
2.1. Function Module API¶
| API Name | Function |
|---|---|
| IaaSe_GetBufferSize | Get the memory size required to run SE algorithm. |
| IaaSe_Init | Initialize SE algorithm. |
| IaaSe_SetConfig | Set SE algorithm parameters. |
| IaaSe_GetConfig | Print out SE algorithm parameters. |
| IaaSe_GetInputSamples | Get the number of sampling points input by SE algorithm. |
| IaaSe_Run | Run SE algorithm. |
| IaaSe_Free | Release resource from SE algorithm. |
| IaaSe_setCallbackFunc | Callback function of SE algorithm verification. |
2.2. IaaSe_GetBufferSize¶
-
Function
Get the memory size required to run SE algorithm.
-
Syntax
int IaaSe_GetBufferSize(void); -
Parameter
Parameter Name Description Input/Output -
Return Value
The return value is the memory size required to run SE algorithm.
-
Dependency
-
Header File: AudioSeProcess.h
-
Library File: libSE_LINUX.so/ libSE_LINUX.a
-
-
Note
This interface only returns the memory size required to run SE algorithm; to apply and release the memory, you need to use other APIs.
-
Example
N/A.
2.3. IaaSe_Init¶
-
Function
Initialize the memory required to run SE algorithm.
-
Syntax
SE_HANDLE IaaSe_Init(char* workBufAddress, AudioSeInit_t *seInit); -
Parameter
Parameter Name Description Input/Output workBufAddress The memory address used by SE algorithm. Input seInit Pointer to the initialization structure of SE algorithm. Input -
Return Value
Return Value Result handle Successful. NULL Failed. -
Dependency
-
Header File: AudioSeProcess.h
-
Library File: libSE_LINUX.so/ libSE_LINUX.a
-
-
Note
- The SE algorithm only supports 16kHz sampling rate and 16bit sampling width.
-
Example
N/A.
2.4. IaaSe_SetConfig¶
-
Function
Set parameters of SE algorithm.
-
Syntax
int IaaSe_SetConfig(SE_HANDLE handle, AudioSeConfig_t seConfig); -
Parameter
Parameter Name Description Input/Output handle The handle of SE algorithm. Input seConfig Pointer to the parameter configuration structure of SE algorithm. Input -
Return Value
Return Value Result 0 Successful. Non-Zero Failed. Please refer to Error Code -
Dependency
-
Header File: AudioSeProcess.h
-
Library File: libSE_LINUX.so/ libSE_LINUX.a
-
2.5. IaaSe_GetConfig¶
-
Function
Set parameters of SE algorithm.
-
Syntax
int IaaSe_GetConfig(SE_HANDLE handle); -
Parameter
Parameter Name Description Input/Output handle The handle of SE algorithm. Input -
Return Value
Return Value Result 0 Successful. Non-Zero Failed. Please refer to Error Code -
Dependency
-
Header File: AudioSeProcess.h
-
Library File: libSE_LINUX.so/ libSE_LINUX.a
-
2.6. IaaSe_GetInputSamples¶
-
Function
Get the number of the sampling points input by SE algorithm.
-
Syntax
int IaaSe_GetInputSamples(SE_HANDLE handle, int *samples); -
Parameter
Parameter Name Description Input/Output handle The handle of SE algorithm. Input samples The number of the sampling points input by SE algorithm. Output -
Return Value
Return Value Result 0 Successful. Non-Zero Failed. Please refer to Error Code -
Dependency
-
Header File: AudioSeProcess.h
-
Library File: libSE_LINUX.so/ libSE_LINUX.a
-
-
Note
- This function is optional. The return value of the SE algorithm is 128 for single channel and 256 for dual channels. According to the sampling bitwidth of 16 bits, the memory size of the single-channel input array is 128 * 16 bit, and the memory size of the dual-channel input array is 256 * 16 bit.
2.7. IaaSe_Run¶
-
Function
Run SE algorithm.
-
Syntax
int IaaSe_Run(SE_HANDLE handle, short *input); -
Parameter
Parameter Name Description Input/Output handle The handle of SE algorithm. Input input The pointer to input data. Input/Output -
Return Value
Return Value Result 0 Successful. Non-Zero Failed. Please refer to Error Code -
Dependency
-
Header File: AudioSeProcess.h
-
Library File: libSE_LINUX.so/ libSE_LINUX.a
-
-
Example
Please refer to Demo code.
2.8. IaaSe_Free¶
-
Function
Release the resource from SE algorithm.
-
Syntax
int IaaSe_Free(SE_HANDLE handle); -
Parameter
Parameter Name Description Input/Output handle The handle of SE algorithm. Input -
Return Value
Return Value Result 0 Successful. Non-Zero Failed. Please refer to Error Code -
Dependency
-
Header File: AudioSeProcess.h
-
Library File: libSE_LINUX.so/ libSE_LINUX.a
-
-
Note
- Please call IaaSe_Free before releasing the memory used by SE algorithm.
2.9. IaaSe_setCallbackFunc¶
-
Function
The callback function of SE algorithm verification (currently not supported).
-
Syntax
int IaaSe_setCallbackFunc(int(*log)(const char *szFmt, ...), int(*envSet)(char *key, char *par), int(*envGetString)(char *var, char *buf, unsigned int size), int(*envSave)(void), int(*readUuid)(unsigned long long *u64Uuid)); -
Dependency
-
Header File: AudioSeProcess.h
-
Library File: libSE_LINUX.so/ libSE_LINUX.a
-
3. SE DATA TYPE¶
3.1. Definition of Data Type Related to SE Module¶
| Data Type | Definition |
|---|---|
| AudioSeInit_t | The structure type of SE algorithm initialization data. |
| AudioSeNoiseType_e | The enum type of SE algorithm noise selection. |
| AudioSeConfig_t | The structure type of SE algorithm parameter configuration. |
| SE_HANDLE | The handle type of SE algorithm. |
3.2. AudioSeInit_t¶
-
Description
Define the configuration parameter structure of SE algorithm.
-
Definition
typedef struct{ int sampleRate; int bitWidth; int channel; }AudioSeInit_t; -
Members
Member Name Description sampleRate The sampling rate of speech. bitWidth The bitwidth of speech sampling. channel Number of speech channels. Value range [1, 2]. "1" means single channel, while "2" means dual channels. -
Note
-
The sampling bitwidth only supports 16bit, and the sampling rate only supports 16kHz.
-
Dual-channel data should be stored interleaved according to the left and right channels, and the data format should be: L, R, L, R, L, R... and so on.
-
-
Related Data Type and Interface
3.3. AudioSeNoiseType_e¶
-
Description
Define the type of noise to be suppressed by SE algorithm.
-
Definition
typedef enum{ IAA_SE_OFFICE_NOISE = 0, IAA_SE_TRAFFIC_NOISE, IAA_SE_ALL_NOISE, }AudioSeNoiseType_e; -
Members
Member Name Description IAA_SE_OFFICE_NOISE Indoor setting. IAA_SE_TRAFFIC_NOISE Outdoor setting. IAA_SE_ALL_NOISE General setting. -
Note
- Only IAA_SE_ALL_NOISE is supported currently.
-
Related Data Type and Interface
3.4. AudioSeConfig_t¶
-
Description
Define the parameter structure of SE algorithm.
-
Definition
typedef struct{ AudioSeNoiseType_e noiseType; int intensity; int normalize; int smooth; int normalizeMode; int normalizeVadThreshold; int normalizePosition; }AudioSeConfig_t; -
Members
Member Name Description noiseType The noise type to be suppressed. intensity The intensity of noise suppression. Value range: [1~10], and the step length is 1. Setting the value to 1 represents the weakest suppression intensity, while setting the value to 10 represents the greatest suppression intensity. The recommended value is 5. normalize Normalized parameter, which should be enabled when sound volume is small. The recommended value is 20000. When not in use, setting it to 0. smooth Smooth factor when Normalize is enabled. Value range: [1~10]. The larger the value, the smoother it will be. The recommended value is 5. normalizeMode Normalized mode, 0: fixed gain normalize, 1: adaptive normalize normalizeVadThreshold Normalized VAD threshold. Value range: [-80, 0] normalizePosition Normalized position, 0: normalize before speech enhancement, 1: normalize after speech enhancement -
Related Data Type and Interface
3.5. SE_HANDLE¶
-
Description
Define the handle type of SE algorithm.
-
Definition
typedef void* SE_HANDLE; -
Members
Member Name Description -
Note
N/A.
-
Related Data Type and Interface
4. Error Codes¶
SE API Error Codes are shown in the following table:
| Error Code | Macro Definition | Description |
|---|---|---|
| 0x00000000 | ALGO_SE_RET_SUCCESS | SE algorithm runs successfully. |
| 0x70000401 | ALGO_SE_RET_INVALID_LICENSE | Invalid license/Trial period is over. |
| 0x70000402 | ALGO_SE_RET_INVALID_HANDLE | Invalid handle. |
| 0x70000403 | ALGO_SE_RET_INVALID_SAMPLERATE | Invalid sampling rate. |
| 0x70000404 | ALGO_SE_RET_INVALID_BITWIDTH | Invalid sampling bitwidth. |
| 0x70000405 | ALGO_SE_RET_INVALID_CHANNEL | Invalid channel number. |
| 0x70000406 | ALGO_SE_RET_INVALID_INTENSTIY | Invalid noise suppression intensity. |
| 0x70000407 | ALGO_SE_RET_INVALID_NOISETYPE | Invalid noise type. |
| 0x70000408 | ALGO_SE_RET_INVALID_NORMALIZE | Invalid normalized amplitude. |
| 0x70000409 | ALGO_SE_RET_INVALID_SMOOTH | Invalid smooth factor. |
| 0x7000040A | ALGO_SE_RET_INVALID_NORMALIZE_POS | Invalid normalized position |
| 0x7000040B | ALGO_SE_RET_INVALID_NORMALIZE_MODE | Invalid normalized mode |
| 0x7000040C | ALGO_SE_RET_INVALID_NORMALIZE_VADTHR | Invalid normalized VAD threshold |