Auto Speech Recognition (ASR) Preprocessor

ASR Preprocessor Suite, speech recognition enhancement application software

Auto speech recognition front-end preprocessing

Speech Recognition Preprocessor

Features List

  • High Definition Acoustic Echo Cancellation (HD AEC) 

  • Voice Activity Detection (VAD)detects the presence of speech  

  • Robust Background Noise removal 

  • Automatic Gain Control (AGC) 

  • Auto Level Controls 

  • Packet Loss Concealment (PLC): Required in VoIP Environment

BENEFITS

» The ASR software includes Adaptive Digital’s proprietary High Definition Acoust ic Echo cancellation (HDAEC). HDAEC has configuration settings that customize for use with ASR.

» Superior Noise Reduction specifically designed for use in conjunction with ASR to remove noise and leave the important speech of interest signal intact.

» Supports narrowband (8 kHz), wideband (16 kHz), super-wideband (32 kHz), and full-band (44.1, 48 kHz).

ASR PREPROCESSING SPECTROGRAM COMPARISON

Original speech

Speech with noise

Speech with noise

After ASR Preprocessing

Specifications

NOTE: We specify MIPS (Millions of Instructions Per Second) as MCPS (Millions of Instruction Cycles Per Second). Unless otherwise specified, peak MIPS are indicated.

Talker Distance (inches)Far End LevelReverb TimeNoise levelRecognition Rate Without EnhancementRecognition Rate With Enhancement
2-10256-960.51
12-10256-960.051
24-10256-9600.96
36-10256-9600.94
42-10256-9600.81
48-10256-9600.63
2-10256-250.631
12-10256-2501
24-10256-2500.98
36-10256-2500.83
42-10256-2500.76
48-10256-25043
 VQE Algorithms – HD AEC, Noise Reduction, AGC
* ASR Preprocessor performance in a situation where there is full-duplex voice, background noise, and reverberation. 

Talker Distance (inches)Reverb Time (msec)Recognition Rate Without EnhancementRecognition Rate With Enhancement
240011
1240011
2440011
364000.961
424000.941
484000.470.76

VQE Algorithms – Noise Reduction, AGC

Signal To NoiseRecognition Rate Without EnhancementRecognition Rate With Enhancement
18100100
16100100
14100100
12*54100
10*72100
86185

VQE Algorithms – Noise Reduction, AGC
* Recognizer gave up part way through without ASR preprocessor enabled.

Description

Speech recognition systems have a high level of accuracy in quiet conditions but work poorly under noisy conditions and are particularly challenged when the distance between the talker and microphone is increased. The accuracy of a speech recognition system may be acceptable if talking into an ASR enabled unit in your quiet office, yet the same unit’s performance may be unacceptable in a shopping mall. 

What is needed is a more effective method of enhancing the speech of interest for accuracy purpose especially in situations when the ASR unit is positioned far from the speaker in noisy environments including background conversations and multi-person chatter.

 

asr-diagram

Adaptive Digital’s ASR Preprocessor Software addresses such degradation issues. The field hardened HD AEC is exceptional in handling acoustic echo. The Noise reduction algorithm has been developed to achieve up to 12 dB of signal to noise ratio improvement with little to no degradation to the desired speech signal. Automatic Level Control, included, has the ability to adjust speech levels speech levels to increase the dynamic range of the ASR, especially at the low signal level end.

Optionally, in VoIP environments Packet Loss Concealment may be required to handle packet drop-outs that would otherwise be perceived as missing data.
The key to integrating a superior solution in an ASR environment is to put various enhancement algorithms together in such a way that maximizes speech quality. Adaptive Digital provides Preprocessor Speech Enhancement algorithms to significantly improve the robustness and accuracy of a speech recognition system.

 

Adaptive Digital’s engineering team has over thirty years of experience in practical and theoretical aspects of voice quality software for embedded processors. We support customers from concept through deployment, helping to ensure that our products’ voice quality is carried into the end user experience.

Translate »