Auto Speech Recognition (ASR) Preprocessor
ASR Preprocessor Suite, speech recognition enhancement application software
Auto speech recognition front-end preprocessing
High Definition Acoustic Echo Cancellation (HD AEC)
Voice Activity Detection (VAD)detects the presence of speech
Robust Background Noise removal
Automatic Gain Control (AGC)
Auto Level Controls
Packet Loss Concealment (PLC): Required in VoIP Environment
» The ASR software includes Adaptive Digital’s proprietary High Definition Acoust ic Echo cancellation (HDAEC). HDAEC has configuration settings that customize for use with ASR.
» Superior Noise Reduction specifically designed for use in conjunction with ASR to remove noise and leave the important speech of interest signal intact.
» Supports narrowband (8 kHz), wideband (16 kHz), super-wideband (32 kHz), and full-band (44.1, 48 kHz).
ASR PREPROCESSING SPECTROGRAM COMPARISON
Speech with noise
After ASR Preprocessing
NOTE: We specify MIPS (Millions of Instructions Per Second) as MCPS (Millions of Instruction Cycles Per Second). Unless otherwise specified, peak MIPS are indicated.
|Talker Distance (inches)||Far End Level||Reverb Time||Noise level||Recognition Rate Without Enhancement||Recognition Rate With Enhancement|
* ASR Preprocessor performance in a situation where there is full-duplex voice, background noise, and reverberation.
Talker Distance (inches) Reverb Time (msec) Recognition Rate Without Enhancement Recognition Rate With Enhancement 2 400 1 1 12 400 1 1 24 400 1 1 36 400 0.96 1 42 400 0.94 1 48 400 0.47 0.76
VQE Algorithms – Noise Reduction, AGC
Signal To Noise Recognition Rate Without Enhancement Recognition Rate With Enhancement 18 100 100 16 100 100 14 100 100 12* 54 100 10* 72 100 8 61 85
* Recognizer gave up part way through without ASR preprocessor enabled.
Speech recognition systems have a high level of accuracy in quiet conditions but work poorly under noisy conditions and are particularly challenged when the distance between the talker and microphone is increased. The accuracy of a speech recognition system may be acceptable if talking into an ASR enabled unit in your quiet office, yet the same unit’s performance may be unacceptable in a shopping mall.
What is needed is a more effective method of enhancing the speech of interest for accuracy purpose especially in situations when the ASR unit is positioned far from the speaker in noisy environments including background conversations and multi-person chatter.
Adaptive Digital’s ASR Preprocessor Software addresses such degradation issues. The field hardened HD AEC is exceptional in handling acoustic echo. The Noise reduction algorithm has been developed to achieve up to 12 dB of signal to noise ratio improvement with little to no degradation to the desired speech signal. Automatic Level Control, included, has the ability to adjust speech levels speech levels to increase the dynamic range of the ASR, especially at the low signal level end.
Optionally, in VoIP environments Packet Loss Concealment may be required to handle packet drop-outs that would otherwise be perceived as missing data.
The key to integrating a superior solution in an ASR environment is to put various enhancement algorithms together in such a way that maximizes speech quality. Adaptive Digital provides Preprocessor Speech Enhancement algorithms to significantly improve the robustness and accuracy of a speech recognition system.
Adaptive Digital’s engineering team has over thirty years of experience in practical and theoretical aspects of voice quality software for embedded processors. We support customers from concept through deployment, helping to ensure that our products’ voice quality is carried into the end user experience.