Far-field Microphone Processing: IA Voice Signal Solutions

Microphone Array | Beamforming | Clean Voice

FAR FIELD VOICE PROCESSING

Available: TI DSP C674x, C5517,
Arm Cortex-A

Near Future: ADI SHARC

Let us know your requirements.

Far-field speech and voice recognition systems use multiple microphones in form of a linear or circular array to reduce the impact of noise and reverberation from the surrounding environment.

Speech recognition performance degrades drastically under noisy and reverberate environments. As in any home, office, or even outdoor application, sound is all around us. The greater distance a speaker is from a microphone, the greater the level of distortion with the addition of the ambient noise streams. Background noises, such as a running dishwasher, television set, children playing, dogs barking, need to be removed from the sound stream so that the keyword can be distinguished from other speech signals by the application.

CLEAR SPEECH SOLUTION

Multi-microphone capable advanced audio processing algorithms from Adaptive Digital Technologies such as high definition acoustic echo cancellation (HD AEC), beamforming, adaptive spectral noise reduction, anti-howling, adaptive filtering, nonlinear processing, and double talk detection to produce a high-quality, full-duplex communication.

The algorithms and processing written by Adaptive Digital correct for the problems that are found in microphone array and loud-speaker applications. Beamforming (BF), noise reduction (NR) and adaptive filtering algorithms focus the processing system on speech as the critical audio signal and filter out other high-frequency and low-frequency signals in the environment.

Non-linear processing helps account for the artifacts introduced by the acoustic characteristics of the system. HD AEC is responsible for removing the echo coming from the 2-way real-time nature of the communication. Anti-howling and double talk detection work to balance audio signals, remove jitter in the transmitted signal and improve the overall quality of the final signal.

FRONT END PROCESSING

Adaptive Digital algorithms recognize the dominant voice and suppress background chatter noise.

Far-field Voice Input Processing software first detects far-field speech, then reduces the clutter in the voice application can send a clear voice signal, or distinguish a wake-word from other noise sources.

For certain environments, a microphone array may be employed for voice capture. In a microphone array, a number of microphones can be arranged in either a circular, or linear pattern and used to pick up speech signals via phase steering. Essentially, the microphones, while not physically pointing in any specific direction will point acoustically in one or many directions. When a voice command emanates from a particular direction, the clutter noise on the periphery of that direction is either reduced or not picked up by the microphone array.

The number of microphones and the distance between them in the array will affect the accuracy, frequency and direction of the directional beam.

TI DSP BASED FAR-FIELD SOLUTIONS

Adaptive Digtal offers TMS320C5517 and TMS320C6748 Clear Speech Solution with Acoustic Beamforming for high processor performance applications.

The process of cleaning up the sound stream is done through the implementation of noise reduction/suppression of any noise that is not voice.

The difference in location to the microphone will affect the intensity of the voice signal, and as with any human element such as speech, there are many differentiations of intensity, deep or high pitched, soft or loud in volume. A gain level adjusting algorithm is applied to the voice signal to adjust the signal to a consistent level no matter the intensity level of the original voice stream.

The clean and enhanced speech signal can then be recognized by the application, allowing speech detection/recognition to take place.

SOFTWARE FAST FACTS:

Adaptive Digital TI-C5517

Audio Algorithms: HD AEC, NR, Beamforming, and AGC
HD AEC Echo Tail – 128msec
Total AEC+NR+BF < 160 MIPS
Total Memory (Data + Program) fits into internal 5517 RAM
Audio – 16 kHz sample rate, 16-bit linear sample

The future of voice recognition technologies will lie in the detection of inflection and emotion. Adaptive Digital’s clear speech algorithms will aid in the advancement of these technologies.

Other applications for the Adaptive Digital HD AEC clear voice solution include multiple microphone video conferencing systems, soft phones, bluetooth speakers, IP cameras, automobile cabins, USB headsets, voice Command and control applications (voice enabled Smart Speaker, voice enabled Smart Gateway) and security to name a few.