HD SAEC | Stereophonic AEC Software
HD SAEC: High Definition Stereo Acoustic Echo Cancellation Software
Full-band stereo acoustic echo cancellation under a wide dynamic range of audio levels.
Designed for next gen products: Adaptive Digital’s HD SEAC™ is a full-band high definition (HD), multi-mic capable, full-duplex stereo acoustic echo canceller (SAEC) which includes noise reduction (NR), as well as anti-howling, adaptive filtering, nonlinear processing, and double-talk detection.
Stereo acoustic echo cancellation is considerably more complex to achieve than mono echo cancellation because of the signal correlation complexity, non-uniqueness, and convergence problems.
 
 Availability – Fullband Stereo HD AEC
| Platforms | 
| Arm ® Devices – Armv7-M Cortex-M7 @ 48 kHz / Armv8-A / Armv9-A | Armv8-M Cortex M33/35, Armv8.1M Cortex-M85 | 
| Intel Core i7 Mobile – 6600u – Skylake | 
ADT HD SAEC is available on the above Platforms: Other configurations are available upon request.
Features List
- True full-duplex operation under a wide dynamic range of audio levels, even when microphone input signal is weak
- Programmable sampling rate, supporting narrowband (8 kHz), wideband (16 kHz), super-wideband (32 kHz), and full-band (44.1, 48 kHz)
- Improved adaptive nonlinear processor
- Handles echo tails of up to 500 msec. and greater, with true full-duplex cancellation.
- Spectrally representative comfort noise generator
- Automatically adjusts for unknown bulk (buffering/audio driver) delay
- Able to handle strong echo (speaker to microphone gains up to 20 dB)
- Anti-Howling
- Instantly adjusts to user-controlled speaker gain changes
- Handles external user-controlled volume changes
- Parameters are user configurable
- Improved fast convergence and reconvergence
- No divergence during double-talk
- Integrated Automatic Gain Control (AGC)
- Improves speech recognition performance in an echoic environment.
- Integrated Next Gen Noise Reduction (NR)
- Integrated Transmit Equalization
 
 Specifications
↓ Click on links below to view specification tables.
Note: HD SAEC Cortex-M4 MIPS generated with 0 wait state FLASH. 
Specifications measured on TI Tiva C series ARM Cortex-M4 based MCU.
SAEC ARM Cortex-M7
 CPU Utilization & Memory Requirements
 All Memory usage is given in units of byte. 
 Platform Sampling Rate Tail Length (msec) MIPS*per Mic Per Channel Memory Cortex-M7 48 kHz 32 257 65k 64 271 65k 128 310 96k 256 383 96k 
* with Anti-howling
SAEC ARM Cortex-M33/M35 – Estimate
 CPU Utilization & Memory Requirements
 All Memory usage is given in units of byte. 
 Platform Sampling Rate Tail Length (msec) MIPS*per Mic Per Channel Memory Cortex-M33/35 48 kHz 32 360 65k 64 380 65k 128 434 96k 256 536 96k 
* with Anti-howling
SAEC – Intel Core i7 Mobile, Skylake-U
CPU UTILIZATIONMCPS (millions of Cycles Per Second)
| Platform | Tail Length | Frame Size | Sampling Rate | Mono/Stereo | MCPS Per Mic | 
| Intel Core i7-6600@2.60GHz | 256 ms | 16khz | mono | 23 | 
| stereo | 31 | |||
| Intel Core i7-6600@2.60GHz | 256 ms | 32khz | mono | 45 | 
| stereo | 63 | |||
| Intel Core i7-6600@2.60GHz | 256 ms | 48khz | mono | 50 | 
| stereo | 85 | 
Stereophonic (Stereo) acoustic echo cancelling (SAEC)
Teleconferencing systems require the use of acoustic echo cancelers (AECs) to reduce echoes that result from coupling between the loudspeaker and microphone. In order to provide a more realistic conversational experience, two-channel audio is necessary. In the case of stereophonic audio transmission, the acoustic echo cancellation problem is more difficult to solve because of the necessity to uniquely identify two acoustic paths.
This disturbance, caused by echo, increases in severity with the propagation delay of the channel.
In applications such as teleconferencing and hands-free telephony, stereophonic systems provide telepresence compared to monaural systems to users by enabling listeners to localize conference participants in meetings where multiple parties might be conversing at the same time.
A high-end stereo teleconferencing system provides a more “natural” listening experience between conferees than monophonic systems. The stereophonic (Stereo) acoustic echo canceller (SAEC) software suppresses the echo returned to the transmission room to enable undisturbed communication between the rooms. Participants can more easily discern who is talking at the other end by means of the spatial aspect of the audio output. In such hands-free systems, stereophonic acoustic echo cancellers are absolutely necessary for full-duplex communication.
Stereophonic acoustic echo cancellation is similar to monophonic (mono) AEC in that the echo in need of canceling is due to a speaker – microphone acoustic coupling. In the SAEC case, each speaker will acoustically couple to each microphone in the system. On the surface the solution to the problem of cancelling multiple echoes seems relatively uncomplicated. The number of acoustic echo cancellers (AECs) needed in a system containing N speakers and M microphones NM. For the full-duplex stereophonic case, four (4) AECs are required. The actual echo cancellation requirements are not so straight forward.
The challenge with the SAEC is well documented and relates to the fact that the audio from each speaker is highly correlated (signals are similar) with the other speakers in the system since it is assumed all speaker audio outputs are coming from a common remote room. The high correlation between speakers can produce non-unique “confusion” for the adaptive filters. The individual adaptive filters can lock onto the same speaker signals producing an overall effect of slow convergence and high misalignment.
The usual solution to this problem is to perform a non-linear operation on each incoming signal. This has an effect of de-correlating the speaker signals which in turn allows the adaptive filters to converge more quickly with less misalignment. The downside of this is that the non-linear operators can cause too much audible distortion to the near end room listeners (since the speaker signals are now distorted).
Pyscho-acoustic non-linear operators can be a good choice since the human ear hears sound on a non-linear Bark Scale. Essentially the ear is non-linear listening device. So by distorting the signal in a pyscho-acoustic fashion, one mathematically de-correlates the signal but if done correctly will be less perceived as distortion to the ear.
—————————————
Psychoacoustics is the scientific study of sound perception. As soon as sound passes through the ears, it stops being a physical phenomena and becomes a matter of perception. What we hear is almost by rule different from what is actually sounding, due to the peculiarities and limitations of our hearing.
The Bark Scale is a nonlinear audio frequency scale.