SITE LINKS
ECHO CANCELLER ![]()
RELATED LINKS
PRODUCT INDEX .pdf INDEX WHITE PAPERS VOIP PHONE REFERENCE KIT G.PAK FRAMEWORK OPEN G.PAK SOLUTION DSP RESOURCE WIZARD
Adaptive Digital's echo canceller is a robust, carrier class, flexible software solution that is available across a wide variety of DSP platforms.
2.5 Residual Echo and Nonlinear Processing
3.1 Convergence with no Near End Speaker
3.2 Convergence in the Presence of Background Noise
At Adaptive Digital Technologies, we are well known for our echo canceller technology. Our G.168 compliant echo canceller and its predecessor G.165 echo canceller have been deployed around the world and continue to provide carrier class voice quality.
The International Telecommunications Union (ITU), in writing the G.168 echo canceller specification, has created a common method for objectively testing echo cancellers using a set of test procedures. The ITU specifies minimum performance requirements for these tests. The goal is to ensure that G.168 compliant echo cancellers perform well in the telephone network.
While the G.168 tests have merit, G.168 compliance does not guarantee good voice quality. Consequently, the objective test results should always be supplemented with subjective voice quality testing.
This document discusses subjective testing and includes sound clips so that the reader can evaluate Adaptive Digital Technologies’ echo canceller subjectively.
Section 2 of this document provides a brief overview of echo cancellation. Section 3 discusses the tests and provides links to sound clips.
Throughout this document, you will find
the following icon, which can be clicked to listen to sample speech clips:
The telephone network contains sources of electrical echo whenever a conversion is done between a 2-wire circuit and a 4-wire circuit. The most common device that connects to a 2-wire circuit is the standard analog telephone. The telephone is connected to the telephone central office by a pair of wires. This pair of wires carries speech signals from the telephone to the central office (receive) and from the central office to the telephone (transmit). At the central office, the 2-wire circuit is converted to a 4-wire circuit using a hybrid circuit. The resulting 4-wire circuit uses one pair of wires for each direction of transmission.
A hybrid circuit does its best to prevent from the signal being received via the 4-wire circuit from being reflected and echoed back. Hybrid circuits provide some echo isolation, but not enough when the end-to-end circuit delay is even moderate.
In order to combat the echo phenomenon, an echo canceller is employed. Today’s echo cancellers use sophisticated algorithms running on high speed Digital Signal Processors (DSPs) to combat the echo.
Figure 2-1 below is a block diagram of a telephone circuit, including the echo source and the echo canceller.
Phone A’s transmission passes through Hybrid A, through Echo Canceller A, through the Telephone Network, through Echo Canceller B, through Hybrid B to telephone B. A similar path is established in between Phone B and Phone A.
When Phone A’s transmission reaches Hybrid B, part of the signal is reflected by the hybrid back towards Echo Canceller B and therefore back to Phone A. If echo cancellation is not performed (and the network delay is moderate), the speaker at Phone A will perceive his echo.
It is the responsibility of Echo Canceller B to cancel the echo that is induced by Hybrid B. Likewise, it is the responsibility of Echo Canceller A to cancel the echo that is induced by Hybrid A.
The terminology Near End and Far End are usually used when referring to an echo canceller. For example, the Far End signal enters echo canceller A and passes through unchanged and is sent out to the hybrid. The hybrid, which is at the Near End with respect to echo canceller A, reflects a portion of the far end signal back towards the echo canceller. The Near End signal received by echo canceller A therefore consists of the sum of Phone A’s transmit signal and the echo of the far end induced by Hybrid A.
Now that we have the network topology and terminology described, we need to discuss a few issues that are faced by echo cancellers.
Perception of echo is quite subjective. Each person has a different tolerance for echo. The tolerance tends to be a function of the relative level of the echo and the delay between the outgoing speech and the returned echo.
We define Echo Return Loss (ERL) as the ratio (in dB) between the level of outgoing speech and the level of the returned echo component. A high ERL means that the echo is small. Conversely, a small ERL means that the echo level is high.
We also define the round trip delay to be the amount of time elapsed between the start of an outgoing utterance and the echoed utterance.
A person’s tolerance of echo is hence a function of round trip delay and echo return loss. You listen for yourself by clicking on the icons in the table below:
A hybrid circuit does not create a brick-wall echo. A brick wall echo refers to one where the response of a far end impulse would be an echoed impulse. Since the hybrid is a circuit, the impulse response of the echo path is of a diffuse nature. The impulse response of the hybrid circuit is referred to as the echo tail. The duration of the echo tail is referred as the tail length. An echo canceller must cancel the entire tail.
To make matters more interesting, it is possible that multiple echo sources can be present in the network. This situation is referred to as one with multiple echo tails. A good echo canceller will cancel echo due to all the echo sources in the network.
ADT’s echo canceller finds the locations of multiple
tails and cancels them.
2.3 Adaptive Filtering
As each hybrid circuit is slightly different, each echo tail is different as well. Many factors determine the echo path. It is even possible for an echo tail to change while a circuit is active. This could happen when a second telephone extension is taken off-hook in parallel with the first one.
Due to these variations in echo tails, it is necessary for an echo canceller to adapt to the tail continuously. Adaptive Filtering is employed within echo cancellers to this end. The adaptive filters should converge quickly, but not so quickly that they might diverge under some conditions.
This is especially important when a circuit is first established. The amount of time it takes the echo canceller to adapt to an echo path is referred to as the “convergence time”.
An echo canceller adapts its filter based upon a comparison of the far end signal and the echo of the far end signal that is returned via the near end signal. Ideally, the signal from the near end telephone is quite. If this is not the case, the echo canceller cannot discern between near end speech and echo. The near end speech appears as interference to the filter adaptation algorithm.
The condition where both parties are speaking is referred to as double-talk. If a double-talk condition is not detected properly by the echo canceller, the near end speech will cause the adaptive filter to diverge. It is therefore important to have a reliable double-talk detector.
Double-talk detectors do not detect double-talk instantaneously. As a result, there is a brief period of double-talk during which the adaptive filter might diverge. In order to combat this situation, ADT’s echo canceller remembers a better, more converged, state of the adaptive filter and reverts back to it when double-talk is detected. This enables ADT’s echo canceller to incur zero divergence due to the onset of the double-talk condition.
2.5 Residual Echo and Nonlinear Processing
Echo cancellers are not perfect. There is nominally some residual echo that gets past the echo canceller. This residual echo is handled using a non-linear processor. The non-linear processor (NLP) acts to remove the residual echo under certain conditions.
The simplest form of nonlinear processor will mute the residual echo when the NLP is engaged. It is not always good to mute the residual echo because as a result, near end background noise will be muted also. This results in the background noise being pulsed on and off as the nonlinear processor activates and deactivates. In order to avoid this unwanted pulsing, a comfort noise generator is used. When using a comfort noise generator in conjunction with a NLP, the residual echo is replaced by a synthesized noise signal whose level is that of the background noise. This improves perceived voice quality.
While it is necessary for the synthesized comfort noise to be generated at the same level as that of the background noise, this by itself is not sufficient for optimum voice quality. If the spectrum of the comfort noise is not similar enough to that of the background noise, the listener will be able to hear the transition between background and comfort noise. Shaped comfort noise is desirable because background noise is, in general, does not have a flat spectrum.
This section discusses the various sound clips that can be used for subjective evaluation of the Adaptive Digital Technologies’ G.168 echo canceller.
3.1 Convergence with no Near End Speaker
This test is run with a far end speaker talking. The near end signal contains only the echo of the far end signal. No near end speaker is talking. This test demonstrates the convergence of the echo canceller. This test is run with and without the non-linear processor (NLP) enabled.
This test is run with a far end speaker talking. The near end signal contains the sum of the echo of the far end signal and additive background noise. This test demonstrates the convergence of the echo canceller in the presence of background noise. This test is run with nonlinear-processor (NLP) in each of the following four configurations:
NLP Off - In this case, the residual echo is heard at the start of the speech segment while the echo canceller initially converges.
NLP Mute - In this case, the residual echo is muted along with the background noise.
NLP CNG Random - In this case, the residual is replaced by random noise. The listener will note that the comfort noise does not appear at the correct level for the first few seconds. This is due to the time it takes the comfort noise generator to determine the level of the background noise. The transitions between true background and comfort noise are quite noticeable because random noise is not spectrally similar to the background noise.
NLP CNG Shaped - In this case, the residual is replaced by shaped noise. Once again, the listener will note that the comfort noise does not appear at the correct level for the first few seconds. This is due to the time it takes the comfort noise generator to determine the level of the background noise. The transitions between true background and comfort noise are less apparent than in the random noise case because the spectrum of the shaped noise is close to that of the background noise.
3.3 Performance in the Presence of Double-Talk
This test is run with a far end speaker talking. The near end signal contains the sum of the echo of the far end signal and a simulated near end speaker with increasing and then decreasing amplitude. The simulated near end speech signal is the ITU recommended Composite Speech Signal (CSS). The simulated near end speaker provides a double-talk condition. This test is run with the NLP off.
3.4 Non-divergence in the Presence of Tones
This test is run with a far end speaker talking. A sequence of DTMF tones is inserted into the middle of the far end speech. The near end signal consists only of an echo of the far end speech. This test is run with the NLP off.
