![]()
Time Domain Voicing Cutoff (TDVC) is a flexible, low-complexity vocoder producing high quality speech at the rate of 1.95 kbps. TDVC represents a breakthrough in high-quality, low-rate voice encoding. The TDVC algorithm was designed for low bit rate voice applications where a high degree of speech intelligibility and natural voice quality are required.
Data sheet:.pdf ![]()
TDVC is a parametric vocoder of the highest quality that:
1) Outperforms other commercial and military parametric coders.
2) Operates at a rate lower than other commercial and military coders
3) Approaches the performance of the previous generation CELP-based waveform coders (the goal was to equal the MOS score of the first-generation GSM codec).
4) With additional FEC, operates in a 7% bit-error-rate Gaussian channel environment with minimal quality loss.
5) Without additional FEC, operates in a 30% frame loss environment using the internal packet loss compensation algorithm.
ADT TDVC is available on the TMS320C6000™ DSP, & C5000™ DSP Families
C64x™DSP, C55x™DSPGenerations
TDVC C64x
All memory requirements are in units of byte.
| Software | MIPS | Program Memory | Data Memory | Scratch | Per-Channel Data Memory | |
| Average | Max | |||||
| Encode | 45.0 | 47.5 | 67k | 1k | 16384 | 3740 |
| Decode | 20.4 | 29.1 | 660 | |||
TDVC C55x
| Software | MIPS | Program Memory | Data Memory | Scratch | Per-Channel Data Memory | |
| Average | Max | |||||
| Encode | 55.45 | 59.0 | 28k | 512 | 16384 | 3740 |
| Decode | 14.86 | 19.0 | 660 | |||
COMPETING ALGORITHMS
What are the competing algorithms?
Low-rate CELP derivatives
MELP-like parametric coders
AMBE-like parametric coders
MELP is most similar to TDVC in that it uses a 10th order LPC analysis for the gross spectrum, and both use a multi-stage VQ to transmit the coefficients. In fact, the MELP and TDVC spectral quantization algorithms may be interchanged between the vocoders, and both systems will perform in a similar fashion.
AMBE is also a parametric algorithm, but operates in the frequency domain. The gross spectrum is transmitted by quantizing spectral amplitudes (which are grouped into bands).
All parametric coders transmit at least 4 classes of parameters: gross spectrum, pitch, voicing, and energy. Some may send multiple estimates of the parameters per frame, while others may send additional information designed to improve some aspect of performance.
While the CELP-like coders may use many of the parametric vocoder parameters, they are inherently waveform coders, because the final decision metric for the excitation parameters is ultimately a MMSE match of the input and output waveforms. CELP coders generally transmit parameters for 1) gross spectrum (using some transform of the LPC coefficients, which are determined in the same fashion as in MELP and TDVC), 2) long term predictor lag (which is similar to the parametric coder’s pitch, but actually is a waveform coder parameter), pitch lag gain (may be multiple coefficients), excitation index (or indices, as there could be several), and excitation gain(s). In lieu of excitation gain, there could be a full-frame RMS measurement.
What makes TDVC better?
First, let’s describe “better”. “Better” means higher voice quality at a lower rate, so anything that reduces the rate (but keeps the quality, or improves it at the same time) is a discriminator. Likewise, an increase in quality without an increase rate is a similar discriminator.
TDVC is a mature algorithm that has been extensively tested*; a total of 10 MOS and 2 DAM tests have been completed using 2 independent testing laboratories and 4 different speech sources, including foreign languages.
The overall high scores are achieved by the combination of many smaller algorithm improvements that either improve quality, lower rate, or perform both functions. These small improvements constitute the key “tricks of the trade” and are either patented or closely guarded trade secrets.

Voicing: TDVC voicing is transmitted with only 3 bits. MELP and AMBE both use more (5 for MELP, including overall voicing. AMBE rate (9 bits) depends on the algorithm version). The key difference is that TDVC uses only one transition frequency to demark the boundary between voiced excitation and unvoiced excitation, while the others divide the excitation type up according to frequency bands. We have found that bandpass voicing does not significantly improve the quality over single frequency cutoff voicing. [Same quality, lower rate].
Voicing determination: TDVC can use either time-domain (filter bank) or frequency domain (FFT) algorithms to determine the voicing cutoff frequency. However, as with all voicing determination algorithms, there are several repeatable scenarios where the algorithm will make an incorrect voicing decision. TDVC contains a heuristic voicing smoother that recognizes these scenarios by observing the history of several statistics over a period of 100 milliseconds. There are seven scenarios in the smoother; see US patent 6,078,880 columns 13-14 for operational details. Other parametric coders have been observed making the same voicing errors that the heuristic smoother removes (MELP is one example). [Higher quality, same rate].
Voiced excitation: TDVC uses a sum-of-sinusoids method to generate the voiced excitation, while MELP uses a time domain impulse train and a pulse phase dispersion filter. AMBE uses a sum-of-sinusoids method, but uses different amplitude, phase, and interpolation control. TDVC creates and entire pitch period of excitation (pitch epoch) using interpolated values based on the epoch’s position within the output frame. The characteristics of the epoch (relative phase, length, and relative harmonic amplitude) are kept constant over the epoch. This gives a more distinct sound quality to the voiced speech. [Higher quality, same rate].
Voiced excitation relative phase: the relative phases of the harmonics are based on a patented “base phase vector” and are adapted according to the fundamental pitch of the speaker – low pitched male speakers get wider inter-harmonic phase dispersion than high pitched females. See US patent 6,119,082, columns 17 and 18. This technique gives better sound quality than the MELP dispersion filter because it is responsive to the individual speaker’s pitch period. [Higher quality, same rate].
Voiced excitation relative amplitude: the individual relative amplitudes of the voiced excitation are controlled such that the amplitudes within the spectral valleys are reduced, while the amplitudes near the spectral peaks are unchanged. The locations of the spectral peaks can be determined by either: 1) factoring the LPC polynomial into its real and imaginary roots, and then determining the analog frequency using the arctangent function, or 2) determining the spectral magnitude of each harmonic, and then searching for the peak values. The details on the amplitude determination can be found in US patent 6,098,036, columns 17 and 18. The results of this method are similar to those of an adaptive postfilter, but this algorithm does not affect the unvoiced spectrum, resulting in better quality overall speech. MELP uses a separate harmonic amplitude quantizer to achieve a similar effect. [Higher quality and lower rate].
Pitch-adaptive bass restoration: TDVC uses a pitch-adaptive algorithm to increase the amplitude of the fundamental, first, second, and third harmonics for voiced speech. This significantly improves the perceived voice quality for male speakers. See US patent 6,081,777. Here again, MELP uses a separate harmonic amplitude quantizer to achieve a similar effect. [Higher quality, same rate].
Precise, quarter-frame gain control: Like MELP, the TDVC encoder transmits the RMS value of the input speech. It is the decoder’s job to match the output RMS with that of the input. TDVC uses a patented (US patent 5,138,661) quadratic gain matching algorithm that takes into account the synthesis filter state, the current subframe’s excitation and synthesis filter coefficients, and the future values of the output waveform up to the next pitch epoch. For voiced speech, gain calculations or always made over 1 or more full pitch epochs. This calculation is repeated four times per frame for the logarithmically interpolated RMS values and results in a much smoother output sound quality than competing algorithms. MELP transmits 2 gains per frame in pursuit of a similar quality. [Higher quality and lower rate].
Spectral smoothing: TDVC employs and adaptive gross spectral smoother in the speech synthesizer that adjusts the LSF parameters such that the speech output has minimal perceptual “shakiness” without muffling phoneme transitions. [Higher quality, same rate].
Adaptive analysis window placement: If a voicing onset is detected, TDVC will advance the analysis window such that it contains all voiced speech. This prevents transmission of a transition frame that has significant perceptual artifacts. [Higher quality, same rate].
There are several new algorithmic improvements that are not yet covered under US patents.
TDVC - Time Domain Voicing Cutoff
MOS - Mean Opinion Score
CELP - Code Excited Linear Predictive
FEC - Forward Error Correction
MELP - Mixed Excited Linear Predictive
AMBE - Advanced Multi-Band Excitation
GSM - Global System for Mobile communication
MOS- Mean Opinion Score
----------------------------------------------------------------------------------------------
Parametric Coder - Parametric vocoder uses known human voice characteristics to encode and decode data. It is an efficient method of speech compression where the parameters of the linear speech model are first analyzed; these parameters are transmitted across the communication channel, ultimately synthesizing a reproduction of the speech signal with a linear model.
Assignee: Lockheed Martin Corporation; (Bethesda, MD, US)
TDVC Algorithm: 6,078,880 (seminal); 6,119,082; 6,098,036; 6,094,629; 6,081,777; 6,081,776;
6,073,093; 6,067,511; 6,138,092.* Testing performed by Dynastat, Inc., an independent testing laboratory.
SITE LINKS
HOMEVOCODERS
ADT 4800 ADT 9600 G.711 G.722 G.722.1 G.722.2 AMR-WB G.723.1 G.726 G.728 G.729, G.729 AB GSM AMR GSM FR LPC EVRC EVRC-B MELP MELPe SMV TDVC iLBCRELATED LINKS
PRODUCT INDEX G.168 NEC & LEC G.168 Lite G.168 EC CHIP 'C6424 G.168 EC CHIP 'C641x ACOUSTIC EC ACOUSTIC EC G2 LC IPP REFERENCE DESIGN DSP RESOURCE WIZARDAdaptive Digital is a member of:
![]()
![]()