CELT - Technology | Technology Trends

Technology

CELT is a transform codec based on the modified discrete cosine transform (MDCT) and concepts from CELP (with a code book for excitation, but in the frequency domain).

The initial PCM-coded signal is handled in relatively small, overlapping blocks for the MDCT (window function) and transformed to frequency coefficients. Choosing an especially short block size on the one hand enables for a low latency, but also leads to poor frequency resolution that has to be compensated. For a further reduction of the algorithmic delay to the expense of a minor sacrifice in audio quality, the by nature 50% of overlap between the blocks is practically cut down to half by silencing the signal during one eight at both ends of a block, respectively.

The coefficients are grouped to resemble the critical bands of the human auditory system. The entire amount of energy of each group is analysed and the values quantised for data reduction and compressed through prediction by only transmitting the difference to the predicted values (delta encoding).

The (unquantised) band energy values are removed from the raw DCT coefficients (normalisation). The coefficients of the resulting residual signal (so-called “band shape”) are coded by Pyramid Vector Quantisation (PVQ, a spherical vector quantisation). This encoding leads to code words of fixed (predictable) length, which in turn enables for robustness against bit errors and leaves no need for entropy encoding. Finally, all output of the encoder are coded to one bitstream by a range encoder. In connection with the PVQ, CELT uses a technique known as band folding, which is said to deliver a similar effect to the spectral band replication (SBR) by reusing coefficients of lower bands for higher ones, while at the same time it has much less implications on the algorithmic delay and computational complexity than the SBR. This works against “birdie” artifacts by preserving more richness in the appropriate frequency bands.

The decoder unpacks the individual components from the range coded bitstream, multiplies the band energy to the band shape coefficients and transforms them back (via iMDCT) to PCM data. The individual blocks are rejoined using weighted overlap-add (WOLA). Many parameters are not explicitly coded, but instead reconstructed by using the same functions as the encoder.

For the channel coupling CELT may use M/S stereo or intensity stereo. Blocks can be described independent from adjacent frames (Intra-frame); for example to enable a decoder to jump into a running stream. With transform codecs so-called pre-echo artifacts can get audible, because the quantisation error of sharp, energy-heavy sounds (transients) can spread over the entire DCT block and the transient doesn't mask them backward in time as well as forward. With CELT each block can be further divided to thwart such artifacts.

Famous quotes containing the word technology:

“If the technology cannot shoulder the entire burden of strategic change, it nevertheless can set into motion a series of dynamics that present an important challenge to imperative control and the industrial division of labor. The more blurred the distinction between what workers know and what managers know, the more fragile and pointless any traditional relationships of domination and subordination between them will become.”
—Shoshana Zuboff (b. 1951)

“The successor to politics will be propaganda. Propaganda, not in the sense of a message or ideology, but as the impact of the whole technology of the times.”
—Marshall McLuhan (1911–1980)

Main Site Subjects