From e9c86133b6239cfbb848c60fa1cd63d962b8c03e Mon Sep 17 00:00:00 2001 From: Jean-Marc Valin Date: Tue, 23 Dec 2008 14:48:27 -0500 Subject: [PATCH] Some details on the MDCT, fixed a bunch of warnings --- doc/ietf/draft-valin-celt-codec.xml | 118 +++++++++++----------------- 1 file changed, 47 insertions(+), 71 deletions(-) diff --git a/doc/ietf/draft-valin-celt-codec.xml b/doc/ietf/draft-valin-celt-codec.xml index c4c4eac0..65a1872d 100644 --- a/doc/ietf/draft-valin-celt-codec.xml +++ b/doc/ietf/draft-valin-celt-codec.xml @@ -12,7 +12,6 @@ Octasic Semiconductor
-jean-marc.valin@octasic.com 4101, Molson Street, suite 300 Montreal @@ -20,12 +19,14 @@ H1Y 3L1 Canada +jean-marc.valin@octasic.com
- + @@ -37,7 +38,7 @@ CELT -CELT is an open-source voice codec suitable for use in very low delay +CELT is an open-source voice codec suitable for use in very low delay Voice over IP (VoIP) type applications. This document describes the encoding and decoding process. @@ -72,18 +73,32 @@ CELT stands for "Constrained Energy Lapped Transform". It applies some of the CE +CELT is designed for transmission over RTP +
Insert encoder overview -Pre-emphasis +The input audio first goes through a pre-emphasis filter, which attenuates the +"spectral tilt". The filter is has the transfer function A(z)=1-alpha_p*z^-1, with +alpha_p=0.8. The inverse of the pre-emphasis is applied at the decoder.
+ +CELT is a transform codec, based on the Modified Discrete Cosine Transform +, which is based on a DCT-IV, with overlap and time-domain +aliasing calcellation. The MDCT implementation has no special characteristic. The +input is a windowed signal (after pre-emphasis) of 2*N samples and the output is N +frequency-domain samples. A "low-overlap" window is used to reduce the algorithmc delay. +It is composed of a smaller window with symmetric zero padding on both sides. The window +is the same as the one used in the Vorbis codec and defined as: W(n)=[sin(pi/2*sin(pi/2*(n+.5)/L))]^2 + +
@@ -101,8 +116,8 @@ that the result is always exactly the same. Any mismatch would cause an error in
-CELT uses a Pyramid Vector Quantization (PVQ) [] codebook for quantising the details -of the spectrum in each band that haven't been predicted by the pitch predictor. +CELT uses a Pyramid Vector Quantization (PVQ) codebook for quantising the details +of the spectrum in each band that haven't been predicted by the pitch predictor.
@@ -125,8 +140,8 @@ Some more text
-CELT uses a Pyramid Vector Quantization (PVQ) [] codebook for quantising the details -of the spectrum in each band that haven't been predicted by the pitch predictor. +CELT uses a Pyramid Vector Quantization (PVQ) [] codebook for quantising the details +of the spectrum in each band that haven't been predicted by the pitch predictor.
@@ -139,8 +154,6 @@ of the spectrum in each band that haven't been predicted by the pitch predictor.
-De-emphasis -
@@ -197,7 +210,7 @@ CELT and AVT communities for their input: Key words for use in RFCs to Indicate Requirement Levels - + @@ -205,69 +218,14 @@ CELT and AVT communities for their input: RTP: A Transport Protocol for real-time applications - - - - + + + + - - -Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies - - - - - - - - -SDP: Session Description Protocol - - - - - - - - - -Packet-based Multimedia Communications Systems - - - - - - - - -Control of communications between Visual Telephone Systems and Terminal Equipment - - - - - - - - -RTP Profile for Audio and Video Conferences with Minimal Control. - - - - - - - - - -The application/ogg Media Type - - - - - @@ -276,11 +234,29 @@ CELT and AVT communities for their input: The CELT ultra-low delay audio codec + - + + +Modified Discrete Cosine Transform + + + + + + + +A Pyramid Vector Quantizer + + + + + + +