eden-emu/opus: Mirror of opus - Eden Git: A Community Emulator

Mirror of opus

Find a file

Timothy B. Terriberry 21af73eb21 Make collapse-detection bitexact. Jean-Marc's original anti-collapse patch used a threshold on the content of a decoded band to determine whether or not it should be filled with random noise. Since this is highly sensitive to the accuracy of the implementation, it could lead to significant decoder output differences even if decoding error up to that point was relatively small. This patch detects collapsed bands from the output of the vector quantizer, using exact integer arithmetic. It makes two simplifying assumptions: a) If either input to haar1() is non-zero during TF resolution adjustments, then the output will be non-zero. b) If the content of a block is non-zero in any of the bands that are used for folding, then the folded output will be non-zero. b) in particular is likely to be false when SPREAD_NONE is used. It also ignores the case where mid and side are orthogonal in stereo_merge, but this is relatively unlikely. This misses just over 3% of the cases that Jean-Marc's anti-collapse detection strategy would catch, but does not mis-classify any (all detected collapses are true collapses). This patch overloads the "fill" parameter to mark which blocks have non-zero content for folding. As a consequence, if a set of blocks on one side of a split has collapsed, _no_ folding is done: the result would be zero anyway, except for short blocks with SPREAD_AGGRESSIVE that are split down to a single block, but a) that means a lot of bits were available so a collapse is unlikely and b) anti-collapse can fill the block anyway, if it's used. This also means that if itheta==0 or itheta==16384, we no longer fold at all on that side (even with long blocks), since we'd be multiplying the result by zero anyway.		2011-01-19 19:43:08 -05:00
doc/ietf	Updated draft for 0.8.1	2010-07-08 15:28:08 -04:00
libcelt	Make collapse-detection bitexact.	2011-01-19 19:43:08 -05:00
tests	Minor fixes to testcases	2011-01-11 09:42:28 -05:00
tools	Fixes constrained VBR	2011-01-12 11:27:03 -05:00
.gitignore	gitignore update	2010-07-03 09:28:15 -04:00
AUTHORS	Initial commit with the autotools stuff and files taken from Speex and Vorbis.	2007-11-29 17:01:16 +11:00
autogen.sh	Added pitch analysis. Doesn't crash, but otherwise untested.	2007-11-30 12:15:49 +11:00
celt.kdevelop	Fixed parallel build	2007-12-11 18:01:22 +11:00
celt.pc.in	Misc changes for 0.7.1.	2010-01-16 19:12:06 -05:00
ChangeLog	Initial commit with the autotools stuff and files taken from Speex and Vorbis.	2007-11-29 17:01:16 +11:00
configure.ac	Use more standard test for lrintf/lrint	2011-01-11 09:32:09 -05:00
COPYING	Miscellaneous comment, copyright notice, readme updates.	2009-02-16 19:52:02 -05:00
Doxyfile	Bump to 0.10	2010-12-20 11:40:50 -05:00
Doxyfile.devel	Bump to 0.10	2010-12-20 11:40:50 -05:00
INSTALL	Nothing to see here.	2007-12-02 20:55:22 +11:00
libcelt.spec.in	Spec file	2009-01-13 13:40:26 -05:00
Makefile.am	Development documentation (internals)	2008-02-20 18:02:42 +11:00
NEWS	Initial commit with the autotools stuff and files taken from Speex and Vorbis.	2007-11-29 17:01:16 +11:00
README	Miscellaneous comment, copyright notice, readme updates.	2009-02-16 19:52:02 -05:00
README.Win32	Misc changes for 0.7.1.	2010-01-16 19:12:06 -05:00
TODO	re-enable support for resizable buffers in the range coder	2008-10-18 09:11:05 -04:00

README

CELT is a very low delay audio codec designed for high-quality communications.

Traditional full-bandwidth  codecs such as Vorbis and AAC can offer high
quality but they require codec delays of hundreds of milliseconds, which
makes them unsuitable  for real-time interactive applications like tele-
conferencing. Speech targeted codecs, such as Speex or G.722, have lower
20-40ms delays but their speech focus and limited sampling rates 
restricts their quality, especially for music.

Additionally, the other mandatory components of a full network audio system—
audio interfaces, routers, jitter buffers— each add their own delay. For lower
speed networks the time it takes to serialize a  packet onto the network cable
takes considerable time, and over the long distances the speed of light
imposes a significant delay.

In teleconferencing— it is important to keep delay low so that the participants
can communicate fluidly without talking on top of each  other and so that their
own voices don't return after a round trip as an annoying echo.

For network music performance— research has show that the total one way delay
must be kept under 25ms to avoid degrading the musicians performance. 

Since many of the sources of delay in a complete system are outside of the
user's control (such as the  speed of light) it is often  only possible to
reduce the total delay by reducing the codec delay. 

Low delay has traditionally been considered a challenging area in audio codec
design, because as a codec is forced to work on the smaller chunks of audio
required for low delay it has access to less redundancy and less perceptual
information which it can use to reduce the size of the transmitted audio.

CELT is designed to bridge the gap between "music" and "speech" codecs,
permitting new very high quality teleconferencing applications, and to go
further, permitting latencies much lower than speech codecs normally provide
to enable applications such as remote musical collaboration even over long
distances.  

In keeping with the Xiph.Org mission—  CELT is also designed to accomplish
this without copyright or patent encumbrance. Only by keeping the formats
that drive our Internet communication free and unencumbered can we maximize
innovation, collaboration, and interoperability.  Fortunately, CELT is ahead
of the adoption curve in its target application space, so there should be 
no reason for someone who needs what CELT provides to go with a proprietary
codec.

CELT has been tested on x86, x86_64, ARM, and the TI C55x DSPs, and should
be portable to any platform with a working C compiler and on the order of
100 MIPS of processing power. 

The code is still in early stage, so it may be broken from time to time, and
the bit-stream is not frozen yet, so it is different from one version to 
another. Oh, and don't complain if it sets your house on fire.

Complaints and accolades can be directed to the CELT mailing list:
http://lists.xiph.org/mailman/listinfo/celt-dev/

To compile:
% ./configure
% make

For platforms without fast floating point support (such as ARM) use the
--enable-fixed argument to configure to build a fixed-point version of CELT.

There are Ogg-based encode/decode tools in tools/. These are quite similar to
the speexenc/speexdec tools. Use the --help option for details.

There is also a basic tool for testing the encoder and decoder called
"testcelt" located in libcelt/: 

% testcelt <rate> <channels> <frame size> <bytes per packet> input.sw output.sw

where input.sw is a 16-bit (machine endian) audio file sampled at 32000 Hz to 
96000 Hz. The output file is already decompressed.  

For example, for a 44.1 kHz mono stream at ~64kbit/sec and with 256 sample
frames:

% testcelt 44100 1 256 46 intput.sw output.sw 

Since 44100/256*46*8 = 63393.74 bits/sec.

All even frame sizes from 64 to 512 are currently supported, although
power-of-two sizes are recommended  and most CELT development is done
using a size of 256.  The delay imposed by CELT is  1.25x - 1.5x  the 
frame duration depending on the frame size and some details of CELT's
internal operation.  For 256 sample frames the delay is 1.5x  or  384
samples, so the total codec delay in the above example is 8.70ms 
(1000/(44100/384)).