Commit graph

505 commits

Author SHA1 Message Date
Jean-Marc Valin
d539c6b9c5 Disabling the postfilter when complexity<5 or when CELT_SET_PREDICTION<=1 2011-02-03 13:36:03 -05:00
Jean-Marc Valin
51c786241b More Opus build work 2011-02-03 00:43:37 -05:00
Jean-Marc Valin
3a8f04db17 Enabling the post-filter and exporting the ec functions for Opus 2011-02-02 23:02:25 -05:00
Timothy B. Terriberry
ce6d0904a1 Increase caps/allocation accuracy.
This stores the caps array in 32nd bits/sample instead of 1/2 bits
 scaled by LM and the channel count, which is slightly less
 less accurate for the last two bands, and much more accurate for
 all the other bands.
A constant offset is subtracted to allow it to represent values
 larger than 255 in 8 bits (the range of unoffset values is
 77...304).
In addition, this replaces the last modeline in the allocation table
 with the caps array, allowing the initial interpolation to
 allocate 8 bits/sample or more, which was otherwise impossible.
2011-02-01 21:17:57 -05:00
Jean-Marc Valin
7e983194a3 Fixing the global stack -- and an overflow in collapse_mask 2011-02-01 18:00:29 -05:00
Jean-Marc Valin
7bb26e13ca Adds a generic CELT_SET_BITRATE() ctl() API for CBR and VBR 2011-02-01 17:04:27 -05:00
Jean-Marc Valin
a350bf5262 Stop collapsing the background noise channels when switching to mono 2011-01-31 17:30:15 -05:00
Timothy B. Terriberry
682b6cf1ad Don't destroy stereo history when switching to mono.
The first version of the mono decoder with stereo output collapsed
 the historic energy values stored for anti-collapse down to one
 channel (by taking the max).
This means that a subsequent switch back would continue on using
 the the maximum of the two values instead of the original history,
 which would make anti-collapse produce louder noise (and
 potentially more pre-echo than otherwise).

This patch moves the max into the anti_collapse function itself,
 and does not store the values back into the source array, so the
 full stereo history is maintained if subsequent frames switch
 back.
It also fixes an encoder mismatch, which never took the max
 (assuming, apparently, that the output channel count would never
 change).
2011-01-31 17:26:02 -05:00
Timothy B. Terriberry
948d27c9bc Propagate balance from compute_allocation() to quant_all_bands().
Instead of just dumping excess bits into the first band after
 allocation, use them to initialize the rebalancing loop in
 quant_all_bands().
This allows these bits to be redistributed over several bands, like
 normal.
2011-01-31 15:37:01 -05:00
Jean-Marc Valin
713d7a4ce9 Fix sample type conversion when resampling 2011-01-31 13:41:01 -05:00
Jean-Marc Valin
00a98f5deb Making the stereo encoder capable of encoding in mono 2011-01-31 11:19:03 -05:00
Jean-Marc Valin
f1916a14fd Making it possible for the stereo decoder to decode a mono stream 2011-01-31 10:51:30 -05:00
Jean-Marc Valin
8cf29f0991 Custom and non-custom versions of the get_size() functions 2011-01-30 23:38:28 -05:00
Jean-Marc Valin
665da0ba4d Merge branch 'exp_api_change' 2011-01-30 13:39:56 -05:00
Timothy B. Terriberry
c564307463 Use a smarter per-band bitrate cap.
The previous "dumb cap" of (64<<LM)*(C<<BITRES) was not actually
 achievable by many (most) bands, and did not take the cost of
 coding theta for splits into account, and so was too small for some
 bands.
This patch adds code to compute a fairly accurate estimate of the
 real maximum per-band rate (an estimate only because of rounding
 effects and the fact that the bit usage for theta is variable),
 which is then truncated and stored in an 8-bit table in the mode.

This gives improved quality at all rates over 160 kbps/channel,
 prevents bits from being wasted all the way up to 255 kbps/channel
 (the maximum rate allowed, and approximately the maximum number of
 bits that can usefully be used regardless of the allocation), and
 prevents dynalloc and trim from producing enormous waste
 (eliminating the need for encoder logic to prevent this).
2011-01-30 11:42:38 -05:00
Jean-Marc Valin
d6c3d3ceae Error handling in _create() functions 2011-01-30 11:00:24 -05:00
Jean-Marc Valin
913a1742b9 Adding resampling support
We use the MDCT as low-pass filter.
2011-01-29 11:35:19 -05:00
Jean-Marc Valin
c97b258c62 celt_encoder_create() now defaults to Opus standard mode
The old constructor is renamed celt_encoder_create_custom(). Same
for the decoder.
2011-01-28 23:07:32 -05:00
Gregory Maxwell
420c325875 Prevent VBR from shooting up to the maximum rate if set to very low target rates, and prevent the encoder VBR from producing 1 byte frames (which are no longer allowed). 2011-01-27 22:58:12 -05:00
Jean-Marc Valin
47e905dce7 Making anti-collapse a bit more conservative again
The energy memory can be lowered (not increased) during a transient
2011-01-27 18:05:47 -05:00
Jean-Marc Valin
b417d8392e Changing some double constants to float 2011-01-27 17:19:49 -05:00
Jean-Marc Valin
61f40418fa Adjusting post-filter coefficients to be exact in 13 bit precision.
That way they can be exact in 16 bits once multiplied by the gain
2011-01-27 17:14:33 -05:00
Jean-Marc Valin
65d35a35cf Only allowing silence in non-hybrid mode.
Also defining a 1-byte packet as triggering the PLC/CNG
2011-01-26 22:04:59 -05:00
Timothy B. Terriberry
a396e153b9 More anti-collapse fixes, as well as a fold fix.
This changes folding so that the LCG is never used on transients
 (either short blocks or long blocks with increased time
 resolution), except in the case that there's not enough decoded
 spectrum to fold yet.

It also now only subtracts the anti-collapse bit from the total
 allocation in quant_all_bands() when space has actually been
 reserved for it.

Finally, it cleans up some of the fill and collapse_mask tracking
 (this tracking was originally made intentionally sloppy to save
 work, but then converted to replace the existing fill flag at the
 last minute, which can have a number of logical implications).
The changes, in particular:
1) Splits of less than a block now correctly mark the second half
    as filled only if the whole block was filled (previously it
    would also mark it filled if the next block was filled).
2) Splits of less than a block now correctly mark a block as
    un-collapsed if either half was un-collapsed, instead of marking
    the next block as un-collapsed when the high half was.
3) The N=2 stereo special case now keeps its fill mask even when
    itheta==16384; previously this would have gotten cleared,
    despite the fact that we fold into the side in this case.
4) The test against fill for folding now only considers the bits
    corresponding to the current set of blocks.
   Previously it would still fold if any later block was filled.
5) The collapse mask used for the LCG fold data is now correctly
    initialized when B=16 on platforms with a 16-bit int.
6) The high bits on a collapse mask are now cleared after the TF
    resolution changes and interleaving at level 0, instead of
    waiting until the very end.
   This prevents extraneous high flags set on mid from being mixed
    into the side flags for mid-side stereo.
2011-01-26 20:54:13 -05:00
Jean-Marc Valin
4b000c37e7 Setting bandE[] to zero after log2Amp when silence=1 2011-01-26 20:30:21 -05:00
Gregory Maxwell
8b631f2c5f Fixes for silence handling in VBR mode, plus an encoder/decoder desync triggered by silent frames. 2011-01-26 20:26:31 -05:00
Jean-Marc Valin
e3e2c26dfc Removing more unused function params 2011-01-26 13:09:53 -05:00
Jean-Marc Valin
13a7c26654 Removes explicit filling of remaining bits with zeros
The initialiser already takes care of this
2011-01-26 10:58:33 -05:00
Jean-Marc Valin
c39bb8ab8c Removes unused function parameters 2011-01-26 10:50:55 -05:00
Jean-Marc Valin
de79c378bd Adding a special way to code digital silence in two or more bytes 2011-01-26 09:24:33 -05:00
Jean-Marc Valin
9ce95e0bd0 anti-collapse tuning
Using the min energy of the two last non-transient frames rather
than the min of just the two last frames. Also slightly increasing
the "thresh" upper bound coefficient to 0.5.
2011-01-25 19:12:06 -05:00
Jean-Marc Valin
d121260f38 Minimum period is now 15 2011-01-25 13:11:36 -05:00
Jean-Marc Valin
495114b755 Moving energy floor to coarse quantization
By moving the energy floor to the encoder, we can use a different
floor for prediction than for the decay level. Also, the fixed-point
dynamic range has been increased to avoid overflows when a fixed-point
decoder is used on a stream encoded in floating-point.
2011-01-24 15:53:17 -05:00
Jean-Marc Valin
3a56c9e1c6 prefilter/postfilter now forced off in Opus hybrid mode 2011-01-23 11:34:55 -05:00
Jean-Marc Valin
eafd8a7f17 Simple DTX/CNG implementation 2011-01-23 00:24:45 -05:00
Gregory Maxwell
8f02c482ba Correct an encoder/decoder mismatch at low volume levels. Relax some low level clamps so that the dynamic range can extend further below the 16bit floor. 2011-01-22 19:50:36 -05:00
Jean-Marc Valin
5c2ac2b75d Tracking the background noise level
Also a fix for the zero-ing of unused band energies.
2011-01-22 14:48:20 -05:00
Jean-Marc Valin
63fb61f176 Using previous range coder state for PRNG
This provides more entropy and allows some more flexibility on the
encoder side.
2011-01-20 23:29:05 -05:00
Timothy B. Terriberry
a363e3952c Remove useless ec_dec_tell() call. 2011-01-19 20:24:37 -05:00
Timothy B. Terriberry
21af73eb21 Make collapse-detection bitexact.
Jean-Marc's original anti-collapse patch used a threshold on the
 content of a decoded band to determine whether or not it should
 be filled with random noise.
Since this is highly sensitive to the accuracy of the
 implementation, it could lead to significant decoder output
 differences even if decoding error up to that point was relatively
 small.

This patch detects collapsed bands from the output of the vector
 quantizer, using exact integer arithmetic.
It makes two simplifying assumptions:
 a) If either input to haar1() is non-zero during TF resolution
     adjustments, then the output will be non-zero.
 b) If the content of a block is non-zero in any of the bands that
     are used for folding, then the folded output will be non-zero.
b) in particular is likely to be false when SPREAD_NONE is used.
It also ignores the case where mid and side are orthogonal in
 stereo_merge, but this is relatively unlikely.
This misses just over 3% of the cases that Jean-Marc's anti-collapse
 detection strategy would catch, but does not mis-classify any (all
 detected collapses are true collapses).

This patch overloads the "fill" parameter to mark which blocks have
 non-zero content for folding.
As a consequence, if a set of blocks on one side of a split has
 collapsed, _no_ folding is done: the result would be zero anyway,
 except for short blocks with SPREAD_AGGRESSIVE that are split down
 to a single block, but a) that means a lot of bits were available
 so a collapse is unlikely and b) anti-collapse can fill the block
 anyway, if it's used.
This also means that if itheta==0 or itheta==16384, we no longer
 fold at all on that side (even with long blocks), since we'd be
 multiplying the result by zero anyway.
2011-01-19 19:43:08 -05:00
Jean-Marc Valin
87efe1df00 Adds an anti-collapse mechanism for transients
This looks for bands in each short block that have no energy. For
each of these "collapsed" bands, noise is injected to have an
energy equal to the minimum of the two previous frames for that band.
The mechanism can be used whenever there are 4 or more MDCTs (otherwise
no complete collapse is possible) and is signalled with one bit just
before the final fine energy bits.
2011-01-18 14:44:04 -05:00
Jean-Marc Valin
2ce5c63d22 Moving the tapset signalling to the beginning of the stream 2011-01-17 20:50:18 -05:00
Jean-Marc Valin
8d367029a7 Adding tapset decision logic
Based on spreading_decision()'s logic. We choose tapsets
with less roll-off when we think the HF are tonal.
2011-01-17 16:37:51 -05:00
Jean-Marc Valin
dfa847a25d Support for multiple postfilter tapsets
Supporting three different tapsets with different roll-offs. The default
is now a 5-tap post-filter with a 13 kHz cutoff frequency.
2011-01-17 11:37:08 -05:00
Gregory Maxwell
d85018cb54 In CVBR mode the rate selection was failing to add bytes which were about to fall off the end of the bitres and never be reusable, causing undershoot. 2011-01-13 16:31:50 -05:00
Jean-Marc Valin
5677e34fde Setting oldBandE to zero outside of [start,end[
In case start or end changes, we want the encoder and decoder
to be in sync and not do anything stupid.
2011-01-13 16:15:53 -05:00
Jean-Marc Valin
2b13401fe6 Allowing the tf recombining to go all the way to LM=3 2011-01-12 16:13:46 -05:00
Jean-Marc Valin
6b565268fb Fixes constrained VBR
Also removes the 8 byte/packet lower bound
2011-01-12 11:27:03 -05:00
Timothy B. Terriberry
76469c64b4 Prevent busts at low bitrates.
This patch makes all symbols conditional on whether or not there's
 enough space left in the buffer to code them, and eliminates much
 of the redundancy in the side information.

A summary of the major changes:
* The isTransient flag is moved up to before the the coarse energy.
  If there are not enough bits to code the coarse energy, the flag
   would get forced to 0, meaning what energy values were coded
   would get interpreted incorrectly.
  This might not be the end of the world, and I'd be willing to
   move it back given a compelling argument.
* Coarse energy switches coding schemes when there are less than 15
   bits left in the packet:
  - With at least 2 bits remaining, the change in energy is forced
     to the range [-1...1] and coded with 1 bit (for 0) or 2 bits
     (for +/-1).
  - With only 1 bit remaining, the change in energy is forced to
     the range [-1...0] and coded with one bit.
  - If there is less than 1 bit remaining, the change in energy is
     forced to -1.
    This effectively low-passes bands whose energy is consistently
     starved; this might be undesirable, but letting the default be
     zero is unstable, which is worse.
* The tf_select flag gets moved back after the per-band tf_res
   flags again, and is now skipped entirely when none of the
   tf_res flags are set, and the default value is the same for
   either alternative.
* dynalloc boosting is now limited so that it stops once it's given
   a band all the remaining bits in the frame, or when it hits the
   "stupid cap" of (64<<LM)*(C<<BITRES) used during allocation.
* If dynalloc boosing has allocated all the remaining bits in the
   frame, the alloc trim parameter does not get encoded (it would
   have no effect).
* The intensity stereo offset is now limited to the range
   [start...codedBands], and thus doesn't get coded until after
   all of the skip decisions.
  Some space is reserved for it up front, and gradually given back
   as each band is skipped.
* The dual stereo flag is coded only if intensity>start, since
   otherwise it has no effect.
  It is now coded after the intensity flag.
* The space reserved for the final skip flag, the intensity stereo
   offset, and the dual stereo flag is now redistributed to all
   bands equally if it is unused.
  Before, the skip flag's bit was given to the band that stopped
   skipping without it (usually a dynalloc boosted band).

In order to enable simple interaction between VBR and these
 packet-size enforced limits, many of which are encountered before
 VBR is run, the maximum packet size VBR will allow is computed at
 the beginning of the encoding function, and the buffer reduced to
 that size immediately.
Later, when it is time to make the VBR decision, the minimum packet
 size is set high enough to ensure that no decision made thus far
 will have been affected by the packet size.
As long as this is smaller than the up-front maximum, all of the
 encoder's decisions will remain in-sync with the decoder.
If it is larger than the up-front maximum, the packet size is kept
 at that maximum, also ensuring sync.
The minimum used now is slightly larger than it used to be, because
 it also includes the bits added for dynalloc boosting.
Such boosting is shut off by the encoder at low rates, and so
 should not cause any serious issues at the rates where we would
 actually run out of room before compute_allocation().
2011-01-09 02:06:53 -05:00
Timothy B. Terriberry
051e044d14 Fix Jean-Marc's sqrt(0.5) constants.
There were two different ones in use, one with less precision than
 a float, and the other missing a digit in the middle.
2011-01-09 01:40:05 -05:00