This unifies the byte buffer, encoder, and decoder into a single
struct.
The common encoder and decoder functions (such as ec_tell()) can
operate on either one, simplifying code which uses both.
The precision argument to ec_tell() has been removed.
It now comes in two precisions:
ec_tell() gives 1 bit precision in two operations, and
ec_tell_frac() gives 1/8th bit precision in... somewhat more.
ec_{enc|dec}_bit_prob() were removed (they are no longer needed).
Some of the byte buffer access functions were made static and
removed from the cross-module API.
All of the code in rangeenc.c and rangedec.c was merged into
entenc.c and entdec.c, respectively, as we are no longer
considering alternative backends.
rangeenc.c and rangede.c have been removed entirely.
This passes make check, after disabling the modes that we removed
support for in cf5d3a8c.
The recombine loop for cm was correct if one started at 1 block,
but was wrong otherwise (for a test case, convert 2 recombined
blocks back to 4 with an initial cm of 0x3; the result should be
0xF, but instead you get 0x7).
The recombine loop for fill was always wrong (for a test case,
combine 8 blocks down to 1 with an initial fill=0xFE; the low bit
remains unset).
This now properly interleaves and deinterleaves bits for these
steps, which avoids declaring collapses (and skipping folding)
where none, in fact, occurred.
aa6fec66 added a check to reject modes with shorts longer than
3.33 ms (less than 300 per second).
However, it only rejected modes which could not be split at all.
This expands the check to also reject modes which, even after
splitting the maximum amount, still do not have shorts less than
3.33 ms.
This stores the caps array in 32nd bits/sample instead of 1/2 bits
scaled by LM and the channel count, which is slightly less
less accurate for the last two bands, and much more accurate for
all the other bands.
A constant offset is subtracted to allow it to represent values
larger than 255 in 8 bits (the range of unoffset values is
77...304).
In addition, this replaces the last modeline in the allocation table
with the caps array, allowing the initial interpolation to
allocate 8 bits/sample or more, which was otherwise impossible.
We did no real error checking to see if a mode is supported when it
is created.
This patch implements checks for Jean-Marc's rules:
1) A mode must have frames at least 1ms in length (no more than
1000 per second).
2) A mode must have shorts of at most 3.33 ms (at least 300 per
second).
It also adds error checking to dump_modes so we report the error
instead of crashing when we fail to create a mode.
The way folding is implemented requires two restrictions:
1. The last band must be the largest (so we can use its size to
allocate a temporary buffer to handle interleaving/TF changes).
2. No band can be larger than twice the size of the previous band
(so that once we have enough data to start folding, we will always
have enough data to fold).
Mode creation makes a heuristic attempt to satisfy these
conditions, but nothing actually guarantees it.
This adds some asserts to check them during mode creation.
They current pass for all supported custom modes.
Currently compute_ebands()'s attempts to round bands to even sizes
and enforce size constraints on consecutive bands can leave some
bands entirely empty (e.g., Fs=8000, frame_size=64, i=11).
This adds a simple post-processing loop to remove such bands.
9b34bd83 caused serious regressions for 240-sample frame stereo,
because the previous qb limit was _always_ hit for two-phase
stereo.
Two-phase stereo really does operate with a different model (for
example, the single bit allocated to the side should really
probably be thought of as a sign bit for qtheta, but we don't
count it as part of qtheta's allocation).
The old code was equivalent to a separate two-phase offset of 12,
however Greg Maxwell's testing demonstrates that 16 performs
best.
Previously, we would only split a band if it was allocated more than
32 bits.
However, the N=4 codebook can only produce about 22.5 bits, and two
N=2 bands combined can only produce 26 bits, including 8 bits for
qtheta, so if we wait until we allocate 32, we're guaranteed to fall
short.
Several of the larger bands come pretty far from filling 32 bits as
well, though their split versions will.
Greg Maxwell also suggested adding an offset to the threshold to
account for the inefficiency of using qtheta compared to another
VQ dimension.
This patch uses 1 bit as a placeholder, as it's a clear
improvement, but we may adjust this later after collecting data on
more possibilities over more files.
The first version of the mono decoder with stereo output collapsed
the historic energy values stored for anti-collapse down to one
channel (by taking the max).
This means that a subsequent switch back would continue on using
the the maximum of the two values instead of the original history,
which would make anti-collapse produce louder noise (and
potentially more pre-echo than otherwise).
This patch moves the max into the anti_collapse function itself,
and does not store the values back into the source array, so the
full stereo history is maintained if subsequent frames switch
back.
It also fixes an encoder mismatch, which never took the max
(assuming, apparently, that the output channel count would never
change).
Instead of just dumping excess bits into the first band after
allocation, use them to initialize the rebalancing loop in
quant_all_bands().
This allows these bits to be redistributed over several bands, like
normal.
The average caps over all values of LM and C are well below the
target allocations of the last two modelines.
Lower them to the caps, to prevent hitting them quite so early.
This helps quality at medium-high rates, in the 180-192 kbps range.
Use measured cross-entropy to estimate the real cost of coding
qtheta given the allocated qb parameter, instead of the entropy of
the PDF.
This is generally much lower, and reduces waste at high rates.
This patch also removes some intermediate rounding from this
computation.
The previous "dumb cap" of (64<<LM)*(C<<BITRES) was not actually
achievable by many (most) bands, and did not take the cost of
coding theta for splits into account, and so was too small for some
bands.
This patch adds code to compute a fairly accurate estimate of the
real maximum per-band rate (an estimate only because of rounding
effects and the fact that the bit usage for theta is variable),
which is then truncated and stored in an 8-bit table in the mode.
This gives improved quality at all rates over 160 kbps/channel,
prevents bits from being wasted all the way up to 255 kbps/channel
(the maximum rate allowed, and approximately the maximum number of
bits that can usefully be used regardless of the allocation), and
prevents dynalloc and trim from producing enormous waste
(eliminating the need for encoder logic to prevent this).