Adds a new bitexact_log2tan() function which is much simpler, and
more accurate.
The new approximation has an RMS error of 0.0038 bits from the
correctly rounded result over the range of inputs we use, compared
to an RMS error of 0.013 for the old log2_frac() method.
The actual computation of delta is also changed to use FRAC_MUL16,
since this allows us to keep the full accuracy of the new method
while avoiding 16-bit overflow.
The old delta computation actually could overflow 16 bits: it needed
8 for the log2_frac() result, 1 for the sign of the difference, and
8 more for N.
B contains the number of blocks _after_ splitting.
We were using it to decide a) when to use a uniform PDF instead of a
triangular one for theta and b) whether to bias the bit allocation
towards the lower bins.
Using B0 (the number of blocks before the split) instead for a)
gives a PEAQ gain of 0.003 ODG (as high as 0.1 ODG on s02a samples
006, 083, and 097) for 240-sample frames at 96kbps mono.
Using B0 instead for b) gives a gain of only 0.00002.
This means we're "time-ordered" in all cases except when increasing
the time resolution on frames that already use short blocks.
There's no reordering when increasing the frequency resolution
on short blocks.
All of our usage of ec_{enc|dec}_bit_prob had the probability of a
"one" being a power of two.
This adds a new ec_{enc|dec}_bit_logp() function that takes this
explicitly into account.
It introduces less rounding error than the bit_prob version, does not
require 17-bit integers to be emulated by ec_{encode|decode}_bin(),
and does not require any multiplies or divisions at all.
It is exactly equivalent to
ec_encode_bin(enc,_val?0:(1<<_logp)-1,(1<<_logp)-(_val?1:0),1<<_logp)
The old ec_{enc|dec}_bit_prob functions are left in place for now,
because I am not sure if SILK is still using them or not when
combined in Opus.
These were stored internally in one order and in the bitstream in a
different order.
Both used bare constants, making it unclear what either actually
meant.
This changes them to use the same order, gives them named constants,
and renames all the "fold" decision stuff to "spread" instead,
since that is what it is really controlling.
The idea here is that it's better to fold a higher band -- even if it was
coded less accurately -- than a lower band that may have a different
temporal structure.