This patch makes all symbols conditional on whether or not there's
enough space left in the buffer to code them, and eliminates much
of the redundancy in the side information.
A summary of the major changes:
* The isTransient flag is moved up to before the the coarse energy.
If there are not enough bits to code the coarse energy, the flag
would get forced to 0, meaning what energy values were coded
would get interpreted incorrectly.
This might not be the end of the world, and I'd be willing to
move it back given a compelling argument.
* Coarse energy switches coding schemes when there are less than 15
bits left in the packet:
- With at least 2 bits remaining, the change in energy is forced
to the range [-1...1] and coded with 1 bit (for 0) or 2 bits
(for +/-1).
- With only 1 bit remaining, the change in energy is forced to
the range [-1...0] and coded with one bit.
- If there is less than 1 bit remaining, the change in energy is
forced to -1.
This effectively low-passes bands whose energy is consistently
starved; this might be undesirable, but letting the default be
zero is unstable, which is worse.
* The tf_select flag gets moved back after the per-band tf_res
flags again, and is now skipped entirely when none of the
tf_res flags are set, and the default value is the same for
either alternative.
* dynalloc boosting is now limited so that it stops once it's given
a band all the remaining bits in the frame, or when it hits the
"stupid cap" of (64<<LM)*(C<<BITRES) used during allocation.
* If dynalloc boosing has allocated all the remaining bits in the
frame, the alloc trim parameter does not get encoded (it would
have no effect).
* The intensity stereo offset is now limited to the range
[start...codedBands], and thus doesn't get coded until after
all of the skip decisions.
Some space is reserved for it up front, and gradually given back
as each band is skipped.
* The dual stereo flag is coded only if intensity>start, since
otherwise it has no effect.
It is now coded after the intensity flag.
* The space reserved for the final skip flag, the intensity stereo
offset, and the dual stereo flag is now redistributed to all
bands equally if it is unused.
Before, the skip flag's bit was given to the band that stopped
skipping without it (usually a dynalloc boosted band).
In order to enable simple interaction between VBR and these
packet-size enforced limits, many of which are encountered before
VBR is run, the maximum packet size VBR will allow is computed at
the beginning of the encoding function, and the buffer reduced to
that size immediately.
Later, when it is time to make the VBR decision, the minimum packet
size is set high enough to ensure that no decision made thus far
will have been affected by the packet size.
As long as this is smaller than the up-front maximum, all of the
encoder's decisions will remain in-sync with the decoder.
If it is larger than the up-front maximum, the packet size is kept
at that maximum, also ensuring sync.
The minimum used now is slightly larger than it used to be, because
it also includes the bits added for dynalloc boosting.
Such boosting is shut off by the encoder at low rates, and so
should not cause any serious issues at the rates where we would
actually run out of room before compute_allocation().
The mid = (lo+hi)>>1 line in the binary search would allow hi to drop
down to the same value as lo, meaning the rounding after the search
would be choosing between the same two values.
This patch changes it to (lo+hi+1)>>1.
This will allow lo to increase up to the value hi, but only in the
case that we can't possibly allocate enough pulses to meet the
target number of bits (in which case the rounding doesn't matter).
To pay for the extra add, this moves the +1 in the comparison to bits
to the other side, which can then be taken outside the loop.
The compiler can't normally do this because it might cause overflow
which would change the results.
This rarely mattered, but gives a 0.01 PEAQ improvement on 12-byte
120 sample frames.
It also makes the search process describable with a simple
algorithm, rather than relying on this particular optimized
implementation.
I.e., the binary search loop can now be replaced with
for(lo=0;lo+1<cache[0]&&cache[lo+1]<bits;lo++);
hi=lo+1;
and it will give equivalent results.
This was not true before.
This allows us to a) not pay a coding cost to avoid skipping bands that are
stupid to skip (e.g., the first band, or bands that have so few bits that we
wouldn't redistribute anything) and b) not reserve bits to pay that cost.
The old code allocated too many fine bits to large bands.
New allocations were derived from by numerical optimization using quantization
MSE sampled from Laplacian distributed random data to within +/- 1 bit for
N=2...160 and bits per band from 0 to 64.
Those allocations could be modeled with only minor errors using a simple offset
of 19/8+log2(N), with no bits spent on fine energy when there would not be
enough bits remaining to code a single pulse.
However, PEAQ testing suggested an offset of 14/8 was better, and that it was
always worth spending at least one bit on fine energy.