Commit graph

441 commits

Author SHA1 Message Date
Gregory Maxwell
a65db56e54 Fix declaration after statement in fixed point. 2014-01-08 11:48:38 -08:00
Jean-Marc Valin
e775090427 pseudostack instrumentation (off by default) 2014-01-07 21:32:41 -05:00
Jean-Marc Valin
9134e96cb2 Fixes SMALL_FOOTPRINT for float 2014-01-07 17:50:46 -05:00
Jean-Marc Valin
e17ca25617 Don't allocate pulses on the stack when calling the SILK PLC.
Also minor C89 fix for the previous commit
2014-01-07 15:27:02 -05:00
Jean-Marc Valin
b63e7110cb Moves CELT PLC pitch search to a separate function to reduce peak stack 2014-01-07 15:02:43 -05:00
Jean-Marc Valin
5f807c176f Adds SMALL_FOOTPRINT hack to the CELT PLC too 2014-01-07 04:48:42 -05:00
Jean-Marc Valin
9d1b6fef2a Moves deemphasis() call out of celt_decode_lost() to reduce peak stack 2014-01-07 04:32:41 -05:00
Jean-Marc Valin
ad8371d172 Cleaning up leftovers of "freq" in celt_decode_with_ec() 2014-01-06 17:45:57 -05:00
Jean-Marc Valin
4d07b1357e Reduces the decoder stack use by removing the pcm_silk buffer in fixed-point
We only keep when concealing less than 10ms with SILK.
2014-01-06 17:43:20 -05:00
Jean-Marc Valin
14ca4ed682 Moves the remains of compute_inv_mdcts() to celt_synthesis() 2014-01-06 09:31:09 -05:00
Jean-Marc Valin
32454dcadc Hack that makes the SMALL_FOOTPRINT CELT decoder use only 4.25 kB of stack. 2014-01-06 09:11:52 -05:00
Jean-Marc Valin
bdc7b93358 Reduces decoder stack usage by only storing one channel of denormalized MDCT 2014-01-06 08:58:38 -05:00
Jean-Marc Valin
4a6744a446 Some cleaning up of the synthesis code. 2014-01-05 21:40:02 -05:00
Jean-Marc Valin
ed01a596dc Making exp_rotation1() use MAC16_16(), which saves a few cycles on ARM 2014-01-04 21:06:24 -05:00
Jean-Marc Valin
ccec752a38 Silences unused parameter warning 2014-01-04 00:49:46 -05:00
Jean-Marc Valin
ef0eac497f Moving the radix-2 to expose trivial twiddle factors 2014-01-03 23:55:52 -05:00
Jean-Marc Valin
c8f62a4aed Improving the accuracy of the fixed-point radix-3 and radix-5 2013-12-31 00:00:37 -05:00
Jean-Marc Valin
e1f846208e Minor cleanup -- nothing to see here 2013-12-29 18:45:49 -05:00
Jean-Marc Valin
05291fd6a6 Fixed-point: slight accuracy improvement in the comb filter 2013-12-29 18:31:17 -05:00
Jean-Marc Valin
30f52cbe2d Remove a SAVE_STACK that was pasted accidentally in the previous commit 2013-12-29 16:21:06 -05:00
Jean-Marc Valin
e1dc1e2238 Unifying scaling of fixed-point and float FFT 2013-12-29 13:34:17 -05:00
Jean-Marc Valin
dbb96ab5cc Fixes C89 issue 2013-12-29 00:09:06 -05:00
Jean-Marc Valin
4c1a90a847 Getting rid of some negations
Since we're doing two rotations, we can invert the sign on both.
Also adding a few comments for optimizing the FFT.
2013-12-28 23:14:26 -05:00
Jean-Marc Valin
cc344fb8ff Slightly improving the accuracy of the fixed-point MDCT downscale
Also simplifying the code
2013-12-28 19:10:44 -05:00
Jean-Marc Valin
e0c00e27d8 Commit 99968ab was causing us to allocate too much stack in the MDCT 2013-12-27 03:16:34 -05:00
Jean-Marc Valin
e43a0abe0a Removes the separate 1/8N rotation in the (I)MDCT and unmerges the MDCT sizes
Undoes commits f7547a4e and 72513f3c
2013-12-27 00:10:54 -05:00
Jean-Marc Valin
a5e3c8a6a6 Inverse MDCT no longer requires any scratch space 2013-12-23 02:26:03 -05:00
Jean-Marc Valin
e2bcb3fe9b Reverse the ordering of the FFT stage to optimize a degenerate radix-4 case.
This also happens to increase the accuracy since it appears that the new
ordering is optimal (at least for 20 ms frames), whereas the previous ordering
was pessimal.
2013-12-22 02:17:24 -05:00
Jean-Marc Valin
c8f4e1608a Merges the FFT scaling with the MDCT pre-rotate 2013-12-21 16:30:49 -05:00
Jean-Marc Valin
153def2884 Getting rid of the inverse FFT entirely
IMDCT now uses the forward FFT.
2013-12-21 15:45:17 -05:00
Jean-Marc Valin
99968abba8 Moving bitrev step to forward MDCT too 2013-12-21 14:29:41 -05:00
Jean-Marc Valin
bc13bbaad7 Applying the forward FFT gain up-front for fixed-point too
This makes us lose a bit of precision in the first steps, but our gain is more
precise because it's only rounded once. Overall, SNR is slightly improved.
2013-12-21 02:33:22 -05:00
Jean-Marc Valin
2e26b82ec2 Moves the bitrev step to the IMDCT pre-rotation 2013-12-20 23:13:29 -05:00
Jean-Marc Valin
306d7f5a30 fixed-point: slight (but free) accuracy improvement in compute_band_energies()
Also moves the VSHR32() condition outside the loop just in case some compilers
don't optimize that properly.
2013-12-16 01:08:21 -05:00
Jean-Marc Valin
e0f26246b0 fixed-point: adds rounding to some shifts to eliminate bias
This reduces the peak decoding error by removing small (inaudible) spikes in
the error at the frame boundaries. These were due to the frequency-domain bias
ending up as a small pulse in the middle of the IMDCT overlap. None of this
was ever audible, but fixing it is still cleaner.
2013-12-14 11:07:13 -05:00
Jean-Marc Valin
4a168eb343 Remove useless code in alloc_trim_analysis() 2013-12-11 01:34:06 -05:00
Jean-Marc Valin
5752d659fd Minor fixed-point accuracy improvements that were completely free 2013-12-11 00:21:38 -05:00
Jean-Marc Valin
91f8010108 Removing indirections 2013-12-10 22:09:33 -05:00
Jean-Marc Valin
5607d5d1c8 Annotating pointer arguments with OPUS_RESTRICT and const 2013-12-10 22:09:29 -05:00
Jean-Marc Valin
122971b8cc More NaN hardening in the analysis code 2013-12-10 13:56:38 -05:00
Jean-Marc Valin
d5553e8aca Using OPUS_COPY()/OPUS_CLEAR() in the decoder too 2013-12-10 02:32:26 -05:00
Jean-Marc Valin
15edb78b3e Making NaN detection more robust to -ffast-math. 2013-12-09 21:56:21 -05:00
Jean-Marc Valin
4fda6b0142 Using celt_inner_prod() in compute_band_energies() 2013-12-09 18:06:34 -05:00
Pedro Becerra
a9b7def9f5 s/MAX16/MAX32/ in transient_analysis()
Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
2013-12-09 16:08:29 -05:00
Jean-Marc Valin
57cd849cf7 Defining celt_inner_prod() and using it instead of explicit loops.
Also adds an SSE-optimized celt_inner_prod().
2013-12-09 15:26:58 -05:00
Jean-Marc Valin
ff072009fe Replaces inline copies and initialization with OPUS_*() macros.
This is a bit faster at -O2 because memcpy()/memmove()/memset() are
vectorized. The code is also cleaner.
2013-12-09 15:26:52 -05:00
Jean-Marc Valin
0f869cba0f Changes ABS16() and ABS32() to use fabs() in the float build
gcc is better at optimizing it than the ?: version
2013-12-09 15:26:43 -05:00
Jean-Marc Valin
c94e4bb103 Optimizes encoder NaN detection and clipping by only running them when needed
NaN detection should now be able to catch values that would create NaNs
further down.
2013-12-09 15:26:03 -05:00
Jean-Marc Valin
5626908ec3 Fixed-point fast-path for normal 48 kHz encoding in celt_preemphasis() 2013-12-05 16:40:59 -05:00
Jean-Marc Valin
aed1009df9 Turns a 16x32 multiply into a 16x16 one in celt_preemphasis(). 2013-12-05 13:36:48 -05:00