Commit graph

3269 commits

Author SHA1 Message Date
Jean-Marc Valin
0d95b3b48c SSE optimization of comb_filter()
Should make it easy to adapt to other architectures.
2013-06-17 03:58:16 -04:00
Jean-Marc Valin
71766ef7a4 Avoids symbol clashes with Speex (pitch_xcorr) and libm (y1) 2013-06-17 00:44:12 -04:00
Jean-Marc Valin
7fd98c571f Converts denormalise_bands() to use 16-bit multiplications 2013-06-16 21:56:41 -04:00
Jean-Marc Valin
ee2506b2c7 Moves log2Amp inside denormalise_bands() and get rid of bandE[]
Also get rid of the MSE measurement code which is outdated and no longer useful
2013-06-16 20:24:52 -04:00
Jean-Marc Valin
3afc6ffff0 Don't call denormalise_bands() on silence 2013-06-16 15:40:10 -04:00
Timothy B. Terriberry
ce15e65319 Split cwrsi() by pulses vs. dimensions.
This lets us cut out a bunch of work in the large _n, small _k case
 where most of the dimensions won't have any pulses.
It also gets rid of all remaining usage of CELT_PVQ_U() in cwrsi(),
 leaving just a single test instead of lots of mins and maxes, and
 makes a bunch of the jump threading more obvious.

This is a 1.6% decoder speedup on a 96 kbps comp48-stereo encode on
 a Cortex A8.
2013-06-15 03:06:57 -04:00
Timothy B. Terriberry
63f744d583 Further speedup in cwrsi() by using the special case for n=2 2013-06-15 02:01:03 -04:00
Timothy B. Terriberry
533dbe705b Further optimization to cwrsi()
Makes it possible to skip the first loop in some cases.
2013-06-15 01:37:20 -04:00
Jean-Marc Valin
bc469b8e44 Splits cwrsi() inner loop in two to avoid the min/max and some load chains 2013-06-15 00:42:38 -04:00
Jean-Marc Valin
4e018b22bb SSE optimization of remove_doubling()
Should be trivial to adapt for Neon.
2013-06-13 23:51:58 -04:00
Jean-Marc Valin
39cbc45828 Fixes stupid tf calibration bugs introduced/exposed in f77410d 2013-06-13 16:13:51 -04:00
Jean-Marc Valin
204e70d9fa Adds a quick hack to replace the normal calls with the multistream version. 2013-06-13 15:06:42 -04:00
Jean-Marc Valin
28733d1281 Moves VBR calculations to a separate function.
Does not change the behaviour of the VBR code in most cases. The only
exception is that the VBR offset is now taken into accound in the base_rate,
which will have a (very minor) impact on CVBR at low rate.
2013-06-10 03:32:12 -04:00
Ron
4a7bb1fe9b Drop the stdint size tests that we never use anywhere
These were probably cribbed from libogg, but we don't use them here,
opus_types.h instead has a list of hardcoded arch definitions.
2013-06-09 00:57:02 +09:30
Jean-Marc Valin
f22e54dca4 Fixes fixed-point on x86 (no SSE). 2013-06-07 07:21:41 -04:00
John Ridges
e50e8084a9 Improved SSE version of xcorr_kernel()
The loop no longer reads past its buffer and is slightly faster.
Also fixes RESTORE_STACK in celt_iir().
2013-06-06 23:12:57 -04:00
Jean-Marc Valin
70c9c3a482 Forgot to add assembly file 2013-06-06 12:45:17 -04:00
Jean-Marc Valin
a092aa8f80 Adds SSE support (only xcorr_kernel() for now)
There's no CPU detection for it, it only gets enabled by __SSE__
which gcc (other compilers?) defines automatically when supported
by -march=, which means at least all x86-64. For ia32, the user needs to
enable it in the CFLAGS.
2013-06-05 18:56:07 -04:00
Aurélien Zanelli
cd4c8249bc Add run-time CPU detection and support for ARM architecture
Run-time CPU detection (RTCD) is enabled by default if target platform support
it.
It can be disable at compile time with --disable-rtcd option.

Add RTCD support for ARM architecture.

Thanks to Timothy B. Terriberry for help and code review

Signed-off-by: Timothy B. Terriberry <tterribe@xiph.org>
2013-06-04 16:23:22 -07:00
Ron
aa6a1a16ad Test the compiler configuration, not the assembler
With gcc-4.4 at least, the raw asm.s files will always successfully
compile even if the default -march for the compiler would not support
those instructions.  So switch to testing the inline asm versions,
where the compiler will barf if they aren't supported by the default
arch if no -march is explicitly given, or if they aren't supported by
the requested -march when it is.
2013-06-04 15:30:40 +09:30
Jean-Marc Valin
58d80ab9ea Disables all the surround mode forcing for mono/stereo 2013-05-27 20:47:47 -04:00
Aurélien Zanelli
fcecd29abf Check if opus_compare is executable in run_vectors.sh
If opus_compare doesn't exist or isn't executable, tests failed normally
which could be misleading.
So test for existence and mode to avoid this ambiguity.
2013-05-27 11:21:31 -04:00
Jean-Marc Valin
0fed074b04 C89 fix 2013-05-26 20:29:44 -04:00
Jean-Marc Valin
068cbd89bf Creates xcorr_kernel() that gets used by pitch_xcorr, celt_fir and celt_iir. 2013-05-26 20:11:44 -04:00
Jean-Marc Valin
2fe4700f76 Skip down-sampling in deemphasis() when not needed. 2013-05-26 18:54:25 -04:00
Aurélien Zanelli
faec6736cb Add an option to disable build of extra programs (demos and tests) 2013-05-26 14:01:11 -07:00
Jean-Marc Valin
319fe445e3 oops (again) 2013-05-25 21:07:48 -04:00
Jean-Marc Valin
1cdc3f5a2d oops 2013-05-25 20:32:45 -04:00
Jean-Marc Valin
64ba502e2c Optimizes remove_doubling() by avoiding redundant calculations of yy
Using a sliding window to pre-compute all yy values.
2013-05-25 20:13:49 -04:00
Jean-Marc Valin
0fa5fa88e9 Adds missing RESTORE_STACK calls 2013-05-25 18:50:01 -04:00
Jean-Marc Valin
531cf591e6 Speeds up celt_iir() by more than a factor of two.
Again, this only impacts the PLC and we assume the order is a multiple of 4.
2013-05-25 07:41:55 -04:00
Jean-Marc Valin
e2374a8ec2 Speeds up celt_fir by more than a factor of two.
Only impacts the PLC. We now assume that the order is a multiple of 4.
2013-05-25 04:25:54 -04:00
Jean-Marc Valin
319df9a836 Fixes two warnings in pitch_xcorr()
Rename y0 and y1 because of the name clash with Bessel functions.
Initialize y_3 to zero because gcc is too dumb to realize it can't
be used uninitialized.
2013-05-25 02:51:56 -04:00
Jean-Marc Valin
e8e57a32f6 Optimizes _celt_autocorr() by using pitch_xcorr()
Computes most of the auto-correlation by reusing pitch_xcorr(). We only
need lag*(lag-1)/2 MACs to complete the calculations.
To do this, pitch_xcorr() was modified so that it no longer truncates the
length to a multiple of 4. Also, the xcorr didn't need the floor at -1.
As a side benefit, this speeds up the PLC, which uses a higher order LPC
filter.
2013-05-25 02:14:25 -04:00
Jean-Marc Valin
fbf99981a6 Merges the 4th order FIR with the first order FIR in pitch_downsample()
Also creates a new hardcoded 5th order fir.
2013-05-24 17:20:08 -04:00
Ralph Giles
1b0552bf94 Try to clarify that opus maps to flac/wav but wav doesn't map to opus. 2013-05-25 01:43:06 +08:00
Ralph Giles
bd5cfda830 Reference before period. 2013-05-25 01:37:46 +08:00
Ralph Giles
4a0bf9601e Hack quoting of hanning article.
If there's no complete author tag, we need to add an opening
quote character manually. See the EBU entry.
2013-05-25 01:28:29 +08:00
Ralph Giles
b243dca30c Wrap lookahead code example in a figure. 2013-05-25 01:23:41 +08:00
Ralph Giles
9e85220f21 Add a wikipedia reference for the Hanning window. 2013-05-25 01:20:00 +08:00
Ralph Giles
6bdbd26ce1 Move the vorbis channel mapping to informative references.
The normative reference is now the channel configurations
give directly in the draft.
2013-05-25 01:18:25 +08:00
Ralph Giles
7918ac13a8 Fix Ogg draft formatting.
Previous markup was invalid.
2013-05-25 01:17:11 +08:00
Ralph Giles
5b6fe64692 Remove an unnecessary comma. 2013-05-25 00:29:16 +08:00
Ralph Giles
2ad6eafcda Merge JM's encoder suggestions.
I've done some editing for clarity, but more needs to be done.
The language needs clean-up, we should forward-reference the LPC
Extrapolation section, and we need a reference for actually
computing linear prediction coefficients.
2013-05-24 18:28:58 +08:00
Ralph Giles
25ffd5cd91 Bump Ogg draft version and date. 2013-05-24 18:03:00 +08:00
Ralph Giles
dfda81eb6e Move implementation status details to wiki.xiph.org.
More recent versions of draft-sheffer-running-code suggest referring
to a wiki. We'd like to try maintaining the implementation status
separately.
2013-05-24 17:44:43 +08:00
Jean-Marc Valin
85a6618af8 Make pitch_xcorr() work when len and max_pitch aren't multiples of 4. 2013-05-24 03:41:04 -04:00
Jean-Marc Valin
088929d1f1 oops, removed a minus sign that should never have appeared 2013-05-24 01:38:06 -04:00
Jean-Marc Valin
559fbe8b16 Unrolled version of the pitch correlation
About 30% faster on x86.
2013-05-24 01:09:31 -04:00
Timothy B. Terriberry
e3ad4ea1cd Move misplaced RESTORE_STACK.
Introduced in c152d602.

Thanks to Pedro Becerra for the report.
2013-05-23 19:33:34 -07:00