Optimize opus decode (float only) use case using ARM NE10.
Mainly effects opus_ifft and ctl_mdct_backward and related
functions.
Work based on previous Encode optimization using ARM NE10
library. See previous commit for details on how to enable
this.
Signed-off-by: Timothy B. Terriberry <tterribe@xiph.org>
Optimize opus encode (float only) usecase using ARM NE10
library. Mainly effects opus_fft and ctl_mdct_forward
and related functions.
This optimization can be used for ARM CPUs that have NEON
VFP unit. This patch only enables optimizations for ARMv7.
Official ARM NE10 library page available at
http://projectne10.github.io/Ne10/
To enable this optimization, use
--enable-intrinsics --with-NE10=<install_prefix>
or
--enable-intrinsics --with-NE10-libraries=<NE10_lib_dir> --with-NE10-includes=<NE10_includes_dir>
Compile time checks made during configure process to make sure
optimization option available only when compiler supports NEON
instrinsics.
Runtime checks made to make sure optimized functions only called
on appropriate hardware.
Signed-off-by: Timothy B. Terriberry <tterribe@xiph.org>
In that case, the yp0 and yp1 ended up aliasing for the last element,
despite being marked as restrict.
Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
We had previously advised people to -Drestrict on
non-C99 compilers, but this creates problems for
some of the MSVC headers. Instead this just
uses a macro and defines it sanely.
Fixes two leakage problems on the wood blocks sample
- Removes DC which causes leakage with no masking
- Detect leakage by comparing short-MDCT energy to long-MDCT energy
and boost allocation for bands with leakage