Commit graph

13 commits

Author SHA1 Message Date
Timothy B. Terriberry
e095c3eb7f Move ARM asm into its own directories. 2013-05-21 12:53:33 -07:00
Timothy B. Terriberry
9880c4cdeb Fix bustage in a16cef62. 2013-05-19 20:52:55 -07:00
Timothy B. Terriberry
a16cef6225 Replace silk_CLZ functions with EC_ILOG().
In most cases these will use __builtin_clz().
In a follow-up, we should audit usage of silk_CLZ32() and convert
 the places where its argument must be non-zero to use EC_ILOG()
 directly to avoid the test for zero (which is necessary on x86).
2013-05-19 19:16:15 -07:00
Timothy B. Terriberry
80ad38370c Convert quotes in license headers to ASCII.
Since the last patch originally had them mangled (presumably by
 mailer, http server, or something else), let's just get rid of
 them.
2013-05-19 19:16:11 -07:00
Timothy B. Terriberry
972a34ec2c Add ARMv4/ARMv5E macros.
Original patch by Aurélien Zanelli <aurelien.zanelli@parrot.com>:
 http://lists.xiph.org/pipermail/opus/2013-May/002078.html

Revised version:
- Add autconf detection (ported from libtheora).
- Rename ARM5E to ARMv5E (an ARM5 is not the same thing as ARMv5!).
- Use actual macros so they can still be selectively overridden.
- Split out ARMv4 parts and add a few more ARMv4 macros.
- Label blocks to make them easy to find in generated assembly.
- Fix MULT16_32_Q15() so we can pass make check.
  The MDCT test passes in values larger than 2**30 for b.
  The new version should be just as fast (or faster, since it's
   easier to merge the shift with following instructions), and
   there's no appreciable impact on accuracy (FFT/MDCT SNR actually
   goes up in most cases).
- Fix register constraints.
  We were using early-clobber flags in a bunch of places that
   didn't need them, and commutative-pair flags in a bunch of
   places that weren't actually commutative.
  This was Jean-Marc's fault (the original code came from Speex).
- Simplify silk_CLZ16().
- Port over iFFT C_MULC asm by Andree Buschmann
   <AndreeBuschmann@t-online.de> from Rockbox.
- Speed up the C_MULC asm by using LDRD, allowing more flexible
   addressing, re-ordering instructions to avoid some stalls,
   allowing more flexible register allocation, and getting things
   out of the inline asm block so the compiler can schedule them
   better.
- Add C_MUL and C_MUL4 asm for the FFT to the encoder based, on the
   new C_MULC.

In total, this patch gives a 22.3% speed-up on test_opus_encoder on
 a 600 MHz Cortex A8 using gcc 4.2.1,
When restricted to ARMv4 optimizations, it gives a 9.6% speed-up
 on the same processor/compiler.
On the conformance test vectors:
 Average mono quality is 97.0583 %
 Average stereo quality is 97.775 %
2013-05-19 19:12:51 -07:00
Timothy B. Terriberry
c152d602aa Use dynamic stack allocation in the SILK encoder.
This makes all remaining large stack allocations use the vararray
 macros.
This continues the work of 6f2d9f50 to allow compiling with
 NONTHREADSAFE_PSEUDOSTACK to move the memory for large buffers
 off the stack for devices where it is very limited.

It also does this for some additional large buffers used by the
 PLC in the decoder.
2013-05-08 10:37:17 -07:00
Jean-Marc Valin
ab5a049705 Merge commit '390c89225d' 2012-04-24 13:39:22 -04:00
Jean-Marc Valin
ae00e60d35 License update using the IETF Trust flavour of the BSD on the Silk code 2012-04-20 16:31:04 -04:00
Jean-Marc Valin
94a4989cdb Makes silk_ADD_SAT32() conform to the C standard
This changes the saturation test to ensure that it relies on the
unsigned overflow behaviour (which is allowed) rather than the signed
overflow behaviour (which is undefined).
2012-04-12 16:35:19 -04:00
Gregory Maxwell
3e4afc6fd2 Removes a number of macro definitions which are used nowhere in the codebase. 2012-03-05 18:00:01 -05:00
Koen Vos
bf75c8ec4d SILK fixes following last codec WG meeting
decoder:
- fixed incorrect scaling of filter states for the smallest quantization
  step sizes
- NLSF2A now limits the prediction gain of LPC filters

encoder:
- increased damping of LTP coefficients in LTP analysis
- increased white noise fraction in noise shaping LPC analysis
- introduced maximum total prediction gain.  Used by Burg's method to
  exit early if prediction gain is exceeded.  This improves packet
  loss robustness and numerical robustness in Burg's method
- Prefiltered signal is now in int32 Q10 domain, from int16 Q0
- Increased max number of iterations in CBR gain control loop from 5 to 6
- Removed useless code from LTP scaling control
- Optimization: smarter LPC loop unrolling
- Switched default win32 compile mode to be floating-point

resampler:
- made resampler have constant delay of 0.75 ms; removed delay
  compensation from silk code.
- removed obsolete table entries (~850 Bytes)
- increased downsampling filter order from 16 to 18/24/36 (depending on
  frequency ratio)
- reoptimized filter coefficients
2011-12-13 14:47:31 -05:00
Ralph Giles
120800f8fa Rename '_FOO' to avoid potentional collisions with reserved identifiers.
C reserves identifiers of the from _[A-Z]+ and we have a number of
those in the code. This patch renames the various function arguments,
MACROS and preprocessor symbols to avoid the reserved form.

It also removes the CHANNELS() macro altogether. This was a
minor optimization for TI DSP to force a mono-only build,
as were the associated local 'const' versions. Since stereo
support is manditory, it wasn't worth keeping.

Thanks to John Ridges for raising the issue, and Jean-Marc Valin
and Greg Maxwell for reviewing the changes.
2011-12-02 12:31:36 -05:00
Jean-Marc Valin
1c2f5633d1 Removed all the silk_ prefixes in source file names (not symbols) 2011-09-16 01:16:53 -07:00
Renamed from silk/silk_macros.h (Browse further)