Commit graph

37 commits

Author SHA1 Message Date
Linfeng Zhang
43db56225b
Optimize silk_biquad_alt_stride2() for ARM NEON
The optimization is bit exact with C function.

Change-Id: Ifb8f04b19f2d576e79ce5dcfa7e0fc374d71d6c8

Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
2017-05-17 15:55:42 -04:00
Linfeng Zhang
60eb7d88b4
Update silk_biquad_alt()
Split to silk_biquad_alt_stride1() and silk_biquad_alt_stride2(),
so that it can be optimized more efficiently when stride is 2.

This change in C code is bit exact with the origin.

Change-Id: Idaefe670397016ace2a489e3435ac61b7dbe79d5

Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
2017-05-17 15:55:42 -04:00
Linfeng Zhang
95d4c9f960
Optimize silk_LPC_inverse_pred_gain() for ARM NEON
The optimization is bit exact with C function.

Change-Id: Ib3bdc26a5a4ebe02e7f24be85104e8e9a2a9a738

Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
2017-02-14 23:57:15 -05:00
Jean-Marc Valin
572d65df7e
Remove silk_LPC_inverse_pred_gain_Q24() which is no longer used anywhere 2017-02-09 23:01:26 -05:00
Linfeng Zhang
cfdaf365b9
Optimize silk_NSQ_del_dec() for ARM NEON
The optimization is bit exact with C function.

This optimization speeds up SILK encoder on NEON as following.

Fixed-point:
Complexity 0-5:  0%
Complexity 6-7:  6%
Complexity 8-9: 10%
Complexity  10:  8%

Got similar results on floating-point.

Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
2017-01-30 16:00:03 -05:00
Koen Vos
3e054b8e57 increase order of noise shaping filter 2016-07-19 16:11:18 -04:00
Koen Vos
07691f15d4 Clean up: alignment of comments 2016-07-17 15:05:55 -04:00
Koen Vos
90f8c5ef4d Clean up: replace tabs by spaces 2016-07-17 15:05:55 -04:00
Koen Vos
52cfffe579 slight clean up 2016-07-17 15:05:55 -04:00
Jean-Marc Valin
34da05b30d Fixes signed integer overlof in silk_ADD_POS_SAT32()
Removes unused 64-bit version
2016-07-17 15:05:54 -04:00
xiangmingzhu
c95c9a048f Cisco optimization for x86 & fixed point
1. Only for fixed point on x86 platform (32bit and 64bit, uses SIMD
   intrinsics up to SSE4.2)
2. Use "configure --enable-fixed-point --enable-intrinsics" to enable
   optimization, default is disabled.
3. Official test cases are verified and passed.

Signed-off-by: Timothy B. Terriberry <tterribe@xiph.org>
2014-10-03 21:16:00 -04:00
Rhishikesh Agashe
f133bac6f9 MIPS optimizations
Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
2014-06-19 04:04:51 -04:00
Timothy B. Terriberry
39386e0b85 Adds Neon assembly for correlation/convolution
Optimizing celt_pitch_xcorr()/xcorr_kernel() which also speeds up
FIRs, IIRs and auto-correlations

Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
2013-11-18 13:41:17 -05:00
Gregory Maxwell
7830cf1bd2 Replace "inline" with OPUS_INLINE.
Newer versions of MSVC are unhappy with the strategy of the build
 environment redefining "inline" (even though they don't support the
 actual keyword). Instead we define OPUS_INLINE to the right thing
 in opus_defines.h.

This is the same approach we use for restrict.
2013-10-28 10:18:54 -07:00
Ralph Giles
f2446c25c6 Remove trailing whitespace from the license headers. 2013-09-16 14:40:04 -07:00
Timothy B. Terriberry
e095c3eb7f Move ARM asm into its own directories. 2013-05-21 12:53:33 -07:00
Timothy B. Terriberry
80ad38370c Convert quotes in license headers to ASCII.
Since the last patch originally had them mangled (presumably by
 mailer, http server, or something else), let's just get rid of
 them.
2013-05-19 19:16:11 -07:00
Timothy B. Terriberry
972a34ec2c Add ARMv4/ARMv5E macros.
Original patch by Aurélien Zanelli <aurelien.zanelli@parrot.com>:
 http://lists.xiph.org/pipermail/opus/2013-May/002078.html

Revised version:
- Add autconf detection (ported from libtheora).
- Rename ARM5E to ARMv5E (an ARM5 is not the same thing as ARMv5!).
- Use actual macros so they can still be selectively overridden.
- Split out ARMv4 parts and add a few more ARMv4 macros.
- Label blocks to make them easy to find in generated assembly.
- Fix MULT16_32_Q15() so we can pass make check.
  The MDCT test passes in values larger than 2**30 for b.
  The new version should be just as fast (or faster, since it's
   easier to merge the shift with following instructions), and
   there's no appreciable impact on accuracy (FFT/MDCT SNR actually
   goes up in most cases).
- Fix register constraints.
  We were using early-clobber flags in a bunch of places that
   didn't need them, and commutative-pair flags in a bunch of
   places that weren't actually commutative.
  This was Jean-Marc's fault (the original code came from Speex).
- Simplify silk_CLZ16().
- Port over iFFT C_MULC asm by Andree Buschmann
   <AndreeBuschmann@t-online.de> from Rockbox.
- Speed up the C_MULC asm by using LDRD, allowing more flexible
   addressing, re-ordering instructions to avoid some stalls,
   allowing more flexible register allocation, and getting things
   out of the inline asm block so the compiler can schedule them
   better.
- Add C_MUL and C_MUL4 asm for the FFT to the encoder based, on the
   new C_MULC.

In total, this patch gives a 22.3% speed-up on test_opus_encoder on
 a 600 MHz Cortex A8 using gcc 4.2.1,
When restricted to ARMv4 optimizations, it gives a 9.6% speed-up
 on the same processor/compiler.
On the conformance test vectors:
 Average mono quality is 97.0583 %
 Average stereo quality is 97.775 %
2013-05-19 19:12:51 -07:00
Koen Vos
9cbbcb53ae Improvements to the pitch search
Normalizes the cost function by (x+y) instead of sqrt(x*y)
2012-10-10 09:37:10 -04:00
Philip Jägenstedt
6d9c16d142 Fix common misspellings
I stumbled upon the typo in README.draft, so took the opportunity to
grep for common misspellings using List_of_common_misspellings.txt for
hunspell.
2012-09-27 09:16:30 -04:00
Gregory Maxwell
e052947f18 Use 'frame' instead of 'signal', take out stdlib.h in silk/.
On MacOS, stdlib.h ends up including sys/signal.h, generating
warnings about the local variables called 'signal' shadowing
the global symbol signal(3).

This was originally done in 86476906 but it missed some use
of 'signal' in prototypes in headers where it didn't cause
warnings. Later the prototypes were moved around and the
warnings came back.

This also cleans up some cases in where stdlib.h was used
but shouldn't be required.
2012-05-23 10:28:36 -04:00
Jean-Marc Valin
ab5a049705 Merge commit '390c89225d' 2012-04-24 13:39:22 -04:00
Jean-Marc Valin
ae00e60d35 License update using the IETF Trust flavour of the BSD on the Silk code 2012-04-20 16:31:04 -04:00
Ralph Giles
5f6e472c91 Remove trailing whitespace.
Also fixes a minor typo.
2012-04-02 11:21:36 -04:00
Gregory Maxwell
3e4afc6fd2 Removes a number of macro definitions which are used nowhere in the codebase. 2012-03-05 18:00:01 -05:00
Koen Vos
bf75c8ec4d SILK fixes following last codec WG meeting
decoder:
- fixed incorrect scaling of filter states for the smallest quantization
  step sizes
- NLSF2A now limits the prediction gain of LPC filters

encoder:
- increased damping of LTP coefficients in LTP analysis
- increased white noise fraction in noise shaping LPC analysis
- introduced maximum total prediction gain.  Used by Burg's method to
  exit early if prediction gain is exceeded.  This improves packet
  loss robustness and numerical robustness in Burg's method
- Prefiltered signal is now in int32 Q10 domain, from int16 Q0
- Increased max number of iterations in CBR gain control loop from 5 to 6
- Removed useless code from LTP scaling control
- Optimization: smarter LPC loop unrolling
- Switched default win32 compile mode to be floating-point

resampler:
- made resampler have constant delay of 0.75 ms; removed delay
  compensation from silk code.
- removed obsolete table entries (~850 Bytes)
- increased downsampling filter order from 16 to 18/24/36 (depending on
  frequency ratio)
- reoptimized filter coefficients
2011-12-13 14:47:31 -05:00
Ralph Giles
120800f8fa Rename '_FOO' to avoid potentional collisions with reserved identifiers.
C reserves identifiers of the from _[A-Z]+ and we have a number of
those in the code. This patch renames the various function arguments,
MACROS and preprocessor symbols to avoid the reserved form.

It also removes the CHANNELS() macro altogether. This was a
minor optimization for TI DSP to force a mono-only build,
as were the associated local 'const' versions. Since stereo
support is manditory, it wasn't worth keeping.

Thanks to John Ridges for raising the issue, and Jean-Marc Valin
and Greg Maxwell for reviewing the changes.
2011-12-02 12:31:36 -05:00
Koen Vos
acc7a6c78b Reformatting changes with an update to the MSVC project files 2011-10-28 19:44:26 -04:00
Jean-Marc Valin
4a7027b27e Allow wrap-around in silk_LPC_analysis_filter() 2011-10-27 13:51:21 -04:00
Jean-Marc Valin
294bfec27b Implements hard CBR for SILK
This is achieved by running the encoding process in a loop and
padding when we don't reach the exact rate. It also implements
VBR-with-cap, which means we no longer need to artificially decrease
the SILK bandwidth when it's close to the cap.
2011-10-20 00:39:41 -04:00
Jean-Marc Valin
4541bbcc22 Gets rid of a "safe" signed overflow in silk_DIV32_varQ() 2011-10-11 16:47:00 -04:00
Gregory Maxwell
59647fdb65 Misc. silk/ cleanups: static inline things which are only used in one file. 2011-09-28 16:28:18 -04:00
Gregory Maxwell
744e59ef01 More resampler cleanups. 2011-09-28 12:40:06 -04:00
Gregory Maxwell
96739ad35e Eliminate function pointers from the resampler. 2011-09-28 02:26:45 -04:00
Jean-Marc Valin
1ee139bca0 Making the left shift macros use unsigned to avoid undefined behaviour
Result should be bit-identical on most machines/compilers, minus the undefined
behaviour
2011-09-23 13:08:04 -04:00
Koen Vos
cc34050455 Fixes an integer overflow caused by uninitialized values in LTP scaling 2011-09-21 14:50:17 -04:00
Jean-Marc Valin
1c2f5633d1 Removed all the silk_ prefixes in source file names (not symbols) 2011-09-16 01:16:53 -07:00
Renamed from silk/silk_SigProc_FIX.h (Browse further)