From e4689464eb5f1cc33f958a119deb3669919c529d Mon Sep 17 00:00:00 2001 From: "Timothy B. Terriberry" Date: Tue, 24 Apr 2012 00:37:04 -0400 Subject: [PATCH] Addressing AD issues Including a description of the PVQ encoder and decoder --- doc/draft-ietf-codec-opus.xml | 75 +++++++++++++++++++++++++++++++---- 1 file changed, 67 insertions(+), 8 deletions(-) diff --git a/doc/draft-ietf-codec-opus.xml b/doc/draft-ietf-codec-opus.xml index 25743809..9aa3c4ec 100644 --- a/doc/draft-ietf-codec-opus.xml +++ b/doc/draft-ietf-codec-opus.xml @@ -943,7 +943,8 @@ A receiver MUST NOT process packets which violate any of the rules above as They are reserved for future applications, such as in-band headers (containing metadata, etc.). Packets which violate these constraints may cause implementations of - this specification to treat them as malformed, and discard them. + this specification to treat them as malformed, and + discard them. These constraints are summarized here for reference: @@ -1983,6 +1984,7 @@ w0_Q13 = w_Q13[wi0] ]]> N.b., w1_Q13 is computed first here, because w0_Q13 depends on it. +The constant 6554 is approximately 0.1 in Q16. > ((32-i)>>1) w_Q9[k] = y + ((213*f*y)>>16) ]]> +The constant 46214 here is approximately the square root of 2 in Q15. The cb1_Q8[] vector completely determines these weights, and they may be tabulated and stored as 13-bit unsigned values (with a range of 1819 to 5227, inclusive) to avoid computing them when decoding. @@ -3453,6 +3457,7 @@ a32_Q24[d_LPC-1][n] = a32_Q12[n] << 12 . Then for each k from d_LPC-1 down to 0, if abs(a32_Q24[k][k]) > 16773022, the filter is unstable and the recurrence stops. +The constant 16773022 here is approximately 0.99975 in Q24. Otherwise, row k-1 of a32_Q24 is computed from row k as
@@ -4566,7 +4571,7 @@ For unvoiced frames, the LPC residual for @@ -5060,14 +5065,14 @@ total_bits, and set dynalloc_loop_log to 1. When the while loop finishes boost contains the boost for this band. If boost is non-zero and dynalloc_logp is greater than 2, decrease dynalloc_logp. Once this process has been executed on all bands, the band boosts have been decoded. This procedure -is implemented around line 2352 of celt.c. +is implemented around line 2469 of celt.c. At very low rates it is possible that there won't be enough available space to execute the inner loop even once. In these cases band boost is not possible but its overhead is completely eliminated. Because of the high cost of band boost when activated, a reasonable encoder should not be using it at very low rates. The reference implements its dynalloc decision -logic around line 1269 of celt.c. +logic around line 1299 of celt.c. The allocation trim is a integer value from 0-10. The default value of 5 indicates no trim. The trim parameter is entropy coded in order to @@ -5079,7 +5084,12 @@ available in the bitstream. To decode the trim, first set the trim value to 5, then iff the count of decoded 8th bits so far (ec_tell_frac) plus 48 (6 bits) is less than or equal to the total frame size in 8th bits minus total_boost (a product of the above band boost procedure), -decode the trim value using the inverse CDF {127, 126, 124, 119, 109, 87, 41, 19, 9, 4, 2, 0}. +decode the trim value using the PDF in . + + +PDF +{1, 1, 2, 5, 10, 22, 46, 22, 10, 5, 2, 2}/128 + For 10 ms and 20 ms frames using short blocks and that have at least LM+2 bits left prior to the allocation process, then one anti-collapse bit is reserved in the allocation process so it can @@ -5188,7 +5198,30 @@ they are equivalent to the mathematical definition. -The decoded vector is normalized such that its +The decoded vector X is recovered as follows. +Let i be the index decoded with the procedure in + with ft = V(N,K), so that 0 <= i < V(N,K). +Let k = K. +Then for j = 0 to (N - 1), inclusive, do: + +Let p = (V(N-j-1,k) + V(N-j,k))/2. + +If i < p, then let sgn = 1, else let sgn = -1 + and set i = i - p. + +Let k0 = k and set p = p - V(N-j-1,k). + +While p > i, set k = k - 1 and + p = p - V(N-j-1,k). + + +Set X[j] = sgn*(k0 - k) and i = i - p. + + + + + +The decoded vector X is then normalized such that its L2-norm equals one. @@ -7204,6 +7237,32 @@ codebook and the implementers MAY use any other search methods. See alg_quant() +
+ + +The vector to encode, X, is converted into an index i such that + 0 <= i < V(N,K) as follows. +Let i = 0 and k = 0. +Then for j = (N - 1) down to 0, inclusive, do: + + +If k > 0, set + i = i + (V(N-j-1,k-1) + V(N-j,k-1))/2. + +Set k = k + abs(X[j]). + +If X[j] < 0, set + i = i + (V(N-j-1,k) + V(N-j,k))/2. + + + + + +The index i is then encoded using the procedure in + with ft = V(N,K). + + +