Updates from mailing list and other small fixes.
* Bump the document date.
* Mandate that the ID header must complete on the first page (to
remove any ambiguities about this requirement in RFC 3533).
* Remove reundant wording that rillian forgot to remove in 360a4117
.
* Split the "Granule Position" section into subsections.
* Move the first paragraph of the "Other Implementation Notes"
section into the "Granule Position" section, add general seeking
implementation guidance, and be specific about the interaction
between pre-roll and pre-skip.
* Retitle the remaining contents of the "Other Implementation Notes"
section to "Packet Size Limits"
* Specify that all the header fields are REQUIRED (and add a
description of the Channel Mapping Table as a whole, so we can
say when it is REQUIRED).
* Specify that implementations MUST NOT reject headers with extra
data if they have an unknown minor version number.
* Add a reference to RFC 3629 (UTF-8).
* Minor formatting adjustments to vorbis-trim and vorbis-mapping
cites.
* Eliminate semicolons and terrible "Else, if" constructs.
This commit is contained in:
parent
3527f9d4c4
commit
b3744613b7
1 changed files with 94 additions and 28 deletions
|
@ -51,7 +51,7 @@
|
|||
</address>
|
||||
</author>
|
||||
|
||||
<date day="3" month="July" year="2012"/>
|
||||
<date day="16" month="July" year="2012"/>
|
||||
<area>RAI</area>
|
||||
<workgroup>codec</workgroup>
|
||||
|
||||
|
@ -141,7 +141,7 @@ The first packet in the logical Ogg bitstream MUST contain the identification
|
|||
(ID) header, which uniquely identifies a stream as Opus audio.
|
||||
The format of this header is defined in <xref target="id_header"/>.
|
||||
It MUST be placed alone (without any other packet data) on the first page of
|
||||
the logical Ogg bitstream.
|
||||
the logical Ogg bitstream, and must complete on that page.
|
||||
This page MUST have its 'beginning of stream' flag set.
|
||||
</t>
|
||||
<t>
|
||||
|
@ -164,9 +164,9 @@ The value N is specified in the ID header (see
|
|||
logical Ogg bitstream.
|
||||
</t>
|
||||
<t>
|
||||
The first N-1 Opus packets, if any, are packed one after another in sequence
|
||||
into the Ogg packet, using the self-delimiting framing from Appendix B
|
||||
of <xref target="RFCOpus"/>.
|
||||
The first N-1 Opus packets, if any, are packed one after another into the Ogg
|
||||
packet, using the self-delimiting framing from Appendix B of
|
||||
<xref target="RFCOpus"/>.
|
||||
The remaining Opus packet is packed at the end of the Ogg packet using the
|
||||
regular, undelimited framing from Section 3 of <xref target="RFCOpus"/>.
|
||||
All of the Opus packets in a single Ogg packet MUST be constrained to have the
|
||||
|
@ -244,6 +244,7 @@ In order to support capturing a stream that uses discontinuous transmission
|
|||
not transmitted.
|
||||
</t>
|
||||
|
||||
<section anchor="preskip" title="Pre-skip">
|
||||
<t>
|
||||
There is some amount of latency introduced during the decoding process, to
|
||||
allow for overlap in the MDCT modes, stereo mixing in the LP modes, and
|
||||
|
@ -269,7 +270,9 @@ It may also be used to perform sample-accurate cropping of existing encoded
|
|||
This amount need not be a multiple of 2.5 ms, may be smaller than a single
|
||||
packet, or may span the contents of several packets.
|
||||
</t>
|
||||
</section>
|
||||
|
||||
<section anchor="pcm_sample_position" title="PCM Sample Position">
|
||||
<t>
|
||||
The PCM sample position is determined from the granule position using the
|
||||
formula
|
||||
|
@ -306,7 +309,7 @@ In this case, the PCM sample position of the first audio sample to be played
|
|||
<t>
|
||||
Vorbis streams use a granule position smaller than the number of audio samples
|
||||
contained in the first audio data page to indicate that some of those samples
|
||||
must be trimmed from the output. See <xref target="vorbis-trim"/>.
|
||||
must be trimmed from the output (see <xref target="vorbis-trim"/>).
|
||||
However, to do so, Vorbis requires that the first audio data page contains
|
||||
exactly two packets, in order to allow the decoder to perform PCM position
|
||||
adjustments before needing to return any PCM data.
|
||||
|
@ -315,7 +318,9 @@ Opus uses the pre-skip mechanism for this purpose instead, since the encoder
|
|||
large packets in streams with a very large number of channels might not fit on
|
||||
a single page.
|
||||
</t>
|
||||
</section>
|
||||
|
||||
<section title="end_trimming" title="End Trimming">
|
||||
<t>
|
||||
The page with the 'end of stream' flag set MAY have a granule position that
|
||||
indicates the page contains less audio data than would normally be returned by
|
||||
|
@ -330,7 +335,10 @@ The remaining samples are discarded.
|
|||
The number of discarded samples SHOULD be no larger than the number decoded
|
||||
from the last packet.
|
||||
</t>
|
||||
</section>
|
||||
|
||||
<section anchor="start_granpos_restrictions"
|
||||
title="Restrictions on the Initial Granule Position">
|
||||
<t>
|
||||
The granule position of the first audio data page with a completed packet MAY
|
||||
be larger than the number of samples contained in packets that complete on
|
||||
|
@ -367,6 +375,32 @@ This would indicate that more samples should be skipped from the initial
|
|||
</t>
|
||||
</section>
|
||||
|
||||
<section anchor="seeking_and_preroll" title="Seeking and Pre-roll">
|
||||
<t>
|
||||
Seeking in Ogg files is best performed using a bisection search for a page
|
||||
whose granule position corresponds to a PCM position at or before the seek
|
||||
target.
|
||||
With appropriately weighted bisection, accurate seeking can be performed with
|
||||
just three or four bisections even in multi-gigabyte files.
|
||||
See <xref target="seeking"/> for general implementation guidance.
|
||||
</t>
|
||||
|
||||
<t>
|
||||
When seeking within an Ogg Opus stream, the decoder SHOULD start decoding (and
|
||||
discarding the output) at least 3840 samples (80 ms) prior to the
|
||||
seek target in order to ensure that the output audio is correct by the time it
|
||||
reaches the seek target.
|
||||
This 'pre-roll' is separate from, and unrelated to, the 'pre-skip' used at the
|
||||
beginning of the stream.
|
||||
If the point 80 ms prior to the seek target comes before the initial PCM
|
||||
sample position, the decoder SHOULD start decoding from the beginning of the
|
||||
stream, applying pre-skip as normal, regardless of whether the pre-skip is
|
||||
larger or smaller than 80 ms.
|
||||
</t>
|
||||
</section>
|
||||
|
||||
</section>
|
||||
|
||||
<section anchor="headers" title="Header Packets">
|
||||
<t>
|
||||
An Opus stream contains exactly two mandatory header packets.
|
||||
|
@ -473,12 +507,12 @@ The original sample rate of the encoder input is not preserved by the lossy
|
|||
An Ogg Opus player SHOULD select the playback sample rate according to the
|
||||
following procedure:
|
||||
<list style="numbers">
|
||||
<t>If the hardware supports 48 kHz playback, decode at 48 kHz;</t>
|
||||
<t>Else, if the hardware's highest available sample rate is a supported
|
||||
rate, decode at this sample rate;</t>
|
||||
<t>Else, if the hardware's highest available sample rate is less than
|
||||
48 kHz, decode at the highest supported rate above this and resample;</t>
|
||||
<t>Else, decode at 48 kHz and resample.</t>
|
||||
<t>If the hardware supports 48 kHz playback, decode at 48 kHz.</t>
|
||||
<t>Otherwise, if the hardware's highest available sample rate is a supported
|
||||
rate, decode at this sample rate.</t>
|
||||
<t>Otherwise, if the hardware's highest available sample rate is less than
|
||||
48 kHz, decode at the highest supported rate above this and resample.</t>
|
||||
<t>Otherwise, decode at 48 kHz and resample.</t>
|
||||
</list>
|
||||
However, the 'Input Sample Rate' field allows the encoder to pass the sample
|
||||
rate of the original input stream as metadata.
|
||||
|
@ -542,9 +576,28 @@ Each possible value of this octet indicates a mapping family, which defines a
|
|||
allowed channel count.
|
||||
The details are described in <xref target="channel_mapping"/>.
|
||||
</t>
|
||||
<t><spanx style="strong">Channel Mapping Table</spanx>:
|
||||
This table defines the mapping from encoded streams to output channels.
|
||||
It is omitted when the channel mapping family is 0, but REQUIRED otherwise.
|
||||
Its contents are specified in <xref target="channel_mapping"/>.
|
||||
</t>
|
||||
</list>
|
||||
</t>
|
||||
|
||||
<t>
|
||||
All fields in the ID headers are REQUIRED, except for the channel mapping
|
||||
table, which is omitted when the channel mapping family is 0.
|
||||
Implementations SHOULD reject ID headers which do not contain enough data for
|
||||
these fields, even if they contain a valid Magic Signature.
|
||||
Future versions of this specification, even backwards-compatible versions,
|
||||
might include additional fields in the ID header.
|
||||
If an ID header has a compatible major version, but a larger minor version,
|
||||
an implementation MUST NOT reject it for containing additional data not
|
||||
specified here.
|
||||
However, implementations MAY reject streams in which the ID header does not
|
||||
complete on the first page.
|
||||
</t>
|
||||
|
||||
<section anchor="channel_mapping" title="Channel Mapping">
|
||||
<t>
|
||||
An Ogg Opus stream allows mapping one number of Opus streams (N) to a possibly
|
||||
|
@ -658,9 +711,8 @@ When the 'channel mapping family' octet has this value, the channel mapping
|
|||
<vspace blankLines="1"/>
|
||||
Allowed numbers of channels: 1...8.<vspace/>
|
||||
Channel meanings depend on the number of channels.
|
||||
See <xref target="vorbis-mapping">the
|
||||
Vorbis mapping</xref> for the assignments from output channel number to
|
||||
specific speaker locations.
|
||||
See <xref target="vorbis-mapping"/> for the assignments from output channel
|
||||
number to specific speaker locations.
|
||||
<vspace blankLines="1"/>
|
||||
</t>
|
||||
<t>Family 255 (no defined channel meaning):
|
||||
|
@ -756,13 +808,13 @@ It MUST NOT indicate that the vendor string is longer than the rest of the
|
|||
<t><spanx style="strong">Vendor String</spanx> (variable length, UTF-8 vector):
|
||||
<vspace blankLines="1"/>
|
||||
This is a simple human-readable tag for vendor information, encoded as a UTF-8
|
||||
string.
|
||||
string <xref target="RFC3629"/>.
|
||||
No terminating NUL octet is required.
|
||||
<vspace blankLines="1"/>
|
||||
This tag is intended to identify the codec encoder and encapsulation
|
||||
implementations, for tracing differences in technical behavior. The
|
||||
user-facing encoding application can use the 'ENCODER' user commment
|
||||
tag name to identify themselves.
|
||||
implementations, for tracing differences in technical behavior.
|
||||
The user-facing encoding application can use the 'ENCODER' user commment tag
|
||||
name to identify themselves.
|
||||
<vspace blankLines="1"/>
|
||||
</t>
|
||||
<t><spanx style="strong">User Comment List Length</spanx> (32 bits, unsigned,
|
||||
|
@ -794,6 +846,17 @@ There is one for each user comment indicated by the 'user comment list length'
|
|||
</list>
|
||||
</t>
|
||||
|
||||
<t>
|
||||
The vendor string length and user comment list length are REQUIRED, and
|
||||
implementations SHOULD reject comment headers that do not contain enough data
|
||||
for these fields, or that do not contain enough data for the corresponding
|
||||
vendor string or user comments they describe.
|
||||
Making this check before allocating the associated memory to contain the data
|
||||
may help prevent a possible Denial-of-Service (DoS) attack from small comment
|
||||
headers that claim to contain strings longer than the entire packet or more
|
||||
user comments than than could possibly fit in the packet.
|
||||
</t>
|
||||
|
||||
<t>
|
||||
The user comment strings follow the NAME=value format described by
|
||||
<xref target="vorbis-comment"/> with the same recommended tag names.
|
||||
|
@ -836,19 +899,11 @@ There is no Opus comment tag corresponding to REPLAYGAIN_ALBUM_GAIN.
|
|||
That information should instead be stored in the ID header's 'output gain'
|
||||
field.
|
||||
</t>
|
||||
|
||||
</section>
|
||||
|
||||
</section>
|
||||
|
||||
<section anchor="other_implementation_notes"
|
||||
title="Other Implementation Notes">
|
||||
<t>
|
||||
When seeking within an Ogg Opus stream, the decoder should start decoding (and
|
||||
discarding the output) at least 3840 samples (80 ms) prior to the
|
||||
seek point in order to ensure that the output audio is correct at the seek
|
||||
point.
|
||||
</t>
|
||||
<section anchor="packet_size_limits" title="Packet Size Limits">
|
||||
<t>
|
||||
Technically valid Opus packets can be arbitrarily large due to the padding
|
||||
format, although the amount of non-padding data they can contain is bounded.
|
||||
|
@ -978,6 +1033,7 @@ The authors agree to grant third parties the irrevocable right to copy, use,
|
|||
<references title="Normative References">
|
||||
|
||||
<?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"?>
|
||||
<?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.3629.xml"?>
|
||||
<?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.3533.xml"?>
|
||||
<?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.5334.xml"?>
|
||||
<?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.6381.xml"?>
|
||||
|
@ -1034,6 +1090,16 @@ The authors agree to grant third parties the irrevocable right to copy, use,
|
|||
<!--?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.3550.xml"?-->
|
||||
<?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.4732.xml"?>
|
||||
|
||||
<reference anchor="seeking"
|
||||
target="http://wiki.xiph.org/Seeking">
|
||||
<front>
|
||||
<title>Granulepos Encoding and How Seeking Really Works</title>
|
||||
<author initials="S." surname="Pfeiffer" fullname="Silvia Pfeiffer"/>
|
||||
<author initials="C." surname="Parker" fullname="Conrad Parker"/>
|
||||
<author initials="G." surname="Maxwell" fullname="Greg Maxwell"/>
|
||||
</front>
|
||||
</reference>
|
||||
|
||||
<reference anchor="replay-gain"
|
||||
target="http://wiki.xiph.org/VorbisComment#Replay_Gain">
|
||||
<front>
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue