Copyright Β© 2026, Society of Motion Picture and Television Engineers. All rights reserved. No part of this material may be reproduced, by any means whatsoever, without the prior written permission of the Society of Motion Picture and Television Engineers.
The Society of Motion Picture and Television Engineers (SMPTE) is an internationally-recognized standards developing organization. Headquartered and incorporated in the United States of America, SMPTE has members in over 80 countries on six continents. SMPTEβs Engineering Documents, including Standards, Recommended Practices, and Engineering Guidelines, are prepared by SMPTEβs Technology Committees. Participation in these Committees is open to all with a bona fide interest in their work. SMPTE cooperates closely with other standards-developing organizations, including ISO, IEC and ITU. SMPTE Engineering Documents are drafted in accordance with the rules given in its Standards Operations Manual.
For more information, please visit www.smpte.org.
At the time of publication no notice had been received by SMPTE claiming patent rights essential to the implementation of this Engineering Document. However, attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. SMPTE shall not be held responsible for identifying any or all such patent rights.
This document was prepared by Technology Committee 35PM.
The following summarizes the substantive changes made from SMPTE ST 2067-201:2021:
The addition of the IABMaxObjectCount item and IAB Channel SubDescriptor allows for the exposure of technical metadata to IMF/MXF parsers, without needing to access and decode the Immersive Audio Bitstream. For example, use of IAB Channel SubDescriptors allows for the identification of encoded bed configurations and their content types.
Adoption of MCA Content and MCA Use Class, in combination with the associated controlled vocabularies, further enhances content identification capabilities. Use of MCA Content Kind and MCA Audio Element Kind is no longer recommended.
The deviation from SMPTE ST 377-1:2019 regarding the calculation and interpretation of Index Stream Offsets reflects the actual practice amongst implementations of this specification.
This clause is entirely informative and does not form an integral part of this Engineering Document.
SMPTE ST 2098-2:2022 has proven (i) effective in representing cinematic immersive sound essence, and (ii) straightforward to wrap in MXF. The objective of this standard is to define a baseline method for the carriage of such sound essence for use with feature and episodic content in IMF compositions (Level 0).
This document specifies requirements for a plug-in mechanism to add Immersive Audio Bitstream (IAB) essence, as specified in SMPTE ST 2098-2:2022, to IMF compositions.
Normative text is text that describes elements of the design that are indispensable or contains the conformance language keywords: "shall", "should", or "may". Informative text is text that is potentially helpful to the user, but not indispensable, and can be removed, changed, or added editorially without affecting interoperability. Informative text does not contain any conformance keywords.
All text in this document is, by default, normative, except: the Introduction, any clause explicitly labeled as "Informative" or individual paragraphs that start with "Note:"
The keywords "shall" and "shall not" indicate requirements strictly to be followed in order to conform to the document and from which no deviation is permitted.
The keywords, "should" and "should not" indicate that, among several possibilities, one is recommended as particularly suitable, without mentioning or excluding others; or that a certain course of action is preferred but not necessarily required; or that (in the negative form) a certain possibility or course of action is deprecated but not prohibited.
The keywords "may" and "need not" indicate courses of action permissible within the limits of the document.
The keyword "reserved" indicates a provision that is not defined at this time, shall not be used, and may be defined in the future. The keyword "forbidden" indicates "reserved" and in addition indicates that the provision will never be defined in the future.
A conformant implementation according to this document is one that includes all mandatory provisions ("shall") and, if implemented, all recommended provisions ("should") as described. A conformant implementation need not implement optional provisions ("may") and need not implement them as described.
Unless otherwise specified, the order of precedence of the types of normative information in this document shall be as follows: Normative prose shall be the authoritative definition; tables shall be next; then formal languages; then figures; and then any other language forms.
The following documents are referred to in the text in such a way that some or all of their content constitutes requirements of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.
No terms and definitions are listed in this document.
An IAB Track File is a Track File, as specified in Subclause 5.1 of SMPTE ST 2067-2:2020, which contains SMPTE ST 2098-2:2022 Immersive Audio Bitstream essence, and is further constrained by the provisions of Clause 5.
Annex A introduces mechanisms to augment the contents of IAB Track Files with additional metadata.
The ConformsToSpecifications element specified in Annex B shall be present and contain a single instance of the Universal Label (UL) specified in Table 1.
| Kind | Leaf |
|---|---|
| Name | IMF IAB Track File Level 0 |
| Symbol | IMF_IABTrackFileLevel0 |
| UL | urn:smpte:ul:060E2B34.0401010D.01010201.02000000 |
The Essence Track of the IAB Track File shall reference an Essence Container, as defined in SMPTE ST 377-1:2019, that contains a single Immersive Audio Bitstream as specified in SMPTE ST 2098-2:2022.
The following shall apply to the Essence Track referencing the Immersive Audio Bitstream essence:
NOTE ββ The first requirement above determines the value of the Data Definition item of the Sequence referenced by the Sound Essence Track.
The Immersive Audio Bitstream shall be contained in a single Sound Element, as defined in SMPTE ST 377-1:2019, whose Essence Element Key shall be as specified in Table 2.
| Kind | Leaf |
|---|---|
| Name | IMF IAB Essence Clip-Wrapped Element |
| Symbol | IMF_IABEssenceClipWrappedElement |
| UL | urn:smpte:ul:060E2B34.01020101.0D010301.16cc0Dnn |
NOTE ββ As audio essence, the Immersive Audio Bitstream is clip-wrapped as specified in SMPTE ST 2067-5.
The following characteristics of the Immersive Audio Bitstream shall remain constant in the Essence Container:
Subclauses 5.6.3.2 through 5.6.3.5 describe IAB bed-related elements that are fixed in configuration, and fields within these elements that are fixed.
All IAFrames in a given Immersive Audio Bitstream shall have the same number of BedDefinition elements. For each BedDefinition element in a given IAFrame, there shall be exactly one BedDefinition element in each additional IAFrame that has the same MetaID value. BedDefinition elements with the same MetaID shall have the same value(s) for ConditionalBed, BedUseCase (if present), ChannelCount, ChannelID, AudioDescription, and AudioDescriptionText (if present). No element which is a child of an ObjectDefinition or BedDefinition shall be used.
BedRemap elements shall not be present in the Immersive Audio Bitstream.
No conditional elements (i.e., BedDefinition elements that have ConditionalBed set to 1 or ObjectDefinition elements that have ConditionalObject set to 1) shall be used unless their UseCase is set to 0xFF (Always Use).
ObjectDefinition elements with the same MetaID shall have the same value(s) for ConditionalObject, ObjectUseCase (if present), AudioDescription, and AudioDescriptionText (if present).
As defined in Subclause 11.2.3 of SMPTE ST 377-1:2019, the Index Edit Rate shall be equal to the Edit Rate of the MXF Essence Track, as defined in 5.4.
Each Index Entry value shall mark the start of an Edit Unit of stored Essence, including the Key and Length (KL) at the start of the Essence Container in each Partition.
NOTE ββ This is a deviation from Subclause 11.1.4 of SMPTE ST 377-1:2019, which excludes the KL from the Stream Offset calculation to enable use of a single Index Entry to index constant byte per element essence.
EXAMPLE ββ If the size of the KL of the Key-Length-Value (KLV) packet that contains the clip-wrapped essence container is 20 bytes, the Stream Offset of the first edit unit is 20.
The IAB Essence Descriptor is a subclass of Generic Sound Essence Descriptor, as defined in SMPTE ST 377-1:2019 and extended in SMPTE ST 2067-2:2020, and contains the items specified in Table 3 with the Set Key specified in Table 4.
| Name | Type | Len | Local Tag | UL Designator | Req | Definition |
|---|---|---|---|---|---|---|
| IAB Essence Descriptor | Set Key | 16 | Dyn | per Table 4 | Req | Identifies an Immersive Audio Bitstream Essence Descriptor |
| Length | BER length | Req | Set length | |||
| All items in Generic Sound Essence Descriptor, as specified in SMPTE ST 377-1:2019 and extended in SMPTE ST 2067-2:2020, except Group UL and Length | ||||||
| IABMaxObjectCount | Uint16 | 2 | Dyn | urn:smpte:ul:060e2b34.0101010e.0402030c.05000000 | Opt | Maximum count of Object Definitions per IAFrame in the IAB |
| Kind | Leaf |
|---|---|
| Name | IAB Essence Descriptor |
| Symbol | IABEssenceDescriptor |
| UL | urn:smpte:ul:060E2B34.027F0101.0D010101.01017B00 |
NOTE ββ The value of byte 6 (0x7F) in the IAB Essence Descriptor Key is a placeholder that is used in the registers, but is not used in actual implementations. Please consult SMPTE ST 377-1:2019, Clause 9 for the proper value(s).
If present, the Max Object Count shall indicate the maximum number of IAB Object Definitions in any given IAFrame within the Immersive Audio Bitstream.
A single IAB Essence Descriptor, as specified in 5.8, shall be associated with the Essence Track.
The Sample Rate item of the IAB Essence Descriptor, as inherited from File Descriptor, shall be set to a value corresponding to the frame rate represented by the code value of the first instance of FrameRate in the Immersive Audio Bitstream, as defined in Subclause 10.2.4 of SMPTE ST 2098-2:2022.
NOTE 1 ββ FrameRate can occur multiple times within an Immersive Audio Bitstream, but is constrained by 5.6, above, to remain constant throughout the bitstream.
The Essence Container Label item of the IAB Essence Descriptor, as inherited from File Descriptor, shall be as specified in Table 5.
| Kind | Leaf |
|---|---|
| Name | IMF Clip-Wrapped IAB Essence Container |
| Symbol | IMF_IABEssenceClipWrappedContainer |
| UL | urn:smpte:ul:060E2B34.0401010D.0D010301.021D0101 |
The IMF Clip-Wrapped IAB Essence Container label is a child of the MXF-GC IAB Audio node label, which is a child of the MXF-GC Immersive Audio node label. The MXF-GC IAB Audio and MXF-GC Immersive Audio node labels are defined in Annex D of this specification.
The Codec item of the IAB Essence Descriptor, as inherited from File Descriptor, shall not be present.
The Sound Essence Coding item of the IAB Essence Descriptor, as inherited from Generic Sound Essence Descriptor, shall be set to the UL value specified in Table 6, which is defined in SMPTE ST 429-18 and has the meaning: Immersive Audio Coding per SMPTE ST 2098-2:2022.
| urn:smpte:ul:060E2B34.04010105.0E090604.00000000 |
The Audio Sampling Rate item of the IAB Essence Descriptor shall be present and shall be set to a value corresponding to the sampling frequency of the audio sample data represented by the code value of the first instance of SampleRate in the Immersive Audio Bitstream, as defined in Subclause 10.2.2 of SMPTE ST 2098-2:2022.
NOTE 2 ββ SampleRate can occur multiple times within an Immersive Audio Bitstream, but is constrained by 5.6, above, to remain constant throughout the bitstream.
The Locked/Unlocked item of the IAB Essence Descriptor shall be ignored.
The Audio Ref Level item of the IAB Essence Descriptor shall be ignored.
If present, the Electro-Spatial Formulation item of the IAB Essence Descriptor shall be set to a value of 15 (multi-channel mode default).
The ChannelCount item of the IAB Essence Descriptor shall be ignored.
The Quantization Bits item in the IAB Essence Descriptor shall be set to a value corresponding to the bit depth of the audio sample data represented by the code value of the first instance of BitDepth in the Immersive Audio Bitstream, as defined in Subclause 10.2.3 of SMPTE ST 2098-2:2022.
NOTE 3 ββ BitDepth can occur multiple times within an Immersive Audio Bitstream, but is constrained by 5.6, above, to remain constant throughout the bitstream.
The Dial Norm item in the IAB Essence Descriptor shall be ignored.
The Reference Image Edit Rate item in the IAB Essence Descriptor, as defined in Clause E.2 of SMPTE ST 2067-2:2020 to extend Generic Sound Essence Descriptor, should be present.
The Reference Audio Alignment Level item in the IAB Essence Descriptor, as defined in Clause E.3 of SMPTE ST 2067-2:2020 to extend Generic Sound Essence Descriptor, should be present.
An IAB Track File shall utilize the MCA Label framework as defined in SMPTE ST 377-4:2021 and further extended in Annex C.
An IAB Track File shall contain exactly one instance of an IAB Soundfield Label SubDescriptor, as defined in Annex C.
An IAB Track File should contain one instance of an IAB Channel SubDescriptor, as defined in Annex E, for each channel of each BedDefinition.
An IAB Track File shall not contain instances of AudioChannelLabelSubDescriptor, SoundfieldGroupLabelSubDescriptor, or GroupOfSoundfieldGroupsLabelSubDescriptor.
Within a given IAB Track File, the constraints of Table 7 shall apply.
| Item | IAB Soundfield Label SubDescriptor Constraints |
|---|---|
| MCA Tag Name | Shall be present. |
| RFC 5646 Spoken Language | Shall be equal to the primary spoken language associated with the IAB soundfield. It shall be absent if and only if the IAB soundfield is not associated with a primary spoken language. |
| MCA Title | Should be present. |
| MCA Title Version | |
| MCA Content | Should be present. If present, MCA Content shall be set according to SMPTE ST 377-41:2023, Subclause 5.4. |
| MCA Use Class | Should be present. If present, MCA Use Class shall be set according to SMPTE ST 377-41:2023, Subclause 5.5. |
| MCA Audio Content Kind | May be present. |
| MCA Audio Element Kind |
Other items defined for IAB Soundfield Label SubDescriptor, but not required by this specification, may be present and may be safely ignored by implementations.
NOTE 1 ββ MCA Tag Symbol and MCA Tag Name contain a human-readable text intended for display to the user. The MCA Label Dictionary ID is used to unambiguously determine the nature of the underlying soundfield.
NOTE 2 ββ MCA Title, MCA Title Version, MCA Content, MCA Use Class, MCA Audio Content Kind and MCA Audio Element Kind contain human-readable descriptive text intended for display to the user.
NOTE 3 ββ MCA Audio Content Kind and MCA Audio Element Kind are legacy items that have been superseded by MCA Content and MCA Use Class, respectively. Use of MCA Audio Content Kind and MCA Audio Element Kind is no longer recommended.
NOTE 4 ββ MCA Content and MCA Use Class are further constrained by SMPTE ST 377-4:2021.
The MCA Label Dictionary ID, MCA Tag Symbol and MCA Tag Name item values shall be set according to Table 8.
| Item | Value |
|---|---|
| MCA Label Dictionary ID | UL of IAB Soundfield defined in Table 9. |
| MCA Tag Symbol | βIABβ |
| MCA Tag Name | βIABβ |
| Kind | Leaf |
|---|---|
| Name | IAB Soundfield |
| Symbol | IABSoundfield |
| UL | urn:smpte:ul:060E2B34.0401010D.03020221.00000000 |
| Description | Identifies an IAB soundfield. |
The object model for the descriptors in an IAB Track File is shown in Figure 1.
A Composition, as defined in SMPTE ST 2067-3:2020, shall contain zero or more IAB Virtual Tracks.
An IAB Virtual Track consists of one or more instances of IABSequence elements as specified in Table 10.
<xs:schema targetNamespace="http://www.smpte-ra.org/ns/2067-201/2019" xmlns:cpl="http://www.smpte-ra.org/schemas/2067-3/2016" xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xs:import namespace="http://www.smpte-ra.org/schemas/2067-3/2016" /> <!-- schema definitions found in this document --> <xs:element name="IABSequence" type="cpl:SequenceType"/> </xs:schema> |
Each IABSequence element shall contain Resource elements of type TrackFileResourceType, as defined in Subclause 6.12 of SMPTE ST 2067-3:2020, with each Resource element referencing an IAB Track File.
The IAB Virtual Track shall reference an IAB Track File whose Edit Rate, as defined in 5.4, is an integer multiple of the Edit Rate of the Main Image Virtual Track.
The EditRate of the Resource element, as defined in Subclause 6.11.3 of SMPTE ST 2067-3:2020, shall be set to a value equal to the Edit Rate of the Essence Track in the referenced MXF file, as defined in 5.4.
The following characteristics of the Immersive Audio Bitstream shall remain constant for all IAB Track Files referenced by a given IAB Virtual Track:
Clause A.2 and Clause A.3 apply to informative metadata that does not alter rendering of the Immersive Audio Bitstream essence.
XML-based informative metadata associated with the Immersive Audio Bitstream essence can be stored within the IAB Track File as specified in SMPTE RP 2057:2011 and Amendment 1:2013 to SMPTE RP 2057:2011.
Informative metadata associated with the Immersive Audio Bitstream essence can be stored separately from the IAB Track File by introducing additional Virtual Tracks to the Composition, as specified in SMPTE ST 2067-3:2020.
The ConformsToSpecifications element identifies the specification(s) to which an MXF file conforms.
NOTE 1 ββ Other MXF Header Metadata items identify the defining specifications of specific components of the file. For instance, the Codec element of the File Descriptor identifies a codec compatible with an Essence Container. Similarly, the DM Schemes element of the Preface identifies the Descriptive Metadata Schemes used in the file.
The Preface Set specified in SMPTE ST 377-1:2019 shall contain 0 or 1 instances of the ConformsToSpecifications element specified in Table B.1.
| Item Name | Item Symbol | Type | Len | Local Tag | Item UL | Req | Meaning | Default |
|---|---|---|---|---|---|---|---|---|
| Conforms To Specifications | ConformsToSpecifications | Batch of UL | var | Dyn | urn:smpte:ul:060E2B34.0101010E.01020210.02040000 | Opt | Identifies the specification(s) to which an MXF file conforms. |
The value of the ConformsToSpecifications element is a batch of ULs, where each UL element shall identify a specification to which the MXF file conforms in its entirety.
NOTE 2 ββ A single file can conform to multiple specifications simultaneously.
NOTE 3 ββ The absence of a particular UL in the ConformsToSpecifications element, or the absence of the element as a whole, is not itself an indication that the file does not conform with the specification associated with the UL.
The Immersive Audio Bitstream (IAB) defined in SMPTE ST 2098-2:2022 carries all information (bed channels, objects and metadata) necessary to generate a Soundfield, as defined in SMPTE ST 377-4:2021, and thus functions in a similar manner as a Soundfield Group, which carries all Audio Channels necessary to create a Soundfield. The IAB Soundfield Label SubDescriptor extends the MCA Label framework in accordance with Subclause 5.3 of SMPTE ST 377-4:2021 to provide labeling for immersive audio, and is comparable to the Soundfield Group Label SubDescriptor defined in that specification for channel-based audio.
The IAB Soundfield Label SubDescriptor is a concrete subclass of MCALabelSubDescriptor defined in Subclause 6.3 of SMPTE ST 377-4:2021, and contains the items specified in Table C.1 with the Set Key specified in Table C.2.
MCA Channel ID shall not be present in the IAB Soundfield Label SubDescriptor.
| Name | Type | Req | Definition |
|---|---|---|---|
| IAB Soundfield Label SubDescriptor Key | Set Key | Req | Identifies an IAB Soundfield Label SubDescriptor |
| Length | BER length | Req | Set length |
| All items in MCALabelSubDescriptor, as specified in SMPTE ST 377-4:2021, except Set Key, Length, and MCA Channel ID. | |||
| Kind | Leaf |
|---|---|
| Name | IAB Soundfield Label SubDescriptor |
| Symbol | IABSoundfieldLabelSubDescriptor |
| UL | urn:smpte:ul:060E2B34.027F0101.0D010101.01017C00 |
NOTE ββ The value of byte 6 (0x7F) in the IAB Soundfield Label SubDescriptor Key is a placeholder that is used in the registers, but is not used in actual implementations. Please consult SMPTE ST 377-1:2019, Clause 9 for the proper value(s).
The MXF-GC Immersive Audio and MXF-GC IAB Audio labels are newly defined node labels under the MXF Generic Container node that provide for reasonable grouping of labels identifying essence container types related to immersive, object-based audio, generally, and SMPTE ST 2098-2:2022 Immersive Audio Bitstream essence, specifically.
The MXF-GC Immersive Audio label is a node-type label that is a child of the MXF Generic Container label (symbol: MXFGenericContainer). The MXF-GC Immersive Audio label shall be defined as specified in Table D.1.
| Kind | Node |
|---|---|
| Name | MXF-GC Immersive Audio |
| Symbol | MXFGCImmersiveAudio |
| UL | urn:smpte:ul:060E2B34.0401010D.0D010301.021D0000 |
| Definition | Identifiers for MXF-GC Mappings of immersive audio data |
The MXF-GC IAB Audio label is a node-type label that is a child of the MXF-GC Immersive Audio label (symbol: MXFGCImmersiveAudio). The MXF-GC IAB Audio label shall be defined as specified in Table D.2.
| Kind | Node |
|---|---|
| Name | MXF-GC IAB Audio |
| Symbol | MXFGCIABAudio |
| UL | urn:smpte:ul:060E2B34.0401010D.0D010301.021D0100 |
| Definition | Identifiers for MXF-GC Mappings of Immersive Audio Bitstream (SMPTE ST 2098-2:2022) audio data |
The Immersive Audio Bitstream (IAB) defined in SMPTE ST 2098-2:2022 encodes metadata that identifies and describes audio channels that are organized in BedDefinition structures within the IAB. To make this metadata available to MXF parsers without IAB decoding abilities, the encoded information should be replicated in IAB Channel SubDescriptor items, as described in this Annex.
The IAB Channel SubDescriptor is a concrete subclass of SubDescriptor defined in SMPTE ST 377-1:2019, and contains the items specified in Table E.1 with the Set Key specified in Table E.2.
| Item Name | Type | Len | Local Tag | UL Designator | Req | Definition |
|---|---|---|---|---|---|---|
| IABChannelSubDescriptor Key | Set Key | 16 | Dyn | per Table E.2 | Req | Identifies an IAB Channel SubDescriptor |
| Length | BER length | Req | Set length | |||
| IABBedMetaID | Uint32 | 4 | Dyn | urn:smpte:ul:060e2b34.0101010e.0402030c.01000000 | Req | IAB MetaID of the associated BedDefinition |
| IABChannelID | Uint32 | 4 | Dyn | urn:smpte:ul:060e2b34.0101010e.0402030c.02000000 | Req | Uniquely identifies a channel within a bed along with its routing destination |
| IABAudioDescription | Uint8 | 1 | Dyn | urn:smpte:ul:060e2b34.0101010e.0402030c.03000000 | Req | Provides a top-level description of the contents of the audio |
| IABAudioDescriptionText | UTF16String | var | Dyn | urn:smpte:ul:060e2b34.0101010e.0402030c.04000000 | Opt | Provides a custom description of the contents of the audio |
| Kind | Leaf |
|---|---|
| Name | IABChannelSubDescriptor |
| Symbol | IABChannelSubDescriptor |
| UL | urn:smpte:ul:060e2b34.027f0101.0d010101.01018115 |
NOTE ββ The value of byte 6 (0x7F) in the IAB Channel SubDescriptor Key is a placeholder that is used in the registers, but is not used in actual implementations. Please consult SMPTE ST 377-1:2019, Clause 9 for the proper value(s).
Some of the values stored in the SubDescriptor are derived from the Immersive Audio Bitstream as indicated below in E.2.2 through E.2.5.
The IABBedMetaID item identifies the parent BedDefinition to which the channel belongs. Its value is derived from, and is equal to, the value of the corresponding MetaID item, as defined in Subclause 10.3.1 of SMPTE ST 2098-2:2022.
The IABChannelID uniquely identifies a channel within a bed, along with its routing destination. Its value is derived from, and is equal to, the value of the corresponding ChannelID item, as defined in Subclause 10.3.5 of SMPTE ST 2098-2:2022.
The value of IABAudioDescription item provides a top-level description of the contents of the audio. Its value is derived from, and is equal to, the value of the corresponding AudioDescription item, as defined in Subclause 10.3.12 of SMPTE ST 2098-2:2022.
The IABAudioDescriptionText item provides a custom description of the contents of the audio. It is present if the most significant bit of AudioDescription is set. If present, its value is derived from, and is equal to, the value of the corresponding AudioDescriptionText item, as defined in Subclause 10.3.13 of SMPTE ST 2098-2:2022.