DCP Operational Constraints

8 Composition Constraints🔗

8.1 General🔗

A Composition (i.e., a Composition Playlist and referenced Track Files) may be delivered in a single DCP or it may be spread across several DCPs. Regardless of the number of DCPs used to convey a Composition, a Composition shall conform to the following constraints.

8.2 Edit Rate🔗

The composition shall have an Edit Rate of 24/1, 25/1, 30/1, 48/1, 50/1 or 60/1.

8.3 Picture Essence Encoding🔗

Picture essence tracks shall be encoded as specified in SMPTE ST 428-1. The pixel array size and frame rate shall be one of the formats listed in Table 1. Monoscopic picture essence tracks shall have matching frame rate and edit rate. Stereoscopic picture essence tracks shall be limited to the 2K formats, and shall have a frame rate of 48/1 and an edit rate equal to half the frame rate ( $r_{e} = r_{f} / 2$ ). (See SMPTE ST 429-10 for an explanation).

Source images having an aspect ratio not listed in Table 1 should be encoded so that the image fills either the horizontal or vertical dimension of the desired Full pixel array (2K or 4K). To fill the pixel array in the opposite dimension, the image should be padded with an equal number of black pixels on each side, i.e., "letter-box" (top side, bottom side) or "pillar-box" (left side, right side).

Table 1 –⁠ Pixel Array Dimensions
Format	Horizontal Pixels	Vertical Pixels	Frame Rate
2K Scope (2.39:1)	2048	858	24/1, 25/1, 30/1, 48/1, 50/1 or 60/1
2K Flat (1.85:1)	1998	1080	24/1, 25/1, 30/1, 48/1, 50/1 or 60/1
2K Full (1.90:1)	2048	1080	24/1, 25/1, 30/1, 48/1, 50/1 or 60/1
4K Scope (2.39:1)	4096	1716	24/1, 25/1 or 30/1
4K Flat (1.85:1)	3996	2160	24/1, 25/1 or 30/1
4K Full (1.90:1)	4096	2160	24/1, 25/1 or 30/1

8.4 Sound Essence Encoding🔗

Sound essence tracks shall be encoded as specified in SMPTE ST 428-2. 10.3.4 and Annex A specify means of identifying the content of these essence tracks.

8.5 Timed Text Essence Encoding🔗

8.5.1 General🔗

Timed Text essence shall be encoded as XML data as specified in SMPTE ST 428-7, and may be constrained per SMPTE ST 428-10. Sub-pictures shall be encoded as Portable Network Graphics (PNG) images as specified in ISO/IEC 15948.

8.5.2 Fonts for Timed Text🔗

When Text elements are present in the Timed Text essence, one (1) LoadFont element shall be present. Timed Text essence shall not contain more than one (1) LoadFont element.

Within the scope of any given Subtitle element, all Font elements shall have the same EffectSize attribute value.

The font resource should not be larger than 10MB.

NOTE 1 —⁠ Legacy implementations might not be able to support font resources larger than 640 KB.

NOTE 2 —⁠ Operational testing has determined that a Font size smaller than 8 pt might be difficult to read, and that, depending on the length of the subtitle, a very large Font size might take too long to appear and might go beyond the dimension of the Primary Picture.

8.5.3 Text Color Interpretation🔗

Color values encoded in the Timed Text essence (in the Color and EffectColor attributes of the Font element) shall be encoded as sRGB values (IEC 61966-2-1).

8.5.4 Images for On-Screen Timed Text🔗

PNG image resources used per SMPTE ST 428-7 shall have three (3) 8-bit color components (R, G, and B). An alpha channel may be present. If an alpha channel is present, the decoder shall use it when creating the composite image. PNG image resources shall contain the sRGB chunk per ISO/IEC 15948.

The width and height of a subpicture shall be equal to or less than the width and height, respectively, of the associated main picture.

8.5.5 Maximum Rate of Occurrence for On-Screen Timed Text🔗

Up to two (2) subtitle instances may be visible on screen at any time. The visibility period of an instance shall include fade-in and fade-out times. A subtitle instance shall contain no more than six (6) Text elements or three (3) Image elements.

8.5.6 Constraints on Stereoscopic Control🔗

All Text and Image elements to be displayed at the same time shall have the same depth information specified through Zvalue within VariableZ and/or Zposition attributes.

8.5.7 IntrinsicPictureResolution Attribute🔗

When present, the value of the IntrinsicPictureResolution attribute of the SubtitleReel element (see SMPTE ST 428-7) shall be one of the values listed in Table 2 below.

Table 2 –⁠ `IntrinsicPictureResolution` Attribute Values
Attribute Value
2K Scope
2K Flat
2K Full
4K Scope
4K Flat
4K Full

NOTE —⁠ The IntrinsicPictureResolution attribute is intended to guide the mastering operator to select the appropriate subtitle resources for the Primary Picture content.

8.6 Sound and Picture Sample Rates🔗

The sample rate of sound essence in a Composition shall be one of the combinations listed in Table 3.

Table 3 –⁠ Sample Rate Constraints
Sound Sample Rate	Composition Edit Rate	Samples per Edit Unit
48 kHz	24/1	2000
48 kHz	25/1	1920
48 kHz	30/1	1600
48 kHz	48/1	1000
48 kHz	50/1	960
48 kHz	60/1	800
96 kHz	24/1	4000
96 kHz	25/1	3840
96 kHz	30/1	3200
96 kHz	48/1	2000
96 kHz	50/1	1920
96 kHz	60/1	1600

8.7 Track File Edit Rates🔗

All essence tracks in a Composition shall have an identical Edit Rate.

8.8 Homogenous Essence🔗

Essence tracks in a Composition shall have homogenous encoding parameter values throughout the Composition. Picture essence shall have constant frame rate and pixel array size. Sound essence shall have constant sample rate, language, channel count, and channel assignment parameters.

9 Composition Playlist Constraints🔗

9.1 Minimum Essence Requirement🔗

A Composition Playlist shall have one picture essence track and one sound essence track in each Reel element.

9.2 Composition Playlist Uniqueness🔗

Two Composition Playlist documents having different contents shall have different values in the top-level Id element.

9.3 ContentVersion Id🔗

The Id element within the ContentVersion element shall contain a URI value conforming to one of the following types:

a Basic UMID, as specified in SMPTE ST 2029
an ISAN, as specified in IETF RFC 4246
a UUID, as specified in IETF RFC 4122
an EIDR, as specified in IETF RFC 7972

NOTE —⁠ The Id element of the ContentVersion element is intended to remain constant across multiple Composition Playlist instances referencing the same underlying content. For instance, both a pre-release and a final version of a Composition Playlist associated with the same feature can have the same ContentVersion/Id, while their Id elements are different. In a typical application, ContentVersion/Id can be used as a reference to an internal booking system.

9.4 Reel Duration🔗

The Duration element shall be present within every Asset element that refers to an external track file. The value of all Duration elements in a reel, with the exception of timed text elements, shall be equal. The Duration of the Reel shall be determined by the MainPicture element, per the provisions of SMPTE ST 429-7, or the MainStereoscopicPicture element, whichever is present.

9.5 Track Files🔗

Track files referenced by a Composition Playlist shall conform to the provisions of Clause 10 of this document.

9.6 Picture Tracks🔗

9.6.1 General🔗

Each Reel element in a Composition Playlist document shall contain one (1) MainPicture element (SMPTE ST 429-7) or one (1) MainStereoscopicPicture element (SMPTE ST 429-10). This element shall refer to a Picture Track File as defined by SMPTE ST 429-3. If the element name is MainStereoscopicPicture, the referenced Track File shall also conform to SMPTE ST 429-10.

9.6.2 Essence Characteristics🔗

All picture assets in a Composition Playlist shall have identical values for the following metadata items:

element name (i.e., MainPicture or MainStereoscopicPicture)
EditRate element
FrameRate element
ScreenAspectRatio element

9.7 Sound Tracks🔗

9.7.1 General🔗

This element shall refer to a Sound Track File as defined by SMPTE ST 429-3.

9.7.2 Essence Characteristics🔗

All sound assets in a Composition Playlist shall have identical values for the following metadata items:

EditRate element
Language element

9.8 Timed Text Tracks🔗

A timed text track is established by the presence of a timed text asset (e.g. MainSubtitle, MainCaption, ClosedSubtitle, or ClosedCaption) in at least one Reel of a Composition. Once a timed text asset appears in one Reel, the established track shall be assumed to exist for the entire Composition, even if related timed text Asset elements are not present in all Reels.

Each Reel element in a Composition Playlist document may contain one on-screen text track, either MainSubtitle as defined by SMPTE ST 429-7 or MainCaption as defined by SMPTE ST 429-12. When present, the MainSubtitle element shall refer to a Timed Text Track File as defined by SMPTE ST 429-5, containing an XML resource conforming to SMPTE ST 428-7. When present, the MainCaption element shall refer to a Timed Text Track File as defined by SMPTE ST 429-5, containing an XML resource conforming to SMPTE ST 428-10. A Composition Playlist shall contain no more than one on-screen text track type (MainSubtitle or MainCaption).

Each Reel element in a Composition Playlist document may contain up to six (6) off-screen (closed) text tracks, using any combination of ClosedSubtitle and ClosedCaption elements as defined by SMPTE ST 429-12. When present, an off-screen text element shall refer to a Timed Text Track File as defined by SMPTE ST 429-5, containing an XML resource conforming to SMPTE ST 428-10. When more than one off-screen text track asset of the same type (ClosedSubtitle or ClosedCaption) is present, the Language attribute shall be used. The Language attribute value of each off-screen text track shall be unique among the set of similarly-typed off-screen text tracks. The value of the Language attribute shall be used to identify material of the same off-screen text track from Reel to Reel for each Asset type instance.

The maximum number of timed text tracks in a Composition Playlist document is seven (7); one (1) on-screen text track plus six (6) off-screen text tracks. Each off-screen text track with a unique combination of element name and Language shall be considered a distinct off-screen text track.

In order to illustrate the concepts in this section, the example diagram in Figure 4 shows a collection of Composition assets on the left, and a Composition with tracks on the right. Each reel shown on the left contains a number of off-screen timed text assets that appears to be within the specified limit of this standard. However, in the example, the number of off-screen text tracks possible is seven, which is more than that allowed by this standard. The Composition on the right is correctly constrained. Note that each timed text track exists for the duration of the Composition, even though it might not be represented by an asset in every reel.

Figure 4 –⁠ Example of allocating timed text assets to timed text tracks.

9.9 Marker Tracks🔗

When present, a MainMarkers element shall not contain either:

any Marker element with an Offset value that exceeds the duration of the parent Reel; or
an IntrinsicDuration value that exceeds the duration of the parent Reel.

NOTE —⁠ As specified in SMPTE ST 429-7, a MainMarkers element contains neither an EntryPoint element nor a Duration element since it does not reference a Track File.

9.10 Cryptographic Keys🔗

No more than 256 distinct cryptographic keys, as uniquely identified by their Key ID, shall be used to encrypt the assets referenced by a Composition Playlist.

9.11 Hash Element🔗

The Hash element shall be present in an asset when the KeyId element is present (i.e., when the referenced Track File is encrypted).

9.12 Digital Signature🔗

When a Composition Playlist document is digitally signed as specified in SMPTE ST 429-7, digital certificates in the signer's certificate chain shall conform to the provisions of SMPTE ST 430-2.

9.13 Composition Metadata🔗

The CompositionMetadataAsset element defined in SMPTE ST 429-16 should be present.

10 Track File Constraints🔗

10.1 General🔗

Essence data shall be contained in MXF files (SMPTE ST 377-1) constrained according to SMPTE ST 429-20.

10.2 Encryption🔗

When cryptographic protection is required, Track Files shall use KLV encryption per SMPTE ST 429-6. Each encrypted Track File shall be encrypted with exactly one (1) 128-bit symmetric key, which is the Cipher Key of the Track File.

The Essence Container Label urn:smpte:ul:060e2b34.04010107.0d010301.020b0100 shall be used for both frame- and clip-wrapped essence.

NOTE 1 —⁠ SMPTE ST 429-6 deprecates the Essence Container Label urn:smpte:ul:060e2b34.04010107.0d010301.020b0100 for clip-wrapped essence outside of D-Cinema applications.

If the Encrypted Track File contains MIC items, the MIC Key used to generate the MIC items shall be derived from the Cipher Key of Track File using the Legacy MIC Key derivation algorithm specified at SMPTE ST 429-6.

NOTE 2 —⁠ SMPTE ST 429-6 no longer specifies a MIC Key derivation method as part of its Reference Decryption Processing Model. This method however remains in use when generating MIC items during Encrypted Track File authoring. The generated MIC Key is carried in the KDM as specified at SMPTE ST 430-1.

10.3 Picture Track Files🔗

10.3.1 General🔗

In addition to the essence encoding constraints specified in Clause 8, Picture Track Files shall have the following properties.

10.3.2 Operational Pattern🔗

Picture Track Files shall conform to the provisions of SMPTE ST 429-3.

10.3.3 Compression🔗

Picture essence shall consist of a sequence of codestreams that conform either to the 2K digital cinema profile or the 4K digital cinema profile specified at Rec. ITU-T T.800 | ISO/IEC 15444-1.

There shall be 5 wavelet transform levels for 2K picture essence.

There shall be 6 wavelet transform levels for 4K picture essence.

10.3.4 Wrapping🔗

Picture essence shall be frame wrapped according to SMPTE ST 422 and SMPTE ST 429-4. Stereoscopic picture essence shall also conform to SMPTE ST 429-10.

10.4 Sound Track Files🔗

10.4.1 General🔗

In addition to the essence encoding constraints specified in Clause 8 above, Sound Track Files shall have the following properties.

10.4.2 Operational Pattern🔗

Sound Track Files shall conform to the provisions of SMPTE ST 429-3.

10.4.3 Wrapping🔗

Sound essence shall be frame wrapped per SMPTE ST 382. Sound essence shall be contained in KLV packets labeled with the Wave Frame Wrapped Element UL. A Wave Audio Essence Descriptor shall be present in the Top-Level File Package.

10.4.4 Channel Assignment🔗

Channel assignment defines what reproduction channel is carried in each channel of the distributed track. Sound Track File channel assignment shall be indicated by a UL value in the Channel Assignment property of the Wave Audio Essence Descriptor. The UL may indicate a fixed channel assignment. Annex A defines a set of channel assignments and respective UL values based on this method. The UL may also indicate a channel assignment scheme defined in another specification. In this case, additional details regarding channel assignment shall be provided by the specification that defines the UL.

If the Channel Assignment property is not present, Channel Configuration 1 (Table A.3) shall be assumed by the decoder. Routing of the container channel to the system audio output is not in the scope of this document.

10.5 Timed Text Track Files🔗

10.5.1 General🔗

In addition to the essence encoding constraints specified in Clause 8 above, Timed Text Track Files shall have the following properties.

10.5.2 Timed Text Essence Format🔗

Timed Text essence shall be encoded as XML data as specified in SMPTE ST 428-7, and may be constrained per SMPTE ST 428-10. See 8.4 and 9.8 above.

10.5.3 Track File Format🔗

Timed Text Track Files shall be created according to SMPTE ST 429-5.

10.5.4 Timed Text Essence Descriptor🔗

If the DCDM Subtitle file contains the IntrinsicPictureResolution attribute (see SMPTE ST 428-7), then the Intrinsic Picture Resolution property of the Timed Text Essence Descriptor, defined in Annex B, should be present in the Timed Text Track File and, when present, shall represent the same value.

If the DCDM Subtitle file contains the DisplayType element (see SMPTE ST 428-7), then the Display Type property of the Timed Text Essence Descriptor, defined in Annex B, should be present in the Timed Text Track File and, when present, shall represent the same value.

If the Timed Text Essence Descriptor property RFC 5646 Language Tag List is present, it shall contain at least the language code specified in the DCDM Subtitle file.

If at least one subtitle instance of the DCDM Subtitle file contains a Zposition attribute (as defined in SMPTE ST 428-7), the Z-Position In Use property of the Timed Text Essence Descriptor shall be non-zero.

Annex A
Audio Channel Assignment Label (Normative)🔗

A.1 General🔗

NOTE —⁠ Implementation behavior is undefined when a Sound Track File fails to adhere to the normative provisions specified herein.

SMPTE ST 382 carries multi-channel PCM sound samples by using sample interleave on a channel basis. Each sample position can be thought of as a channel within the container specified at SMPTE ST 382.

The number of channels within the Sound Track File shall be an even number. The inclusion of a channel of silence may be required to achieve this.

Clause A.1 and Clause A.2 each specifies a method for unambiguously identifying the channels present in Sound Track Files and indicating their intended reproduction location in the theater. Each method uses the ChannelAssignment property of the WaveAudioEssence Descriptor in a Sound Track File, as specified in 10.4.4 above.

Compliant playback devices shall use the ChannelAssignment property to identify the sound channels being used.

A.2 Static Container Channel Configurations🔗

A.2.1 General🔗

Each table in this Annex defines a container channel configuration that has a corresponding Universal Label (UL) for use as a value of the ChannelAssignment property. Container channels are numbered in sample packing order. The first sample is carried in container channel 1, the second in container channel 2 and so on.

The number of channels contained in a Sound Track file shall be less than or equal to the number of channels defined by the table associated with the ChannelAssignment property. However, if a given container channel is present, it shall be used according to the table. The WaveAudioEssence Descriptor ChannelCount property may be used in combination with the ChannelAssignment property to determine actual channel usage. For instance, a ChannelAssignment label indicating Channel Configuration 1 may accompany a container with a ChannelCount value of 6, indicating that channels 7 and 8 (Hearing Impaired and Visually Impaired-Narrative) are not present.

The special case of no specified channel configuration is also provided for (see Table A.6). The label associated with this table shall mean no configuration specified. This may be used for test or experimental purposes.

NOTE —⁠ For the purpose of setting appropriate transport flags, implementations should not assume that all audio channels in Channel Configuration 4 contain linear PCM audio samples suitable for direct conversion to an analog audio signal.

A.2.2 Channel Label Set ULs🔗

Table A.1 –⁠ Specification of the Channel Assignment Label when Static Container Channel Configurations are used
Byte No.	Description	Value (hex)	Meaning
1-7	Registry Designator	See register
8	Registry Version Number	`0bh`	Version of the register in which this label first appears
9	Parametric	`04h`	Node used to define parametric data
10	Sound Essence	`02h`	Identifies sound essence coding
11	Sound Coding Characteristics	`02h`	Identifies sound coding characteristics
12	Sound Channel Labeling	`10h`	Identifies sound channel labeling
13	Sound Channel Labeling SMPTE ST 429-2	`03h`	Identifies sound channel labeling as defined in this document (SMPTE ST 429-2)
14	Channel Label Sets	`01h`	Identifies Static Sound Channel Label Sets
15	Channel Configuration	See Table A.2	Identifies sound Channel Configuration
16	Reserved	`00h`	Reserved

Table A.2 –⁠ Values for Table A.1, Byte 15
Channel Configuration	Byte 15 Value
Channel Configuration 1 (Table A.3)	`01h`
Channel Configuration 2 (Table A.4)	`02h`
Channel Configuration 3 (Table A.5)	`03h`
Channel Configuration 4 (Table A.6)	`04h`
Channel Configuration 5 (Table A.7)	`05h`

A.2.3 Channel Configuration Tables🔗

Table A.3 –⁠ Channel Configuration 1
Container Channel	SMPTE ST 428-12 Name
1	Left
2	Right
3	Center
4	LFE
5	Left Surround
6	Right Surround
7	Hearing Impaired
8	Visually Impaired-Narrative

Table A.4 –⁠ Channel Configuration 2
Container Channel	SMPTE ST 428-12 Name
1	Left
2	Right
3	Center
4	LFE
5	Left Surround
6	Right Surround
7	Center Surround
8	Not Used
9	Hearing Impaired
10	Visually Impaired-Narrative

Table A.5 –⁠ Channel Configuration 3
Container Channel	SMPTE ST 428-12 Name
1	Left
2	Right
3	Center
4	LFE
5	Left Surround
6	Right Surround
7	Left Center
8	Right Center
9	Hearing Impaired
10	Visually Impaired-Narrative

Table A.6 –⁠ Channel Configuration 4
Container Channel	Name
1	CH01
2	CH02
3	CH03
4	CH04
5	CH05
6	CH06
7	CH07
8	CH08
9	CH09
10	CH10
11	CH11
12	CH12
13	CH13
14	CH14
15	CH15
16	CH16

Table A.7 –⁠ Channel Configuration 5
Container Channel	SMPTE ST 428-12 Name
1	Left
2	Right
3	Center
4	LFE
5	Left Side Surround
6	Right Side Surround
7	Left Rear Surround
8	Right Rear Surround
9	Hearing Impaired
10	Visually Impaired-Narrative

NOTE —⁠ Earlier revisions of this specification used terminology from SMPTE ST 428-3, instead of SMPTE ST 428-12, to define the mappings from container channels to audio channels. Although the mappings remain unchanged, the terms used to refer to a few of the audio channels have changed. For instance, SMPTE ST 428-12 differentiates Side Surrounds (Lss/Rss) from Left and Right surrounds (Ls/Rs) and uses Lrs to refer to the Left Rear Surround channel, whereas SMPTE ST 428-3 uses Rls.

A.3 Configurations using MXF Multichannel Audio Framework🔗

A.3.1 General🔗

When the ChannelAssignment of the WaveAudioEssence Descriptor in a Sound Track File contains the UL defined in Table A.8, the framework specified in SMPTE ST 377-4 shall be used in conjunction with the constraints defined in A.3.2 and A.3.3 to unambiguously identify the audio channels and soundfield group carried in the Sound Track File.

NOTE —⁠ Items defined in SMPTE ST 377-4 that are not specified in this section can nevertheless be present in the Sound Track File and describe particular aspects of an audio channel or soundfield group. Implementations can safely ignore these items.

The MXF Multichannel Audio Framework (MCA Framework) associates audio channels and soundfield groups contained within a D-Cinema Sound Track File with an MXF SubDescriptor that contains metadata, including a unique identifier. This enables D-Cinema implementations to properly route and process audio channels, e.g. the Hearing Impaired and Left channels may be handled by different devices. It also enables straightforward extensibility for the purpose of both experimentation and widespread use: new standalone audio channels can be defined without impacting existing soundfield groups and new soundfield groups can be introduced with minimal effort.

Figure A.1 illustrates the use of the audio channel and soundfield group information contained in a Sound Track File, as specified here.

Figure A.1 –⁠ Illustrative use of AudioChannelLabelSubDescriptor and SoundfieldGroupLabelSubDescriptor for a Sound Track File containing 10 audio channels consisting of a 5.1 soundfield group and associated Hearing Impaired and Visually Impaired-Narrative channels (Informative). The audio channel labeling method defined in this section is not limited to this specific channel count or soundfield configuration.

A.3.2 Configuration Channel Assignment Label🔗

Table A.8 –⁠ Specification of the Channel Assignment Label when the MCA Framework is used
Byte No.	Description	Value (hex)	Meaning
1-7	Registry Designator	See register
8	Registry Version Number	0D	Version of the register in which this label first appears
9	Parametric	`04h`	Node used to define parametric data
10	Sound Essence	`02h`	Identifies sound essence coding
11	Sound Coding Characteristics	`02h`	Identifies sound coding characteristics
12	Sound Channel Labeling	`10h`	Identifies sound channel labeling
13	Sound Channel Labeling SMPTE ST 429-2	`03h`	Identifies sound channel labeling as defined in this document (SMPTE ST 429-2)
14	D-Cinema Application of the MXF Multichannel Audio Framework	`02h`	Indicates that the D-Cinema Application of the MXF Multichannel Audio Framework is used
15	Reserved	`00h`	Reserved
16	Reserved	`00h`	Reserved

A.3.3 AudioChannelLabelSubDescriptor🔗

Each audio channel contained in the Sound Track File shall be associated with zero or one AudioChannelLabelSubDescriptor instance, and each AudioChannelLabelSubDescriptor instance shall be associated with an audio channel.

Implementations shall ignore audio channels not associated with an AudioChannelLabelSubDescriptor instance. These channels should contain silence.

NOTE —⁠ The ChannelCount property of the Wave Audio Essence Descriptor reflects the number of channels in the Sound Track File and not the number of AudioChannelLabelSubDescriptor instances.

In addition to the items required by SMPTE ST 377-4, the following items shall be present in every AudioChannelLabelSubDescriptor instance:

MCA Channel ID
MCA Tag Name
RFC 5646 Spoken Language
SoundfieldGroupLinkID, if and only if the audio channel referenced by the AudioChannelLabelSubDescriptor instance belongs to a soundfield group associated with a SoundfieldGroupLabelSubDescriptor instance. If present, SoundfieldGroupLinkID shall contain the MCA Link ID value of the associated SoundfieldGroupLabelSubDescriptor instance.

Not all audio channels present in a Sound Track File need to be associated with a soundfield group. For example, Hearing Impaired and Visually Impaired-Narrative channels, if present, do not belong to a soundfield group and, hence, their respective AudioChannelLabelSubDescriptor instances do not reference a SoundfieldGroupLabelSubDescriptor instance.

If an audio channel is associated with a soundfield group, then the value of their respective RFC 5646 Spoken Language items shall be equal.

A.3.4 Common D-Cinema Channels🔗

Implementations shall recognize the common D-Cinema audio channels defined in SMPTE ST 428-12.

The presence of such an audio channel shall be indicated by an AudioChannelLabelSubDescriptor instance whose MCA Label Dictionary ID value is equal to UL of the audio channel as specified at SMPTE ST 428-12.

The MCA Tag Name of such an AudioChannelLabelSubDescriptor instance shall be equal to the Name (as specified in SMPTE ST 428-12) of the audio channel associated with the UL value.

The MCA Tag Symbol item of such an AudioChannelLabelSubDescriptor instance shall be constructed by prepending the string ch to the Symbol (as specified in SMPTE ST 428-12) of the audio channel associated with the UL value.

No audio channel listed at SMPTE ST 428-12 shall appear more than once in a given Sound Track File with the exception of Hearing Impaired and Visually Impaired-Narrative channels. If there are multiple Hearing Impaired or Visually Impaired-Narrative channels in a Sound Track File, they shall be distinguished by the value of their RFC 5646 Spoken Language item.

Furthermore, the RFC 5646 Spoken Language item shall not have the same value in two or more audio channels labeled Hearing Impaired, and the RFC 5646 Spoken Language item shall not have the same value in two or more audio channels labeled Visually Impaired-Narrative.

A.3.5 Extension Channels🔗

For extensibility, channels not defined at SMPTE ST 428-12 may be present.

Implementations shall not automatically pre-assign an audio channel with an AudioChannelLabelSubDescriptor instance having a MCA Label Dictionary ID that the implementation does not recognize and, for the purpose of setting appropriate transport flags, should not assume that such an audio channel contains linear PCM audio samples suitable for direct conversion to an analog audio signal.

Implementations may display to the user channels associated with an MCA Label Dictionary ID they do not recognize and offer the user the option to take action on such a channel based on the MCA Tag Name, MCA Tag Symbol and RFC 5646 Spoken Language of the AudioChannelLabelSubDescriptor instance that references it.

A.3.6 SoundfieldGroupLabelSubDescriptor🔗

There shall be one and only one SoundfieldGroupLabelSubDescriptor instance in the Sound Track file.

In addition to the items required by SMPTE ST 377-4, the following items shall be present in the SoundfieldGroupLabelSubDescriptor instance:

MCA Tag Name
RFC 5646 Spoken Language

A.3.7 Common D-Cinema Soundfield Groups🔗

Implementations shall recognize the common D-Cinema soundfield groups specified at SMPTE ST 428-12.

The presence of such a soundfield group shall be indicated by SoundfieldGroupLabelSubDescriptor instance whose MCA Label Dictionary ID value is equal to one of the UL specified at SMPTE ST 428-12.

The MCA Tag Name of such a SoundfieldGroupLabelSubDescriptor instance shall match the value of the Name of the soundfield group (as specified in SMPTE ST 428-12) associated with the UL value.

The MCA Tag Symbol item of such an SoundfieldGroupLabelSubDescriptor instance shall be constructed by prepending the string sg to the Symbol of the soundfield group (as specified in SMPTE ST 428-12) associated with the UL value.

Not all channels listed in the Audio Channels column of a given soundfield group in SMPTE ST 428-12 need to be present in the sound track file, but only those channels listed in the Audio Channels column for a given soundfield group may reference that SoundfieldGroupLabelSubDescriptor instance. Furthermore, if a channel is listed in the Audio Channels column of a given soundfield group but absent in the Sound Track File, then implementations shall assume the channel was not intended for reproduction by the content provider.

NOTE —⁠ Implementations may indicate to the user if a channel listed in the Audio Channels column for a given soundfield group is not present.

A.3.8 Extension Soundfield Groups🔗

For extensibility, soundfield groups not defined at SMPTE ST 428-12 may be present. However, implementations shall take no action with a SoundfieldGroupLabelSubDescriptor instance having a MCA Label Dictionary ID that the implementation does not recognize or if a channel that is not listed in the Audio Channels column for a given soundfield group references that SoundfieldGroupLabelSubDescriptor instance.

NOTE —⁠ Implementations can use the SoundfieldGroupLabelSubDescriptor instance for display to the user and to appropriately configure the B-Chain for the intended soundfield reproduction.

Item Name	Type	Len	UL Designator	Req ?	Meaning	Default
Display Type	UTF16 String	var	`06.0E.2B.34 01.01.01.0E 06.01.01.02 04.00.00.00`	Opt	A text string giving an application specific means to indicate the intended use of the content of the XML document	none
Intrinsic Picture Resolution	UTF16 String	var	`06.0E.2B.34 01.01.01.0E 06.01.01.02 05.00.00.00`	Opt	Indicates the resolution of the primary picture on which Sub-Picture Ancillary Resources are to be rendered	none
RFC 5646 Language Tag List	UTF16 String	var	`06.0E.2B.34 01.01.01.0E 03.01.01.02 02.16.00.00`	Opt	A comma-separated list of language tags, each as specified at IETF RFC 5646.	empty
Z-Position In Use	UInt8	1	`06.0E.2B.34 01.01.01.0E 06.01.01.02 06.00.00.00`	Opt	When non-zero, indicates that one or more subtitle instances in the enclosed XML resource make use of stereoscopic positioning features.	`00h`

D-Cinema Packaging — DCP Operational Constraints

Table of contents🔗

Foreword🔗

1 Scope🔗

2 Conformance🔗

3 Normative references🔗

4 Terms and definitions🔗

5 Overview (Informative)🔗

5.1 General🔗

5.2 D-Cinema Package🔗

5.3 D-Cinema Composition🔗

6 DCP Constraints🔗

6.1 Minimum Contents🔗

6.2 UUID Generation🔗

6.3 XML Constraints🔗

7 Packing List Constraints🔗

7.1 General🔗

7.2 Asset Identity🔗

7.3 Unique Set of Assets🔗

7.4 Digital Signature🔗

7.5 Group ID🔗

7.5.1 Composition Packages🔗

7.5.2 Asset Packages🔗