Packager Documentation

Shaka Packager is a tool and a media packaging SDK for DASH and HLS packaging and encryption. It can transmux input media files from one container to another container.

Note

Shaka Packager does not do transcoding. Content must be pre-encoded before passing to packager.

Packager operates in streams, described by stream_descriptor. The streams can be read from the same “file” or different “files”, which can be regular files, pipes, udp streams, etc.

This page is the documentation on using the packager tool. If you are interested in integrating packager library into your own tool, please see Shaka Packager Library.

Getting Shaka Packager

There are several ways you can get Shaka Packager.

Synopsis

$ packager <stream_descriptor> ... \
           [--dump_stream_info] \
           [--quiet] \
           [Chunking Options] \
           [MP4 Output Options] \
           [encryption / decryption options] \
           [DASH options] \
           [HLS options] \
           [Ads options]

Stream descriptors

There can be multiple stream_descriptor with input from the same “file” or multiple different “files”.

Stream descriptor is of the form:

<field>=<value>[,<field>=<value>]...

These are the available fields:

input (in):

input/source media “file” path, which can be regular files, pipes, udp streams. See UDP file options on additional options for UDP files.

stream_selector (stream):

Required field with value ‘audio’, ‘video’, ‘text’ or stream number (zero based).

output (out):

Required output file path (single file).

init_segment:

initialization segment path (multiple file).

segment_template (segment):

Optional value which specifies the naming pattern for the segment files, and that the stream should be split into multiple files. Its presence should be consistent across streams. See Segment template formatting.

bandwidth (bw):

Optional value which contains a user-specified maximum bit rate for the stream, in bits/sec. If specified, this value is propagated to (HLS) EXT-X-STREAM-INF:BANDWIDTH or (DASH) Representation@bandwidth and the $Bandwidth$ template parameter for segment names. If not specified, the bandwidth value is estimated from content bitrate. Note that it only affects the generated manifests/playlists; it has no effect on the media content itself.

language (lang):

Optional value which contains a user-specified language tag. If specified, this value overrides any language metadata in the input stream.

output_format (format):

Optional value which specifies the format of the output files (MP4 or WebM). If not specified, it will be derived from the file extension of the output file.

For subtitles in MP4, you can specify ‘vtt+mp4’ or ‘ttml+mp4’ to control which text format is used.

input_format (format):

Optional value which specifies the format of the input files or streams. If not specified, it will be autodetected, which in some cases may fail.

For example, a live UDP WebVTT input stream may be up and streaming long before a shaka packager instance consumes it, and therefore shaka packager never gets the initial “WEBVTT” header string. In such a case, shaka packager can’t properly autodetect the stream format as WebVTT, and thus doesn’t process it. But stating ‘input_format=webvtt’ as selector parameter will tell shaka packager to omit autodetection and consider WebVTT format for that stream.

trick_play_factor (tpf):

Optional value which specifies the trick play, a.k.a. trick mode, stream sampling rate among key frames. If specified, the output is a trick play stream.

cc_index:

Optional value which specifies the index/ID of the subtitle stream to use for formats where multiple exist within the same stream. For example, CEA allows specifying up to 4 streams within a single video stream. If not specified, all subtitles will be merged together.

forced_subtitle:

Optional boolean value (0|1). If set to 1 indicates that this stream is a Forced Narrative subtitle that should be displayed when subtitles are otherwise off, for example used to caption short portions of the audio that might be in a foreign language. For DASH this will set role to forced_subtitle, for HLS it will set FORCED=YES and AUTOSELECT=YES. Only valid for subtitles.

DASH specific stream descriptor fields

dash_accessibilities (accessibilities):

Optional semicolon separated list of values for DASH Accessibility element. The value should be in the format: scheme_id_uri=value, which propagates to the Accessibility element in the result DASH manifest. See DASH (ISO/IEC 23009-1) specification for details.

dash_roles (roles):

optional semicolon separated list of values for DASH Role element. The value should be one of: caption, subtitle, main, alternate, supplementary, commentary, dub, description, sign, metadata, enhanced-audio- intelligibility, emergency, forced-subtitle, easyreader, and karaoke.

See DASH (ISO/IEC 23009-1) specification for details.

HLS specific stream descriptor fields

hls_name:

Used for HLS audio to set the NAME attribute for EXT-X-MEDIA. Defaults to the base of the playlist name.

hls_group_id:

Used for HLS audio to set the GROUP-ID attribute for EXT-X-MEDIA. Defaults to ‘audio’ if not specified.

playlist_name:

The HLS playlist file to create. Usually ends with ‘.m3u8’, and is relative to hls_master_playlist_output (see below). If unspecified, defaults to something of the form ‘stream_0.m3u8’, ‘stream_1.m3u8’, ‘stream_2.m3u8’, etc.

iframe_playlist_name:

The optional HLS I-Frames only playlist file to create. Usually ends with ‘.m3u8’, and is relative to hls_master_playlist_output (see below). Should only be set for video streams. If unspecified, no I-Frames only playlist is created.

hls_characteristics (charcs):

Optional colon or semi-colon separated list of values for the CHARACTERISTICS attribute for EXT-X-MEDIA. See CHARACTERISTICS attribute in http://bit.ly/2OOUkdB for details.

Chunking options

--segment_duration <seconds>

Segment duration in seconds. If single_segment is specified, this parameter sets the duration of a subsegment; otherwise, this parameter sets the duration of a segment. Actual segment durations may not be exactly as requested.

--fragment_duration <seconds>

Fragment duration in seconds. Should not be larger than the segment duration. Actual fragment durations may not be exactly as requested.

--segment_sap_aligned

Force segments to begin with stream access points. Default enabled.

--fragment_sap_aligned

Force fragments to begin with stream access points. This flag implies segment_sap_aligned. Default enabled.

--start_segment_number

Indicates the startNumber in DASH SegmentTemplate and HLS segment name.

MP4 output options

--mp4_include_pssh_in_stream

MP4 only: include pssh in the encrypted stream. Default enabled.

--mp4_use_decoding_timestamp_in_timeline

Deprecated. Do not use.

–generate_sidx_in_media_segments –nogenerate_sidx_in_media_segments

Indicates whether to generate ‘sidx’ box in media segments. Note that it is required for DASH on-demand profile (not using segment template).

Default enabled.

Transport stream output options

--transport_stream_timestamp_offset_ms

Transport stream only (MPEG2-TS, HLS Packed Audio): A positive value, in milliseconds, by which output timestamps are offset to compensate for possible negative timestamps in the input. For example, timestamps from ISO-BMFF after adjusted by EditList could be negative. In transport streams, timestamps are not allowed to be less than zero. Default: 100ms.

DASH options

--generate_static_live_mpd

If enabled, generates static mpd. If segment_template is specified in stream descriptors, shaka-packager generates dynamic mpd by default; if this flag is enabled, shaka-packager generates static mpd instead. Note that if segment_template is not specified, shaka-packager always generates static mpd regardless of the value of this flag.

--mpd_output <file_path>

MPD output file name.

--base_urls <comma_separated_urls>
Comma separated BaseURLs for the MPD:

<url>[,<url>]….

The values will be added as <BaseURL> element(s) immediately under the <MPD> element.

--min_buffer_time <seconds>

Specifies, in seconds, a common duration used in the definition of the MPD Representation data rate.

--minimum_update_period <seconds>

Indicates to the player how often to refresh the media presentation description in seconds. This value is used for dynamic MPD only.

--suggested_presentation_delay <seconds>

Specifies a delay, in seconds, to be added to the media presentation time. This value is used for dynamic MPD only.

--time_shift_buffer_depth <seconds>

Guaranteed duration of the time shifting buffer for dynamic media presentations, in seconds.

--preserved_segments_outside_live_window <num_segments>

Segments outside the live window (defined by time_shift_buffer_depth above) are automatically removed except for the most recent X segments defined by this parameter. This is needed to accommodate latencies in various stages of content serving pipeline, so that the segments stay accessible as they may still be accessed by the player.

The segments are not removed if the value is zero.

--utc_timings <scheme_id_uri_value_pairs>
Comma separated UTCTiming schemeIdUri and value pairs for the MPD:

<scheme_id_uri>=<value>[,<scheme_id_uri>=<value>]…

This value is used for dynamic MPD only.

--default_language <language>

Any audio/text tracks tagged with this language will have <Role … value="main" /> in the manifest. This allows the player to choose the correct default language for the content.

This applies to both audio and text tracks. The default language for text tracks can be overriden by ‘default_text_language’.

--default_text_language <text_language>

Same as above, but this applies to text tracks only, and overrides the default language for text tracks.

--allow_approximate_segment_timeline

For live profile only.

If enabled, segments with close duration (i.e. with difference less than one sample) are considered to have the same duration. This enables MPD generator to generate less SegmentTimeline entries. If all segments are of the same duration except the last one, we will do further optimization to use SegmentTemplate@duration instead and omit SegmentTimeline completely.

Ignored if $Time$ is used in segment template, since $Time$ requires accurate Segment Timeline.

–dash_only=0|1

Optional. Defaults to 0 if not specified. If it is set to 1, indicates the stream is DASH only.

--allow_codec_switching

If enabled, allow adaptive switching between different codecs, if they have the same language, media type (audio, video etc) and container type.

--low_latency_dash_mode

If enabled, LL-DASH streaming will be used, reducing overall latency by decoupling latency from segment duration.

--force_cl_index

True forces the muxer to order streams in the order given on the command-line. False uses the previous unordered behavior.

--dash_label <label_name>

Optional. Will add Label tag to adapation set and will be taken into consideration along with codecs, language, media type (audio, video etc) and container type to create different adaptation sets.

HLS options

--hls_master_playlist_output <file_path>

Output path for the master playlist for HLS. This flag must be used to output HLS.

--hls_base_url <url>

The base URL for the Media Playlists and media files listed in the playlists. This is the prefix for the files.

--hls_key_uri <uri>

The key uri for ‘identity’ and ‘com.apple.streamingkeydelivery’ (FairPlay) key formats. Ignored if the playlist is not encrypted or not using the above key formats.

--hls_playlist_type <type>

VOD, EVENT, or LIVE. This defines the EXT-X-PLAYLIST-TYPE in the HLS specification. For hls_playlist_type of LIVE, EXT-X-PLAYLIST-TYPE tag is omitted.

--time_shift_buffer_depth <seconds>

Guaranteed duration of the time shifting buffer for LIVE playlists, in seconds.

--preserved_segments_outside_live_window <num_segments>

Segments outside the live window (defined by time_shift_buffer_depth above) are automatically removed except for the most recent X segments defined by this parameter. This is needed to accommodate latencies in various stages of content serving pipeline, so that the segments stay accessible as they may still be accessed by the player.

The segments are not removed if the value is zero.

--default_language <language>

The first audio/text rendition in a group tagged with this language will have ‘DEFAULT’ attribute set to ‘YES’. This allows the player to choose the correct default language for the content.

This applies to both audio and text tracks. The default language for text tracks can be overriden by ‘default_text_language’.

--default_text_language <text_language>

Same as above, but this applies to text tracks only, and overrides the default language for text tracks.

--hls_media_sequence_number <unsigned_number>

HLS uses the EXT-X-MEDIA-SEQUENCE tag at the start of a live playlist in order to specify the first segment sequence number. This is because any live playlist have a limited number of segments, and they also keep updating with new segments while removing old ones. When a player refreshes the playlist, this information is important for keeping track of segments positions.

When the packager starts, it naturally starts this count from zero. However, there are many situations where the packager may be restarted, without this meaning starting this value from zero (but continuing a previous sequence). The most common situations are problems in the encoder feeding the packager.

With those cases in mind, this parameter allows to set the initial EXT-X-MEDIA-SEQUENCE value. This way, it’s possible to continue the sequence number from previous packager run.

For more information about the reasoning of this, please see issue #691.

The EXT-X-MEDIA-SEQUENCE documentation can be read here: https://tools.ietf.org/html/rfc8216#section-4.3.3.2.

--hls_start_time_offset <seconds>

Sets EXT-X-START on the media playlists to specify the preferred point at wich the player should start playing. A positive number indicates a time offset from the beginning of the playlist. A negative number indicates a negative time offset from the end of the last media segment in the playlist.

–hls_only=0|1

Optional. Defaults to 0 if not specified. If it is set to 1, indicates the stream is HLS only.

--force_cl_index

True forces the muxer to order streams in the order given on the command-line. False uses the previous unordered behavior.

--create_session_keys

Playback of Offline HLS assets shall use EXT-X-SESSION-KEY to declare all eligible content keys in the master playlist.

Ads options

--ad_cues <start_time[;start_time]…>

List of cuepoint markers separated by semicolon. The start_time represents the start of the cue marker in seconds (double precision) relative to the start of the program. This flag preconditions content for Dynamic Ad Insertion with Google Ad Manager. For DASH, multiple periods will be generated with period boundaries at the next key frame to the designated start times; For HLS, segments will be terminated at the next key frame to the designated start times and ‘#EXT-X-PLACEMENT-OPPORTUNITY’ tag will be inserted after the segment in media playlist.

Encryption / decryption options

Shaka Packager supports three different types of key providers:

  • Raw key: keys are provided in command line

  • Widevine: fetches keys from Widevine key server

  • PlayReady: fetches keys from PlayReady key server

Different key providers cannot be specified at the same time.

[--enable_widevine_encryption <Widevine Encryption Options>] \
[--enable_widevine_decryption <Widevine Decryption Options>] \
[--enable_raw_key_encryption <Raw Key Encryption Options>] \
[--enable_raw_key_decryption <Raw Key Decryption Options>] \
[--enable_playready_encryption <PlayReady Encryption Options>]

General encryption options

--protection_scheme <scheme>

Specify a protection scheme, ‘cenc’ or ‘cbc1’ or pattern-based protection schemes ‘cens’ or ‘cbcs’.

--crypt_byte_block

Specify the count of the encrypted blocks in the protection pattern, where block is of size 16-bytes.

There are three common patterns (crypt_byte_block:skip_byte_block): 1:9 (default), 5:5, 10:0.

Apply to video streams with ‘cbcs’ and ‘cens’ protection schemes only; ignored otherwise.

--skip_byte_block

Specify the count of the unencrypted blocks in the protection pattern.

Apply to video streams with ‘cbcs’ and ‘cens’ protection schemes only; ignored otherwise.

--vp9_subsample_encryption, --novp9_subsample_encryption

Enable / disable VP9 subsample encryption. Enabled by default.

--clear_lead <seconds>

Clear lead in seconds if encryption is enabled. Shaka Packager does not support partial encrypted segments, all the segments including the partial segment overlapping with the initial ‘clear_lead’ seconds are not encrypted, with all the following segments encrypted. If segment_duration is greater than ‘clear_lead’, then only the first segment is not encrypted. Default: 5

--protection_systems

Protection systems to be generated. Supported protection systems include Widevine, PlayReady, FairPlay, Marlin, and CommonSystem.

--playready_extra_header_data <string>

Extra XML data to add to PlayReady PSSH data. Can be specified even if using another key source.

Raw key encryption options

--enable_raw_key_encryption

Enable encryption with raw key (keys provided in command line)). This generates Common protection system if neither –pssh nor –protection_systems is specified. Use –pssh to provide custom protection systems or use –protection_systems to generate protection systems automatically.

--enable_raw_key_decryption

Enable decryption with raw key (keys provided in command line).

--keys <key_info_string[,key_info_string][,key_info_string]…>

key_info_string is of the form:

label=<label>:key_id=<key_id>:key=<key>[:iv=<initialization_vector>]

label can be an arbitrary string or a predefined DRM label like AUDIO, SD, HD, etc. Label with an empty string indicates the default key and key_id. The drm_label in Stream descriptors, which can be implicit, determines which key info is applied to the stream by matching the drm_label with the label in key info.

key_id and key should be 32-digit hex strings.

initialization_vector is an optional IV with the same format and semantics as the parameter for the –iv option below. This is mutually exclusive with that option.

--iv <16-digit or 32-digit hex string>

IV in hex string format. If not specified, a random IV will be generated. This flag should only be used for testing. IV must be either 8 bytes (16 digits HEX) or 16 bytes (32 digits in HEX).

--pssh <hex string>

One or more concatenated PSSH boxes in hex string format. If neither this flag nor –protection_systems is specified, a v1 common PSSH box will be generated.

Widevine encryption options

--enable_widevine_encryption

Enable encryption with Widevine key server. User should provide either AES signing key (–aes_signing_key, –aes_signing_iv) or RSA signing key (–rsa_signing_key_path). This generates Widevine protection system if –protection_systems is not specified. Use –protection_systems to generate multiple protection systems.

--enable_entitlement_license

Enable entitlement license in the Widevine encryption request.

--enable_widevine_decryption

Enable decryption with Widevine key server. User should provide either AES signing key (–aes_signing_key, –aes_signing_iv) or RSA signing key (–rsa_signing_key_path).

--key_server_url <url>

Key server url. Required for Widevine encryption and decryption.

--content_id <hex>

Content identifier that uniquely identifies the content.

--policy <policy>

The name of a stored policy, which specifies DRM content rights.

--max_sd_pixels <pixels>

The video track is considered SD if its max pixels per frame is no higher than max_sd_pixels. Default: 442368 (768 x 576).

--max_hd_pixels <pixels>

The video track is considered HD if its max pixels per frame is higher than max_sd_pixels, but no higher than max_hd_pixels. Default: 2073600 (1920 x 1080).

--max_uhd1_pixels <pixels>

The video track is considered UHD1 if its max pixels per frame is higher than max_hd_pixels, but no higher than max_uhd1_pixels. Otherwise it is UHD2. Default: 8847360 (4096 x 2160).

--signer <signer>

The name of the signer.

--aes_signing_key <hex>

AES signing key in hex string. aes_signing_iv is required if aes_signing_key is specified. This option is exclusive with rsa_signing_key_path.

--aes_signing_iv <hex>

AES signing iv in hex string.

--rsa_signing_key_path <file path>

Path to the file containing PKCS#1 RSA private key for request signing. This option is exclusive with aes_signing_key.

--crypto_period_duration <seconds>

Defines how often key rotates. If it is non-zero, key rotation is enabled.

--group_id <hex>

Identifier for a group of licenses.

PlayReady encryption options

--enable_playready_encryption

Enable encryption with PlayReady key. This generates PlayReady protection system if –protection_systems is not specified. Use –protection_system to generate multiple protection systems.

--playready_server_url <url>

PlayReady packaging server url.

--program_identifier <program_identifier>

Program identifier for packaging request.

--ca_file <file path>

Absolute path to the certificate authority file for the server cert. PEM format. Optional, depends on server configuration.

--client_cert_file <file path>

Absolute path to client certificate file. Optional, depends on server configuration.

--client_cert_private_key_file <file path>

Absolute path to the private key file. Optional, depends on server configuration.

--client_cert_private_key_password <string>

Password to the private key file. Optional, depends on server configuration.