Kinesis Video Streams data model - Amazon Kinesis Video Streams

Kinesis Video Streams data model

The Upload to Kinesis Video Streams and Watch output from cameras using parser library send and receive video data in a format that supports embedding information alongside video data. This format is based on the Matroska (MKV) specification.

The MKV format is an open specification for media data. All the libraries and code examples in the Amazon Kinesis Video Streams Developer Guide send or receive data in the MKV format.

The Upload to Kinesis Video Streams uses the StreamDefinition and Frame types to produce MKV stream headers, frame headers, and frame data.

For information about the full MKV specification, see Matroska Specifications.

The following sections describe the components of MKV-formatted data produced by the C++.

Stream header elements

The following MKV header elements are used by StreamDefinition (defined in StreamDefinition.h).

Element Description Typical values
stream_name Corresponds to the name of the Kinesis video stream. my-stream
retention_period The duration, in hours, that stream data is persisted by Kinesis Video Streams. Specify 0 for a stream that doesn't retain data. 24
tags A key-value collection of user data. This data is displayed in the AWS Management Console and can be read by client applications to filter or get information about a stream.
kms_key_id If present, the user-defined AWS KMS key is used to encrypt data on the stream. If absent, the data is encrypted by the Kinesis-supplied key (aws/kinesisvideo). 01234567-89ab-cdef-0123-456789ab
streaming_type Currently, the only valid streaming type is STREAMING_TYPE_REALTIME. STREAMING_TYPE_REALTIME
content_type The user-defined content type. For streaming video data to play in the console, the content type must be video/h264. video/h264
max_latency This value isn't currently used and should be set to 0. 0
fragment_duration The estimate of how long your fragments should be, which is used for optimization. The actual fragment duration is determined by the streaming data. 2
timecode_scale

Indicates the scale used by frame timestamps. The default is 1 millisecond. Specifying 0 also assigns the default value of 1 millisecond. This value can be between 100 nanoseconds and 1 second.

For more information, see TimecodeScale in the Matroska documentation.

key_frame_fragmentation If true, the stream starts a new cluster when a keyframe is received. true
frame_timecodes If true, Kinesis Video Streams uses the presentation time stamp (pts) and decoding time stamp (dts) values of the received frames. If false, Kinesis Video Streams stamps the frames when they are received with system-generated time values. true
absolute_fragment_time If true, the cluster timecodes are interpreted as using absolute time (for example, from the producer's system clock). If false, the cluster timecodes are interpreted as being relative to the start time of the stream. true
fragment_acks If true, acknowledgements (ACKs) are sent when Kinesis Video Streams receives the data. The ACKs can be received using the KinesisVideoStreamFragmentAck or KinesisVideoStreamParseFragmentAck callbacks. true
restart_on_error Indicates whether the stream should resume transmission after a stream error is raised. true
nal_adaptation_flags Indicates whether NAL (Network Abstraction Layer) adaptation or codec private data is present in the content. Valid flags include NAL_ADAPTATION_ANNEXB_NALS and NAL_ADAPTATION_ANNEXB_CPD_NALS. NAL_ADAPTATION_ANNEXB_NALS
frame_rate An estimate of the content frame rate. This value is used for optimization; the actual frame rate is determined by the rate of incoming data. Specifying 0 assigns the default of 24. 24
avg_bandwidth_bps An estimate of the content bandwidth, in Mbps. This value is used for optimization; the actual rate is determined by the bandwidth of incoming data. For example, for a 720 p resolution video stream running at 25 FPS, you can expect the average bandwidth to be 5 Mbps. 5
buffer_duration The duration that content is to be buffered on the producer. If there's low network latency, this value can be reduced. If network latency is high, increasing this value prevents frames from being dropped before they can be sent, due to allocation failing to put frames into the smaller buffer.
replay_duration The amount of time the video data stream is "rewound" if connection is lost. This value can be zero if lost frames due to connection loss are not a concern. The value can be increased if the consuming application can remove redundant frames. This value should be less than the buffer duration, otherwise the buffer duration is used.
connection_staleness The duration that a connection is maintained when no data is received.
codec_id The codec used by the content. For more information, see CodecID in the Matroska specification. V_MPEG2
track_name The user-defined name of the track. my_track
codecPrivateData Data provided by the encoder used to decode the frame data, such as the frame width and height in pixels, which is needed by many downstream consumers. In the C++ producer library, the gMkvTrackVideoBits array in MkvStatics.cpp includes pixel width and height for the frame.
codecPrivateDataSize The size of the data in the codecPrivateData parameter.
track_type The type of the track for the stream. MKV_TRACK_INFO_TYPE_AUDIO or MKV_TRACK_INFO_TYPE_VIDEO
segment_uuid User-defined segment uuid (16 bytes).
default_track_id Unique non-zero number for the track. 1

Stream track data

The following MKV track elements are used by StreamDefinition (defined in StreamDefinition.h).

Element Description Typical Values
track_name User-defined track name. For example, "audio" for the audio track. audio
codec_id Codec id for the track. For example, "A_AAC" for an audio track. A_AAC
cpd Data provided by the encoder used to decode the frame data. This data can include frame width and height in pixels, which is needed by many downstream consumers. In the C++ producer library, the gMkvTrackVideoBits array in MkvStatics.cpp includes pixel width and height for the frame.
cpd_size The size of the data in the codecPrivateData parameter.
track_type The type of the track. For example, you can use the enum value of MKV_TRACK_INFO_TYPE_AUDIO for audio. MKV_TRACK_INFO_TYPE_AUDIO

Frame header elements

The following MKV header elements are used by Frame (defined in the KinesisVideoPic package, in mkvgen/Include.h):

  • Frame Index: A monotonically increasing value.

  • Flags: The type of frame. Valid values include the following:

    • FRAME_FLAGS_NONE

    • FRAME_FLAG_KEY_FRAME: If key_frame_fragmentation is set on the stream, key frames start a new fragment.

    • FRAME_FLAG_DISCARDABLE_FRAME: Tells the decoder that it can discard this frame if decoding is slow.

    • FRAME_FLAG_INVISIBLE_FRAME: Duration of this block is 0.

  • Decoding Timestamp: The timestamp of when this frame was decoded. If previous frames depend on this frame for decoding, this timestamp might be earlier than that of earlier frames. This value is relative to the start of the fragment.

  • Presentation Timestamp: The timestamp of when this frame is displayed. This value is relative to the start of the fragment.

  • Duration: The playback duration of the frame.

  • Size: The size of the frame data in bytes

MKV frame data

The data in frame.frameData might contain only media data for the frame, or it might contain further nested header information, depending on the encoding schema used. To be displayed in the AWS Management Console, the data must be encoded in the H.264 codec, but Kinesis Video Streams can receive time-serialized data streams in any format.