Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Review and monitoring - Streaming Media Lens

Review and monitoring

SM_PERF3: How do you use caching to improve content delivery performance? 
SM_PBP5 – Use a content delivery network and monitor your cache-hit-ratio
SM_PBP6 – Ensure that cache-control headers for your content are optimized
SM_PBP7 – Have a cache invalidation runbook
SM_PBP8 – Minimize negative (error) caching

A Content Delivery Network (CDN) scales video delivery by serving content from local caches nearest the user and providing optimized routes to origination services. Caching improves time-to-first-byte for clients and reduces the load on origin services. CDNs use a multi-tier architecture with two or three tiers of cache hierarchy before requests make it back to the origin server. These tiers are usually referred to as the edge tier and mid-tier caches. The edge tier is first to receive a request from client and responds fastest in the case of a cache match. The mid-tier has a larger cache depth, but is located only in select locations. Cache misses to the edge tier come back to the mid-tier for another chance at a cache match. 

A Cache Hit Ratio (CHR) is the ratio of requests served from cache (matches) to the total requests (misses and matches) over a period of time. Cache matches improve the client experience and cache misses result in a request directly to your origin layer that increases response latency and costs. Monitoring CHR will help you to improve delivery and origin layer performance over time. You can enable CHR, origin latency, and HTTP error rate metrics from your Amazon CloudFront monitoring settings.

CDNs typically employ a last recently used (LRU) caching strategy on each tier. This means that data will be maintained in caches based on the amount of traffic an object receives and the available cache size. Though you can’t guarantee caches will hold content for the next request, you can set Cache-Control headers on the origin to indicate the preferred duration for an object to be kept in a CDN cache. Your CDN should be configured to respect caching headers from your origin server to ensure that live content and manifests are only cached for the appropriate amount of time.

Live streaming manifests are frequently updated to represent the next media object in the stream and should not be cached for longer than half of your segment duration. Caching longer than the segment duration could result in the serving of stale manifests, a delay for clients to retrieve the next media segment, and client buffer exhaustion, which will negatively impact user experience. Live media segments and VOD content (both segments and manifests) should be cached for as long as possible to retain them in delivery caches for the maximum amount of time.

Scenario Segment Size Manifest Update Frequency Segment Cache-Control Header or Cache Behavior Manifest Cache-Control Header or Cache Behavior
Live 10 seconds 10 seconds 21,600 seconds or max DVR window 5 seconds or less
VOD 10 seconds Static 86,400 seconds or longest possible 86,400 seconds or longest possible

Recommended cache behaviors for live and Video-on-Demand (VOD) scenarios

There are often times when cached content needs to be modified or invalidated. Have a cache invalidation runbook in place so that you can modify cached objects and invalidate the previous content. This can be achieved by invalidating content with a CDN feature, using variable file names, or using query string parameters to “break” the cache when content is changed.

Caching of error responses from the origin, also known as negative caching, should be minimized as some streaming clients might proactively request future segments before they are published to minimize latency. For live streaming, it should be disabled completely for manifest and segment files. At a minimum, the negative caching duration should not exceed one segment length. Amazon CloudFront caches origin errors for five minutes by default, but you can configure it to suit your needs.

SM_PERF4: How do you monitor viewer experience?
SM_PBP9 – Collect and analyze real user logs and metrics
SM_PBP10 – Recognize and respond to playback anomalies

Infrastructure logging and monitoring only provides you with part of the picture. We recommend that you design a client that sends real-user data directly to monitoring and logging systems. This allows you to benchmark normal behavior, identify anomalies, and correlate events with content delivery systems. For example, session initialization information, like playback URL, user-agent, and network connection status could help you identify issues with a specific origin, client device type, or network environment. 

For streaming media, it’s especially important to monitor the health of the video decoder to determine how changes in network topology, video encoding settings, or mobile operating systems impact the end user experience. For example, capturing video buffering events, which directly impact customer satisfaction, should be a key indicator of streaming health. 

We recommend that you capture client metrics from streaming sessions with services like Amazon Kinesis and monitor for anomalies with Amazon CloudWatch. Equipped with this data, you can uncover patterns from real users, create alerting systems, and automate remediation tasks. The AWS Partner Network provides another avenue for video-specific monitoring tools that can give you actionable data from playback sessions.

Amazon Prime Video, a streaming media service by Amazon, has many ways of monitoring customer experience. One key metric, Zero-Impact-Rate, measures the rate of streaming sessions that have had any buffers or errors. This is used to baseline customer experience and alert when there are deviations from normal behavior. Here are other client metrics that provide valuable insights into viewer playback experience:

Metric Description
Time-to-First-Frame Time between client request for content and first frame being displayed on client
Playback Frames Per Second Client displayed frame rate
Session Resolution Client displayed resolution
Session Duration Duration the client spent watching content
Buffering Events Client buffering events
Zero-Buffer-Rate Number of sessions that had zero buffer events
Client Errors Events Client HTTP or application errors
Zero-Error-Rate Percentage of total sessions that had zero error events
Zero-Impact-Rate Percentage of total sessions that had zero Buffers or Errors

Suggested metrics for measuring quality of service

PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.