Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Speech mark types

Focus mode
Speech mark types - Amazon Polly

You request speech marks using the SpeechMarkTypes option for either the SynthesizeSpeech or StartSpeechSynthesisTask commands. You specify the metadata elements that you want to return from your input text. You can request as many as four types of metadata but you must specify at least one per request. No audio output is generated with the request.

In the AWS CLI, for example:

--speech-mark-types='["sentence", "word", "viseme", "ssml"]'

Amazon Polly generates speech marks using the following elements:

  • sentence – Indicates a sentence element in the input text.

  • word – Indicates a word element in the text.

  • viseme – Describes the face and mouth movements corresponding to each phoneme being spoken. For more information, see Visemes and Amazon Polly.

  • ssml – Describes a <mark> element from the SSML input text. For more information, see Generating speech from SSML documents.

PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.