

# Long audio files
<a name="asynchronous"></a>

To create TTS files for large passages of text, use Amazon Polly's *asynchronous synthesis* functionality. This uses the three `SpeechSynthesisTask` APIs: 
+ `StartSpeechSynthesisTask`: starts a new synthesis task.
+ `GetSpeechSynthesisTask`: returns details about a previously submitted synthesis task.
+ `ListSpeechSynthesisTasks`: lists all submitted synthesis tasks.

The `SynthesizeSpeech` operation produces audio in near-real time, with relatively little latency in most cases. To do this, the operation can only synthesize 3000 characters. 

Amazon Polly's Asynchronous Synthesis feature overcomes the challenge of processing a larger text document by changing the way the document is both synthesized and returned. When a synthesis request is made by submitting input text using the `StartSpeechSynthesisTask`, Amazon Polly queues the requests, and then asynchronously processes them in the background as soon as the system resources are available. Amazon Polly then uploads the resulting speech or speech marks stream directly to your (required) Amazon Simple Storage Service (Amazon S3) bucket, and notifies you about the completed file's availability through your (optional) SNS topic. 

In this way, all of the functionality except near-real time processing is available for texts of up to 100,000 billable characters (or 200,000 total characters) in length.

To synthesize a document using this method, you must have an Amazon S3 bucket that is writable to which the audio file can be saved. You can be notified when the synthesized audio is ready by providing an optional SNS Topic identifier. When the synthesis task is complete, Amazon Polly will publish a message on that topic. This message may also contain useful error information in cases where the synthesis task didn't succeed. To do this, make sure that the user creating the synthesis task can also publish to the SNS Topic. See the [Amazon SNS documentation](https://docs.aws.amazon.com/sns/latest/dg/welcome.html) for more information on how to create and subscribe to an SNS Topic. 

**Encryption**

You can store the output file in an encrypted form in your S3 bucket if desired. To do this, you enable [Amazon S3 bucket encryption](https://docs.aws.amazon.com/AmazonS3/latest/dev/bucket-encryption.html), which use one of the strongest block ciphers available, 256-bit Advanced Encryption Standard (AES-256). 

**Topics**
+ [Setting up the IAM policy for asynchronous synthesis](asynchronous-iam.md)
+ [Creating long audio files](longer-console.md)

# Setting up the IAM policy for asynchronous synthesis
<a name="asynchronous-iam"></a>

In order to use the asynchronous synthesis functionality, you will need an IAM policy that allows the following: 
+ use of new Amazon Polly operations
+ writing to the output S3 bucket
+ publishing to the status SNS topic [optional] 

The following policy grants only the necessary permissions required for asynchronous synthesis and can be attached to the IAM user. 

# Creating long audio files
<a name="longer-console"></a>

You can use the Amazon Polly console to create long speeches using asynchronous synthesis with the same functionality as you can use with the AWS CLI. This is done using the **Text-to-Speech** tab much like any other synthesis. 

------
#### [ Console ]

The other asynchronous synthesis functionality is also available via the console. The **S3 synthesis tasks** tab reflects the `ListSpeechSynthesisTasks` functionality, displaying all tasks saved to the S3 bucket and enabling you to filter them if you want. Clicking on a specific single task shows its details, reflecting `GetSpeechSynthesisTask` functionality.

**To synthesize a large text using the Amazon Polly console**

1. Sign in to the AWS Management Console and open the Amazon Polly console at [https://console.aws.amazon.com/polly/](https://console.aws.amazon.com/polly/).

1. Choose the **Text-to-Speech** tab. Select **Long Form** as the engine if appropriate.

1. With **SSML** on or off, type or paste your text into the input box.

1. Choose the language, region, and voice for your text. 

1. Choose **Save to S3**. 
**Note**  
Both the **Download** and **Listen** options are greyed out if the text length is above the 3,000 character limit for the real-time `SynthesizeSpeech` operation.

1. The console opens a form so that you can choose where to store the output file.

   1. Fill in the name of the destination Amazon S3 bucket.

   1. Optionally, fill in the prefix key of the output.
**Note**  
The output S3 bucket must be writable.

   1. If you want to be notified when the synthesis task is complete, provide an optional SNS topic identifier.
**Note**  
The SNS must be open for publication by the current console user to use this option. For more information, see [Amazon Simple Notification Service (SNS)](https://aws.amazon.com/sns/)

   1. Choose **Save to S3**. 

**To retrieve information on your speech synthesis tasks**

1. In the console, choose the **S3 Synthesis Tasks** tab.

1. The tasks are displayed in date order. To filter the tasks, by status, choose **All statuses** and then choose the status to use.

1. To view the details of a specific task, choose the linked **Task ID**.

------
#### [ AWS CLI ]

Amazon Polly *asynchronous synthesis* functionality uses three `SpeechSynthesisTask` APIs to work with large amounts of text: 
+ `StartSpeechSynthesisTask`: starts a new synthesis task.
+ `GetSpeechSynthesisTask`: returns details about a previously submitted synthesis task.
+ `ListSpeechSynthesisTasks`: lists all submitted synthesis tasks.

**Synthesizing large amounts of text (`StartSpeechSynthesisTask`)**

When you want to create an audio file larger than one that you can create with the real-time `SynthesizeSpeech`, use the `StartSpeechSynthesisTask` operation. In addition to the arguments needed for the `SynthesizeSpeech` operation, `StartSpeechSynthesisTask` also requires the name of an Amazon S3 bucket. Two other optional arguments are also available: a key prefix for the output file and the ARN for an SNS Topic if you want to receive status notification about the task.
+ `OutputS3BucketName`: The name of the Amazon S3 bucket where the synthesis should be uploaded. This bucket should be in the same region as the Amazon Polly service. Additionally, the IAM user being used to make the call should have access to the bucket. [Required] 
+ `OutputS3KeyPrefix`: Key prefix for the output file. Use this parameter if you want to save the output speech file in a custom directory-like key in your bucket. [Optional] 
+ `SnsTopicArn`: The SNS topic ARN to use if you want to receive notifications about status of the task. This SNS topic should be in the same region as the Amazon Polly service. Additionally, the IAM user being used to make the call should have access to the topic. [Optional] 

For example, the following example can be used to run the `start-speech-synthesis-task` AWS CLI command in the US East (Ohio) region: 

The following AWS CLI example is formatted for Unix, Linux, and macOS. For Windows, replace the backslash (\$1) Unix continuation character at the end of each line with a caret (^) and use full quotation marks (") around the input text with single quotes (') for interior tags.

```
aws polly start-speech-synthesis-task \
  --region us-east-2 \
  --endpoint-url "https://polly.us-east-2.amazonaws.com/" \
  --output-format mp3 \
  --output-s3-bucket-name your-bucket-name \
  --output-s3-key-prefix optional/prefix/path/file \
  --voice-id Joanna \
  --text file://text_file.txt
```

This will result in a response that looks similar to this: 

```
"SynthesisTask": 
{
     "OutputFormat": "mp3",
     "OutputUri": "https://s3.us-east-2.amazonaws.com/your-bucket-name/optional/prefix/path/file.<task_id>.mp3",
     "TextType": "text",
     "CreationTime": [..],
     "RequestCharacters": [..],
     "TaskStatus": "scheduled",
     "TaskId": [task_id],
     "VoiceId": "Joanna"
 }
```

The `start-speech-synthesis-task` operation returns several new fields: 
+ `OutputUri`: the location of your output speech file. 
+ `TaskId`: a unique identifier for the speech synthesis task generated by Amazon Polly. 
+ `CreationTime`: a timestamp for when the task was initially submitted.
+ `RequestCharacters`: the number of billable characters in the task.
+ `TaskStatus`: provides information on the status of the submitted task. 

  When your task is submitted, the initial status will show `scheduled`. When Amazon Polly starts processing the task, the status will change to `inProgress` and later, to `completed` or `failed`. If the task fails, an error message will be returned when calling either the GetSpeechSynthesisTask or ListSpeechSynthesisTasks operation. 

When the task is completed, the speech file is available at the location specified in `OutputUri`.

**Retrieving information on your speech synthesis task**

You can get information on a task, such as errors, status, and so on, using the `GetSpeechSynthesisTask` operation. To do this, you will need the `task-id` returned by the `StartSpeechSynthesisTask`.

For example, the following example can be used to run the `get-speech-synthesis-task` AWS CLI command: 

```
aws polly get-speech-synthesis-task \
--region us-east-2 \
--endpoint-url "https:// polly.us-east-2.amazonaws.com/" \
--task-id task identifier
```

You can also list all speech synthesis tasks that you've run in the current region using the `ListSpeechSynthesisTasks` operation. 

For example, the following example can be used to run the `list-speech-synthesis-tasks` AWS CLI command: 

```
aws polly list-speech-synthesis-tasks \
--region us-east-2 \
--endpoint-url "https:// polly.us-east-2.amazonaws.com/"
```

------