Amazon Polly examples using SDK for JavaScript (v3) - AWS SDK Code Examples

There are more AWS SDK examples available in the AWS Doc SDK Examples GitHub repo.

Amazon Polly examples using SDK for JavaScript (v3)

The following code examples show you how to perform actions and implement common scenarios by using the AWS SDK for JavaScript (v3) with Amazon Polly.

Scenarios are code examples that show you how to accomplish specific tasks by calling multiple functions within a service or combined with other AWS services.

Each example includes a link to the complete source code, where you can find instructions on how to set up and run the code in context.

Topics

Scenarios

The following code example shows how to create an application that analyzes customer comment cards, translates them from their original language, determines their sentiment, and generates an audio file from the translated text.

SDK for JavaScript (v3)

This example application analyzes and stores customer feedback cards. Specifically, it fulfills the need of a fictitious hotel in New York City. The hotel receives feedback from guests in various languages in the form of physical comment cards. That feedback is uploaded into the app through a web client. After an image of a comment card is uploaded, the following steps occur:

  • Text is extracted from the image using Amazon Textract.

  • Amazon Comprehend determines the sentiment of the extracted text and its language.

  • The extracted text is translated to English using Amazon Translate.

  • Amazon Polly synthesizes an audio file from the extracted text.

The full app can be deployed with the AWS CDK. For source code and deployment instructions, see the project in GitHub. The following excerpts show how the AWS SDK for JavaScript is used inside of Lambda functions.

import { ComprehendClient, DetectDominantLanguageCommand, DetectSentimentCommand, } from "@aws-sdk/client-comprehend"; /** * Determine the language and sentiment of the extracted text. * * @param {{ source_text: string}} extractTextOutput */ export const handler = async (extractTextOutput) => { const comprehendClient = new ComprehendClient({}); const detectDominantLanguageCommand = new DetectDominantLanguageCommand({ Text: extractTextOutput.source_text, }); // The source language is required for sentiment analysis and // translation in the next step. const { Languages } = await comprehendClient.send( detectDominantLanguageCommand, ); const languageCode = Languages[0].LanguageCode; const detectSentimentCommand = new DetectSentimentCommand({ Text: extractTextOutput.source_text, LanguageCode: languageCode, }); const { Sentiment } = await comprehendClient.send(detectSentimentCommand); return { sentiment: Sentiment, language_code: languageCode, }; };
import { DetectDocumentTextCommand, TextractClient, } from "@aws-sdk/client-textract"; /** * Fetch the S3 object from the event and analyze it using Amazon Textract. * * @param {import("@types/aws-lambda").EventBridgeEvent<"Object Created">} eventBridgeS3Event */ export const handler = async (eventBridgeS3Event) => { const textractClient = new TextractClient(); const detectDocumentTextCommand = new DetectDocumentTextCommand({ Document: { S3Object: { Bucket: eventBridgeS3Event.bucket, Name: eventBridgeS3Event.object, }, }, }); // Textract returns a list of blocks. A block can be a line, a page, word, etc. // Each block also contains geometry of the detected text. // For more information on the Block type, see https://docs.aws.amazon.com/textract/latest/dg/API_Block.html. const { Blocks } = await textractClient.send(detectDocumentTextCommand); // For the purpose of this example, we are only interested in words. const extractedWords = Blocks.filter((b) => b.BlockType === "WORD").map( (b) => b.Text, ); return extractedWords.join(" "); };
import { PollyClient, SynthesizeSpeechCommand } from "@aws-sdk/client-polly"; import { S3Client } from "@aws-sdk/client-s3"; import { Upload } from "@aws-sdk/lib-storage"; /** * Synthesize an audio file from text. * * @param {{ bucket: string, translated_text: string, object: string}} sourceDestinationConfig */ export const handler = async (sourceDestinationConfig) => { const pollyClient = new PollyClient({}); const synthesizeSpeechCommand = new SynthesizeSpeechCommand({ Engine: "neural", Text: sourceDestinationConfig.translated_text, VoiceId: "Ruth", OutputFormat: "mp3", }); const { AudioStream } = await pollyClient.send(synthesizeSpeechCommand); const audioKey = `${sourceDestinationConfig.object}.mp3`; // Store the audio file in S3. const s3Client = new S3Client(); const upload = new Upload({ client: s3Client, params: { Bucket: sourceDestinationConfig.bucket, Key: audioKey, Body: AudioStream, ContentType: "audio/mp3", }, }); await upload.done(); return audioKey; };
import { TranslateClient, TranslateTextCommand, } from "@aws-sdk/client-translate"; /** * Translate the extracted text to English. * * @param {{ extracted_text: string, source_language_code: string}} textAndSourceLanguage */ export const handler = async (textAndSourceLanguage) => { const translateClient = new TranslateClient({}); const translateCommand = new TranslateTextCommand({ SourceLanguageCode: textAndSourceLanguage.source_language_code, TargetLanguageCode: "en", Text: textAndSourceLanguage.extracted_text, }); const { TranslatedText } = await translateClient.send(translateCommand); return { translated_text: TranslatedText }; };
Services used in this example
  • Amazon Comprehend

  • Lambda

  • Amazon Polly

  • Amazon Textract

  • Amazon Translate