예 1: 어휘가 하나인 어휘집 예 2: 어휘가 여러 개 있는 어휘집 예 3: 여러 어휘집 지정 API용 추가 코드 샘플 PutLexicon

오퍼레이션 사용 PutLexicon

Amazon Polly를 사용하면 계정의 특정 AWS 지역에 발음 PutLexicon 어휘를 저장하는 데 사용할 수 있습니다. 이렇게 하면 서비스에서 텍스트 합성을 시작하기 전에 적용하려는 저장된 어휘를 SynthesizeSpeech 요청에 하나 이상 지정할 수 있습니다. 자세한 정보는 어휘집 관리을 참조하세요.

이 섹션에서는 예제 어휘와 이를 저장하고 테스트하기 위한 step-by-step 지침을 제공합니다.

참고

이러한 어휘는 발음 어휘 사양(PLS) W3C 권장 사항을 준수해야 합니다. 자세한 내용은 W3C 웹 사이트의 발음 어휘 사양(PLS) 버전 1.0을 참조하세요.

예 1: 어휘가 하나인 어휘집

다음과 같은 W3C PLS 준수 어휘를 고려해 보세요.


<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" 
      xml:lang="en-US">
  <lexeme>
    <grapheme>W3C</grapheme>
    <alias>World Wide Web Consortium</alias>
  </lexeme>
</lexicon>

유의할 사항:

<lexicon> 요소에 지정된 두 가지 속성:
- xml:lang 속성은 어휘가 적용되는 언어 코드와 en-US를 지정합니다. SynthesizeSpeech 호출에서 지정한 음성과 동일한 언어 코드(en-US)가 있는 경우 Amazon Polly에서 이 예제 어휘를 사용할 수 있습니다.
  
  참고
  DescribeVoices 작업을 사용하여 음성과 관련된 언어 코드를 찾을 수 있습니다.
- alphabet 속성은 IPA를 지정합니다. 이는 발음에 국제 발음기호(IPA) 알파벳을 사용한다는 뜻입니다. IPA는 발음을 표기하기 위한 알파벳입니다. Amazon Polly는 확장 음성 평가 방법 발음기호(X-SAMPA)도 지원합니다.
<lexeme> 요소는 <grapheme>(즉, 단어의 텍스트 표현)와 <alias> 사이의 매핑을 설명합니다.

이 어휘를 테스트하려면 다음을 수행합니다.

어휘를 example.pls로 저장합니다.
put-lexicon AWS CLI 명령을 실행하여 us-east-2 지역에 어휘집 (이름 포함w3c) 을 저장합니다.
```
aws polly put-lexicon \
--name w3c \
--content file://example.pls 
```
synthesize-speech 명령을 실행하여 샘플 텍스트를 오디오 스트림(speech.mp3)에 합성하고, 옵션으로 lexicon-name 파라미터를 지정합니다.
```
aws polly synthesize-speech \
--text 'W3C is a Consortium' \
--voice-id Joanna \
--output-format mp3 \
--lexicon-names="w3c" \
speech.mp3
```
결과물 speech.mp3를 재생하면 텍스트의 W3C라는 단어가 월드 와이드 웹 컨소시엄으로 대체된 것을 확인할 수 있습니다.

위의 예제 어휘에서는 별칭을 사용합니다. 어휘에 언급된 IPA 알파벳은 사용되지 않습니다. 다음 어휘는 IPA 알파벳이 있는 <phoneme> 요소를 사용하여 표기 발음을 지정합니다.


<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" 
      xml:lang="en-US">
  <lexeme>
    <grapheme>pecan</grapheme>
    <phoneme>pɪˈkɑːn</phoneme>
  </lexeme>
</lexicon>

동일한 단계를 수행하여 이 어휘를 테스트합니다. “피칸”이라는 단어가 포함된 입력 텍스트(예: “피칸 파이는 맛있어요”)를 지정합니다.

예 2: 어휘가 여러 개 있는 어휘집

이 예제에서 어휘를 통해 지정하는 어휘소는 합성을 위한 입력 텍스트에만 적용됩니다. 다음 어휘소를 고려해 보세요.


<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0"
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">

  <lexeme> 
    <grapheme>W3C</grapheme>
    <alias>World Wide Web Consortium</alias>
  </lexeme>
  <lexeme> 
    <grapheme>W3C</grapheme>
    <alias>WWW Consortium</alias>
  </lexeme>
  <lexeme> 
    <grapheme>Consortium</grapheme>
    <alias>Community</alias>
  </lexeme>
</lexicon>

어휘에는 세 개의 어휘소가 지정되며, 그 중 두 개의 어휘소는 다음과 같이 W3C 자소의 별칭을 정의합니다.

첫 번째 <lexeme> 요소는 별칭(월드 와이드 웹 컨소시엄)을 정의합니다.
두 번째 <lexeme>는 대체 별칭(WWW 컨소시엄)을 정의합니다.

Amazon Polly는 어휘의 특정 자소를 첫 번째 대체어로 대체합니다.

세 번째 <lexeme>는 컨소시엄이라는 단어의 대체어(커뮤니티)를 정의합니다.

먼저 이 어휘를 테스트해 보겠습니다. 다음 샘플 텍스트를 오디오 파일(speech.mp3)에 합성하고 SynthesizeSpeech를 호출할 때 어휘를 지정한다고 가정해 보겠습니다.


The W3C is a Consortium

SynthesizeSpeech는 먼저 다음과 같이 어휘를 적용합니다.

첫 번째 어휘소에 따라 W3C라는 단어를 월드 와이드 웹 컨소시엄으로 수정합니다. 수정된 텍스트는 다음과 같이 나타납니다.
```
The World Wide Web Consortium is a Consortium
```
세 번째 어휘소에 정의된 별칭은 원본 텍스트의 일부인 컨소시엄이라는 단어에만 적용되며, 그 결과 다음과 같은 텍스트가 나타납니다.
```
The World Wide Web Consortium is a Community.
```

다음과 같이 사용하여 이를 테스트할 수 있습니다. AWS CLI

어휘를 example.pls로 저장합니다.
put-lexicon 명령을 실행하여 이름이 w3c인 어휘를 us-east-2 리전에 저장합니다.
```
aws polly put-lexicon \
--name w3c \
--content file://example.pls
```
list-lexicons 명령을 실행하여 w3c 어휘가 반환된 어휘 목록에 있는지 확인합니다.
```
aws polly list-lexicons
```
synthesize-speech 명령을 실행하여 샘플 텍스트를 오디오 파일(speech.mp3)에 합성하고 옵션으로 lexicon-name 파라미터를 지정합니다.
```
aws polly synthesize-speech \
--text 'W3C is a Consortium' \
--voice-id Joanna \
--output-format mp3 \
--lexicon-names="w3c" \
speech.mp3
```
결과물인 speech.mp3 파일을 재생하여 합성된 음성에 변경된 텍스트가 적용되는지 확인합니다.

예 3: 여러 어휘집 지정

SynthesizeSpeech를 호출할 때 여러 어휘를 지정할 수 있습니다. 이 경우 왼쪽에서 오른쪽 순으로 지정된 어휘 중 첫 번째 어휘가 이전의 모든 어휘보다 우선합니다.

다음 두 어휘를 고려해 보세요. 참고로 어휘마다 동일한 W3C 자소에 대해 서로 다른 별칭이 설명되어 있습니다.

어휘 1: w3c.pls


<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>W3C</grapheme>
    <alias>World Wide Web Consortium</alias>
  </lexeme>
</lexicon>

어휘 2: w3cAlternate.pls


<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0"
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">

  <lexeme> 
    <grapheme>W3C</grapheme>
    <alias>WWW Consortium</alias>
  </lexeme>
</lexicon>

이들 어휘를 각각 w3c 및 w3cAlternate로 저장한다고 가정해 보겠습니다. SynthesizeSpeech 호출에서 어휘를 순서대로(w3c 다음에 w3cAlternate) 지정하는 경우 첫 번째 어휘에 정의된 W3C의 별칭이 두 번째 어휘보다 우선합니다. 어휘를 테스트하려면 다음을 수행합니다.

w3c.pls 및 w3cAlternate.pls 파일에 어휘를 로컬로 저장합니다.
명령을 사용하여 이러한 어휘를 업로드합니다. put-lexicon AWS CLI
- w3c.pls 어휘를 업로드하고 w3c로 저장합니다.
```
aws polly put-lexicon \
--name w3c \
--content file://w3c.pls 
```
- 서비스에 w3cAlternate.pls 어휘를 w3cAlternate로 업로드합니다.
```
aws polly put-lexicon \
--name w3cAlternate \
--content file://w3cAlternate.pls 
```
synthesize-speech 명령을 실행하여 샘플 텍스트를 오디오 스트림(speech.mp3)에 합성하고 lexicon-name 파라미터를 사용하여 두 어휘를 모두 지정합니다.
```
aws polly synthesize-speech \
--text 'PLS is a W3C recommendation' \
--voice-id Joanna \
--output-format mp3 \
--lexicon-names '["w3c","w3cAlternative"]' \
speech.mp3
```
결과물인 speech.mp3를 테스트합니다. 다음과 같이 표시되어야 합니다.
```
PLS is a World Wide Web Consortium recommendation
```

API용 추가 코드 샘플 PutLexicon

Java 샘플: PutLexicon
Python(Boto3) 샘플: PutLexicon

javascript가 브라우저에서 비활성화되거나 사용이 불가합니다.

AWS 설명서를 사용하려면 Javascript가 활성화되어야 합니다. 지침을 보려면 브라우저의 도움말 페이지를 참조하십시오.

문서 규칙

의 어휘집 관리 AWS CLI

GetLexicon