기계 번역으로 제공되는 번역입니다. 제공된 번역과 원본 영어의 내용이 상충하는 경우에는 영어 버전이 우선합니다.
를 사용하여 Amazon Transcribe 사용자 지정 어휘 생성 및 구체화 AWS SDK
다음 코드 예시는 다음과 같은 작업을 수행하는 방법을 보여줍니다.
Amazon S3에 오디오 파일을 업로드합니다.
Amazon Transcribe 작업을 실행하여 파일을 트랜스크립션하고 결과를 얻습니다.
사용자 지정 어휘를 생성하고 세부 조정하여 트랜스크립션 정확도를 향상시킵니다.
사용자 지정 어휘와 함께 작업을 실행하고 결과를 얻습니다.
- Python
-
- SDK Python용(Boto3)
-
참고
에 대한 자세한 내용은 를 참조하세요 GitHub. AWS 코드 예시 리포지토리
에서 전체 예시를 찾고 설정 및 실행하는 방법을 배워보세요. Lewis Carroll의 Jabberwocky 낭독이 포함된 오디오 파일을 트랜스크립셚바니다. 먼저 Amazon Transcribe 작업을 래핑하는 함수를 생성하여 시작합니다.
def start_job( job_name, media_uri, media_format, language_code, transcribe_client, vocabulary_name=None, ): """ Starts a transcription job. This function returns as soon as the job is started. To get the current status of the job, call get_transcription_job. The job is successfully completed when the job status is 'COMPLETED'. :param job_name: The name of the transcription job. This must be unique for your AWS account. :param media_uri: The URI where the audio file is stored. This is typically in an Amazon S3 bucket. :param media_format: The format of the audio file. For example, mp3 or wav. :param language_code: The language code of the audio file. For example, en-US or ja-JP :param transcribe_client: The Boto3 Transcribe client. :param vocabulary_name: The name of a custom vocabulary to use when transcribing the audio file. :return: Data about the job. """ try: job_args = { "TranscriptionJobName": job_name, "Media": {"MediaFileUri": media_uri}, "MediaFormat": media_format, "LanguageCode": language_code, } if vocabulary_name is not None: job_args["Settings"] = {"VocabularyName": vocabulary_name} response = transcribe_client.start_transcription_job(**job_args) job = response["TranscriptionJob"] logger.info("Started transcription job %s.", job_name) except ClientError: logger.exception("Couldn't start transcription job %s.", job_name) raise else: return job def get_job(job_name, transcribe_client): """ Gets details about a transcription job. :param job_name: The name of the job to retrieve. :param transcribe_client: The Boto3 Transcribe client. :return: The retrieved transcription job. """ try: response = transcribe_client.get_transcription_job( TranscriptionJobName=job_name ) job = response["TranscriptionJob"] logger.info("Got job %s.", job["TranscriptionJobName"]) except ClientError: logger.exception("Couldn't get job %s.", job_name) raise else: return job def delete_job(job_name, transcribe_client): """ Deletes a transcription job. This also deletes the transcript associated with the job. :param job_name: The name of the job to delete. :param transcribe_client: The Boto3 Transcribe client. """ try: transcribe_client.delete_transcription_job(TranscriptionJobName=job_name) logger.info("Deleted job %s.", job_name) except ClientError: logger.exception("Couldn't delete job %s.", job_name) raise def create_vocabulary( vocabulary_name, language_code, transcribe_client, phrases=None, table_uri=None ): """ Creates a custom vocabulary that can be used to improve the accuracy of transcription jobs. This function returns as soon as the vocabulary processing is started. Call get_vocabulary to get the current status of the vocabulary. The vocabulary is ready to use when its status is 'READY'. :param vocabulary_name: The name of the custom vocabulary. :param language_code: The language code of the vocabulary. For example, en-US or nl-NL. :param transcribe_client: The Boto3 Transcribe client. :param phrases: A list of comma-separated phrases to include in the vocabulary. :param table_uri: A table of phrases and pronunciation hints to include in the vocabulary. :return: Information about the newly created vocabulary. """ try: vocab_args = {"VocabularyName": vocabulary_name, "LanguageCode": language_code} if phrases is not None: vocab_args["Phrases"] = phrases elif table_uri is not None: vocab_args["VocabularyFileUri"] = table_uri response = transcribe_client.create_vocabulary(**vocab_args) logger.info("Created custom vocabulary %s.", response["VocabularyName"]) except ClientError: logger.exception("Couldn't create custom vocabulary %s.", vocabulary_name) raise else: return response def get_vocabulary(vocabulary_name, transcribe_client): """ Gets information about a custom vocabulary. :param vocabulary_name: The name of the vocabulary to retrieve. :param transcribe_client: The Boto3 Transcribe client. :return: Information about the vocabulary. """ try: response = transcribe_client.get_vocabulary(VocabularyName=vocabulary_name) logger.info("Got vocabulary %s.", response["VocabularyName"]) except ClientError: logger.exception("Couldn't get vocabulary %s.", vocabulary_name) raise else: return response def update_vocabulary( vocabulary_name, language_code, transcribe_client, phrases=None, table_uri=None ): """ Updates an existing custom vocabulary. The entire vocabulary is replaced with the contents of the update. :param vocabulary_name: The name of the vocabulary to update. :param language_code: The language code of the vocabulary. :param transcribe_client: The Boto3 Transcribe client. :param phrases: A list of comma-separated phrases to include in the vocabulary. :param table_uri: A table of phrases and pronunciation hints to include in the vocabulary. """ try: vocab_args = {"VocabularyName": vocabulary_name, "LanguageCode": language_code} if phrases is not None: vocab_args["Phrases"] = phrases elif table_uri is not None: vocab_args["VocabularyFileUri"] = table_uri response = transcribe_client.update_vocabulary(**vocab_args) logger.info("Updated custom vocabulary %s.", response["VocabularyName"]) except ClientError: logger.exception("Couldn't update custom vocabulary %s.", vocabulary_name) raise def list_vocabularies(vocabulary_filter, transcribe_client): """ Lists the custom vocabularies created for this AWS account. :param vocabulary_filter: The returned vocabularies must contain this string in their names. :param transcribe_client: The Boto3 Transcribe client. :return: The list of retrieved vocabularies. """ try: response = transcribe_client.list_vocabularies(NameContains=vocabulary_filter) vocabs = response["Vocabularies"] next_token = response.get("NextToken") while next_token is not None: response = transcribe_client.list_vocabularies( NameContains=vocabulary_filter, NextToken=next_token ) vocabs += response["Vocabularies"] next_token = response.get("NextToken") logger.info( "Got %s vocabularies with filter %s.", len(vocabs), vocabulary_filter ) except ClientError: logger.exception( "Couldn't list vocabularies with filter %s.", vocabulary_filter ) raise else: return vocabs def delete_vocabulary(vocabulary_name, transcribe_client): """ Deletes a custom vocabulary. :param vocabulary_name: The name of the vocabulary to delete. :param transcribe_client: The Boto3 Transcribe client. """ try: transcribe_client.delete_vocabulary(VocabularyName=vocabulary_name) logger.info("Deleted vocabulary %s.", vocabulary_name) except ClientError: logger.exception("Couldn't delete vocabulary %s.", vocabulary_name) raise
랩퍼 함수를 호출하여 사용자 지정 어휘 없이 오디오를 트랜스크립션한 다음 다른 버전의 사용자 지정 어휘를 사용하여 트랜스크립션하면 결과가 개선됩니다.
def usage_demo(): """Shows how to use the Amazon Transcribe service.""" logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s") s3_resource = boto3.resource("s3") transcribe_client = boto3.client("transcribe") print("-" * 88) print("Welcome to the Amazon Transcribe demo!") print("-" * 88) bucket_name = f"jabber-bucket-{time.time_ns()}" print(f"Creating bucket {bucket_name}.") bucket = s3_resource.create_bucket( Bucket=bucket_name, CreateBucketConfiguration={ "LocationConstraint": transcribe_client.meta.region_name }, ) media_file_name = ".media/Jabberwocky.mp3" media_object_key = "Jabberwocky.mp3" print(f"Uploading media file {media_file_name}.") bucket.upload_file(media_file_name, media_object_key) media_uri = f"s3://{bucket.name}/{media_object_key}" job_name_simple = f"Jabber-{time.time_ns()}" print(f"Starting transcription job {job_name_simple}.") start_job( job_name_simple, f"s3://{bucket_name}/{media_object_key}", "mp3", "en-US", transcribe_client, ) transcribe_waiter = TranscribeCompleteWaiter(transcribe_client) transcribe_waiter.wait(job_name_simple) job_simple = get_job(job_name_simple, transcribe_client) transcript_simple = requests.get( job_simple["Transcript"]["TranscriptFileUri"] ).json() print(f"Transcript for job {transcript_simple['jobName']}:") print(transcript_simple["results"]["transcripts"][0]["transcript"]) print("-" * 88) print( "Creating a custom vocabulary that lists the nonsense words to try to " "improve the transcription." ) vocabulary_name = f"Jabber-vocabulary-{time.time_ns()}" create_vocabulary( vocabulary_name, "en-US", transcribe_client, phrases=[ "brillig", "slithy", "borogoves", "mome", "raths", "Jub-Jub", "frumious", "manxome", "Tumtum", "uffish", "whiffling", "tulgey", "thou", "frabjous", "callooh", "callay", "chortled", ], ) vocabulary_ready_waiter = VocabularyReadyWaiter(transcribe_client) vocabulary_ready_waiter.wait(vocabulary_name) job_name_vocabulary_list = f"Jabber-vocabulary-list-{time.time_ns()}" print(f"Starting transcription job {job_name_vocabulary_list}.") start_job( job_name_vocabulary_list, media_uri, "mp3", "en-US", transcribe_client, vocabulary_name, ) transcribe_waiter.wait(job_name_vocabulary_list) job_vocabulary_list = get_job(job_name_vocabulary_list, transcribe_client) transcript_vocabulary_list = requests.get( job_vocabulary_list["Transcript"]["TranscriptFileUri"] ).json() print(f"Transcript for job {transcript_vocabulary_list['jobName']}:") print(transcript_vocabulary_list["results"]["transcripts"][0]["transcript"]) print("-" * 88) print( "Updating the custom vocabulary with table data that provides additional " "pronunciation hints." ) table_vocab_file = "jabber-vocabulary-table.txt" bucket.upload_file(table_vocab_file, table_vocab_file) update_vocabulary( vocabulary_name, "en-US", transcribe_client, table_uri=f"s3://{bucket.name}/{table_vocab_file}", ) vocabulary_ready_waiter.wait(vocabulary_name) job_name_vocab_table = f"Jabber-vocab-table-{time.time_ns()}" print(f"Starting transcription job {job_name_vocab_table}.") start_job( job_name_vocab_table, media_uri, "mp3", "en-US", transcribe_client, vocabulary_name=vocabulary_name, ) transcribe_waiter.wait(job_name_vocab_table) job_vocab_table = get_job(job_name_vocab_table, transcribe_client) transcript_vocab_table = requests.get( job_vocab_table["Transcript"]["TranscriptFileUri"] ).json() print(f"Transcript for job {transcript_vocab_table['jobName']}:") print(transcript_vocab_table["results"]["transcripts"][0]["transcript"]) print("-" * 88) print("Getting data for jobs and vocabularies.") jabber_jobs = list_jobs("Jabber", transcribe_client) print(f"Found {len(jabber_jobs)} jobs:") for job_sum in jabber_jobs: job = get_job(job_sum["TranscriptionJobName"], transcribe_client) print( f"\t{job['TranscriptionJobName']}, {job['Media']['MediaFileUri']}, " f"{job['Settings'].get('VocabularyName')}" ) jabber_vocabs = list_vocabularies("Jabber", transcribe_client) print(f"Found {len(jabber_vocabs)} vocabularies:") for vocab_sum in jabber_vocabs: vocab = get_vocabulary(vocab_sum["VocabularyName"], transcribe_client) vocab_content = requests.get(vocab["DownloadUri"]).text print(f"\t{vocab['VocabularyName']} contents:") print(vocab_content) print("-" * 88) print("Deleting demo jobs.") for job_name in [job_name_simple, job_name_vocabulary_list, job_name_vocab_table]: delete_job(job_name, transcribe_client) print("Deleting demo vocabulary.") delete_vocabulary(vocabulary_name, transcribe_client) print("Deleting demo bucket.") bucket.objects.delete() bucket.delete() print("Thanks for watching!")
-
API 자세한 내용은 의 AWS SDK Python(Boto3) API 참조에 대한 다음 주제를 참조하세요.
-
개발자 안내서 및 코드 예제의 AWS SDK 전체 목록은 섹션을 참조하세요SDK와 함께 이 서비스 사용 AWS. 이 주제에는 시작하기에 대한 정보와 이전 SDK 버전에 대한 세부 정보도 포함되어 있습니다.
텍스트를 스피치로, 다시 스피치에서 텍스트로 변환
오디오 트랜스크립션 및 작업 데이터 가져오기