本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
使用 Amazon Transcribe
以下示例显示如何使用 Amazon Transcribe 进行双向流式处理。双向流式处理意味着数据流同时转向服务且实时接收回来。此示例使用 Amazon Transcribe 流式转录发送音频流并实时收回转录的文本流。
要了解此功能的更多信息,请参阅《Amazon Transcribe Developer Guide》中的 Streaming Transcription。
要开始使用 Amazon Transcribe,请参阅《Amazon Transcribe Developer Guide》中的 Getting Started。
设置麦克风
此代码使用 javax.sound.sampled 软件包流式处理来自输入设备的音频。
代码
import javax.sound.sampled.AudioFormat; import javax.sound.sampled.AudioSystem; import javax.sound.sampled.DataLine; import javax.sound.sampled.TargetDataLine; public class Microphone { public static TargetDataLine get() throws Exception { AudioFormat format = new AudioFormat(16000, 16, 1, true, false); DataLine.Info datalineInfo = new DataLine.Info(TargetDataLine.class, format); TargetDataLine dataLine = (TargetDataLine) AudioSystem.getLine(datalineInfo); dataLine.open(format); return dataLine; } }
请参阅 GitHub 上的完整示例
创建发布者
该代码实现一个发布者,用于发布来自 Amazon Transcribe 音频流的音频数据。
代码
package com.amazonaws.transcribe; import java.io.IOException; import java.io.InputStream; import java.io.UncheckedIOException; import java.nio.ByteBuffer; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.atomic.AtomicLong; import org.reactivestreams.Publisher; import org.reactivestreams.Subscriber; import org.reactivestreams.Subscription; import software.amazon.awssdk.core.SdkBytes; import software.amazon.awssdk.services.transcribestreaming.model.AudioEvent; import software.amazon.awssdk.services.transcribestreaming.model.AudioStream; import software.amazon.awssdk.services.transcribestreaming.model.TranscribeStreamingException; public class AudioStreamPublisher implements Publisher<AudioStream> { private final InputStream inputStream; public AudioStreamPublisher(InputStream inputStream) { this.inputStream = inputStream; } @Override public void subscribe(Subscriber<? super AudioStream> s) { s.onSubscribe(new SubscriptionImpl(s, inputStream)); } private class SubscriptionImpl implements Subscription { private static final int CHUNK_SIZE_IN_BYTES = 1024 * 1; private ExecutorService executor = Executors.newFixedThreadPool(1); private AtomicLong demand = new AtomicLong(0); private final Subscriber<? super AudioStream> subscriber; private final InputStream inputStream; private SubscriptionImpl(Subscriber<? super AudioStream> s, InputStream inputStream) { this.subscriber = s; this.inputStream = inputStream; } @Override public void request(long n) { if (n <= 0) { subscriber.onError(new IllegalArgumentException("Demand must be positive")); } demand.getAndAdd(n); executor.submit(() -> { try { do { ByteBuffer audioBuffer = getNextEvent(); if (audioBuffer.remaining() > 0) { AudioEvent audioEvent = audioEventFromBuffer(audioBuffer); subscriber.onNext(audioEvent); } else { subscriber.onComplete(); break; } } while (demand.decrementAndGet() > 0); } catch (TranscribeStreamingException e) { subscriber.onError(e); } }); } @Override public void cancel() { } private ByteBuffer getNextEvent() { ByteBuffer audioBuffer; byte[] audioBytes = new byte[CHUNK_SIZE_IN_BYTES]; int len = 0; try { len = inputStream.read(audioBytes); if (len <= 0) { audioBuffer = ByteBuffer.allocate(0); } else { audioBuffer = ByteBuffer.wrap(audioBytes, 0, len); } } catch (IOException e) { throw new UncheckedIOException(e); } return audioBuffer; } private AudioEvent audioEventFromBuffer(ByteBuffer bb) { return AudioEvent.builder() .audioChunk(SdkBytes.fromByteBuffer(bb)) .build(); } } }
请参阅 GitHub 上的完整示例
创建客户端和启动流
在 main 方法中,创建请求对象,启动音频输入流,并通过音频输入实例化发布者。
您还必须创建 StartStreamTranscriptionResponseHandler
然后,使用 TranscribeStreamingAsyncClient 的 startStreamTranscription
方法启动双向流式处理。
导入
import javax.sound.sampled.AudioFormat; import javax.sound.sampled.AudioSystem; import javax.sound.sampled.DataLine; import javax.sound.sampled.TargetDataLine; import javax.sound.sampled.AudioInputStream; import software.amazon.awssdk.regions.Region; import software.amazon.awssdk.services.transcribestreaming.TranscribeStreamingAsyncClient; import software.amazon.awssdk.services.transcribestreaming.model.TranscribeStreamingException ; import software.amazon.awssdk.services.transcribestreaming.model.StartStreamTranscriptionRequest; import software.amazon.awssdk.services.transcribestreaming.model.MediaEncoding; import software.amazon.awssdk.services.transcribestreaming.model.LanguageCode; import software.amazon.awssdk.services.transcribestreaming.model.StartStreamTranscriptionResponseHandler; import software.amazon.awssdk.services.transcribestreaming.model.TranscriptEvent;
代码
public static void convertAudio(TranscribeStreamingAsyncClient client) throws Exception { try { StartStreamTranscriptionRequest request = StartStreamTranscriptionRequest.builder() .mediaEncoding(MediaEncoding.PCM) .languageCode(LanguageCode.EN_US) .mediaSampleRateHertz(16_000).build(); TargetDataLine mic = Microphone.get(); mic.start(); AudioStreamPublisher publisher = new AudioStreamPublisher(new AudioInputStream(mic)); StartStreamTranscriptionResponseHandler response = StartStreamTranscriptionResponseHandler.builder().subscriber(e -> { TranscriptEvent event = (TranscriptEvent) e; event.transcript().results().forEach(r -> r.alternatives().forEach(a -> System.out.println(a.transcript()))); }).build(); // Keeps Streaming until you end the Java program client.startStreamTranscription(request, publisher, response); } catch (TranscribeStreamingException e) { System.err.println(e.awsErrorDetails().errorMessage()); System.exit(1); } }
请参阅 GitHub 上的完整示例
更多信息
-
《Amazon Transcribe Developer Guide》中的 How It Works。
-
《Amazon Transcribe Developer Guide》中的 Getting Started With Streaming Audio。