在对象上使用 Amazon S3 Select 的示例
不再向新客户提供 Amazon S3 Select。Amazon S3 Select 的现有客户可以像往常一样继续使用该功能。了解更多
您可以使用 S3 Select,通过 Amazon S3 控制台、REST API 和 AWS SDK 从一个对象中选择内容。
有关 S3 Select 支持的 SQL 函数的更多信息,请参阅SQL 函数。
您可以使用 AWS SDK 从对象中选择内容。然而,如果您的应用程序需要它,则可以直接发送 REST 请求。有关请求和响应格式的更多信息,请参阅 SelectObjectContent。
您可以使用 Amazon S3 Select 通过 selectObjectContent
方法选择对象的一些内容。如果此方法成功,它将返回 SQL 表达式的结果。
- Java
-
以下 Java 代码返回对象 (包含以 CSV 格式存储的数据) 中存储的每条记录的第一列的值。它还请求返回 Progress
和 Stats
消息。必须提供有效的存储桶名称和包含 CSV 格式的数据的对象。
有关创建和测试有效示例的说明,请参阅《AWS SDK for Java 开发人员指南》中的入门。
package com.amazonaws;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import com.amazonaws.services.s3.model.CSVInput;
import com.amazonaws.services.s3.model.CSVOutput;
import com.amazonaws.services.s3.model.CompressionType;
import com.amazonaws.services.s3.model.ExpressionType;
import com.amazonaws.services.s3.model.InputSerialization;
import com.amazonaws.services.s3.model.OutputSerialization;
import com.amazonaws.services.s3.model.SelectObjectContentEvent;
import com.amazonaws.services.s3.model.SelectObjectContentEventVisitor;
import com.amazonaws.services.s3.model.SelectObjectContentRequest;
import com.amazonaws.services.s3.model.SelectObjectContentResult;
import java.io.File;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.concurrent.atomic.AtomicBoolean;
import static com.amazonaws.util.IOUtils.copy;
/**
* This example shows how to query data from S3Select and consume the response in the form of an
* InputStream of records and write it to a file.
*/
public class RecordInputStreamExample {
private static final String BUCKET_NAME = "${my-s3-bucket}";
private static final String CSV_OBJECT_KEY = "${my-csv-object-key}";
private static final String S3_SELECT_RESULTS_PATH = "${my-s3-select-results-path}";
private static final String QUERY = "select s._1 from S3Object s";
public static void main(String[] args) throws Exception {
final AmazonS3 s3Client = AmazonS3ClientBuilder.defaultClient();
SelectObjectContentRequest request = generateBaseCSVRequest(BUCKET_NAME, CSV_OBJECT_KEY, QUERY);
final AtomicBoolean isResultComplete = new AtomicBoolean(false);
try (OutputStream fileOutputStream = new FileOutputStream(new File (S3_SELECT_RESULTS_PATH));
SelectObjectContentResult result = s3Client.selectObjectContent(request)) {
InputStream resultInputStream = result.getPayload().getRecordsInputStream(
new SelectObjectContentEventVisitor() {
@Override
public void visit(SelectObjectContentEvent.StatsEvent event)
{
System.out.println(
"Received Stats, Bytes Scanned: " + event.getDetails().getBytesScanned()
+ " Bytes Processed: " + event.getDetails().getBytesProcessed());
}
/*
* An End Event informs that the request has finished successfully.
*/
@Override
public void visit(SelectObjectContentEvent.EndEvent event)
{
isResultComplete.set(true);
System.out.println("Received End Event. Result is complete.");
}
}
);
copy(resultInputStream, fileOutputStream);
}
/*
* The End Event indicates all matching records have been transmitted.
* If the End Event is not received, the results may be incomplete.
*/
if (!isResultComplete.get()) {
throw new Exception("S3 Select request was incomplete as End Event was not received.");
}
}
private static SelectObjectContentRequest generateBaseCSVRequest(String bucket, String key, String query) {
SelectObjectContentRequest request = new SelectObjectContentRequest();
request.setBucketName(bucket);
request.setKey(key);
request.setExpression(query);
request.setExpressionType(ExpressionType.SQL);
InputSerialization inputSerialization = new InputSerialization();
inputSerialization.setCsv(new CSVInput());
inputSerialization.setCompressionType(CompressionType.NONE);
request.setInputSerialization(inputSerialization);
OutputSerialization outputSerialization = new OutputSerialization();
outputSerialization.setCsv(new CSVOutput());
request.setOutputSerialization(outputSerialization);
return request;
}
}
- JavaScript
-
有关将 AWS SDK for JavaScript 与 S3 SelectObjectContent
API 操作一起使用从 Amazon S3 中存储的 JSON 和 CSV 文件中选择记录的 JavaScript 示例,请参阅博客文章在 AWS SDK for JavaScript 中引入对 Amazon S3 Select 的支持。
- Python
-
有关使用 SQL 查询来搜索通过使用 S3 Select 以逗号分隔值(CSV)文件形式加载到 Amazon S3 中的数据的 Python 示例,请参阅博客文章使用 Amazon S3 Select 在没有服务器或数据库的情况下查询数据。