使用分割區投影在 Athena 中建立日誌的資料表 CloudTrail - Amazon Athena

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

使用分割區投影在 Athena 中建立日誌的資料表 CloudTrail

由於 CloudTrail 日誌具有已知結構,您可以預先指定其分割區結構,因此您可以使用 Athena 分割區投影功能來縮短查詢執行時間並自動化分割區管理。分割區投影會在新增資料時自動新增分割區。因此您無需使用 ALTER TABLE ADD PARTITION 手動新增分割區。

下列範例CREATE TABLE陳述式會自動在指定日期到單一 出現的 CloudTrail 日誌上使用分割區投影 AWS 區域。在 LOCATIONstorage.location.template子句中,取代 bucket, account-idaws-region 具有對應相同值的預留位置。對於 projection.timestamp.range,請取代 2020/01/01 您想要使用的開始日期。成功執行查詢之後,您可以查詢資料表。您無須執行 ALTER TABLE ADD PARTITION 就能載入分割區。

CREATE EXTERNAL TABLE cloudtrail_logs_pp( eventversion STRING, useridentity STRUCT< type: STRING, principalid: STRING, arn: STRING, accountid: STRING, invokedby: STRING, accesskeyid: STRING, username: STRING, onbehalfof: STRUCT< userid: STRING, identitystorearn: STRING>, sessioncontext: STRUCT< attributes: STRUCT< mfaauthenticated: STRING, creationdate: STRING>, sessionissuer: STRUCT< type: STRING, principalid: STRING, arn: STRING, accountid: STRING, username: STRING>, ec2roledelivery:string, webidfederationdata: STRUCT< federatedprovider: STRING, attributes: map<string,string>> > >, eventtime STRING, eventsource STRING, eventname STRING, awsregion STRING, sourceipaddress STRING, useragent STRING, errorcode STRING, errormessage STRING, requestparameters STRING, responseelements STRING, additionaleventdata STRING, requestid STRING, eventid STRING, readonly STRING, resources ARRAY<STRUCT< arn: STRING, accountid: STRING, type: STRING>>, eventtype STRING, apiversion STRING, recipientaccountid STRING, serviceeventdetails STRING, sharedeventid STRING, vpcendpointid STRING, vpcendpointaccountid STRING, eventcategory STRING, addendum STRUCT< reason:STRING, updatedfields:STRING, originalrequestid:STRING, originaleventid:STRING>, sessioncredentialfromconsole STRING, edgedevicedetails STRING, tlsdetails STRUCT< tlsversion:STRING, ciphersuite:STRING, clientprovidedhostheader:STRING> ) PARTITIONED BY ( `timestamp` string) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 's3://amzn-s3-demo-bucket/AWSLogs/account-id/CloudTrail/aws-region' TBLPROPERTIES ( 'projection.enabled'='true', 'projection.timestamp.format'='yyyy/MM/dd', 'projection.timestamp.interval'='1', 'projection.timestamp.interval.unit'='DAYS', 'projection.timestamp.range'='2020/01/01,NOW', 'projection.timestamp.type'='date', 'storage.location.template'='s3://amzn-s3-demo-bucket/AWSLogs/account-id/CloudTrail/aws-region/${timestamp}')

如需有關分割區投影的詳細資訊,請參閱搭配 Amazon Athena 使用分割區投影