為不分割的記 AWS WAF 錄建立表格 - Amazon Athena

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

為不分割的記 AWS WAF 錄建立表格

本節說明如何在不進行資料分割或分割區投影的情況下為 AWS WAF 記錄建立資料表。

注意

出於性能和成本原因,我們不建議使用非分區的結構描述進行查詢。如需詳細資訊,請參閱大數據部落格中的 Amazon Athena 十 AWS 大效能調整秘訣

建立 AWS WAF 表格的步驟
  1. 將下列DDL陳述式複製並貼到 Athena 主控台。視需要修改欄位,以符合您的日誌輸出。修改 Amazon S3 儲存貯體的 LOCATION 以與存放日誌的儲存貯體相對應。

    此查詢會使用 OpenX JSON SerDe

    注意

    SerDe 預期每個JSON文件都位於單行文字上,記錄中的欄位之間沒有行終止字元。如果JSON文本是漂亮的打印格式,你可能會收到一個錯誤消息,如 HIVECURSOR_ _ERROR: 行不是一個有效的JSON對象HIVECURSOR_ _ ERROR JsonParseException end-of-input: 意外:OBJECT當你嘗試查詢表後,你創建它的預期關閉標記。如需詳細資訊,請參閱上 GitHub的 OpenX SerDe 文件中的資JSON料檔案。

    CREATE EXTERNAL TABLE `waf_logs`( `timestamp` bigint, `formatversion` int, `webaclid` string, `terminatingruleid` string, `terminatingruletype` string, `action` string, `terminatingrulematchdetails` array < struct < conditiontype: string, sensitivitylevel: string, location: string, matcheddata: array < string > > >, `httpsourcename` string, `httpsourceid` string, `rulegrouplist` array < struct < rulegroupid: string, terminatingrule: struct < ruleid: string, action: string, rulematchdetails: array < struct < conditiontype: string, sensitivitylevel: string, location: string, matcheddata: array < string > > > >, nonterminatingmatchingrules: array < struct < ruleid: string, action: string, overriddenaction: string, rulematchdetails: array < struct < conditiontype: string, sensitivitylevel: string, location: string, matcheddata: array < string > > >, challengeresponse: struct < responsecode: string, solvetimestamp: string >, captcharesponse: struct < responsecode: string, solvetimestamp: string > > >, excludedrules: string > >, `ratebasedrulelist` array < struct < ratebasedruleid: string, limitkey: string, maxrateallowed: int > >, `nonterminatingmatchingrules` array < struct < ruleid: string, action: string, rulematchdetails: array < struct < conditiontype: string, sensitivitylevel: string, location: string, matcheddata: array < string > > >, challengeresponse: struct < responsecode: string, solvetimestamp: string >, captcharesponse: struct < responsecode: string, solvetimestamp: string > > >, `requestheadersinserted` array < struct < name: string, value: string > >, `responsecodesent` string, `httprequest` struct < clientip: string, country: string, headers: array < struct < name: string, value: string > >, uri: string, args: string, httpversion: string, httpmethod: string, requestid: string >, `labels` array < struct < name: string > >, `captcharesponse` struct < responsecode: string, solvetimestamp: string, failureReason: string >, `challengeresponse` struct < responsecode: string, solvetimestamp: string, failureReason: string >, `ja3Fingerprint` string, `oversizefields` string, `requestbodysize` int, `requestbodysizeinspectedbywaf` int ) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 's3://amzn-s3-demo-bucket/prefix/'
  2. 在 Athena 主控台查詢編輯器中執行 CREATE EXTERNAL TABLE 陳述式。這會註冊 waf_logs 資料表,並讓其中的資料可用於從 Athena 進行查詢。