Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Create and query a basic table based on Amazon EMR log files

Focus mode
Create and query a basic table based on Amazon EMR log files - Amazon Athena

The following example creates a basic table, myemrlogs, based on log files saved to s3://aws-logs-123456789012-us-west-2/elasticmapreduce/j-2ABCDE34F5GH6/elasticmapreduce/. The Amazon S3 location used in the examples below reflects the pattern of the default log location for an EMR cluster created by Amazon Web Services account 123456789012 in Region us-west-2. If you use a custom location, the pattern is s3://amzn-s3-demo-bucket/ClusterID.

For information about creating a partitioned table to potentially improve query performance and reduce data transfer, see Create and query a partitioned table based on Amazon EMR logs.

CREATE EXTERNAL TABLE `myemrlogs`( `data` string COMMENT 'from deserializer') ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 's3://aws-logs-123456789012-us-west-2/elasticmapreduce/j-2ABCDE34F5GH6'

Example queries

The following example queries can be run on the myemrlogs table created by the previous example.

Example – Query step logs for occurrences of ERROR, WARN, INFO, EXCEPTION, FATAL, or DEBUG
SELECT data, "$PATH" FROM "default"."myemrlogs" WHERE regexp_like("$PATH",'s-86URH188Z6B1') AND regexp_like(data, 'ERROR|WARN|INFO|EXCEPTION|FATAL|DEBUG') limit 100;
Example – Query a specific instance log, i-00b3c0a839ece0a9c, for ERROR, WARN, INFO, EXCEPTION, FATAL, or DEBUG
SELECT "data", "$PATH" AS filepath FROM "default"."myemrlogs" WHERE regexp_like("$PATH",'i-00b3c0a839ece0a9c') AND regexp_like("$PATH",'state') AND regexp_like(data, 'ERROR|WARN|INFO|EXCEPTION|FATAL|DEBUG') limit 100;
Example – Query presto application logs for ERROR, WARN, INFO, EXCEPTION, FATAL, or DEBUG
SELECT "data", "$PATH" AS filepath FROM "default"."myemrlogs" WHERE regexp_like("$PATH",'presto') AND regexp_like(data, 'ERROR|WARN|INFO|EXCEPTION|FATAL|DEBUG') limit 100;
Example – Query Namenode application logs for ERROR, WARN, INFO, EXCEPTION, FATAL, or DEBUG
SELECT "data", "$PATH" AS filepath FROM "default"."myemrlogs" WHERE regexp_like("$PATH",'namenode') AND regexp_like(data, 'ERROR|WARN|INFO|EXCEPTION|FATAL|DEBUG') limit 100;
Example – Query all logs by date and hour for ERROR, WARN, INFO, EXCEPTION, FATAL, or DEBUG
SELECT distinct("$PATH") AS filepath FROM "default"."myemrlogs" WHERE regexp_like("$PATH",'2019-07-23-10') AND regexp_like(data, 'ERROR|WARN|INFO|EXCEPTION|FATAL|DEBUG') limit 100;

On this page

PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.