The Amazon Athena AWS CMDB connector enables Athena to communicate with various AWS services so that you can query them with SQL.
This connector can be registered with Glue Data Catalog as a federated catalog. It supports data access controls defined in Lake Formation at the catalog, database, table, column, row, and tag levels. This connector uses Glue Connections to centralize configuration properties in Glue.
Prerequisites
Deploy the connector to your AWS account using the Athena console or the AWS Serverless Application Repository. For more information, see Create a data source connection or Use the AWS Serverless Application Repository to deploy a data source connector.
Parameters
Use the parameters in this section to configure the AWS CMDB connector.
We recommended that you configure a AWS CMDB connector by using a Glue
connections object. To do this, set the glue_connection
environment variable of the AWS CMDB connector Lambda to the name of the Glue
connection to use.
Glue connections properties
Use the following command to get the schema for a Glue connection object. This schema contains all the parameters that you can use to control your connection.
aws glue describe-connection-type --connection-type CMDB
Lambda environment properties
glue_connection – Specifies the name of the Glue connection associated with the federated connector.
Databases and tables
The Athena AWS CMDB connector makes the following databases and tables available for
querying your AWS resource inventory. For more information on the columns available in
each table, run a DESCRIBE
statement using the Athena console or API.database
.table
-
ec2 – This database contains Amazon EC2 related resources, including the following.
-
ebs_volumes – Contains details of your Amazon EBS volumes.
-
ec2_instances – Contains details of your EC2 Instances.
-
ec2_images – Contains details of your EC2 Instance images.
-
routing_tables – Contains details of your VPC Routing Tables.
-
security_groups – Contains details of your security groups.
-
subnets – Contains details of your VPC Subnets.
-
vpcs – Contains details of your VPCs.
-
emr – This database contains Amazon EMR related resources, including the following.
-
emr_clusters – Contains details of your EMR Clusters.
-
rds – This database contains Amazon RDS related resources, including the following.
-
rds_instances – Contains details of your RDS Instances.
-
s3 – This database contains RDS related resources, including the following.
-
buckets – Contains details of your Amazon S3 buckets.
-
objects – Contains details of your Amazon S3 objects, excluding their contents.
Required Permissions
For full details on the IAM policies that this
connector requires, review the Policies
section of the athena-aws-cmdb.yaml
-
Amazon S3 write access – The connector requires write access to a location in Amazon S3 in order to spill results from large queries.
-
Athena GetQueryExecution – The connector uses this permission to fast-fail when the upstream Athena query has terminated.
-
S3 List – The connector uses this permission to list your Amazon S3 buckets and objects.
-
EC2 Describe – The connector uses this permission to describe resources such as your Amazon EC2 instances, security groups, VPCs, and Amazon EBS volumes.
-
EMR Describe / List – The connector uses this permission to describe your EMR clusters.
-
RDS Describe – The connector uses this permission to describe your RDS Instances.
Performance
Currently, the Athena AWS CMDB connector does not support parallel scans. Predicate pushdown is performed within the Lambda function. Where possible, partial predicates are pushed to the services being queried. For example, a query for the details of a specific Amazon EC2 instance calls the EC2 API with the specific instance ID to run a targeted describe operation.
License information
The Amazon Athena AWS CMDB connector project is licensed under the Apache-2.0 License
Additional resources
For additional information about this connector, visit the corresponding site