Automate AWS resource assessment - AWS Prescriptive Guidance

Automate AWS resource assessment

Created by Naveen Suthar (AWS), Arun Bagal (AWS), Manish Garg (AWS), and Sandeep Gawande (AWS)

Code repository: infrastructure-assessment-iac-automation

Environment: PoC or pilot

Technologies: DevOps; Infrastructure; Management & governance; Operations; Serverless

AWS services: Amazon Athena; AWS CloudTrail; AWS Lambda; Amazon S3; Amazon QuickSight

Summary

This pattern describes an automated approach for setting up resource assessment capabilities by using the AWS Cloud Development Kit (AWS CDK). By using this pattern, operations teams gather resource auditing details in an automated manner and view the details of all resources deployed in an AWS account on a single dashboard. This is helpful in the following use cases:

This solution will also help the leadership team obtain insights about the resources and activities in an AWS account from a single dashboard.

Note: Amazon QuickSight is a paid service. Before running it to analyze data and create a dashboard, review the Amazon QuickSight pricing.

Prerequisites and limitations

Prerequisites

Limitations

  • This solution is deployed to a single AWS account.

  • The solution will not track the events that happened before its deployment unless AWS CloudTrail was already set up and storing data in an S3 bucket.

Product versions

  • AWS CDK version 2.55.1 or later

  • Python version 3.9 or later

Architecture

Target technology stack

  • Amazon Athena

  • AWS CloudTrail

  • AWS Glue

  • AWS Lambda

  • Amazon QuickSight

  • Amazon S3

Target architecture

The AWS CDK code will deploy all the resources that are required to set up resource-assessment capabilities in an AWS account. The following diagram shows the process of sending CloudTrail logs to AWS Glue, Amazon Athena, and QuickSight.

AWS resource assessment with AWS Glue, Amazon Athena, and Amazon QuickSight in a six-step process.
  1. CloudTrail sends logs to an S3 bucket for storage.

  2. An event notification invokes a Lambda function that processes the logs and generates filtered data.

  3. The filtered data is stored in another S3 bucket.

  4. An AWS Glue crawler is set up on the filtered data that is in the S3 bucket to create a schema in the AWS Glue Data Catalog table.

  5. The filtered data is ready to be queried by Amazon Athena.

  6. The queried data is accessed by QuickSight for visualization.

Automation and scale

  • This solution can be scaled from one AWS account to multiple AWS accounts if there is an organization-wide CloudTrail trail in AWS Organizations. By deploying CloudTrail at the organizational level, you can also use this solution to fetch resource-auditing details for all the required resources.

  • This pattern uses AWS serverless resources to deploy the solution.

Tools

AWS services

  • Amazon Athena is an interactive query service that helps you analyze data directly in Amazon S3 by using standard SQL.

  • AWS Cloud Development Kit (AWS CDK) is a software development framework that helps you define and provision AWS Cloud infrastructure in code.

  • AWS CloudFormation helps you set up AWS resources, provision them quickly and consistently, and manage them throughout their lifecycle across AWS accounts and AWS Regions.

  • AWS CloudTrail helps you audit the governance, compliance, and operational risk of your AWS account.

  • AWS Glue is a fully managed extract, transform, and load (ETL) service. It helps you reliably categorize, clean, enrich, and move data between data stores and data streams. This pattern uses an AWS Glue crawler and an AWS Glue Data Catalog table.

  • AWS Lambda is a compute service that helps you run code without needing to provision or manage servers. It runs your code only when needed and scales automatically, so you pay only for the compute time that you use.

  • Amazon QuickSight is a cloud-scale business intelligence (BI) service that helps you visualize, analyze, and report your data in a single dashboard.

  • Amazon Simple Storage Service (Amazon S3) is a cloud-based object storage service that helps you store, protect, and retrieve any amount of data.

Code repository

The code for this pattern is available in the GitHub infrastructure-assessment-iac-automation repository.

The code repository contains the following files and folders:

  • lib folder – The AWS CDK construct Python files used to create AWS resources

  • src/lambda_code – The Python code that is run in the Lambda function

  • requirements.txt – The list of all Python dependencies that must be installed

  • cdk.json – The input file to provide values required to spin up resources

Best practices

Set up monitoring and alerting for the Lambda function. For more information, see Monitoring and troubleshooting Lambda functions. For general best practices when working with Lambda functions, see the AWS documentation.

Epics

TaskDescriptionSkills required

Clone the repo on your local machine.

To clone the repository, run the command git clone https://github.com/aws-samples/infrastructure-assessment-iac-automation.git.

AWS DevOps, DevOps engineer

Set up the Python virtual environment and install required dependencies.

To set up the Python virtual environment, run the following commands.

cd infrastructure-assessment-iac-automation python3 -m venv .venv source .venv/bin/activate

To set up the required dependencies, run the command pip install -r requirements.txt.

AWS DevOps, DevOps engineer

Set up the AWS CDK environment and synthesize the AWS CDK code.

  1. To set up the AWS CDK environment in your AWS account, run the command cdk bootstrap aws://ACCOUNT-NUMBER/REGION.

  2. To convert the code to an AWS CloudFormation stack configuration, run the command cdk synth.

AWS DevOps, DevOps engineer
TaskDescriptionSkills required

Export variables for the account and Region where the stack will be deployed.

To provide AWS credentials for AWS CDK by using environment variables, run the following commands.

export CDK_DEFAULT_ACCOUNT=<12 Digit AWS Account Number> export CDK_DEFAULT_REGION=<region>
AWS DevOps, DevOps engineer

Set up the AWS CLI profile.

To set up the AWS CLI profile for the account, follow the instructions in the AWS documentation.

AWS DevOps, DevOps engineer
TaskDescriptionSkills required

Deploy resources in the account.

To deploy resources in the AWS account by using AWS CDK, do the following:

  1. In the root of the cloned repository, in the cdk.json file, provide inputs for the following parameters:

    • s3_context

    • ct_context

    • kms_context

    • lambda_context

    • glue_context

    • qs_context

    These values define resource configurations and nomenclature. Default values are set and can be changed if required.

    Note: To avoid an error saying that the S3 bucket already exists, make sure to provide unique names for s3_context in the ct and output sections.

  2. To deploy resources, run the command cdk deploy.

    The cdk deploy command creates a CloudTrail resource to log events and save the log file in the input S3 bucket. The trail's log files will be processed by the Lambda function. The filtered results are stored in the output S3 bucket and are ready to be consumed by Amazon Athena and Amazon QuickSight.

AWS DevOps

Run the AWS Glue crawler and create the Data Catalog table.

An AWS Glue crawler is used to keep the data schema dynamic. The solution creates and updates partitions in the AWS Glue Data Catalog table by running the crawler periodically as defined by the AWS Glue crawler scheduler. After the data is available in the output S3 bucket, use the following steps to run the AWS Glue crawler and create the Data Catalog table schema for testing:

  1. Sign in to the AWS Management Console and navigate to the AWS Glue console.

  2. In the navigation pane, under Data Catalog, choose Crawlers.

  3. Select the iac-tool-qa-resource-iac-json-crawler crawler.

  4. Run the crawler.

  5. After the crawler runs successfully, it creates an AWS Glue Data Catalog table. AWS QuickSight will use the table to visualize the data.

Note: The AWS CDK code configures the AWS Glue crawler to run at a particular time, but you can also run it on demand.

AWS DevOps, DevOps engineer

Deploy the QuickSight construct.

  1. To deploy the QuickSight construct, uncomment the code between #QuickSight setup - start and #QuickSight setup – ends in resource_iac_tool_stack.py.

  2. After you uncomment, run the cdk deploy command to create QuickSight DataSource and QuickSight DataSet in the QuickSight account.

AWS DevOps, DevOps engineer

Create the QuickSight dashboard.

To create the example QuickSight dashboard and analysis, do the following:

  1. Navigate to the QuickSight console and select the AWS Region where resources are deployed.

  2. In the navigation pane, choose Datasets, and validate that a dataset named ct-operations-iac-ds has been created in the Amazon QuickSight dataset.

    If you don't see the dataset, redeploy the QuickSight construct.

  3. Select the ct-operations-iac-ds dataset, and choose USE IN ANALYSIS.

  4. Select the default sheet.

  5. Select the respective columns from the field list on the left side.

  6. After selecting the required columns, select the appropriate visual type to view the data.

For more information, see Starting an analysis in Amazon QuickSight and Visual types in Amazon QuickSight.

AWS DevOps, DevOps engineer
TaskDescriptionSkills required

Remove the AWS resources.

  1. To remove AWS resources deployed by the solution, run the command cdk destroy.

  2. Delete all objects from the two S3 buckets, and then remove the buckets.

    For more information, see Deleting a bucket.

AWS DevOps, DevOps engineer
TaskDescriptionSkills required

Monitor and clean up manually created resources.

(Optional) If your organization has compliance requirements to create resources using IaC tools, you can achieve compliance by using AWS resource-assessment tool automation to fetch manually provisioned resources. You can also use the tool to import the resources to an IaC tool or to re-create them. To monitor manually provisioned resources, perform the following high-level tasks:

  1. Deploy AWS resource-assessment tool automation.

  2. Set up a Lambda function to query the Athena tables on a daily basis, find the relevant data about manually provisioned resources, and export it to a comma-separated values (CSV) file.

  3. After the Lambda function runs, a notification with the required data can be sent to respective stakeholders.

  4. For longer retention, the .csv file can be stored in the S3 bucket.

  5. Based on the information in the .csv file, delete the manually created resources or import them to an existing IaC solution.

AWS DevOps, DevOps engineer

Troubleshooting

IssueSolution

AWS CDK returns errors.

For help with AWS CDK issues, see Troubleshooting common AWS CDK issues.

Related resources

Additional information

Multiple accounts

To set up the AWS CLI credential for multiple accounts, use AWS profiles. For more information, see the Configure multiple profiles section in Set up the AWS CLI.

AWS CDK commands

When working with AWS CDK, keep in mind the following useful commands:

  • Lists all stacks in the app

    cdk ls
  • Emits the synthesized AWS CloudFormation template

    cdk synth
  • Deploys the stack to your default AWS account and Region

    cdk deploy
  • Compares the deployed stack with the current state

    cdk diff
  • Opens the AWS CDK documentation

    cdk docs