Automate event-driven backups from CodeCommit to Amazon S3 using CodeBuild and CloudWatch Events - AWS Prescriptive Guidance

Automate event-driven backups from CodeCommit to Amazon S3 using CodeBuild and CloudWatch Events

Created by Kirankumar Chandrashekar (AWS)

Environment: Production

Technologies: DevOps; Storage & backup

Workload: All other workloads

AWS services: Amazon S3; Amazon CloudWatch; AWS CodeBuild; AWS CodeCommit

Summary

On the Amazon Web Services (AWS) Cloud, you can use AWS CodeCommit to host secure Git-based repositories. CodeCommit is a fully managed source control service. However, if a CodeCommit repository is accidentally deleted, its contents are also deleted and cannot be restored

This pattern describes how to automatically back up a CodeCommit repository to an Amazon Simple Storage Service (Amazon S3) bucket after a change is made to the repository. If the CodeCommit repository is later deleted, this backup strategy provides you with a point-in-time recovery option.

Prerequisites and limitations

Prerequisites 

  • An active AWS account.

  • An existing CodeCommit repository, with user access configured according to your requirements. For more information, see Setting up for AWS CodeCommit in the CodeCommit documentation.  

  • An S3 bucket for uploading the CodeCommit backups. 

Limitations

  • This pattern automatically backs up all of your CodeCommit repositories. If you want to back up individual CodeCommit repositories, you must modify the Amazon CloudWatch Events rule.

Architecture

The following diagram illustrates the workflow for this pattern.

AWS Cloud architecture showing Git push workflow from Users to S3 bucket via CodeCommit and CodeBuild.

The workflow consists of the following steps:

  1. Code is pushed to a CodeCommit repository.

  2. The CodeCommit repository notifies CloudWatch Events of a repository change (for example, a git push command).

  3. CloudWatch Events invokes AWS CodeBuild and sends it the CodeCommit repository information.

  4. CodeBuild clones the entire CodeCommit repository and packages it into a .zip file.

  5. CodeBuild uploads the .zip file to an S3 bucket.

Technology stack  

  • CloudWatch Events

  • CodeBuild

  • CodeCommit

  • Amazon S3

Tools

  • Amazon CloudWatch Events – CloudWatch Events delivers a near real-time stream of system events that describe changes in AWS resources.

  • AWS CodeBuild – CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy. 

  • AWS CodeCommit – CodeCommit is a fully managed source control service that hosts secure Git-based repositories. 

  • AWS Identity and Access Management (IAM) – IAM is a web service that helps you securely control access to AWS resources.

  • Amazon S3 – Amazon Simple Storage Service (Amazon S3) is storage for the internet.

Epics

TaskDescriptionSkills required
Create a CodeBuild service role.

Sign in to the AWS Management Console and open the IAM console. Choose Roles, and choose Create role. Create a service role for CodeBuild to clone the CodeCommit repository, upload files to the S3 bucket, and send logs to Amazon CloudWatch. For more information, see Create a CodeBuild service role in the CodeBuild documentation.

Cloud administrator
Create a CodeBuild project.

On the CodeBuild console, choose Create CodeBuild project. Create a CodeBuild project by using the buildspec.yml template from the Additional information section. For help with this story, see Create a build project in the CodeBuild documentation. 

Cloud administrator
TaskDescriptionSkills required
Create an IAM role for CloudWatch Events.

On the IAM console, choose Roles and create an IAM role for CloudWatch Events. For more information about this, see CloudWatch Events IAM role in the IAM documentation.

Important: You must add codebuild:StartBuild permissions to the IAM role for CloudWatch Events.

Cloud administrator
Create a CloudWatch Events rule.
  1. On the CloudWatch console, choose Events and then choose Rules. Choose Create rule, and use the CloudWatch Events rule from the Additional information section. This creates a rule that listens for event changes (for example, git push or git commit commands) in your CodeCommit repositories. For more information, see Create a CloudWatch Events rule for a CodeCommit source in the AWS CodePipeline documentation.

  2. Choose Targets, choose Topic, and then choose Configure input. Choose Input transformer, and use the input path and input template from the Additional information section. This ensures that your CodeCommit repository details are parsed and sent as environment variables to the CodeBuild project. For more information, see the input transformer tutorial in the CloudWatch documentation. 

  3. Choose Configure details, and enter a name and description for the rule. Choose Create rule.

Important: This CloudWatch Events rule describes changes in all your CodeCommit repositories. You must modify the CloudWatch Events rule if you want to back up individual CodeCommit repositories or use separate S3 buckets for different repository backups.

Cloud administrator

Related resources

Creating a CodeBuild project

Creating and configuring a CloudWatch Events rule

Additional information

CodeBuild buildspec.yml template

version: 0.2 phases: install: commands: - pip install git-remote-codecommit build: commands: - env - git clone -b $REFERENCE_NAME codecommit::$REPO_REGION://$REPOSITORY_NAME - dt=$(date '+%d-%m-%Y-%H:%M:%S'); - echo "$dt" - zip -yr $dt-$REPOSITORY_NAME-backup.zip ./ - aws s3 cp $dt-$REPOSITORY_NAME-backup.zip s3:// #substitute a valid S3 Bucket Name here

CloudWatch Events rule 

{ "source": [ "aws.codecommit" ], "detail-type": [ "CodeCommit Repository State Change" ], "detail": { "event": [ "referenceCreated", "referenceUpdated" ] } }

Sample input transformer for the CloudWatch Events rule target 

Input path:

{"referenceType":"$.detail.referenceType","region":"$.region","repositoryName":"$.detail.repositoryName","account":"$.account","referenceName":"$.detail.referenceName"}

Input template (please fill in the values as appropriate):

{ "environmentVariablesOverride": [ { "name": "REFERENCE_NAME", "value": "" }, { "name": "REFERENCE_TYPE", "value": "" }, { "name": "REPOSITORY_NAME", "value": "" }, { "name": "REPO_REGION", "value": "" }, { "name": "ACCOUNT_ID", "value": "" } ] }