Connecting Amazon Q Business to GitHub (Cloud) using AWS CloudFormation - Amazon Q Business

Connecting Amazon Q Business to GitHub (Cloud) using AWS CloudFormation

You use the AWS::QBusiness::DataSource resource to connect a data source to your Amazon Q application.

Use the configuration property to provide a JSON or YAML schema with the necessary configuration details specific to your data source connector.

To learn more about AWS CloudFormation, see What is AWS CloudFormation? in the AWS CloudFormation User Guide.

GitHub (Cloud) configuration properties

The following provides information about important configuration properties required in the schema.

Configuration Description Type Required

connectionConfiguration

Configuration information for the endpoint for the data source.

object

This property has a sub-property called repositoryEndpointMetadata.

Yes

repositoryEndpointMetadata

The endpoint information for the data source.

object

This property has the following sub-properties.

  • hostUrl

  • type

  • organizationName

Yes

hostUrl

The GitHub (Cloud) host URL. For example, if you use GitHub (Cloud) Enterprise Cloud: https://api.github.com.

string

Yes

type

The hosting method for your GitHub instance.

string

The only allowed value is SAAS.

Yes

organizationName

You can find your organization name when you log in to GitHub (Cloud) desktop and go to Your organizations under your profile picture dropdown.

string

Yes

repositoryConfigurations

Configuration information for the content of the data source. For example, configuring specific types of content and field mappings.

array

This propery has the following sub-properties:

  • ghRepository

  • ghCommit

  • ghIssueDocument

  • ghIssueComment

  • ghIssueAttachment

  • ghPRDocument

  • ghPRComment

  • ghPRAttachment

Yes
  • ghRepository

  • ghCommit

  • ghIssueDocument

  • ghIssueComment

  • ghIssueAttachment

  • ghPRDocument

  • ghPRComment

  • ghPRAttachment

A list of objects that map the attributes or field names of your GitHub (Cloud) pages and assets to Amazon Q index field names.

object

These properties have the following sub-properties.

  • indexFieldName

  • indexFieldType

  • dataSourceFieldName

  • dateFieldFormat

No

indexFieldName

The field name of your GitHub (Cloud) pages and assets.

string

Yes

indexFieldType

The field type of your GitHub (Cloud) pages and assets.

string

The allowed values are STRING, STRING_LIST, and DATE.

Yes

dataSourceFieldName

The data source field name of your GitHub (Cloud) pages and assets.

string

Yes

dateFieldFormat

The date format of your GitHub (Cloud) pages and assets.

string

Specify the date format in the form yyyy-MM-dd'T'HH:mm:ss'Z'

No

additionalProperties

Additional configuration options for your content in your data source.

object

This property has the following sub-properties.

  • isCrawlAcl

  • crawlRepository

  • crawlRepositoryDocuments

  • crawlIssue

  • crawlIssueComment

  • crawlIssueCommentAttachment

  • crawlPullRequest

  • crawlPullRequestComment

  • crawlPullRequestCommentAttachment

  • fieldForUserId

  • maxFileSizeInMegaBytes

  • repositoryFilter

  • inclusionFolderNamePatterns

  • inclusionFileTypePatterns

  • inclusionFileNamePatterns

  • exclusionFolderNamePatterns

  • exclusionFileTypePatterns

  • exclusionFileNamePatterns

Yes

isCrawlAcl

Specify true to crawl access control information from documents.

boolean

No

maxFileSizeInMegaBytes

Specify the maximum single file size limit in MBs that Amazon Q will crawl. Amazon Q will crawl only the files within the size limit you define. The default file size is 50MB. The maximum file size should be greater than 0MB and less than or equal to 50MB.

string

The allowed values are numbers between greater than 0 and less than or equal to 50.

No

fieldForUserId

Specify field to use for UserId for ACL crawling.

string

No
repositoryFilter A list of names of the specific repositories and branch names you want to index.

object

This property has the following sub-properties: repositoryName and branchNameList.

No

repositoryName

The list of repository names that you want to index.

string

No

branchNameList

The list of branch names that you want to index.

array (string)

No

crawlRepository

Specify true to crawl repositories.

boolean

No

crawlRepositoryDocuments

Specify true to crawl repository documents.

boolean

No

crawlIssue

Specify true to crawl issues.

boolean

No

crawlIssueComment

Specify true to crawl issue comments.

boolean

No

crawlIssueCommentAttachment

Specify true to crawl issue comment attachments.

boolean

No

crawlPullRequest

Specify true to crawl pull requests.

boolean

No

crawlPullRequestComment

Specify true to crawl pull request comments.

boolean

No

crawlPullRequestCommentAttachment

Specify true to crawl pull request comment attachments.

boolean

No
  • inclusionFolderNamePatterns

  • inclusionFileTypePatterns

  • inclusionFileNamePatterns

A list of regular expression patterns to include specific content in your GitHub (Cloud) data source. Content that matches the patterns are included in the index. Content that doesn't match the patterns are excluded from the index. If any content matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence, and the content isn't included in the index.

array (string)

No
  • exclusionFolderNamePatterns

  • exclusionFileTypePatterns

  • exclusionFileNamePatterns

A list of regular expression patterns to exclude specific content in your GitHub (Cloud) data source. Content that matches the patterns are included in the index. Content that doesn't match the patterns are excluded from the index. If any content matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence, and the content isn't included in the index.

array (string)

No

type

The type of data source. Specify GITHUB as your data source type.

string

Yes

enableIdentityCrawler

Specify true to use the Amazon Q identity crawler to sync identity/principal information on users and groups with access to specific documents.

boolean

Yes

syncMode

Specify whether Amazon Q should update your index by syncing all documents or only new, modified, and deleted documents.

string

You can choose between the following options:

  • Use FORCED_FULL_CRAWL to freshly re-crawl all content and replace existing content each time your data source syncs with your index.

  • Use FULL_CRAWL to incrementally crawl only new, modified, and deleted content each time your data source syncs with your index.

  • Use CHANGE_LOG to incrementally crawl only new and modified content each time your data source syncs with your index.

Yes

secretArn

The Amazon Resource Name (ARN) of an AWS Secrets Manager secret that contains the key-value pairs required to connect to your GitHub (Cloud).

string

The secret must contain a JSON structure with the following keys:

{ "personalToken": "token" }
No

version

The version of this template that's currently supported.

string

No

GitHub (Cloud) JSON schema for using the configuration property with AWS CloudFormation

The following is the GitHub (Cloud) JSON schema and examples for the configuration property for AWS CloudFormation.

GitHub (Cloud) JSON schema for using the configuration property with AWS CloudFormation

The following is the GitHub (Cloud) JSON schema for the configuration property for AWS CloudFormation

{ "type": "object", "properties": { "type": { "type": "string", "pattern": "GITHUB" }, "syncMode": { "type": "string", "enum": ["FULL_CRAWL", "FORCED_FULL_CRAWL", "CHANGE_LOG"] }, "secretArn": { "type": "string", "minLength": 20, "maxLength": 2048 }, "enableIdentityCrawler": { "anyOf": [ { "type": "boolean" }, { "type": "string", "enum": ["true", "false"] } ] }, "connectionConfiguration": { "type": "object", "properties": { "repositoryEndpointMetadata": { "type": "object", "properties": { "type": { "type": "string" }, "hostUrl": { "type": "string", "pattern": "https://.*" }, "organizationName": { "type": "string" } }, "required": ["type", "hostUrl", "organizationName"] } }, "required": ["repositoryEndpointMetadata"] }, "repositoryConfigurations": { "type": "object", "properties": { "ghRepository": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": ["fieldMappings"] }, "ghCommit": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": ["fieldMappings"] }, "ghIssueDocument": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": ["fieldMappings"] }, "ghIssueComment": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": ["fieldMappings"] }, "ghIssueAttachment": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": ["fieldMappings"] }, "ghPRDocument": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": ["fieldMappings"] }, "ghPRComment": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": ["fieldMappings"] }, "ghPRAttachment": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": ["fieldMappings"] } } }, "additionalProperties": { "type": "object", "properties": { "isCrawlAcl": { "anyOf": [ { "type": "boolean" }, { "type": "string", "enum": ["true", "false"] } ] }, "maxFileSizeInMegaBytes": { "type": "string" }, "fieldForUserId": { "type": "string" }, "crawlRepository": { "anyOf": [ { "type": "boolean" }, { "type": "string", "enum": ["true", "false"] } ] }, "crawlRepositoryDocuments": { "anyOf": [ { "type": "boolean" }, { "type": "string", "enum": ["true", "false"] } ] }, "crawlIssue": { "anyOf": [ { "type": "boolean" }, { "type": "string", "enum": ["true", "false"] } ] }, "crawlIssueComment": { "anyOf": [ { "type": "boolean" }, { "type": "string", "enum": ["true", "false"] } ] }, "crawlIssueCommentAttachment": { "anyOf": [ { "type": "boolean" }, { "type": "string", "enum": ["true", "false"] } ] }, "crawlPullRequest": { "anyOf": [ { "type": "boolean" }, { "type": "string", "enum": ["true", "false"] } ] }, "crawlPullRequestComment": { "anyOf": [ { "type": "boolean" }, { "type": "string", "enum": ["true", "false"] } ] }, "crawlPullRequestCommentAttachment": { "anyOf": [ { "type": "boolean" }, { "type": "string", "enum": ["true", "false"] } ] }, "repositoryFilter": { "type": "array", "items": [ { "type": "object", "properties": { "repositoryName": { "type": "string" }, "branchNameList": { "type": "array", "items": { "type": "string" } } } } ] }, "inclusionFolderNamePatterns": { "type": "array", "items": { "type": "string" } }, "inclusionFileTypePatterns": { "type": "array", "items": { "type": "string" } }, "inclusionFileNamePatterns": { "type": "array", "items": { "type": "string" } }, "exclusionFolderNamePatterns": { "type": "array", "items": { "type": "string" } }, "exclusionFileTypePatterns": { "type": "array", "items": { "type": "string" } }, "exclusionFileNamePatterns": { "type": "array", "items": { "type": "string" } }, "enableDeletionProtection": { "anyOf": [ { "type": "boolean" }, { "type": "string", "enum": ["true", "false"] } ], "default": false }, "deletionProtectionThreshold": { "type": "string", "default": "15" } }, "required": [] }, "version": { "type": "string", "anyOf": [ { "pattern": "1.0.0" } ] } }, "required": [ "syncMode", "enableIdentityCrawler", "connectionConfiguration", "repositoryConfigurations", "additionalProperties" ] }

GitHub (Cloud) JSON schema example for using the configuration property with AWS CloudFormation

The following is the GitHub (Cloud) JSON schema example for the configuration property for AWS CloudFormation

{ "AWSTemplateFormatVersion": "2010-09-09", "Description": "CloudFormation GITHUB Data Source Template", "Resources": { "DataSourceGitHub": { "Type": "AWS::QBusiness::DataSource", "Properties": { "ApplicationId": "app12345-1234-1234-1234-123456789012", "IndexId": "indx1234-1234-1234-1234-123456789012", "DisplayName": "MyGitHubDataSource", "RoleArn": "arn:aws:iam::123456789012:role/qbusiness-data-source-role", "Configuration": { "type": "GITHUB", "syncMode": "FULL_CRAWL", "secretArn": "arn:aws:secretsmanager:us-west-2:123456789012:secret:my-github-secret", "enableIdentityCrawler": "true", "sslCertificatePath": { "bucket": "my-github-bucket", "key": "certificates/my-cert.pem" }, "connectionConfiguration": { "repositoryEndpointMetadata": { "type": "GitHub", "hostUrl": "https://api.github.com", "organizationName": "my-org" } }, "repositoryConfigurations": { "ghRepository": { "fieldMappings": [ { "indexFieldName": "repo_name", "indexFieldType": "STRING", "dataSourceFieldName": "name", "dateFieldFormat": "yyyy-MM-dd'T'HH:mm:ss'Z'" } ] }, "ghCommit": { "fieldMappings": [ { "indexFieldName": "commit_id", "indexFieldType": "STRING", "dataSourceFieldName": "id", "dateFieldFormat": "yyyy-MM-dd'T'HH:mm:ss'Z'" } ] } }, "additionalProperties": { "isCrawlAcl": "true", "maxFileSizeInMegaBytes": "50", "crawlRepository": "true", "crawlIssue": "true", "repositoryFilter": [ { "repositoryName": "my-repo", "branchNameList": ["main", "develop"] } ], "inclusionFileTypePatterns": ["*.md", "*.txt"], "exclusionFileNamePatterns": ["*draft*"], "enableDeletionProtection": "false", "deletionProtectionThreshold": "15" } } } } } }

GitHub (Cloud) YAML schema for using the configuration property with AWS CloudFormation

The following is the GitHub (Cloud) YAML schema and examples for the configuration property for AWS CloudFormation:

GitHub (Cloud) YAML schema for using the configuration property with AWS CloudFormation

The following is the GitHub (Cloud) YAML schema for the configuration property for AWS CloudFormation.

type: object properties: type: type: string pattern: GITHUB syncMode: type: string enum: - FULL_CRAWL - FORCED_FULL_CRAWL - CHANGE_LOG secretArn: type: string minLength: 20 maxLength: 2048 enableIdentityCrawler: anyOf: - type: boolean - type: string enum: - true - false connectionConfiguration: type: object properties: repositoryEndpointMetadata: type: object properties: type: type: string hostUrl: type: string pattern: "https://.*" organizationName: type: string required: - type - hostUrl - organizationName required: - repositoryEndpointMetadata repositoryConfigurations: type: object properties: ghRepository: type: object properties: fieldMappings: type: array items: type: object properties: indexFieldName: type: string indexFieldType: type: string enum: - STRING - STRING_LIST - DATE dataSourceFieldName: type: string dateFieldFormat: type: string pattern: "yyyy-MM-dd'T'HH:mm:ss'Z'" required: - indexFieldName - indexFieldType - dataSourceFieldName required: - fieldMappings ghCommit: type: object properties: fieldMappings: type: array items: type: object properties: indexFieldName: type: string indexFieldType: type: string enum: - STRING - STRING_LIST - DATE dataSourceFieldName: type: string dateFieldFormat: type: string pattern: "yyyy-MM-dd'T'HH:mm:ss'Z'" required: - indexFieldName - indexFieldType - dataSourceFieldName required: - fieldMappings ghIssueDocument: type: object properties: fieldMappings: type: array items: type: object properties: indexFieldName: type: string indexFieldType: type: string enum: - STRING - STRING_LIST - DATE dataSourceFieldName: type: string dateFieldFormat: type: string pattern: "yyyy-MM-dd'T'HH:mm:ss'Z'" required: - indexFieldName - indexFieldType - dataSourceFieldName required: - fieldMappings ghIssueComment: type: object properties: fieldMappings: type: array items: type: object properties: indexFieldName: type: string indexFieldType: type: string enum: - STRING - STRING_LIST - DATE dataSourceFieldName: type: string dateFieldFormat: type: string pattern: "yyyy-MM-dd'T'HH:mm:ss'Z'" required: - indexFieldName - indexFieldType - dataSourceFieldName required: - fieldMappings ghIssueAttachment: type: object properties: fieldMappings: type: array items: type: object properties: indexFieldName: type: string indexFieldType: type: string enum: - STRING - STRING_LIST - DATE dataSourceFieldName: type: string dateFieldFormat: type: string pattern: "yyyy-MM-dd'T'HH:mm:ss'Z'" required: - indexFieldName - indexFieldType - dataSourceFieldName required: - fieldMappings ghPRDocument: type: object properties: fieldMappings: type: array items: type: object properties: indexFieldName: type: string indexFieldType: type: string enum: - STRING - STRING_LIST - DATE dataSourceFieldName: type: string dateFieldFormat: type: string pattern: "yyyy-MM-dd'T'HH:mm:ss'Z'" required: - indexFieldName - indexFieldType - dataSourceFieldName required: - fieldMappings ghPRComment: type: object properties: fieldMappings: type: array items: type: object properties: indexFieldName: type: string indexFieldType: type: string enum: - STRING - STRING_LIST - DATE dataSourceFieldName: type: string dateFieldFormat: type: string pattern: "yyyy-MM-dd'T'HH:mm:ss'Z'" required: - indexFieldName - indexFieldType - dataSourceFieldName required: - fieldMappings ghPRAttachment: type: object properties: fieldMappings: type: array items: type: object properties: indexFieldName: type: string indexFieldType: type: string enum: - STRING - STRING_LIST - DATE dataSourceFieldName: type: string dateFieldFormat: type: string pattern: "yyyy-MM-dd'T'HH:mm:ss'Z'" required: - indexFieldName - indexFieldType - dataSourceFieldName required: - fieldMappings additionalProperties: type: object properties: isCrawlAcl: anyOf: - type: boolean - type: string enum: - true - false maxFileSizeInMegaBytes: type: string fieldForUserId: type: string crawlRepository: anyOf: - type: boolean - type: string enum: - true - false crawlRepositoryDocuments: anyOf: - type: boolean - type: string enum: - true - false crawlIssue: anyOf: - type: boolean - type: string enum: - true - false crawlIssueComment: anyOf: - type: boolean - type: string enum: - true - false crawlIssueCommentAttachment: anyOf: - type: boolean - type: string enum: - true - false crawlPullRequest: anyOf: - type: boolean - type: string enum: - true - false crawlPullRequestComment: anyOf: - type: boolean - type: string enum: - true - false crawlPullRequestCommentAttachment: anyOf: - type: boolean - type: string enum: - true - false repositoryFilter: type: array items: type: object properties: repositoryName: type: string branchNameList: type: array items: type: string inclusionFolderNamePatterns: type: array items: type: string inclusionFileTypePatterns: type: array items: type: string inclusionFileNamePatterns: type: array items: type: string exclusionFolderNamePatterns: type: array items: type: string exclusionFileTypePatterns: type: array items: type: string exclusionFileNamePatterns: type: array items: type: string enableDeletionProtection: anyOf: - type: boolean - type: string enum: - true - false default: false deletionProtectionThreshold: type: string default: "15" required: [] version: type: string anyOf: - pattern: 1.0.0 required: - syncMode - enableIdentityCrawler - connectionConfiguration - repositoryConfigurations - additionalProperties

GitHub (Cloud) YAML schema example for using the configuration property with AWS CloudFormation

The following is the GitHub (Cloud) YAML example for the Configuration property for AWS CloudFormation:

AWSTemplateFormatVersion: "2010-09-09" Description: CloudFormation GITHUB Data Source Template Resources: DataSourceGitHub: Type: AWS::QBusiness::DataSource Properties: ApplicationId: app12345-1234-1234-1234-123456789012 IndexId: indx1234-1234-1234-1234-123456789012 DisplayName: MyGitHubDataSource RoleArn: arn:aws:iam::123456789012:role/qbusiness-data-source-role Configuration: type: GITHUB syncMode: FULL_CRAWL secretArn: arn:aws:secretsmanager:us-west-2:123456789012:secret:my-github-secret enableIdentityCrawler: "true" sslCertificatePath: bucket: my-github-bucket key: certificates/my-cert.pem connectionConfiguration: repositoryEndpointMetadata: type: GitHub hostUrl: https://api.github.com organizationName: my-org repositoryConfigurations: ghRepository: fieldMappings: - indexFieldName: repo_name indexFieldType: STRING dataSourceFieldName: name dateFieldFormat: yyyy-MM-dd'T'HH:mm:ss'Z' ghCommit: fieldMappings: - indexFieldName: commit_id indexFieldType: STRING dataSourceFieldName: id dateFieldFormat: yyyy-MM-dd'T'HH:mm:ss'Z' additionalProperties: isCrawlAcl: "true" maxFileSizeInMegaBytes: "50" crawlRepository: "true" crawlIssue: "true" repositoryFilter: - repositoryName: my-repo branchNameList: - main - develop inclusionFileTypePatterns: - "*.md" - "*.txt" exclusionFileNamePatterns: - "*draft*" enableDeletionProtection: "false" deletionProtectionThreshold: "15"