

# Getting started
<a name="getting-started"></a>

This section shows you how to create a data source and add your documents to an Amazon Kendra index. Instructions are provided for the AWS console, the AWS CLI, a Python program using the AWS SDK for Python (Boto3), and a Java program using the AWS SDK for Java.

**Topics**
+ [Prerequisites](gs-prerequisites.md)
+ [Getting started with the Amazon Kendra console](gs-console.md)
+ [Getting started (AWS CLI)](gs-cli.md)
+ [Getting started (AWS SDK for Python (Boto3))](gs-python.md)
+ [Getting started (AWS SDK for Java)](gs-java.md)
+ [Getting started with an Amazon S3 data source (console)](getting-started-s3.md)
+ [Getting started with a MySQL database data source (console)](getting-started-mysql.md)
+ [Getting started with an AWS IAM Identity Center identity source (console)](getting-started-aws-sso.md)

# Prerequisites
<a name="gs-prerequisites"></a>

The following steps are prequisites for the getting started exercises. The steps show you how to set up your account, create an IAM role that gives Amazon Kendra permission to make calls on your behalf, and index documents from an Amazon S3 bucket. An S3 bucket is used as an example, but you can use other data sources that Amazon Kendra supports. See [Data sources](https://docs.aws.amazon.com/kendra/latest/dg/hiw-data-source.html).

## Sign up for an AWS account
<a name="sign-up-for-aws"></a>

If you do not have an AWS account, complete the following steps to create one.

**To sign up for an AWS account**

1. Open [https://portal.aws.amazon.com/billing/signup](https://portal.aws.amazon.com/billing/signup).

1. Follow the online instructions.

   Part of the sign-up procedure involves receiving a phone call or text message and entering a verification code on the phone keypad.

   When you sign up for an AWS account, an *AWS account root user* is created. The root user has access to all AWS services and resources in the account. As a security best practice, assign administrative access to a user, and use only the root user to perform [tasks that require root user access](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_root-user.html#root-user-tasks).

AWS sends you a confirmation email after the sign-up process is complete. At any time, you can view your current account activity and manage your account by going to [https://aws.amazon.com/](https://aws.amazon.com/) and choosing **My Account**.

## Create a user with administrative access
<a name="create-an-admin"></a>

After you sign up for an AWS account, secure your AWS account root user, enable AWS IAM Identity Center, and create an administrative user so that you don't use the root user for everyday tasks.

**Secure your AWS account root user**

1.  Sign in to the [AWS Management Console](https://console.aws.amazon.com/) as the account owner by choosing **Root user** and entering your AWS account email address. On the next page, enter your password.

   For help signing in by using root user, see [Signing in as the root user](https://docs.aws.amazon.com/signin/latest/userguide/console-sign-in-tutorials.html#introduction-to-root-user-sign-in-tutorial) in the *AWS Sign-In User Guide*.

1. Turn on multi-factor authentication (MFA) for your root user.

   For instructions, see [Enable a virtual MFA device for your AWS account root user (console)](https://docs.aws.amazon.com/IAM/latest/UserGuide/enable-virt-mfa-for-root.html) in the *IAM User Guide*.

**Create a user with administrative access**

1. Enable IAM Identity Center.

   For instructions, see [Enabling AWS IAM Identity Center](https://docs.aws.amazon.com//singlesignon/latest/userguide/get-set-up-for-idc.html) in the *AWS IAM Identity Center User Guide*.

1. In IAM Identity Center, grant administrative access to a user.

   For a tutorial about using the IAM Identity Center directory as your identity source, see [ Configure user access with the default IAM Identity Center directory](https://docs.aws.amazon.com//singlesignon/latest/userguide/quick-start-default-idc.html) in the *AWS IAM Identity Center User Guide*.

**Sign in as the user with administrative access**
+ To sign in with your IAM Identity Center user, use the sign-in URL that was sent to your email address when you created the IAM Identity Center user.

  For help signing in using an IAM Identity Center user, see [Signing in to the AWS access portal](https://docs.aws.amazon.com/signin/latest/userguide/iam-id-center-sign-in-tutorial.html) in the *AWS Sign-In User Guide*.

**Assign access to additional users**

1. In IAM Identity Center, create a permission set that follows the best practice of applying least-privilege permissions.

   For instructions, see [ Create a permission set](https://docs.aws.amazon.com//singlesignon/latest/userguide/get-started-create-a-permission-set.html) in the *AWS IAM Identity Center User Guide*.

1. Assign users to a group, and then assign single sign-on access to the group.

   For instructions, see [ Add groups](https://docs.aws.amazon.com//singlesignon/latest/userguide/addgroups.html) in the *AWS IAM Identity Center User Guide*.
+ If you are using an S3 bucket containing documents to test Amazon Kendra, create an S3 bucket in the same region that you are using Amazon Kendra. For instructions, see [Creating and Configuring an S3 Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-configure-bucket.html) in the *Amazon Simple Storage Service User Guide*.

  Upload your documents to your S3 bucket. For instructions, see [Uploading, Downloading, and Managing Objects](https://docs.aws.amazon.com/AmazonS3/latest/userguide/upload-download-objects.html) in the *Amazon Simple Storage Service User Guide*.

  If you are using another data source, you must have an active site and credentials to connect to the data source.

If you are using the console to get started, start with [Getting started with the Amazon Kendra console](gs-console.md).

## Amazon Kendra resources: AWS CLI, SDK, console
<a name="gs-prereq-cli-sdk"></a>

There are certain permissions required if you use CLI, SDK, or the console.

To use Amazon Kendra for the CLI, SDK, or console you must have permissions to allow Amazon Kendra to create and manage resources on your behalf. Depending on your use case, these permissions include access to the Amazon Kendra API itself, AWS KMS keys if you want to encrypt your data through a custom CMK, Identity Center directory if you want to integrate with AWS IAM Identity Center or [create a Search Experience](https://docs.aws.amazon.com/kendra/latest/dg/deploying-search-experience-no-code.html). For a full list of permissions for different use cases, see [IAM roles](https://docs.aws.amazon.com/kendra/latest/dg/iam-roles.html).

First, you must attach the below permissions to your IAM user.

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Sid": "Stmt1644430853544",
      "Action": [
        "kms:CreateGrant",
        "kms:DescribeKey"
      ],
      "Effect": "Allow",
      "Resource": "*"
    },
    {
      "Sid": "Stmt1644430878150",
      "Action": "kendra:*",
      "Effect": "Allow",
      "Resource": "*"
    },
    {
      "Sid": "Stmt1644430973706",
      "Action": [
        "sso:AssociateProfile",
        "sso:CreateManagedApplicationInstance",
        "sso:DeleteManagedApplicationInstance",
        "sso:DisassociateProfile",
        "sso:GetManagedApplicationInstance",
        "sso:GetProfile",
        "sso:ListDirectoryAssociations",
        "sso:ListProfileAssociations",
        "sso:ListProfiles"
      ],
      "Effect": "Allow",
      "Resource": "*"
    },
    {
      "Sid": "Stmt1644430999558",
      "Action": [
        "sso-directory:DescribeGroup",
        "sso-directory:DescribeGroups",
        "sso-directory:DescribeUser",
        "sso-directory:DescribeUsers"
      ],
      "Effect": "Allow",
      "Resource": "*"
    },
    {
      "Sid": "Stmt1644431025960",
      "Action": [
        "identitystore:DescribeGroup",
        "identitystore:DescribeUser",
        "identitystore:ListGroups",
        "identitystore:ListUsers"
      ],
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}
```

------

Second, if you use the CLI or SDK, you must also create an IAM role and policy to access Amazon CloudWatch Logs. If you are using the console, you don't need to create an IAM role and policy for this. You create this as part of the console procedure.

**To create an IAM role and policy for the AWS CLI and SDK that allows Amazon Kendra to access your Amazon CloudWatch Logs.**

1. Sign in to the AWS Management Console and open the IAM console at [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/).

1. From the left menu, choose **Policies** and then choose **Create policy**.

1. Choose **JSON** and then replace the default policy with the following:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Effect": "Allow",
               "Action": [
                   "cloudwatch:PutMetricData"
               ],
               "Resource": "*",
               "Condition": {
                   "StringEquals": {
                       "cloudwatch:namespace": "AWS/Kendra"
                   }
               }
           },
           {
               "Effect": "Allow",
               "Action": [
                   "logs:DescribeLogGroups"
               ],
               "Resource": "*"
           },
           {
               "Effect": "Allow",
               "Action": [
                   "logs:CreateLogGroup"
               ],
               "Resource": [
                   "arn:aws:logs:us-east-1:123456789012:log-group:/aws/kendra/*"
               ]
           },
           {
               "Effect": "Allow",
               "Action": [
                   "logs:DescribeLogStreams",
                   "logs:CreateLogStream",
                   "logs:PutLogEvents"
               ],
               "Resource": [
                   "arn:aws:logs:us-east-1:123456789012:log-group:/aws/kendra/*:log-stream:*"
               ]
           }
       ]
   }
   ```

------

1. Choose **Review policy**.

1. Name the policy "KendraPolicyForGettingStartedIndex" and then choose **Create policy**.

1. From the left menu, choose **Roles** and then choose **Create role**.

1. Choose **Another AWS account** and then type your account ID in **Account ID**. Choose **Next: Permissions**.

1. Choose the policy that you created above and then choose **Next: Tags**

1. Don't add any tags. Choose **Next: Review**.

1. Name the role "KendraRoleForGettingStartedIndex" and then choose **Create role**.

1. Find the role that you just created. Choose the role name to open the summary. Choose **Trust relationships** and then choose **Edit trust relationship**.

1. Replace the existing trust relationship with the following:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
         {
           "Effect": "Allow",
           "Principal": {
             "Service": "kendra.amazonaws.com"
           },
           "Action": "sts:AssumeRole"
         }
       ]
   }
   ```

------

1. Choose **Update trust policy**.

Third, if you use an Amazon S3 to store your documents or you are using S3 to test Amazon Kendra, you also must create an IAM role and policy to access your bucket. If you are using another data source, see [IAM roles for data sources](https://docs.aws.amazon.com/kendra/latest/dg/iam-roles.html#iam-roles-ds).

**To create an IAM role and policy that allows Amazon Kendra to access and index your Amazon S3 bucket.**

1. Sign in to the AWS Management Console and open the IAM console at [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/).

1. From the left menu, choose **Policies** and then choose **Create policy**.

1. Choose **JSON** and then replace the default policy with the following:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Action": [
                   "s3:GetObject"
               ],
               "Resource": [
                   "arn:aws:s3:::bucket name/*"
               ],
               "Effect": "Allow"
           },
           {
               "Action": [
                   "s3:ListBucket"
               ],
               "Resource": [
                   "arn:aws:s3:::bucket name"
               ],
               "Effect": "Allow"
           },
           {
               "Effect": "Allow",
               "Action": [
                   "kendra:BatchPutDocument",
                   "kendra:BatchDeleteDocument"
               ],
               "Resource": "arn:aws:kendra:us-east-1:123456789012:index/*"
           }
       ]
   }
   ```

------

1. Choose **Review policy**.

1. Name the policy "KendraPolicyForGettingStartedDataSource" and then choose **Create policy**.

1. From the left menu, choose **Roles** and then choose **Create role**.

1. Choose **Another AWS account** and then type your account ID in **Account ID**. Choose **Next: Permissions**.

1. Choose the policy that you created above and then choose **Next: Tags**

1. Don't add any tags. Choose **Next: Review**.

1. Name the role "KendraRoleForGettingStartedDataSource" and then choose **Create role**.

1. Find the role that you just created. Choose the role name to open the summary. Choose **Trust relationships** and then choose **Edit trust relationship**.

1. Replace the existing trust relationship with the following:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
         {
           "Effect": "Allow",
           "Principal": {
             "Service": "kendra.amazonaws.com"
           },
           "Action": "sts:AssumeRole"
         }
       ]
   }
   ```

------

1. Choose **Update trust policy**.

Depending on how you want to use the Amazon Kendra API, do one of the following.
+ [Getting started (AWS CLI)](gs-cli.md)
+ [Getting started (AWS SDK for Java)](gs-java.md)
+ [Getting started (AWS SDK for Python (Boto3))](gs-python.md)

# Getting started with the Amazon Kendra console
<a name="gs-console"></a>

The following procedures show how to create and test an Amazon Kendra index by using the AWS console. In the procedures you create an index and a data source for an index. Finally, you test your index by making a search request. 

**Step 1: To create an index (console)**

1. Sign in to the AWS Management Console and open the Amazon Kendra console at [https://console.aws.amazon.com/kendra/](https://console.aws.amazon.com/kendra/).

1. Select **Create index** in the **Indexes** section.

1. In the **Specify index details** page, give your index a name and a description.

1. In **IAM role**, choose **Create a new role** and then give the role a name. The IAM role will have the prefix "AmazonKendra-".

1. Leave all of the other fields at their defaults. Choose **Next**.

1. In the **Configure user access control** page, choose **Next**.

1. In the **Provisioning details** page, choose **Developer edition**.

1. Choose **Create** to create your index.

1. Wait for your index to be created. Amazon Kendra provisions the hardware for your index. This operation can take some time.<a name="gs-data-source"></a>

**Step 2: To add a data source to an index (console)**

1. View the available [data sources](https://docs.aws.amazon.com/kendra/latest/dg/data-source.html) to connect Amazon Kendra to and index your documents.

1. In the navigation pane, select **Data sources** and then select **Add data source** for your chosen data source.

1. Follow the steps to configure the data source.<a name="gs-search"></a>

**Step 3: To search an index (console)**

1. In the navigation pane, choose the option to search your index.

1. Enter a search term that's appropriate for your index. The **top results** and **top document** results are shown.

# Getting started (AWS CLI)
<a name="gs-cli"></a>

The following procedure shows how to create an Amazon Kendra index using the AWS CLI. The procedure creates a data source, index, and runs a query on the index.

**To create an Amazon Kendra index (CLI)**

1. Do the [Prerequisites](gs-prerequisites.md).

1. Enter the following command to create an index.

   ```
   aws kendra create-index \
    --name cli-getting-started-index \
    --description "Index for CLI getting started guide." \
    --role-arn arn:aws:iam::account id:role/KendraRoleForGettingStartedIndex
   ```

1. Wait for Amazon Kendra to create the index. Check the progress using the following command. When the status field is `ACTIVE`, go on to the next step.

   ```
   aws kendra describe-index \
    --id index id
   ```

1. At the command prompt, enter the following command to create a data source.

   ```
   aws kendra create-data-source \
    --index-id index id \
    --name data source name \
    --role-arn arn:aws:iam::account id:role/KendraRoleForGettingStartedDataSource \
    --type S3 \
    --configuration '{"S3Configuration":{"BucketName":"S3 bucket name"}}'
   ```

   If you connect to your data source using a template schema, configure the template schema.

   ```
   aws kendra create-data-source \
    --index-id index id \
    --name data source name \
    --role-arn arn:aws:iam::account id:role/KendraRoleForGettingStartedDataSource \
    --type TEMPLATE \
    --configuration '{"TemplateConfiguration":{"Template":{JSON schema}}}'
   ```

1. It will take Amazon Kendra a while to create the data source. Enter the following command to check the progress. When the status is `ACTIVE`, go on to the next step.

   ```
   aws kendra describe-data-source \
    --id data source ID \
    --index-id index ID
   ```

1. Enter the following command to synchronize the data source.

   ```
   aws kendra start-data-source-sync-job \
    --id data source ID \
    --index-id index ID
   ```

1. Amazon Kendra will index your data source. The amount of time that it takes depends on the number of documents. You can check the status of the sync job using the following command. When the status is `ACTIVE`, go on to the next step.

   ```
   aws kendra describe-data-source \
    --id data source ID \
    --index-id index ID
   ```

1. Enter the following command to make a query.

   ```
   aws kendra query \
    --index-id index ID \
    --query-text "search term"
   ```

   The results of the search are displayed in JSON format.

# Getting started (AWS SDK for Python (Boto3))
<a name="gs-python"></a>

The following program is an example of using Amazon Kendra in a Python program. The program performs the following actions:

1. Creates a new index using the [CreateIndex](https://docs.aws.amazon.com/kendra/latest/APIReference/API_CreateIndex.html) operation.

1. Waits for index creation to complete. It uses the [DescribeIndex](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DescribeIndex.html) operation to monitor the status of the index.

1. Once the index is active, it creates a data source using the [CreateDataSource](https://docs.aws.amazon.com/kendra/latest/APIReference/API_CreateDataSource.html) operation.

1. Waits for data source creation to complete. It uses the [DescribeDataSource](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DescribeDataSource.html) operation to monitor the status of the data source.

1. When the data source is active, it synchronizes the index with the contents of the data source using the [StartDataSourceSyncJob](https://docs.aws.amazon.com/kendra/latest/APIReference/API_StartDataSourceSyncJob.html) operation.

```
import boto3
from botocore.exceptions import ClientError
import pprint
import time

kendra = boto3.client("kendra")

print("Create an index.")

# Provide a name for the index
index_name = "python-getting-started-index"
# Provide an optional decription for the index
description = "Getting started index"
# Provide the IAM role ARN required for indexes
index_role_arn = "arn:aws:iam::${accountId}:role/KendraRoleForGettingStartedIndex"

try:
    index_response = kendra.create_index(
        Description = description,
        Name = index_name,
        RoleArn = index_role_arn
    )

    pprint.pprint(index_response)

    index_id = index_response["Id"]

    print("Wait for Amazon Kendra to create the index.")

    while True:
        # Get the details of the index, such as the status
        index_description = kendra.describe_index(
            Id = index_id
        )
        # When status is not CREATING quit.
        status = index_description["Status"]
        print(" Creating index. Status: "+status)
        time.sleep(60)
        if status != "CREATING":
            break

    print("Create an S3 data source.")
    
    # Provide a name for the data source
    data_source_name = "python-getting-started-data-source"
    # Provide an optional description for the data source
    data_source_description = "Getting started data source."
    # Provide the IAM role ARN required for data sources
    data_source_role_arn = "arn:aws:iam::${accountId}:role/KendraRoleForGettingStartedDataSource"
    # Provide the data source connection information 
    S3_bucket_name = "S3-bucket-name"
    data_source_type = "S3"
    # Configure the data source
    configuration = {"S3Configuration":
        {
            "BucketName": S3_bucket_name
        }
    }
    
    """
    If you connect to your data source using a template schema, 
    configure the template schema
    configuration = {"TemplateConfiguration":
        {
            "Template": {JSON schema}
        }
    }
    """
    
    data_source_response = kendra.create_data_source(
        Name = data_source_name,
        Description = data_source_name,
        RoleArn = data_source_role_arn,
        Type = data_source_type,
        Configuration = configuration,
        IndexId = index_id
    )

    pprint.pprint(data_source_response)

    data_source_id = data_source_response["Id"]

    print("Wait for Amazon Kendra to create the data source.")

    while True:
        # Get the details of the data source, such as the status
        data_source_description = kendra.describe_data_source(
            Id = data_source_id,
            IndexId = index_id
        )
        # If status is not CREATING, then quit
        status = data_source_description["Status"]
        print(" Creating data source. Status: "+status)
        time.sleep(60)
        if status != "CREATING":
            break

    print("Synchronize the data source.")

    sync_response = kendra.start_data_source_sync_job(
        Id = data_source_id,
        IndexId = index_id
    )

    pprint.pprint(sync_response)

    print("Wait for the data source to sync with the index.")

    while True:

        jobs = kendra.list_data_source_sync_jobs(
            Id = data_source_id,
            IndexId = index_id
        )

        # For this example, there should be one job
        status = jobs["History"][0]["Status"]

        print(" Syncing data source. Status: "+status)
        if status != "SYNCING":
            break
        time.sleep(60)

except  ClientError as e:
        print("%s" % e)

print("Program ends.")
```

# Getting started (AWS SDK for Java)
<a name="gs-java"></a>

The following program is an example of using Amazon Kendra in a Java program. The program performs the following actions:

1. Creates a new index using the [CreateIndex](https://docs.aws.amazon.com/kendra/latest/APIReference/API_CreateIndex.html) operation.

1. Waits for index creation to complete. It uses the [DescribeIndex](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DescribeIndex.html) operation to monitor the status of the index.

1. Once the index is active, it creates a data source using the [CreateDataSource](https://docs.aws.amazon.com/kendra/latest/APIReference/API_CreateDataSource.html) operation.

1. Waits for data source creation to complete. It uses the [DescribeDataSource](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DescribeDataSource.html) operation to monitor the status of the data source.

1. When the data source is active, it synchronizes the index with the contents of the data source using the [StartDataSourceSyncJob](https://docs.aws.amazon.com/kendra/latest/APIReference/API_StartDataSourceSyncJob.html) operation.

```
package com.amazonaws.kendra;

import java.util.concurrent.TimeUnit;
import software.amazon.awssdk.services.kendra.KendraClient;
import software.amazon.awssdk.services.kendra.model.CreateDataSourceRequest;
import software.amazon.awssdk.services.kendra.model.CreateDataSourceResponse;
import software.amazon.awssdk.services.kendra.model.CreateIndexRequest;
import software.amazon.awssdk.services.kendra.model.CreateIndexResponse;
import software.amazon.awssdk.services.kendra.model.DataSourceConfiguration;
import software.amazon.awssdk.services.kendra.model.DataSourceStatus;
import software.amazon.awssdk.services.kendra.model.DataSourceSyncJob;
import software.amazon.awssdk.services.kendra.model.DataSourceSyncJobStatus;
import software.amazon.awssdk.services.kendra.model.DataSourceType;
import software.amazon.awssdk.services.kendra.model.DescribeDataSourceRequest;
import software.amazon.awssdk.services.kendra.model.DescribeDataSourceResponse;
import software.amazon.awssdk.services.kendra.model.DescribeIndexRequest;
import software.amazon.awssdk.services.kendra.model.DescribeIndexResponse;
import software.amazon.awssdk.services.kendra.model.IndexStatus;
import software.amazon.awssdk.services.kendra.model.ListDataSourceSyncJobsRequest;
import software.amazon.awssdk.services.kendra.model.ListDataSourceSyncJobsResponse;
import software.amazon.awssdk.services.kendra.model.S3DataSourceConfiguration;
import software.amazon.awssdk.services.kendra.model.StartDataSourceSyncJobRequest;
import software.amazon.awssdk.services.kendra.model.StartDataSourceSyncJobResponse;


public class CreateIndexAndDataSourceExample {

    public static void main(String[] args) throws InterruptedException {
        System.out.println("Create an index");

        String indexDescription = "Getting started index for Kendra";
        String indexName = "java-getting-started-index";
        String indexRoleArn = "arn:aws:iam::<your AWS account ID>:role/<name of an IAM role>";

        System.out.println(String.format("Creating an index named %s", indexName));
        KendraClient kendra = KendraClient.builder().build();

        CreateIndexRequest createIndexRequest = CreateIndexRequest
            .builder()
            .description(indexDescription)
            .name(indexName)
            .roleArn(indexRoleArn)
            .build();
        CreateIndexResponse createIndexResponse = kendra.createIndex(createIndexRequest);
        System.out.println(String.format("Index response %s", createIndexResponse));

        String indexId = createIndexResponse.id();

        System.out.println(String.format("Waiting until the index with index ID %s is created", indexId));
        while (true) {
            DescribeIndexRequest describeIndexRequest = DescribeIndexRequest.builder().id(indexId).build();
            DescribeIndexResponse describeIndexResponse = kendra.describeIndex(describeIndexRequest);
            IndexStatus status = describeIndexResponse.status();
            if (status != IndexStatus.CREATING) {
                break;
            }

            TimeUnit.SECONDS.sleep(60);
        }

        System.out.println("Creating an S3 data source");
        String dataSourceName = "java-getting-started-data-source";
        String dataSourceDescription = "Getting started data source";
        String s3BucketName = "amzn-s3-demo-bucket";
        String dataSourceRoleArn = "arn:aws:iam::<your AWS account ID>:role/<name of an IAM role>";

        CreateDataSourceRequest createDataSourceRequest = CreateDataSourceRequest
            .builder()
            .indexId(indexId)
            .name(dataSourceName)
            .description(dataSourceDescription)
            .roleArn(dataSourceRoleArn)
            .type(DataSourceType.S3)
            .configuration(
                DataSourceConfiguration
                    .builder()
                    .s3Configuration(
                        S3DataSourceConfiguration
                            .builder()
                            .bucketName(s3BucketName)
                            .build()
                    ).build()
            ).build();

        CreateDataSourceResponse createDataSourceResponse = kendra.createDataSource(createDataSourceRequest);
        System.out.println(String.format("Response of creating data source: %s", createDataSourceResponse));

        String dataSourceId = createDataSourceResponse.id();
        System.out.println(String.format("Waiting for Kendra to create the data source %s", dataSourceId));
        DescribeDataSourceRequest describeDataSourceRequest = DescribeDataSourceRequest
            .builder()
            .indexId(indexId)
            .id(dataSourceId)
            .build();

        while (true) {
            DescribeDataSourceResponse describeDataSourceResponse = kendra.describeDataSource(describeDataSourceRequest);

            DataSourceStatus status = describeDataSourceResponse.status();
            System.out.println(String.format("Creating data source. Status: %s", status));
            if (status != DataSourceStatus.CREATING) {
                break;
            }

            TimeUnit.SECONDS.sleep(60);
        }

        System.out.println(String.format("Synchronize the data source %s", dataSourceId));
        StartDataSourceSyncJobRequest startDataSourceSyncJobRequest = StartDataSourceSyncJobRequest
            .builder()
            .indexId(indexId)
            .id(dataSourceId)
            .build();
        StartDataSourceSyncJobResponse startDataSourceSyncJobResponse = kendra.startDataSourceSyncJob(startDataSourceSyncJobRequest);
        System.out.println(String.format("Waiting for the data source to sync with the index %s for execution ID %s", indexId, startDataSourceSyncJobResponse.executionId()));

        // For this particular list, there should be just one job
        ListDataSourceSyncJobsRequest listDataSourceSyncJobsRequest = ListDataSourceSyncJobsRequest
            .builder()
            .indexId(indexId)
            .id(dataSourceId)
            .build();

        while (true) {
            ListDataSourceSyncJobsResponse listDataSourceSyncJobsResponse = kendra.listDataSourceSyncJobs(listDataSourceSyncJobsRequest);
            DataSourceSyncJob job = listDataSourceSyncJobsResponse.history().get(0);
            System.out.println(String.format("Syncing data source. Status: %s", job.status()));

            if (job.status() != DataSourceSyncJobStatus.SYNCING) {
                break;
            }

            TimeUnit.SECONDS.sleep(60);

        }

        System.out.println("Index setup is complete");
    }
}
```

# Getting started with an Amazon S3 data source (console)
<a name="getting-started-s3"></a>

You can use the Amazon Kendra console to get started using an Amazon S3 bucket as a data store. When you use the console you specify all of the connection information you need to index the contents of the bucket. For more information, see [Amazon S3](data-source-s3.md).

Use the following procedure to create a basic S3 bucket data source using the default configuration. The procedure assumes that you created an index following the steps in step 1 of [Getting started with the Amazon Kendra console](gs-console.md).

**To create an S3 bucket data source using the Amazon Kendra console**

1. Sign into the AWS Management Console and open the Amazon Kendra console at [https://console.aws.amazon.com/kendra/home](https://console.aws.amazon.com/kendra/home).

1. From the list of indexes, choose the index that you want to add the data source to.

1. Choose **Add data sources**.

1. From the list of data source connectors, choose **Amazon S3**.

1. On the **Define attributes** page, give your data source a name and optionally a description. Leave the **Tags** field blank. Choose **Next** to continue.

1. In the **Enter the data source location** field, enter the name of the S3 bucket that contains your documents. You can enter the name directly, or you can browse for the name by choosing **Browse**. The bucket must be in the same Region as the index.

1. In **IAM role** choose **Create a new role** and then type a role name. For more information, see [IAM roles for Amazon S3 data sources](https://docs.aws.amazon.com/kendra/latest/dg/iam-roles.html#iam-roles-ds-s3).

1. In the **Set sync run schedule** section, choose **Run on demand**.

1. Choose **Next** to continue.

1. On the **Review and create** page review the details of your S3 data source. If you want to make changes, choose the **Edit** button next to the item that you want to change. When you are satisfied with your choices, choose **Create** to create your S3 data source.

After you choose **Create**, Amazon Kendra starts creating the data source. It can take several minutes for the data source to be created. When it is finished, the status of the data source changes from **Creating** to **Active**.

After creating the data source, you need to sync the Amazon Kendra index with the data source. Choose **Sync now** to start the sync process. It can take several minutes to several hours to synchronize the data source, depending on the number and size of the documents.

# Getting started with a MySQL database data source (console)
<a name="getting-started-mysql"></a>

You can use the Amazon Kendra console to get started using a MySQL database as a data source. When you use the console you specify the connection information you need to index the contents of a MySQL database. For more information, see [Using a database data source](https://docs.aws.amazon.com/kendra/latest/dg/data-source-database.html).

You first need to create a MySQL database, then you can create a data source for the database.

Use the following procedure to create a basic MySQL database. The procedure assumes that you have already created an index following step 1 of [Getting started with the Amazon Kendra console](gs-console.md).

**To create a MySQL database**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. From the navigation pane, choose **Subnet groups** and then choose **Create DB Subnet Group**.

1. Name the group and choose your Virtual Private Cloud (VPC). For more information on configuring a VPC, see [Configuring Amazon Kendra to use a VPC](https://docs.aws.amazon.com/kendra/latest/dg/vpc-configuration.html).

1. Add your VPC's private subnets. Your private subnets are the ones that are not connected to your NAT. Choose **Create**.

1. From the navigation pane, choose **Databases** and then choose **Create database.**

1. Use the following parameters to create the database. Leave all of the other parameters at their defaults.
   + **Engine options**—MySQL
   + **Templates**—Free tier
   + **Credential Settings**—Enter and confirm a password
   + Under **Connectivity**, choose **Additional connectivity configuration**. Make the following choices.
     + **Subnet group**—Choose the subnet group that you created in step 4.
     + **VPC security group**—Choose the group that contains both inbound and outbound rules that you created in your VPC. For example, **DataSourceSecurityGroup**. For more information on configuring a VPC, see [Configuring Amazon Kendra to use a VPC](https://docs.aws.amazon.com/kendra/latest/dg/vpc-configuration.html).
   + Under **Additional configuration**, set the **Initial database name** to **content**.

1. Choose **Create database**.

1. From the list of databases, choose your new database. Make a note of the database endpoint.

1. After you create your database, you must create a table to hold your documents. Creating a table is outside the scope of these instructions. When you create your table, note the following:
   + Database name—**content**
   + Table name—**documents**
   + Columns—**ID**, **Title**, **Body**, and **LastUpdate**. You can include additional columns if you want.

Now that you have created your MySQL database, you can create a data source for the database.

**To create a MySQL data source**

1. Sign in to the AWS Management Console and open the Amazon Kendra console at [https://console.aws.amazon.com/kendra/home](https://console.aws.amazon.com/kendra/home).

1. From the navigation pane, choose **Indexes** and then choose your index.

1. Choose **Add data sources** and then choose **Amazon RDS**.

1. Type a name and description for the data source and then choose **Next**.

1. Choose **MySQL**.

1. Under **Connection access**, enter the following information:
   + **Endpoint**—The endpoint of the database that you created earlier.
   + **Port**—The port number for the database. For MySQL, the default is 3306.
   + **Type of authentication**—Choose **New**.
   + **New secret container name**—A name for the Secrets Manager container for the database credentials.
   + **Username**—The name of a user with administrative access to the database.
   + **Password**—The password for the user, and then choose **Save authentication**.
   + **Database name**—**content**.
   + **Table name**—**documents**.
   + **IAM role**—Choose **Create a new role**, and then type a name for the role.

1. In **Column configuration** enter the following:
   + **Document ID column name**—**ID**
   + **Document title column name**—**Title**
   + **Document data column name**—**Body**

1. In **Column change detection** enter the following:
   + **Change detecting columns**—**LastUpdate**

1. In **Configure VPC & security group** provide the following:
   + In **Virtual Private Cloud (VPC)**, choose your VPC.
   + In **Subnets**, choose the private subnets that you created in your VPC.
   + In **VPC security groups**, choose the security group that contains both inbound and outbound rules that you created in your VPC for MySQL databases. For example, **DataSourceSecurityGroup**.

1. In **Set sync run schedule**, choose **Run on demand** and then choose **Next**.

1. In **Data source field mapping**, choose **Next**.

1. Review the configuration of your data source to make sure that it is correct. When you're satisfied that everything is correct, choose **Create**.

# Getting started with an AWS IAM Identity Center identity source (console)
<a name="getting-started-aws-sso"></a>

An AWS IAM Identity Center identity source contains information on your users and groups. This is useful for setting up user context filtering, where Amazon Kendra filters search results for different users based on the user or their group's access to documents.

To create an IAM Identity Center identity source, you must activate IAM Identity Center and create an organization in AWS Organizations. When you activate IAM Identity Center and create an organization for the first time, it automatically defaults to the Identity Center directory as the identity source. You can change to Active Directory (Amazon managed or self-managed) or an external identity provider as your identity source. You must follow the correct guidance for this — see [Changing your IAM Identity Center identity source](https://docs.aws.amazon.com/kendra/latest/dg/changing-aws-sso-source.html). You can have only one identity source per organization.

In order for your users and groups to be assigned different levels of access to documents, you need to include your users and groups in your access control list when you ingest documents into your index. This allows your users and groups to search for documents in Amazon Kendra in accordance with their level of access. When you issue a query, the user ID needs to be an exact match of the user name in IAM Identity Center.

You must also grant the required permissions to use IAM Identity Center with Amazon Kendra. For more information, see [IAM roles for IAM Identity Center](https://docs.aws.amazon.com/kendra/latest/dg/iam-roles.html#iam-roles-aws-sso).

**To set up an IAM Identity Center identity source**

1. Open the [IAM Identity Center console](https://console.aws.amazon.com/singlesignon).

1. Choose **Enable IAM Identity Center**, and then choose **Create AWS organization**.

   Identity Center directory is created by default, and an email is sent to you to verify the email address associated with the organization.

1. To add a group to your AWS organization, in the navigation pane, choose **Groups**.

1. On the **Groups page**, choose **Create group** and enter a group name and description in the dialog box. Choose **Create**.

1. To add a user to your Organizations, in the navigation pane, choose **Users**.

1. On the **Users** page, choose **Add user**. Under **User details**, specify all required fields. For **Password**, choose **Send an email to the user**. Choose **Next**.

1. To add a user to a group, choose **Groups** and select a group.

1. On the **Details** page, under **Group members**, choose **Add user**.

1. On the **Add users to group** page, select the user you want to add as a member of the group. You can select multiple users to add to a group.

1. To sync your list of users and groups with IAM Identity Center, change your identity source to Active Directory or External identity provider.

   Identity Center directory is the default identity source and requires you to manually add your users and groups using this source if you do not have your own list managed by a provider. To change your identity source, you must follow the correct guidance for this—see [Changing your IAM Identity Center identity source](https://docs.aws.amazon.com/kendra/latest/dg/changing-aws-sso-source.html).

**Note**  
If using Active Directory or an external identity provider as your identity source, you must map the email addresses of your users to IAM Identity Center user names when you specify the System for Cross-domain Identity Management (SCIM) protocol. For more information, see the [IAM Identity Center guide on SCIM for enabling IAM Identity Center](https://docs.aws.amazon.com/singlesignon/latest/userguide/scim-profile-saml.html).

Once you have set up your IAM Identity Center identity source, you can activate this in the console when you create or edit your index. Go to **User access control** in your index settings and edit your settings to allow fetching user-group information from IAM Identity Center.

You can also activate IAM Identity Center using the [UserGroupResolutionConfiguration](https://docs.aws.amazon.com/kendra/latest/APIReference/API_UserGroupResolutionConfiguration.html) object. You provide the `UserGroupResolutionMode` as `AWS_SSO` and create an IAM role that gives permission to call `sso:ListDirectoryAssociations`, `sso-directory:SearchUsers`, `sso-directory:ListGroupsForUser`, `sso-directory:DescribeGroups`.

**Warning**  
Amazon Kendra currently does not support using `UserGroupResolutionConfiguration` with an AWS organization member account for your IAM Identity Center identity source. You must create your index in the management account for the organization in order to use `UserGroupResolutionConfiguration`.

The following is an overview of how to set up a data source with `UserGroupResolutionConfiguration` and user access control to filter search results on user context. This assumes you have already created an index and an IAM role for indexes. You create an index and provide the IAM role using the [CreateIndex](https://docs.aws.amazon.com/kendra/latest/APIReference/API_CreateIndex.html) API.

**Setting up a data source with `UserGroupResolutionConfiguration` and user context filtering**

1. Create an [IAM role](https://docs.aws.amazon.com/kendra/latest/dg/iam-roles.html#iam-roles-aws-sso) that gives permission to access your IAM Identity Center identity source.

1. Configure [https://docs.aws.amazon.com/kendra/latest/APIReference/API_UserGroupResolutionConfiguration.html](https://docs.aws.amazon.com/kendra/latest/APIReference/API_UserGroupResolutionConfiguration.html) by setting the mode to `AWS_SSO` and call [UpdateIndex](https://docs.aws.amazon.com/kendra/latest/APIReference/API_UpdateIndex.html) to update your index to use IAM Identity Center.

1. If you want to use token-based user access control to filter search results on user context, set [UserContextPolicy](https://docs.aws.amazon.com/kendra/latest/APIReference/API_UpdateIndex.html#Kendra-UpdateIndex-request-UserContextPolicy) to `USER_TOKEN` when you call `UpdateIndex`. Otherwise, Amazon Kendra crawls the access control list for each of your documents for most data source connectors. You can also filter search results on user context in the [Query](https://docs.aws.amazon.com/kendra/latest/APIReference/API_Query.html) API by providing user and group information in `UserContext`. You can also map users to their groups using [PutPrincipalMapping](https://docs.aws.amazon.com/kendra/latest/APIReference/API_PutPrincipalMapping.html) so that you only need to provide the user ID when you issue the query.

1. Create an [IAM role](https://docs.aws.amazon.com//kendra/latest/dg/iam-roles.html#iam-roles-ds) that gives permission to access your data source.

1. [Configure](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DataSourceConfiguration.html) your data source. You must provide the required connection information to connect to your data source.

1. Create a data source using the [CreateDataSource](https://docs.aws.amazon.com/kendra/latest/APIReference/API_CreateDataSource.html) API. Provide the `DataSourceConfiguration` object, which includes `TemplateConfiguration`, the ID of your index, the IAM role for your data source, the data source type, and give your data source a name. You can also update your data source.

# Changing your IAM Identity Center identity source
<a name="changing-aws-sso-source"></a>

**Warning**  
Changing your identity source in IAM Identity Center **Settings** might affect the preservation of user and group information. To do this safely, it is recommended you review [Considerations for changing your identity source](https://docs.aws.amazon.com/singlesignon/latest/userguide/manage-your-identity-source-considerations.html). When you change your identity source, a new identity source ID is generated. Check you are using the correct ID before you set the mode to `AWS_SSO` in [UserGroupResolutionConfiguration](https://docs.aws.amazon.com/kendra/latest/APIReference/API_UserGroupResolutionConfiguration.html).

**To change your IAM Identity Center identity source**

1. Open the [IAM Identity Center> console](https://console.aws.amazon.com/singlesignon).

1. Choose **Settings**.

1. On the **Settings** page, under **Identity source**, choose **Change**.

1. On the **Change identity source** page, select your preferred identity source, and then choose **Next**.