本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。
下列程式是在 Python 程式 Amazon Kendra 中使用 的範例。程式會執行下列動作:
-
使用 CreateIndex 操作建立新的索引。
-
等待索引建立完成。它使用 DescribeIndex 操作來監控索引的狀態。
-
一旦索引處於作用中狀態,它會使用 CreateDataSource 操作建立資料來源。
-
等待資料來源建立完成。它使用 DescribeDataSource 操作來監控資料來源的狀態。
-
當資料來源處於作用中狀態時,它會使用 StartDataSourceSyncJob 操作來同步索引與資料來源的內容。
import boto3
from botocore.exceptions import ClientError
import pprint
import time
kendra = boto3.client("kendra")
print("Create an index.")
# Provide a name for the index
index_name = "python-getting-started-index"
# Provide an optional decription for the index
description = "Getting started index"
# Provide the IAM role ARN required for indexes
index_role_arn = "arn:aws:iam::${accountId}:role/KendraRoleForGettingStartedIndex"
try:
index_response = kendra.create_index(
Description = description,
Name = index_name,
RoleArn = index_role_arn
)
pprint.pprint(index_response)
index_id = index_response["Id"]
print("Wait for Amazon Kendra to create the index.")
while True:
# Get the details of the index, such as the status
index_description = kendra.describe_index(
Id = index_id
)
# When status is not CREATING quit.
status = index_description["Status"]
print(" Creating index. Status: "+status)
time.sleep(60)
if status != "CREATING":
break
print("Create an S3 data source.")
# Provide a name for the data source
data_source_name = "python-getting-started-data-source"
# Provide an optional description for the data source
data_source_description = "Getting started data source."
# Provide the IAM role ARN required for data sources
data_source_role_arn = "arn:aws:iam::${accountId}:role/KendraRoleForGettingStartedDataSource"
# Provide the data source connection information
S3_bucket_name = "S3-bucket-name"
data_source_type = "S3"
# Configure the data source
configuration = {"S3Configuration":
{
"BucketName": S3_bucket_name
}
}
"""
If you connect to your data source using a template schema,
configure the template schema
configuration = {"TemplateConfiguration":
{
"Template": {JSON schema}
}
}
"""
data_source_response = kendra.create_data_source(
Name = data_source_name,
Description = data_source_name,
RoleArn = data_source_role_arn,
Type = data_source_type,
Configuration = configuration,
IndexId = index_id
)
pprint.pprint(data_source_response)
data_source_id = data_source_response["Id"]
print("Wait for Amazon Kendra to create the data source.")
while True:
# Get the details of the data source, such as the status
data_source_description = kendra.describe_data_source(
Id = data_source_id,
IndexId = index_id
)
# If status is not CREATING, then quit
status = data_source_description["Status"]
print(" Creating data source. Status: "+status)
time.sleep(60)
if status != "CREATING":
break
print("Synchronize the data source.")
sync_response = kendra.start_data_source_sync_job(
Id = data_source_id,
IndexId = index_id
)
pprint.pprint(sync_response)
print("Wait for the data source to sync with the index.")
while True:
jobs = kendra.list_data_source_sync_jobs(
Id = data_source_id,
IndexId = index_id
)
# For this example, there should be one job
status = jobs["History"][0]["Status"]
print(" Syncing data source. Status: "+status)
if status != "SYNCING":
break
time.sleep(60)
except ClientError as e:
print("%s" % e)
print("Program ends.")