

# Creating collections
<a name="serverless-create"></a>

You can use the console or the AWS CLI to create a serverless collection. These steps cover how to create a *search* or *time series* collection. To create a *vector search* collection, see [Working with vector search collections](serverless-vector-search.md). 

**Topics**
+ [Create a collection (console)](serverless-create-console.md)
+ [Create a collection (CLI)](serverless-create-cli.md)

# Create a collection (console)
<a name="serverless-create-console"></a>

Use the procedures in this section to create a collection by using the AWS Management Console. These steps cover how to create a *search* or *time series* collection. To create a *vector search* collection, see [Working with vector search collections](serverless-vector-search.md). 

**Topics**
+ [Configure collection settings](#serverless-create-console-step-2)
+ [Configure additional search fields](#serverless-create-console-step-3)

## Configure collection settings
<a name="serverless-create-console-step-2"></a>

Use the following procedure configure information about your collection. 

**To configure collection settings using the console**

1. Navigate to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/home/](https://console.aws.amazon.com/aos/home/).

1. Expand **Serverless** in the left navigation pane and choose **Collections**. 

1. Choose **Create collection**.

1. Provide a name and description for the collection. The name must meet the following criteria:
   + Is unique to your account and AWS Region
   + Contains only lowercase letters a-z, the numbers 0–9, and the hyphen (-)
   + Contains between 3 and 32 characters

1. Choose a collection type:
   + **Time series** – Log analytics segment that focuses on analyzing large volumes of semi-structured, machine-generated data. At least 24 hours of data is stored on hot indexes, and the rest remains in warm storage.
   + **Search** – Full-text search that powers applications in your internal networks and internet-facing applications. All search data is stored in hot storage to ensure fast query response times.
**Note**  
Choose this option if you are enabling automatic semantic search, as described in [Configure collection settings](#serverless-create-console-step-2).
   + **Vector search** – Semantic search on vector embeddings that simplifies vector data management. Powers machine learning (ML) augmented search experiences and generative AI applications such as chatbots, personal assistants, and fraud detection.

   For more information, see [Choosing a collection type](serverless-overview.md#serverless-usecase).

1. For **Deployment type**, choose the redundancy setting for your collection. By default, each collection has redundancy, which means that the indexing and search OpenSearch Compute Units (OCUs) each have their own standby replicas in a different Availability Zone. For development and testing purposes, you can choose to disable redundancy, which reduces the number of OCUs in your collection to two. For more information, see [How it works](serverless-overview.md#serverless-process).

1. For **Security**, choose **Standard create**.

1. For **Encryption**, choose an AWS KMS key to encrypt your data with. OpenSearch Serverless notifies you if the collection name that you entered matches a pattern defined in an encryption policy. You can choose to keep this match or override it with unique encryption settings. For more information, see [Encryption in Amazon OpenSearch Serverless](serverless-encryption.md).

1. For **Network access settings**, configure network access for the collection.
   + For **Access type**, select public or private. 

     If you choose private, specify which VPC endpoints and AWS services can access the collection.
     + **VPC endpoints for access** – Specify one or more VPC endpoints to allow access through. To create a VPC endpoint, see [Data plane access through AWS PrivateLink](serverless-vpc.md).
     + **AWS service private access** – Select one or more supported services to allow access to.
   + For **Resource type**, select whether users can access the collection through its *OpenSearch* endpoint (to make API calls through cURL, Postman, and so on), through the *OpenSearch Dashboards* endpoint (to work with visualizations and make API calls through the console), or both.
**Note**  
AWS service private access applies only to the OpenSearch endpoint, not to the OpenSearch Dashboards endpoint.

   OpenSearch Serverless notifies you if the collection name that you entered matches a pattern defined in a network policy. You can choose to keep this match or override it with custom network settings. For more information, see [Network access for Amazon OpenSearch Serverless](serverless-network.md).

1. (Optional) Add one or more tags to the collection. For more information, see [Tagging Amazon OpenSearch Serverless collections](tag-collection.md).

1. Choose **Next**.

## Configure additional search fields
<a name="serverless-create-console-step-3"></a>

The options you see on page two of the create collection workflow depend on the type of collection you are creating. This section describes how to configure additional search fields for each collection type. This section also describes how to configure automatic semantic enrichment. Skip any section that doesn't apply to your collection type.

**Topics**
+ [Configure automatic semantic enrichment](#serverless-create-console-step-3-semantic-enrichment-fields)
+ [Configure time series search fields](#serverless-create-console-step-3-time-series-fields)
+ [Configure lexical search fields](#serverless-create-console-step-3-lexical-fields)
+ [Configure vector search fields](#serverless-create-console-step-3-vector-search-fields)

### Configure automatic semantic enrichment
<a name="serverless-create-console-step-3-semantic-enrichment-fields"></a>

When you create or edit a collection, you can configure automatic semantic enrichment, which simplifies semantic search implementation and capabilities in Amazon OpenSearch Service. Semantic search returns query results that incorporate not just keyword matching, but the intent and contextual meaning of the user's search. For more information, see [Automatic semantic enrichment for Serverless](serverless-semantic-enrichment.md).

**To configure automatic semantic enrichment**

1. In the **Index details** section, for **Index name**, specify a name.

1. In the **Automatic semantic enrichment fields** section, choose **Add semantic search field**.

1. In the **Input field name for semantic enrichment** field, enter the name of a field that you want to enrich.

1. **Data type** is **Text**. You can't change this.

1. For **Language**, choose either **English** or **Multilingual**.

1. Choose **Add field**.

1. After you finish configuring optional fields for your collection, choose **Next**. Review your changes and choose **Submit** to create the collection.

### Configure time series search fields
<a name="serverless-create-console-step-3-time-series-fields"></a>

The options in the **Time series search fields** section pertain to time series data and data streams. For more information about these subjects, see [Managing time-series data in Amazon OpenSearch Service with data streams](data-streams.md).

**To configure time series search fields**

1. In the **Time series search fields** section, choose **Add time series field**.

1. For **Field name**, enter a name.

1. For **Data type**, choose a type from the list.

1. Choose **Add field**

1. After you finish configuring optional fields for your collection, choose **Next**. Review your changes and choose **Submit** to create the collection.

### Configure lexical search fields
<a name="serverless-create-console-step-3-lexical-fields"></a>

Lexical search seeks an exact match between a search query and indexed terms or keywords.

**To configure lexical search fields**

1. In the **Lexical search fields** section, choose **Add search field**.

1. For **Field name**, enter a name.

1. For **Data type**, choose a type from the list.

1. Choose **Add field**

1. After you finish configuring optional fields for your collection, choose **Next**. Review your changes and choose **Submit** to create the collection.

### Configure vector search fields
<a name="serverless-create-console-step-3-vector-search-fields"></a>

**To configure vector search fields**

1. In the **Vector fields** section, choose **Add vector field**.

1. For **Field name**, enter a name.

1. For **Engine**, choose a type from the list.

1. Enter the number of dimensions.

1. For **Distance Metric**, choose a type from the list.

1. After you finish configuring optional fields for your collection, choose **Next**.

1. Review your changes and choose **Submit** to create the collection.

# Create a collection (CLI)
<a name="serverless-create-cli"></a>

Use the procedures in this section to create an OpenSearch Serverless collection using the AWS CLI. 

**Topics**
+ [Before you begin](#serverless-create-cli-before-you-begin)
+ [Creating a collection](#serverless-create-cli-creating)
+ [Creating a collection with an automatic semantic enrichment index](#serverless-create-cli-automatic-semantic-enrichment)

## Before you begin
<a name="serverless-create-cli-before-you-begin"></a>

Before you create a collection using the AWS CLI, use the following procedure to create required policies for the collection.

**Note**  
In each of the following procedures, when you specify a name for a collection, the name must meet the following criteria:  
Is unique to your account and AWS Region
Contains only lowercase letters a-z, the numbers 0–9, and the hyphen (-)
Contains between 3 and 32 characters

**To create required policies for a collection**

1. Open the AWS CLI and run the following command to create an [encryption policy](serverless-encryption.md) with a resource pattern that matches the intended name of the collection. 

   ```
   &aws opensearchserverless create-security-policy \
     --name policy name \
     --type encryption --policy "{\"Rules\":[{\"ResourceType\":\"collection\",\"Resource\":[\"collection\/collection name\"]}],\"AWSOwnedKey\":true}"
   ```

   For example, if you plan to name your collection *logs-application*, you might create an encryption policy like this:

   ```
   &aws opensearchserverless create-security-policy \
     --name logs-policy \
     --type encryption --policy "{\"Rules\":[{\"ResourceType\":\"collection\",\"Resource\":[\"collection\/logs-application\"]}],\"AWSOwnedKey\":true}"
   ```

   If you plan to use the policy for additional collections, you can make the rule more broad, such as `collection/logs*` or `collection/*`.

1. Run the following command to configure network settings for the collection using a [network policy](serverless-network.md). You can create network policies after you create a collection, but we recommend doing it beforehand.

   ```
   &aws opensearchserverless create-security-policy \
     --name policy name \
     --type network --policy "[{\"Description\":\"description\",\"Rules\":[{\"ResourceType\":\"dashboard\",\"Resource\":[\"collection\/collection name\"]},{\"ResourceType\":\"collection\",\"Resource\":[\"collection\/collection name\"]}],\"AllowFromPublic\":true}]"
   ```

   Using the previous *logs-application* example, you might create the following network policy:

   ```
   &aws opensearchserverless create-security-policy \
     --name logs-policy \
     --type network --policy "[{\"Description\":\"Public access for logs collection\",\"Rules\":[{\"ResourceType\":\"dashboard\",\"Resource\":[\"collection\/logs-application\"]},{\"ResourceType\":\"collection\",\"Resource\":[\"collection\/logs-application\"]}],\"AllowFromPublic\":true}]"
   ```

## Creating a collection
<a name="serverless-create-cli-creating"></a>

The following procedure uses the [CreateCollection](https://docs.aws.amazon.com/opensearch-service/latest/ServerlessAPIReference/API_CreateCollection.html) API action to create a collection of type `SEARCH` or `TIMESERIES`. If you don't specify a collection type in the request, it defaults to `TIMESERIES`. For more information about these types, see [Choosing a collection type](serverless-overview.md#serverless-usecase). To create a *vector search* collection, see [Working with vector search collections](serverless-vector-search.md). 

If your collection is encrypted with an AWS owned key, the `kmsKeyArn` is `auto` rather than an ARN.

**Important**  
After you create a collection, you won't be able to access it unless it matches a data access policy. For more information, see [Data access control for Amazon OpenSearch Serverless](serverless-data-access.md).

**To create a collection**

1. Verify that you created required policies described in [Before you begin](#serverless-create-cli-before-you-begin).

1. Run the following command. For `type` specify either `SEARCH` or `TIMESERIES`.

   ```
   &aws opensearchserverless create-collection --name "collection name" --type collection type --description "description"
   ```

## Creating a collection with an automatic semantic enrichment index
<a name="serverless-create-cli-automatic-semantic-enrichment"></a>

Use the following procedure to create a new OpenSearch Serverless collection with an index that is configured for [automatic semantic enrichment](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-semantic-enrichment.html). The procedure uses the OpenSearch Serverless [CreateIndex](https://docs.aws.amazon.com/opensearch-service/latest/ServerlessAPIReference/API_CreateIndex.html) API action.

**To create a new collection with an index configured for automatic semantic enrichment**

Run the following command to create the collection and an index.

```
&aws opensearchserverless create-index \
--region Region ID \
--id collection name --index-name index name \
--index-schema \
'mapping in json'
```

Here's an example.

```
&aws opensearchserverless create-index \
--region us-east-1 \
--id conversation_history --index-name conversation_history_index \
--index-schema \ 
'{
    "mappings": {
        "properties": {
            "age": {
                "type": "integer"
            },
            "name": {
                "type": "keyword"
            },
            "user_description": {
                "type": "text"
            },
            "conversation_history": {
                "type": "text",
                "semantic_enrichment": {
                    "status": "ENABLED",
                    // Specifies the sparse tokenizer for processing multi-lingual text
                    "language_option": "MULTI-LINGUAL", 
                    // If embedding_field is provided, the semantic embedding field will be set to the given name rather than original field name + "_embedding"
                    "embedding_field": "conversation_history_user_defined" 
                }
            },
            "book_title": {
                "type": "text",
                "semantic_enrichment": {
                    // No embedding_field is provided, so the semantic embedding field is set to "book_title_embedding"
                    "status": "ENABLED",
                    "language_option": "ENGLISH"
                }
            },
            "abstract": {
                "type": "text",
                "semantic_enrichment": {
                    // If no language_option is provided, it will be set to English.
                    // No embedding_field is provided, so the semantic embedding field is set to "abstract_embedding"
                    "status": "ENABLED" 
                }
            }
        }
    }
}'
```