Using data lifecycle policies with Amazon OpenSearch Serverless - Amazon OpenSearch Service

Using data lifecycle policies with Amazon OpenSearch Serverless

A data lifecycle policy for an Amazon OpenSearch Serverless time series collection determines the lifespan of the data in that collection. OpenSearch Serverless retains the data for the period of time that you configure.

You can configure a separate data lifecycle policy for each index of each time series collection in your AWS account. OpenSearch Serverless retains documents in indexes for, at minimum, the retention period you configure in the policy. It then automatically deletes them on a best-effort basis, typically within 48 hours or 10% of the retention period, whichever is longer.

Only time series collections support data lifecycle policies. They are not supported by search or vector search collections.

Data lifecycle policies

In a data lifecycle policy, you specify a series of rules. The data lifecycle policy lets you manage the retention period of data associated to indexes or collections that match these rules. These rules define the retention period for data in an index or group of indexes. Each rule consists of a resource type (index), a retention period, and a list of resources (indexes) that the retention period applies to.

You define the retention period with one of the following formats:

  • "MinIndexRetention": "24h" – OpenSearch Serverless retains index data for the specified period in hours or days. You can set this period to be from 24h to 3650d.

  • "NoMinIndexRetention": true – OpenSearch Serverless retains index data indefinitely.

In the following sample policy, the first rule specifies a retention period of 15 days for all indexes within the collection marketing. The second rule specifies that all index names that begin with log in the finance collection have no retention period set and will be retained indefinitely.

{ "lifeCyclePolicyDetail": { "type": "retention", "name": "my-policy", "policyVersion": "MTY4ODI0NTM2OTk1N18x", "policy": { "Rules": [ { "ResourceType":"index", "Resource":[ "index/marketing/*" ], "MinIndexRetention": "15d" }, { "ResourceType":"index", "Resource":[ "index/finance/log*" ], "NoMinIndexRetention": true } ] }, "createdDate": 1688245369957, "lastModifiedDate": 1688245369957 } }

In the following sample policy rule, OpenSearch Serverless indefinitely retains the data in all indexes for all collections within the account.

{ "Rules": [ { "ResourceType": "index", "Resource": [ "index/*/*" ] } ], "NoMinIndexRetention": true }

Permissions required

Lifecycle policies for OpenSearch Serverless use the following AWS Identity and Access Management (IAM) permissions. You can specify IAM conditions to restrict users to data lifecycle policies associated with specific collections and indexes.

  • aoss:CreateLifecyclePolicy – Create a data lifecycle policy.

  • aoss:ListLifecyclePolicies – List all data lifecycle policies in the current account.

  • aoss:BatchGetLifecyclePolicy – View a data lifecycle policy associated with an account or policy name.

  • aoss:BatchGetEffectiveLifecyclePolicy – View a data lifecycle policy for a given resource (index is the only supported resource).

  • aoss:UpdateLifecyclePolicy – Modify a given data lifecycle policy, and change its retention setting or resource.

  • aoss:DeleteLifecyclePolicy – Delete a data lifecycle policy.

The following identity-based access policy allows a user to view all data lifecycle policies, and update policies with the resource pattern collection/application-logs:

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "aoss:UpdateLifecyclePolicy" ], "Resource": "*", "Condition": { "StringEquals": { "aoss:collection": "application-logs" } } }, { "Effect": "Allow", "Action": [ "aoss:ListLifecyclePolicies", "aoss:BatchGetLifecyclePolicy" ], "Resource": "*" } ] }

Policy precedence

There can be situations where data lifecycle policy rules overlap, within or across policies. When this happens, a rule with a more specific resource name or pattern for an index overrides a rule with a more general resource name or pattern for any indexes that are common to both rules.

For example, in the following policy, two rules apply to an index index/sales/logstash. In this situation, the second rule takes precedence because index/sales/log* is the longest match to index/sales/logstash. Therefore, OpenSearch Serverless sets no retention period for the index.

{ "Rules":[ { "ResourceType":"index", "Resource":[ "index/sales/*", ], "MinIndexRetention": "15d" }, { "ResourceType":"index", "Resource":[ "index/sales/log*", ], "NoMinIndexRetention": true } ] }

Policy syntax

Provide one or more rules. These rules define data lifecycle settings for your OpenSearch Serverless indexes.

Each rule contains the following elements. You can either provide MinIndexRetention or NoMinIndexRetention in each rule, but not both.

Element Description
Resource type The type of resource that the rule applies to. The only supported option for data lifecycle policies is index.
Resource A list of resource names and/or patterns. Patterns consist of a prefixe and a wildcard (*), which allow the associated permissions to apply to multiple resources. For example, index/<collection-name|pattern>/<index-name|pattern>.
MinIndexRetention The minimum period, in days (d) or hours (h), to retain the document in the index. The lower bound is 24h and the upper bound is 3650d.
NoMinIndexRetention If true, OpenSearch Serverless retains documents indefinitely.

The following are some examples:

{ "Rules": [ { "ResourceType": "index", "Resource": [ "index/autoparts-inventory/*" ], "MinIndexRetention": "20d" }, { "ResourceType": "index", "Resource": [ "index/auto*/gear" ], "MinIndexRetention": "24h" }, { "ResourceType": "index", "Resource": [ "index/autoparts-inventory/tires" ], "NoMinIndexRetention": true } ] }

Creating data lifecycle policies (AWS CLI)

To create a data lifecycle policy using the OpenSearch Serverless API operations, use the CreateLifecyclePolicy command. This command accepts both inline policies and .json files. Inline policies must be encoded as a JSON escaped string.

The following request creates a data lifecycle policy:

aws opensearchserverless create-lifecycle-policy \ --name my-policy \ --type retention \ --policy "{\"Rules\":[{\"ResourceType\":\"index\",\"Resource\":[\"index/autoparts-inventory/*\"],\"MinIndexRetention\": \"81d\"},{\"ResourceType\":\"index\",\"Resource\":[\"index/sales/orders*\"],\"NoMinIndexRetention\":true}]}"

To provide the policy in a JSON file, use the format --policy file://my-policy.json

Viewing data lifecycle policies

Before you create a collection, you might want to preview the existing data lifecycle policies in your account to see which one has a resource pattern that matches your collection's name. The following ListLifecyclePolicies request lists all data lifecycle policies in your account:

aws opensearchserverless list-lifecycle-policies --type retention

The request returns information about all configured data lifecycle policies. To view the pattern rules defined in the one specific policy, find the policy information in the contents of the lifecyclePolicySummaries element in the response. Note the name and type of this policy and use these properties in a BatchGetLifecyclePolicy request to receive a response with the following policy details:

{ "lifecyclePolicySummaries": [ { "type": "retention", "name": "my-policy", "policyVersion": "MTY2MzY5MTY1MDA3Ml8x", "createdDate": 1663691650072, "lastModifiedDate": 1663691650072 } ] }

To limit the results to policies that contain specific collections or indexes, you can include resource filters:

aws opensearchserverless list-lifecycle-policies --type retention --resources "index/autoparts-inventory/*"

To view detailed information about a specific policy, use the BatchGetLifecyclePolicy command.

Updating data lifecycle policies

When you modify a data lifecycle policy, all associated collections are impacted. To update a data lifecycle policy in the OpenSearch Serverless console, expand Data lifecycle policies, select the policy to modify, and choose Edit. Make your changes and choose Save.

To update a data lifecycle policy using the OpenSearch Serverless API, use the UpdateLifecyclePolicy command. You must include a policy version in the request. You can retrieve the policy version by using the ListLifecyclePolicies or BatchGetLifecyclePolicy commands. Including the most recent policy version ensures that you don't inadvertently override a change made by someone else.

The following request updates a data lifecycle policy with a new policy JSON document:

aws opensearchserverless update-lifecycle-policy \ --name my-policy \ --type retention \ --policy-version MTY2MzY5MTY1MDA3Ml8x \ --policy file://my-new-policy.json

There might be a few minutes of lag time between when you update the policy and when the new retention periods are enforced.

Deleting data lifecycle policies

When you delete a data lifecycle policy, it no longer applies to any matching indexes. To delete a policy in the OpenSearch Serverless console, select the policy and choose Delete.

You can also use the DeleteLifecyclePolicy command:

aws opensearchserverless delete-lifecycle-policy --name my-policy --type retention