Using Pit with Amazon OpenSearch Serverless - Amazon OpenSearch Service

Using Pit with Amazon OpenSearch Serverless

You can now use the PIT (Point in Time) plugin to run different queries against a dataset that is fixed in time. Typically, if you run a query on an index multiple times, the same query may return a different result due to data being continually indexed, updated, and deleted. If you need to run a query against the same data, you can preserve that data's state by creating a PIT. To learn more about Point in Time, see Point in Time.

Amazon OpenSearch Serverless support use of the following PIT APIs:

  • POST /<target_indexes>/_search/point_in_time

  • GET /_search/point_in_time/_all

  • DELETE /_search/point_in_time/_all

  • DELETE /_search/point_in_time

All APIs are managed via aoss:ReadDocument data access control. For more information on supported APIs, see Supported operations in plugins in Amazon OpenSearch Serverless.

OpenSearch offers various methods of pagination that can be can be coupled with Point in Time to perform deep pagination of search results. These features include the ability to specify the from and size paramaters of your search and the search_after parameter for deep pagination of search results.To learn more about pagination, see Paginate results.

Point in Time (PIT) with search_after is recommended pagination method in Amazon OpenSearch Serverless, especially for deep pagination. It bypasses the limitations of all other methods because it operates on a dataset that is frozen in time, it is not bound to a query, and it supports consistent pagination going forward.

You can also slice a PIT search into multiple slices if you want to jump from a page to a non-consecutive page by using the slice.id and slice.max parameters. To learn more on search slicing, see Search slicing.

Slicing performs better when you specify a field parameter to the slice. The field specified must be a numeric doc value type such as short, integer, or long. This field is used to create slice buckets, so there should be at least as many unique values as slices, and those values should uniformly distributed.

Create a PIT

Create a PIT using the following example:

POST /my-index-1/_search/point_in_time?keep_alive=100m { "pit_id": "o123QQEEeEeeeE5eEEeeEEEeEEEeEeEEEE45eee3E3EeaEEeeEE1EEEeEeeEEeEeeEEeEEEmEEE2EEE6eEe1eEEeEeeAAAAAAAAAAAIWcDVrM3ZIX0pRNS1XejE5YXRPRFhzUQEWc05ZTjdyS3dTcjJoVWlsYTVRYS0ydwAA", "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "creation_time": 1658146050064 }

Selecting parameter for pagination with PIT

For effective pagination, a unique sort parameter must be selected. Without a unique sort parameter, documents may be skipped during pagination. You must select a tiebreaker field, or a set of tiebreaker fields, which are random enough to ensure no two documents have the same sort ordering. OpenSearch discourages the use of _id for tiebreakers because it requires fielddata which is very cost-intensive. If no tiebreaker is available, it is recommended you index documents with additional random integer field which can later be used as a tiebreaker during search.

Search using the PIT context id and the search_after parameter to retrieve the next page of results as shown in the following example:

Pagination with PIT

Search using the PIT context id and the search_after parameter to retrieve the next page of results as shown in the following example:

GET /_search { "size": 10000, "query": { "match" : { "user.id" : "elkbee" } }, "pit": { "id": "46ToAwMDaWR5BXV1aWQyKwZub2RlXzMAAAAAAAAAACoBYwADaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQADaWR5BXV1aWQyKgZub2RlXzIAAAAAAAAAAAwBYgACBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "100m" }, "sort": [ {"@timestamp": {"order": "asc"}} ], "search_after": [ "2021-05-20T05:30:04.832Z" ] }

Extend PIT using search request

To extend a point in time search, you will need to provide a keep_alive parameter in the "pit" object during search. See the following example:

GET /_search { "size": 10000, "query": { "match" : { "user.id" : "elkbee" } }, "pit": { "id": "46ToAwMDaWR5BXV1aWQyKwZub2RlXzMAAAAAAAAAACoBYwADaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQADaWR5BXV1aWQyKgZub2RlXzIAAAAAAAAAAAwBYgACBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "100m" }, "sort": [ {"@timestamp": {"order": "asc"}} ], "search_after": [ "2021-05-20T05:30:04.832Z" ] }

List all PITs

To list all PITs, see the following example:

GET /_search/point_in_time/_all { "pits": [{ "pit_id": "o463QQEPbXktaW5kZXgtMDAwMDAxFnNOWU43ckt3U3IyaFVpbGE1UWEtMncAFjFyeXBsRGJmVFM2RTB6eVg1aVVqQncAAAAAAAAAAAEWcDVrM3ZIX0pRNS1XejE5YXRPRFhzUQEWc05ZTjdyS3dTcjJoVWlsYTVRYS0ydwAA", "creation_time": 1658146048666, "keep_alive": 6000000 }, { "pit_id": "o463QQEPbXktaW5kZXgtMDAwMDAxFnNOWU43ckt3U3IyaFVpbGE1UWEtMncAFjFyeXBsRGJmVFM2RTB6eVg1aVVqQncAAAAAAAAAAAIWcDVrM3ZIX0pRNS1XejE5YXRPRFhzUQEWc05ZTjdyS3dTcjJoVWlsYTVRYS0ydwAA", "creation_time": 1658146050064, "keep_alive": 6000000 } ] }

Delete a PIT

To delete a PIT, see the following example

DELETE /_search/point_in_time { "pit_id": [ "o463QQEPbXktaW5kZXgtMDAwMDAxFkhGN09fMVlPUkVPLXh6MUExZ1hpaEEAFjBGbmVEZHdGU1EtaFhhUFc4ZkR5cWcAAAAAAAAAAAEWaXBPNVJtZEhTZDZXTWFFR05waXdWZwEWSEY3T18xWU9SRU8teHoxQTFnWGloQQAA", "o463QQEPbXktaW5kZXgtMDAwMDAxFkhGN09fMVlPUkVPLXh6MUExZ1hpaEEAFjBGbmVEZHdGU1EtaFhhUFc4ZkR5cWcAAAAAAAAAAAIWaXBPNVJtZEhTZDZXTWFFR05waXdWZwEWSEY3T18xWU9SRU8teHoxQTFnWGloQQAA" ] } { "pits": [ { "successful": true, "pit_id": "o463QQEPbXktaW5kZXgtMDAwMDAxFkhGN09fMVlPUkVPLXh6MUExZ1hpaEEAFjBGbmVEZHdGU1EtaFhhUFc4ZkR5cWcAAAAAAAAAAAEWaXBPNVJtZEhTZDZXTWFFR05waXdWZwEWSEY3T18xWU9SRU8teHoxQTFnWGloQQAA" }, { "successful": false, "pit_id": "o463QQEPbXktaW5kZXgtMDAwMDAxFkhGN09fMVlPUkVPLXh6MUExZ1hpaEEAFjBGbmVEZHdGU1EtaFhhUFc4ZkR5cWcAAAAAAAAAAAIWaXBPNVJtZEhTZDZXTWFFR05waXdWZwEWSEY3T18xWU9SRU8teHoxQTFnWGloQQAA" } ] }