Using Pit with Amazon OpenSearch Serverless
You can now use the PIT (Point in Time) plugin to run different queries against a
dataset that is fixed in time. Typically, if you run a query on an index multiple times, the
same query may return a different result due to data being continually indexed, updated, and
deleted. If you need to run a query against the same data, you can preserve that data's
state by creating a PIT. To learn more about Point in Time, see Point in Time
Amazon OpenSearch Serverless support use of the following PIT APIs:
-
POST /<target_indexes>/_search/point_in_time
-
GET /_search/point_in_time/_all
-
DELETE /_search/point_in_time/_all
-
DELETE /_search/point_in_time
All APIs are managed via aoss:ReadDocument
data access control. For more
information on supported APIs, see Supported operations in plugins in Amazon OpenSearch Serverless.
OpenSearch offers various methods of pagination that can be can be coupled with Point in
Time to perform deep pagination of search results. These features include the ability to
specify the from
and size
paramaters of your search and the
search_after
parameter for deep pagination of search results.To learn more
about pagination, see Paginate results
Point in Time (PIT) with search_after
is recommended pagination method in
Amazon OpenSearch Serverless, especially for deep pagination. It bypasses the limitations of all other
methods because it operates on a dataset that is frozen in time, it is not bound to a
query, and it supports consistent pagination going forward.
You can also slice a PIT search into multiple slices if you want to jump from a page to a
non-consecutive page by using the slice.id
and slice.max
parameters. To learn more on search slicing, see Search slicing
Slicing performs better when you specify a field parameter to the slice. The field specified must be a numeric doc value type such as short, integer, or long. This field is used to create slice buckets, so there should be at least as many unique values as slices, and those values should uniformly distributed.
Create a PIT
Create a PIT using the following example:
POST /my-index-1/_search/point_in_time?keep_alive=100m { "pit_id": "o123QQEEeEeeeE5eEEeeEEEeEEEeEeEEEE45eee3E3EeaEEeeEE1EEEeEeeEEeEeeEEeEEEmEEE2EEE6eEe1eEEeEeeAAAAAAAAAAAIWcDVrM3ZIX0pRNS1XejE5YXRPRFhzUQEWc05ZTjdyS3dTcjJoVWlsYTVRYS0ydwAA", "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "creation_time": 1658146050064 }
Selecting parameter for pagination with PIT
For effective pagination, a unique sort
parameter must be selected.
Without a unique sort
parameter, documents may be skipped during
pagination. You must select a tiebreaker field, or a set of tiebreaker fields, which are
random enough to ensure no two documents have the same sort
ordering.
OpenSearch discourages the use of _id
for tiebreakers because it requires
fielddata
which is very cost-intensive. If no tiebreaker is available,
it is recommended you index documents with additional random integer field which can
later be used as a tiebreaker during search.
Search using the PIT context id and the search_after
parameter to
retrieve the next page of results as shown in the following example:
Pagination with PIT
Search using the PIT context id and the search_after
parameter to
retrieve the next page of results as shown in the following example:
GET /_search { "size": 10000, "query": { "match" : { "user.id" : "elkbee" } }, "pit": { "id": "46ToAwMDaWR5BXV1aWQyKwZub2RlXzMAAAAAAAAAACoBYwADaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQADaWR5BXV1aWQyKgZub2RlXzIAAAAAAAAAAAwBYgACBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "100m" }, "sort": [ {"@timestamp": {"order": "asc"}} ], "search_after": [ "2021-05-20T05:30:04.832Z" ] }
Extend PIT using search request
To extend a point in time search, you will need to provide a keep_alive
parameter in the "pit"
object during search. See the following
example:
GET /_search { "size": 10000, "query": { "match" : { "user.id" : "elkbee" } }, "pit": { "id": "46ToAwMDaWR5BXV1aWQyKwZub2RlXzMAAAAAAAAAACoBYwADaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQADaWR5BXV1aWQyKgZub2RlXzIAAAAAAAAAAAwBYgACBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "100m" }, "sort": [ {"@timestamp": {"order": "asc"}} ], "search_after": [ "2021-05-20T05:30:04.832Z" ] }
List all PITs
To list all PITs, see the following example:
GET /_search/point_in_time/_all { "pits": [{ "pit_id": "o463QQEPbXktaW5kZXgtMDAwMDAxFnNOWU43ckt3U3IyaFVpbGE1UWEtMncAFjFyeXBsRGJmVFM2RTB6eVg1aVVqQncAAAAAAAAAAAEWcDVrM3ZIX0pRNS1XejE5YXRPRFhzUQEWc05ZTjdyS3dTcjJoVWlsYTVRYS0ydwAA", "creation_time": 1658146048666, "keep_alive": 6000000 }, { "pit_id": "o463QQEPbXktaW5kZXgtMDAwMDAxFnNOWU43ckt3U3IyaFVpbGE1UWEtMncAFjFyeXBsRGJmVFM2RTB6eVg1aVVqQncAAAAAAAAAAAIWcDVrM3ZIX0pRNS1XejE5YXRPRFhzUQEWc05ZTjdyS3dTcjJoVWlsYTVRYS0ydwAA", "creation_time": 1658146050064, "keep_alive": 6000000 } ] }
Delete a PIT
To delete a PIT, see the following example
DELETE /_search/point_in_time { "pit_id": [ "o463QQEPbXktaW5kZXgtMDAwMDAxFkhGN09fMVlPUkVPLXh6MUExZ1hpaEEAFjBGbmVEZHdGU1EtaFhhUFc4ZkR5cWcAAAAAAAAAAAEWaXBPNVJtZEhTZDZXTWFFR05waXdWZwEWSEY3T18xWU9SRU8teHoxQTFnWGloQQAA", "o463QQEPbXktaW5kZXgtMDAwMDAxFkhGN09fMVlPUkVPLXh6MUExZ1hpaEEAFjBGbmVEZHdGU1EtaFhhUFc4ZkR5cWcAAAAAAAAAAAIWaXBPNVJtZEhTZDZXTWFFR05waXdWZwEWSEY3T18xWU9SRU8teHoxQTFnWGloQQAA" ] } { "pits": [ { "successful": true, "pit_id": "o463QQEPbXktaW5kZXgtMDAwMDAxFkhGN09fMVlPUkVPLXh6MUExZ1hpaEEAFjBGbmVEZHdGU1EtaFhhUFc4ZkR5cWcAAAAAAAAAAAEWaXBPNVJtZEhTZDZXTWFFR05waXdWZwEWSEY3T18xWU9SRU8teHoxQTFnWGloQQAA" }, { "successful": false, "pit_id": "o463QQEPbXktaW5kZXgtMDAwMDAxFkhGN09fMVlPUkVPLXh6MUExZ1hpaEEAFjBGbmVEZHdGU1EtaFhhUFc4ZkR5cWcAAAAAAAAAAAIWaXBPNVJtZEhTZDZXTWFFR05waXdWZwEWSEY3T18xWU9SRU8teHoxQTFnWGloQQAA" } ] }