

# Vector search
<a name="vector-search"></a>

Vector search for MemoryDB extends the functionality of MemoryDB. Vector search can be used in conjunction with existing MemoryDB functionality. Applications that do not use vector search are unaffected by its presence. Vector search is available in all Regions that MemoryDB is available.

Vector search for simplifies your application architecture while delivering high-speed vector search. Vector search for MemoryDB is ideal for use cases where peak performance and scale are the most important selection criteria. You can use your existing MemoryDB data, or a Valkey or Redis OSS API, to build machine learning and generative AI use cases. This includes retrieval-augmented generation, anomaly detection, document retrieval, and real-time recommendations.

As of 6/26/2024, AWS MemoryDB delivers the fastest vector search performance at the highest recall rates among popular vector databases on AWS.

**Topics**
+ [

# Vector search overview
](vector-search-overview.md)
+ [

# Use cases
](vector-search-examples.md)
+ [

# Vector search features and limits
](vector-search-limits.md)
+ [

# Create a cluster enabled for vector search
](vector-search-cluster.md)
+ [

# Vector search commands
](vector-search-commands.md)

# Vector search overview
<a name="vector-search-overview"></a>

Vector search is built on the creation, maintenance and use of indexes. Each vector search operation specifies a single index and its operation is confined to that index, i.e., operations on one index are unaffected by operations on any other index. Except for the operations to create and destroy indexes, any number of operations may be issued against any index at any time, meaning that at the cluster level, multiple operations against multiple indexes may be in progress simultaneously.

Individual indexes are named objects that exist in a unique namespace, which is separate from the other Valkey and Redis OSS namespaces: keys, functions, etc. Each index is conceptually similar to a conventional database table in that it’s structured in two dimensions: column and rows. Each row in the table corresponds to a key. Each column in the index corresponds to a member or portion of that key. Within this document the terms key, row and record are identical and used interchangeably. Similarly the terms column, field, path and member are essentially identical and are also used interchangeably.

There are no special commands to add, delete or modify indexed data. Rather the existing **HASH** or **JSON** commands that modify a key that is in an index also automatically update the index.

**Topics**
+ [

## Indexes and the Valkey and Redis OSS keyspace
](#vector-search-indexes-keyspaces)
+ [

## Index field types
](#vector-search-index-field-types)
+ [

## Vector index algorithms
](#vector-search-index-algorithms)
+ [

## Vector search query expression
](#vector-search-query-expression)
+ [

## INFO command
](#vector-search-ft.info)
+ [

## Vector search security
](#vector-search-security)

## Indexes and the Valkey and Redis OSS keyspace
<a name="vector-search-indexes-keyspaces"></a>

Indexes are constructed and maintained over a subset of the Valkey and Redis OSS keyspace. Multiple indexes may choose disjoint or overlapping subsets of the keyspace without limitation. The keyspace for each index is defined by a list of key prefixes that are provided when the index is created. The list of prefixes is optional and if omitted, the entire keyspace will be part of that index. Indexes are also typed in that they only cover keys that have a matching type. Currently, only JSON and HASH indexes are supported. A HASH index only indexes HASH keys covered by its prefix list and similarly a JSON index only indexes JSON keys that are covered by its prefix list. Keys within an index’s keyspace prefix list that do not have the designated type are ignored and do not affect search operations.

When a HASH or JSON command modifies a key that is within a keyspace of an index that index is updated. This process involves extracting the declared fields for each index and updating the index with the new value. The update process is done in a background thread, meaning that the indexes are only eventually consistent with their keyspace contents. Thus an insert or update of a key will not be visible in search results for a short period of time. During periods of heavy system load and/or heavy mutation of data, the visibility delay can become longer.

The creation of an index is multi-step process. The first step is to execute the [FT.CREATE](https://docs.aws.amazon.com/memorydb/latest/devguide/vector-search-commands-ft.create.html) command which defines the index. Successful execution of a create automatically initiates the second step – backfilling. The backfill process runs in a background thread and scans the key space for keys that are within the new index’s prefix list. Each key that is found is added to the index. Eventually the entire keyspace is scanned, completing the index creation process. Note that while the backfill process is running, mutations of indexed keys is permitted, there is no restriction and the index backfill process will not complete until all keys are properly indexed. Query operations attempted while an index is undergoing backfill are not allowed and are terminated with an error. The completion of the backfilling process can be determined from the output of the `FT.INFO` command for that index ('backfill\$1status').

## Index field types
<a name="vector-search-index-field-types"></a>

Each field (column) of an index has a specific type that is declared when the index is created and a location within a key. For HASH keys the location is the field name within the HASH. For JSON keys the location is a JSON path description. When a key is modified the data associated with the declared fields is extracted, converted to the declared type and stored in the index. If the data is missing or cannot be successfully converted to the declared type, then that field is omitted from the index. There are four types of fields, as explained following: 
+ **Number fields** contain a single number. For JSON fields, the numeric rules of JSON numbers must be followed. For HASH, the field is expected to contain the ASCII text of a number written in the standard format for fixed or floating point numbers. Regardless of the representation within the key, this field is converted to a 64-bit floating point number for storage within the index. Number fields can be used with the range search operator. Because the underlying numbers are stored in floating point with it’s precision limitations, the usual rules about numeric comparisons for floating point numbers apply.
+ **Tag fields** contain zero or more tag values coded as a single UTF-8 string. The string is parsed into tag values using a separator character (default is a comma but can be overridden) with leading and trailing white space removed. Any number of tag values can be contained in a single tag field. Tag fields can be used to filter queries for tag value equivalence with either case-sensitive or case-insensitive comparison.
+ **Text fields** contain a blob of bytes which need not be UTF-8 compliant. Text fields can be used to decorate query results with application-meaningful values. For example a URL or the contents of a document, etc.
+ **Vector fields** contain a vector of numbers also known as an embedding. Vector fields support K-nearest neighbor searching (KNN) of fixed sized vectors using a specified algorithm and distance metric. For HASH indexes, the field should contain the entire vector encoded in binary format (*little-endian IEEE 754*). For JSON keys the path should reference an array of the correct size filled with numbers. Note that when a JSON array is used as a vector field, the internal representation of the array within the JSON key is converted into the format required by the selected algorithm, reducing memory consumption and precision. Subsequent read operations using the JSON commands will yield the reduced precision value.

## Vector index algorithms
<a name="vector-search-index-algorithms"></a>

Two vector index algorithms are provided:
+ **Flat** – The Flat algorithm is a brute force linear processing of each vector in the index, yielding exact answers within the bounds of the precision of the distance computations. Because of the linear processing of the index, run times for this algorithm can be very high for large indexes.
+ **HNSW (Hierarchical Navigable Small Worlds)** – The HNSW algorithm is an alternative that provides an approximation of the correct answer in exchange for substantially lower execution times. The algorithm is controlled by three parameters `M`, `EF_CONSTRUCTION` and `EF_RUNTIME`. The first two parameters are specified at index creation time and cannot be changed. The `EF_RUNTIME` parameter has a default value that is specified at index creation, but can be overridden on any individual query operation afterward. These three parameters interact to balance memory and CPU consumption during ingestion and query operations as well as control the quality of the approximation of an exact KNN search (known as recall ratio).

Both vector search algorithms (Flat and HNSW) support an optional `INITIAL_CAP` parameter. When specified, this parameter pre-allocates memory for the indexes, resulting in reduced memory management overhead and increased vector ingestion rates.

Vector search algorithms like HNSW may not efficiently handle deleting or overwriting of previously inserted vectors. Use of these operations can result in excess index memory consumption and/or degraded recall quality. Reindexing is one method for restoring optimal memory usage and/or recall. 

## Vector search query expression
<a name="vector-search-query-expression"></a>

The [FT.SEARCH](https://docs.aws.amazon.com/memorydb/latest/devguide/vector-search-commands-ft.search.html) and [FT.AGGREGATE](https://docs.aws.amazon.com/memorydb/latest/devguide/vector-search-commands-ft.aggregate.html) commands require a query expression. This expression is a single string parameter which is composed of one or more operators. Each operator uses one field in the index to identify a subset of the keys in the index. Multiple operators may be combined using boolean combiners as well as parentheses to further enhance or restrict the collected set of keys (or resultset).

### Wildcard
<a name="vector-search-query-expression-wildcard"></a>

The wildcard operator, the asterisk (‘\$1’), matches all keys in the index. 

### Numeric range
<a name="vector-search-query-expression-numeric-range"></a>

The numeric range operator has the following syntax:

```
<range-search> ::= '@' <numeric-field-name> ':' '[' <bound> <bound> ']'
<bound>  ::= <number> | '(' <number>
<number> ::= <integer> | <fixed-point> | <floating-point> | 'Inf' | '-Inf' | '+Inf'
```

The <numeric-field-name> must be a declared field of type `NUMERIC`. By default the bound is inclusive but a leading open parenthesis [‘(’] can be used to make a bound exclusive. Range search can be converted into a single relational comparison (<, <=, >, >=) by using `Inf`, `+Inf` or `-Inf` as one of the bounds. Regardless of the numeric format specified (integer, fixed-point, floating-point, infinity) the number is converted to 64-bit floating point to perform comparisons, reducing precision accordingly.

**Examples**  

```
@numeric-field:[0 10]                      // 0   <= <value> <= 10
@numeric-field:[(0 10]                     // 0   <  <value> <= 10
@numeric-field:[0 (10]                     // 0   <= <value> <  10
@numeric-field:[(0 (10]                    // 0   <  <value> <  10
@numeric-field:[1.5 (Inf]                  // 1.5 <= value
```

### Tag compare
<a name="vector-search-query-expression-tag-compare"></a>

The tag compare operator has the following syntax:

```
<tag-search> ::= '@' <tag-field-name> ':' '{' <tag> [ '|' <tag> ]* '}'
```

If any of the tags in the operator match any of the tags in the tag field of the record, then the record is included in the resultset. The field designed by the `<tag-field-name>` must be a field of the index declared with type `TAG`. Examples of a tag compare are:

```
@tag-field:{ atag }
@tag-field: { tag1 | tag2 }
```

### Boolean combinations
<a name="vector-search-query-expression-boolean-combinations"></a>

The result sets of a numeric or tag operator can be combined using boolean logic: and/or. Parentheses can be used to group operators and/or change the evaluation order. The syntax of boolean logic operators is:

```
<expression> ::= <phrase> | <phrase> '|' <expression> | '(' <expression> ')'
<phrase> ::= <term> | <term> <phrase>
<term> ::= <range-search> | <tag-search> | '*'
```

Multiple terms combined into a phrase are "and"-ed. Multiple phrases combined with the pipe (‘\$1’) are “or”-ed.

### Vector search
<a name="vector-search-query-expression-vector-search"></a>

Vector indexes support two different searching methods: nearest neighbor and range. A nearest neighbor search locates a number, K, of the vectors in the index that are closest to the provided (reference) vector — this is colloquially called KNN for ‘K’ nearest neighbors. The syntax for a KNN search is:

```
<vector-knn-search> ::= <expression> '=>[KNN' <k> '@' <vector-field-name> '$' <parameter-name> <modifiers> ']'
<modifiers> ::= [ 'EF_RUNTIME' <integer> ] [ 'AS' <distance-field-name>]
```

A vector KNN search is only applied to the vectors that satisfy the `<expression>` which can be any combination of the operators defined above: wildcard, range search, tag search and/or boolean combinations thereof.
+ `<k>` is an integer specifying the number of nearest neighbor vectors to be returned.
+ `<vector-field-name>` must specify a declared field of type `VECTOR`.
+ `<parameter-name>` field specifies one of the entries for the `PARAM` table of the `FT.SEARCH` or `FT.AGGREGATE` command. This parameter is the reference vector value for distance computations. The value of the vector is encoded into the `PARAM` value in *little-endian IEEE 754* binary format (same encoded as for a HASH vector field)
+ For vector indexes of type HNSW, the optional `EF_RUNTIME` clause can be used to override the default value of the `EF_RUNTIME` parameter that was established when the index was created.
+ The optional `<distance-field-name>` provides a field name for the resultset to contain the computed distance between the reference vector and the located key.

A range search locates all vectors within a specified distance (radius) from a reference vector. The syntax for a range search is:

```
<vector-range-search> ::= ‘@’ <vector-field-name> ‘:’ ‘[’ ‘VECTOR_RANGE’ ( <radius> | ‘$’ <radius-parameter> )  $<reference-vector-parameter> ‘]’ [ ‘=’ ‘>’ ‘{’ <modifiers> ‘}’ ] 
<modifiers> ::= <modifier> | <modifiers>, <modifier> 
<modifer> ::= [ ‘$yield_distance_as’ ‘:’ <distance-field-name> ] [ ‘$epsilon’ ‘:’ <epsilon-value> ]
```

Where:
+ `<vector-field-name>`is the name of the vector field to be searched.
+ `<radius> or $<radius-parameter>` is the numerical distance limit for search.
+ `$<reference-vector-parameter> ` is the name of the parameter that contains the reference vector. The value of the vector is encoded into the PARAM value in little-endian IEEE 754 binary format (same encoding as for a HASH vector field)
+ The optional `<distance-field-name>` provides a field name for the resultset to contain the computed distance between the reference vector and each key.
+ The optional `<epsilon-value> ` controls the boundary of the search operation, vectors within the distance `<radius> * (1.0 + <epsilon-value>) ` are traversed looking for candidate results. The default is .01.

## INFO command
<a name="vector-search-ft.info"></a>

Vector search augments the Valkey and Redis OSS [INFO](https://valkey.io/commands/info/) command with several additional sections of statistics and counters. A request to retrieve the section `SEARCH` will retrieve all of the following sections:

### `search_memory` section
<a name="vector-search-ft.info-search-memory"></a>


| Name | Description | 
| --- | --- | 
| search\$1used\$1memory\$1bytes | Number of bytes of memory consumed in all search data structures | 
| search\$1used\$1memory\$1human | Human readable version of above | 

### `search_index_stats` section
<a name="vector-search-ft.info-search_index_stats"></a>


| Name | Description | 
| --- | --- | 
| search\$1number\$1of\$1indexes | Number of created indexes | 
| search\$1num\$1fulltext\$1indexes | Number of non-vector fields in all indexes | 
| search\$1num\$1vector\$1indexes | Number of vector fields in all indexes | 
| search\$1num\$1hash\$1indexes | Number of indexes on HASH type keys | 
| search\$1num\$1json\$1indexes | Number of indexes on JSON type keys | 
| search\$1total\$1indexed\$1keys | Total number of keys in all indexes | 
| search\$1total\$1indexed\$1vectors | Total number of vectors in all indexes | 
| search\$1total\$1indexed\$1hash\$1keys | Total number of keys of type HASH in all indexes | 
| search\$1total\$1indexed\$1json\$1keys | Total number of keys of tytpe JSON in all indexes | 
| search\$1total\$1index\$1size | Bytes used by all indexes | 
| search\$1total\$1fulltext\$1index\$1size | Bytes used by non-vector index structures | 
| search\$1total\$1vector\$1index\$1size | Bytes used by vector index structures | 
| search\$1max\$1index\$1lag\$1ms | Ingestion delay during last ingestion batch update | 

### `search_ingestion` section
<a name="vector-search-ft.info-search_ingestion"></a>


| Name | Description | 
| --- | --- | 
| search\$1background\$1indexing\$1status | Status of ingestion. NO\$1ACTIVITY means idle. Other values indicate there are keys in the process of being ingested. | 
| search\$1ingestion\$1paused | Except while restarting, this should always be "no". | 

### `search_backfill` section
<a name="vector-search-ft.info-search_backfill"></a>

**Note**  
Some of the fields documented in this section are only visible when a backfill is currently in progress.


| Name | Description | 
| --- | --- | 
| search\$1num\$1active\$1backfills | Number of current backfill activities | 
| search\$1backfills\$1paused | Except when out of memory, this should always be "no". | 
| search\$1current\$1backfill\$1progress\$1percentage | % completion (0-100) of the current backfill | 

### `search_query` section
<a name="vector-search-ft.info-search_query"></a>


| Name | Description | 
| --- | --- | 
| search\$1num\$1active\$1queries | Number of FT.SEARCH and FT.AGGREGATE commands currently in progress | 

## Vector search security
<a name="vector-search-security"></a>

[ACL (Access Control Lists)](https://valkey.io/topics/acl/) security mechanisms for both command and data access are extended to control the search facility. ACL control of individual search commands is fully supported. A new ACL category, `@search`, is provided and many of the existing categories (`@fast`, `@read`, `@write`, etc.) are updated to include the new commands. Search commands do not modify key data, meaning that the existing ACL machinery for write access is preserved. The access rules for HASH and JSON operations are not modified by the presence of an index; normal key-level access control is still applied to those commands.

Search commands with an index also have their access controlled through ACL. Access checks are performed at the whole-index level, not at the per-key level. This means that access to an index is granted to a user only if that user has permission to access all possible keys within the keyspace prefix list of that index. In other words, the actual contents of an index don’t control the access. Rather, it is the theoretical contents of an index as defined by the prefix list which is used for the security check. It can be easy to create a situation where a user has read and/or write access to a key but is unable to access an index containing that key. Note that only read access to the keyspace is required to create or use an index – the presence or absence of write access is not considered. 

For more information on using ACLs with MemoryDB see [Authenticating users with Access Control Lists (ACLs)](https://docs.aws.amazon.com/memorydb/latest/devguide/clusters.acls.html).

# Use cases
<a name="vector-search-examples"></a>

Following are use cases of vector search.

## Retrieval Augmented Generation (RAG)
<a name="vector-search-examples-retrieval-augmented-generation"></a>

Retrieval Augmented Generation (RAG) leverages vector search to retrieve relevant passages from a large corpus of data to augment a large language model (LLM). Specifically, an encoder embeds the input context and search query into vectors, then uses approximate nearest neighbor search to find semantically similar passages. These retrieved passages are concatenated with the original context to provide additional relevant information to the LLM to return a more accurate response to the user.

![\[Graphic of Retrieval Augmented Generation flow\]](http://docs.aws.amazon.com/memorydb/latest/devguide/images/RAG.png)


## Durable Semantic Cache
<a name="vector-search-examples-durable-semantic-cache"></a>

Semantic Caching is a process to reduce computational costs by storing previous results from the FM. By reusing previous results from prior inferences instead of recomputing them, semantic caching reduces the amount of computation required during inference through the FMs. MemoryDB enables durable semantic caching, which avoids data loss of your past inferences. This allows your generative AI applications to respond within single-digit milliseconds with answers from prior semantically similar questions, while reducing cost by avoiding unnecessary LLM inferences.

![\[Workflow diagram showing Foundation Model process.\]](http://docs.aws.amazon.com/memorydb/latest/devguide/images/FM.png)

+ **Semantic search hit** – If a customer’s query is semantically similar based on a defined similarity score to a prior question, the FM buffer memory (MemoryDB) will return the answer to the prior question in step 4 and will not call the FM through steps 3. This will avoid the foundation model (FM) latency and costs incurred, providing for a faster experience for the customer.
+ **Semantic search miss** – If a customer’s query is not semantically similar based on a defined similarity score to a prior query, a customer will call the FM to deliver a response to customer in step 3a. The response generated from the FM will then be stored as a vector into MemoryDB for future queries (step 3b) to minimize FM costs on semantically similar questions. In this flow, step 4 would not be invoked as there was no semantically similar question for the original query. 

## Fraud detection
<a name="vector-search-examples-fraud-detection"></a>

Fraud detection, a form of anomaly detection, represents valid transactions as vectors while comparing the vector representations of net new transactions. Fraud is detected when these net new transactions have a low similarity to the vectors representing the valid transactional data. This allows fraud to be detected by modeling normal behavior, rather than trying to predict every possible instance of fraud. MemoryDB allows for organizations to do this in periods of high throughput, with minimal false positives and single-digit millisecond latency.

![\[Workflow diagram showing Fraud Detection process.\]](http://docs.aws.amazon.com/memorydb/latest/devguide/images/fraud-detection.png)


## Other use cases
<a name="vector-search-engines"></a>
+ **Recommendation engines** can find users similar products or content by representing items as vectors. The vectors are created by analyzing attributes and patterns. Based on user patterns and attributes, new unseen items can be recommended to users by finding the most similar vectors already rated positively aligned to the user.
+ **Document search engines** represent text documents as dense vectors of numbers, capturing semantic meaning. At search time, the engine converts a search query to a vector and finds documents with the most similar vectors to the query using approximate nearest neighbor search. This vector similarity approach allows matching documents based on meaning rather than just matching keywords.

# Vector search features and limits
<a name="vector-search-limits"></a>

## Vector search availability
<a name="vector-search-availability"></a>

Vector search-enabled MemoryDB configuration is supported on R6g, R7g, and T4g node types and is available in all AWS Regions where MemoryDB is available. 

Existing clusters can not be modified to enable search. However, search-enabled clusters can be created from snapshots of clusters with search disabled.

## Parametric restrictions
<a name="parameter-restrictions"></a>

The following table shows limits for various vector search items:


| Item | Maximum value | 
| --- | --- | 
| Number of dimensions in a vector | 32768 | 
| Number of indexes that can be created | 10 | 
| Number of fields in an index | 50 | 
| FT.SEARCH and FT.AGGREGATE TIMEOUT clause (milliseconds) | 10000 | 
| Number of pipeline stages in FT.AGGREGATE command | 32 | 
| Number of fields in FT.AGGREGATE LOAD clause | 1024 | 
| Number of fields in FT.AGGREGATE GROUPBY clause | 16 | 
| Number of fields in FT.AGGREGATE SORTBY clause | 16 | 
| Number of parameters in FT.AGGREGATE PARAM clause | 32 | 
| HNSW M parameter | 512 | 
| HNSW EF\$1CONSTRUCTION parameter | 4096 | 
| HNSW EF\$1RUNTIME parameter | 4096 | 

## Scaling limits
<a name="scaling-restrictions"></a>

Vector search for MemoryDB is currently limited to a single shard and horizontal scaling is not supported. Vector search supports vertical and replica scaling.

## Operational restrictions
<a name="operational-restrictions"></a>

**Index Persistence and Backfilling**

The vector search feature persists the definition of indexes, and the content of the index. This means that during any operational request or event that causes a node to start or restart, the index definition and content are restored from the latest snapshot and any pending transactions are read from the multi-AZ transaction log. No user action is required to initiate this. The rebuild is performed as a backfill operation as soon as data is restored. This is functionally equivalent to the system automatically executing an [FT.CREATE](https://docs.aws.amazon.com/memorydb/latest/devguide/vector-search-commands-ft.create.html) command for each defined index. Note that the node becomes available for application operations as soon as the data is restored but likely before index backfill has completed, meaning that backfill(s) will again become visible to applications, for example, search commands using backfilling indexes may be rejected. For more information on backfilling, see [Vector search overviewIndexes and the Valkey and Redis OSS keyspace](vector-search-overview.md#vector-search-indexes-keyspaces).

The completion of index backfill is not synchronized between a primary and a replica. This lack of synchronization can unexpectedly become visible to applications and thus it is recommended that applications verify backfill completion on primaries and all replicas before initiating search operations.

## Snapshot import/export and Live Migration
<a name="snapshot-restrictions"></a>

The presence of search indexes in an RDB file limits the compatible transportability of that data. The format of the vector indexes defined by the MemoryDB vector search functionality is only understood by another MemoryDB vector enabled cluster. Also, the RDB files from the preview clusters can be imported by the GA version of the MemoryDB clusters, which will rebuild the index content on loading the RDB file. 

However, RDB files that do not contain indexes are not restricted in this fashion. Thus data within a preview cluster can be exported to non-preview clusters by deleting the indexes prior to the export.

## Memory consumption
<a name="memory-consumption"></a>

 Memory consumption is based on the number of vectors, the number of dimensions, the M-value, and the amount of non-vector data, such as metadata associated to the vector or other data stored within the instance. 

The total memory required is a combination of the space needed for the actual vector data, and the space required for the vector indices. The space required for Vector data is calculated by measuring the actual capacity required for storing vectors within HASH or JSON data structures and the overhead to the nearest memory slabs, for optimal memory allocations. Each of the vector indexes uses references to the vector data stored in these data structures, and uses efficient memory optimizations to remove any duplicate copies of the vector data in the index.

The number of vectors depend on how you decide to represent your data as vectors. For instance, you can choose to represent a single document into several chunks, where each chunk represents a vector. Alternatively, you could choose to represent the whole document as a single vector. 

The number of dimensions of your vectors is dependent on the embedding model you choose. For instance, if you choose to use the [AWS Titan](https://aws.amazon.com/bedrock/titan/) embedding model then the number of dimensions would be 1536. 

The M parameter represents the number of bi-directional links created for every new element during index construction. MemoryDB defaults this value to 16; however, you can override this. A higher M parameter works better for high dimensionality and/or high recall requirements while low M parameters work better for low dimensionality and/or low recall requirements. The M value increases the consumption of memory as the index becomes larger, increasing memory consumption. 

Within the console experience, MemoryDB offers an easy way to choose the right instance type based on the characteristics of your vector workload after checking Enable vector search under the cluster settings. 

![\[Vector search cluster settings in the AWS console.\]](http://docs.aws.amazon.com/memorydb/latest/devguide/images/vector-search-cluster-settings-console.png)




**Sample workload**

A customer wants to build a semantic search engine built on top of their internal financial documents. They currently hold 1M financial documents that are chunked into 10 vectors per document using the titan embedding model with 1536 dimensions and have no non-vector data. The customer decides to use the default of 16 as the M parameter. 
+ Vectors: 1 M \$1 10 chunks = 10M vectors
+ Dimensions: 1536
+ Non-Vector Data (GB): 0 GB
+ M parameter: 16

With this data, the customer can click the Use vector calculator button within the console to get a recommended instance type based on their parameters:

![\[The vector calculator recommended node type, based on the input to the calculator.\]](http://docs.aws.amazon.com/memorydb/latest/devguide/images/vector-calc1.png)


![\[The vector calculator with values entered.\]](http://docs.aws.amazon.com/memorydb/latest/devguide/images/vector-calc2.png)


In this example, the vector calculator will look for the smallest [MemoryDB r7g node type](https://aws.amazon.com/memorydb/pricing/) that can hold the memory required to store the vectors based on the parameters provided. Note that this is an approximation, and you should test the instance type to make sure it fits your requirements.



Based on the above calculation method and the parameters in the sample workload, this vector data would require 104.9 GB to store the data and a single index. In this case, the `db.r7g.4xlarge` instance type would be recommended as it has 105.81 GB of usable memory. The next smallest node type would be too small to hold the vector workload.

As each of the vector indexes use references to the vector data stored and do not create additional copies of the vector data in the vector index, the indexes will also consume relatively less space. This is very useful in creating multiple indexes, and also in situations where portions of the vector data have been deleted and reconstructing the HNSW graph would help create optimal node connections for high quality vector search results.

## Out of Memory during backfill
<a name="out-of-memory-backfill"></a>

Similar to Valkey and Redis OSS write operations, an index backfill is subjected to out-of-memory limitations. If engine memory is filled up while a backfill is in progress, all backfills are paused. If memory becomes available, the backfill process is resumed. It is also possible to delete and index when backfill is paused due to out of memory.

## Transactions
<a name="transactions"></a>

The commands `FT.CREATE`, `FT.DROPINDEX`, `FT.ALIASADD`, `FT.ALIASDEL`, and `FT.ALIASUPDATE` cannot be executed in a transactional context, i.e., not within a MULTI/EXEC block or within a LUA or FUNCTION script. 

# Create a cluster enabled for vector search
<a name="vector-search-cluster"></a>

You can create a cluster that is enabled for vector search by using the AWS Management Console, or the AWS Command Line Interface. Depending on the approach, considerations to enable vector search must be enabled.

## Using the AWS Management Console
<a name="vector-search-console"></a>

To create a cluster enabled for vector search within the console, you need to enable vector search under the **Cluster** settings. Vector search is available for MemoryDB version 7.1 in a single shard configuration.

![\[Viewing the cluster settings with the "Enable vector search" option checked provides information about specific version and configuration support.\]](http://docs.aws.amazon.com/memorydb/latest/devguide/images/vs-2.png)


For more information on using vector search with the AWS Management Console, see [Creating a cluster (Console)](getting-started.md#clusters.createclusters.viewdetails.cluster).

## Using the AWS Command Line Interface
<a name="vector-search-cli"></a>

To create a vector search enabled MemoryDB cluster, you can use the MemoryDB [create-cluster](https://docs.aws.amazon.com/cli/latest/reference/memorydb/create-cluster.html) command by passing an immutable parameter group `default.memorydb-redis7.search` to enable the vector search capabilities.

```
aws memorydb create-cluster \
  --cluster-name <value> \
  --node-type <value> \
  --engine redis \
  --engine-version 7.1 \
  --num-shards 1 \
  --acl-name <value> \
  --parameter-group-name default.memorydb-redis7.search
```

Optionally, you can also create a new parameter group to enable vector search as shown in the following example. You can learn more about parameter groups [here](parametergroups.management.md).

```
aws memorydb create-parameter-group \
  --parameter-group-name my-search-parameter-group \
  --family memorydb_redis7
```

Next, update the parameter search-enabled to yes in the newly created parameter group.

```
aws memorydb update-parameter-group \
  --parameter-group-name my-search-parameter-group \
  --parameter-name-values "ParameterName=search-enabled,ParameterValue=yes"
```

You can now use this custom parameter group instead the of the default parameter group to enable vector search on your MemoryDB clusters.

# Vector search commands
<a name="vector-search-commands"></a>

Following are a list of supported commands for vector search. 

**Topics**
+ [

# FT.CREATE
](vector-search-commands-ft.create.md)
+ [

# FT.SEARCH
](vector-search-commands-ft.search.md)
+ [

# FT.AGGREGATE
](vector-search-commands-ft.aggregate.md)
+ [

# FT.DROPINDEX
](vector-search-commands-ft.dropindex.md)
+ [

# FT.INFO
](vector-search-commands-ft.info.md)
+ [

# FT.\$1LIST
](vector-search-commands-ft.list.md)
+ [

# FT.ALIASADD
](vector-search-commands-ft.aliasadd.md)
+ [

# FT.ALIASDEL
](vector-search-commands-ft.aliasdel.md)
+ [

# FT.ALIASUPDATE
](vector-search-commands-ft.aliasupdate.md)
+ [

# FT.\$1ALIASLIST
](vector-search-commands-ft.aliaslist.md)
+ [

# FT.PROFILE
](vector-search-commands-ft.profile.md)
+ [

# FT.EXPLAIN
](vector-search-commands-ft.explain.md)
+ [

# FT.EXPLAINCLI
](vector-search-commands-ft.explain-cli.md)

# FT.CREATE
<a name="vector-search-commands-ft.create"></a>

 Creates an index and initiates a backfill of that index. For more information, see [Vector search overview](https://docs.aws.amazon.com/memorydb/latest/devguide/vector-search-overview.html) for details on index construction.

**Syntax**

```
FT.CREATE <index-name>
ON HASH | JSON
[PREFIX <count> <prefix1> [<prefix2>...]]
SCHEMA 
(<field-identifier> [AS <alias>] 
  NUMERIC 
| TAG [SEPARATOR <sep>] [CASESENSITIVE] 
| TEXT
| VECTOR [HNSW|FLAT] <attr_count> [<attribute_name> <attribute_value>])

)+
```

**Schema**
+ Field identifier:
  + For hash keys, field identifier is A field name.
  + For JSON keys, field identifier is A JSON path.

  For more information, see [Index field types](vector-search-overview.md#vector-search-index-field-types).
+ Field types:
  + TAG: For more information, see [Tags](https://redis.io/docs/interact/search-and-query/advanced-concepts/tags/) .
  + NUMERIC: Field contains a number.
  + TEXT: Field contains any blob of data.
  + VECTOR: vector field that supports vector search.
    + Algorithm – can be HNSW (Hierarchical Navigable Small World) or FLAT (brute force). 
    + `attr_count` – number of attributes that will be passed as algorithm configuration, this includes both names and values. 
    + `{attribute_name} {attribute_value}` – algorithm-specific key/value pairs that define index configuration. 

      For FLAT algorithm, attributes are:

      Required:
      + DIM – Number of dimensions in the vector.
      + DISTANCE\$1METRIC – Can be one of [L2 \$1 IP \$1 COSINE].
      + TYPE – Vector type. The only supported type is `FLOAT32`.

      Optional:
      + INITIAL\$1CAP – Initial vector capacity in the index affecting memory allocation size of the index.

      For HNSW algorithm, attributes are:

      Required:
      + TYPE – Vector type. The only supported type is `FLOAT32`.
      + DIM – Vector dimension, specified as a positive integer. Maximum: 32768
      + DISTANCE\$1METRIC – Can be one of [L2 \$1 IP \$1 COSINE].

      Optional:
      + INITIAL\$1CAP – Initial vector capacity in the index affecting memory allocation size of the index. Defaults to 1024.
      + M – Number of maximum allowed outgoing edges for each node in the graph in each layer. on layer zero the maximal number of outgoing edges will be 2M. Default is 16 Maximum is 512.
      + EF\$1CONSTRUCTION – controls the number of vectors examined during index construction. Higher values for this parameter will improve recall ratio at the expense of longer index creation times. Default value is 200. Maximum value is 4096.
      + EF\$1RUNTIME – controls the number of vectors examined during query operations. Higher values for this parameter can yield improved recall at the expense of longer query times. The value of this parameter can be overriden on a per-query basis. Default value is 10. Maximum value is 4096.

**Return**

Returns a simple string OK message or error reply.

**Examples**

**Note**  
The following example uses arguments native to [valkey-cli](https://valkey.io/topics/cli/), such as de-quoting and de-escaping of data, before sending it to Valkey or Redis OSS. To use other programming-language clients (Python, Ruby, C\$1, etc.), follow those environments' handling rules for dealing with strings and binary data. For more information on supported clients, see [Tools to Build on AWS](https://aws.amazon.com/developer/tools/)

**Example 1: Create some indexes**  
Create an index for vectors of size 2  

```
FT.CREATE hash_idx1 ON HASH PREFIX 1 hash: SCHEMA vec AS VEC VECTOR HNSW 6 DIM 2 TYPE FLOAT32 DISTANCE_METRIC L2
OK
```
Create a 6-dimensional JSON index using the HNSW algorithm:  

```
FT.CREATE json_idx1 ON JSON PREFIX 1 json: SCHEMA $.vec AS VEC VECTOR HNSW 6 DIM 6 TYPE FLOAT32 DISTANCE_METRIC L2
OK
```

**Example 2: Populate some data**  
The following commands are formatted so they can be executed as arguments to the redis-cli terminal program. Developers using programming-language clients (such Python, Ruby, C\$1, etc.) will need to follow their environment's handling rules for dealing with strings and binary data.  
Creating some hash and json data:  

```
HSET hash:0 vec "\x00\x00\x00\x00\x00\x00\x00\x00"
HSET hash:1 vec "\x00\x00\x00\x00\x00\x00\x80\xbf"
JSON.SET json:0 . '{"vec":[1,2,3,4,5,6]}'
JSON.SET json:1 . '{"vec":[10,20,30,40,50,60]}'
JSON.SET json:2 . '{"vec":[1.1,1.2,1.3,1.4,1.5,1.6]}'
```
Note the following:  
+ The keys of the hash and JSON data have the prefixes of their index definitions.
+ The vectors are at the appropriate paths of the index definitions.
+ The hash vectors are entered as hex data while the JSON data is entered as numbers.
+ The vectors are the appropriate lengths, the two-dimensional hash vector entries have two floats worth of hex data, the six-dimensional json vector entries have six numbers.

**Example 3: Delete and re-create an index**  

```
FT.DROPINDEX json_idx1
OK

FT.CREATE json_idx1 ON JSON PREFIX 1 json: SCHEMA $.vec AS VEC VECTOR FLAT 6 DIM 6 TYPE FLOAT32 DISTANCE_METRIC L2
OK
```
Note the new JSON index uses the `FLAT` algorithm instead of the `HNSW` algorithm. Also note that it will re-index the existing JSON data:  

```
FT.SEARCH json_idx1 "*=>[KNN 100 @VEC $query_vec]" PARAMS 2 query_vec "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" DIALECT 2
1) (integer) 3
2) "json:2"
3) 1) "__VEC_score"
   2) "11.11"
   3) "$"
   4) "[{\"vec\":[1.1, 1.2, 1.3, 1.4, 1.5, 1.6]}]"
4) "json:0"
5) 1) "__VEC_score"
   2) "91"
   3) "$"
   4) "[{\"vec\":[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]}]"
6) "json:1"
7) 1) "__VEC_score"
   2) "9100"
   3) "$"
   4) "[{\"vec\":[10.0, 20.0, 30.0, 40.0, 50.0, 60.0]}]"
```

# FT.SEARCH
<a name="vector-search-commands-ft.search"></a>

Uses the provided query expression to locate keys within an index. Once located, the count and/or content of indexed fields within those keys can be returned. For more information, see [Vector search query expression](https://docs.aws.amazon.com/memorydb/latest/devguide/vector-search-overview.html#vector-search-query-expression).

To create data for use in these examples, see the [FT.CREATE](https://docs.aws.amazon.com/memorydb/latest/devguide/vector-search-commands-ft.create.html) command.

**Syntax**

```
FT.SEARCH <index-name> <query>
[RETURN <token_count> (<field-identifier> [AS <alias>])+]
[TIMEOUT timeout] 
[PARAMS <count> <name> <value> [<name> <value>]]
[LIMIT <offset> <count>]
[COUNT]
```
+ RETURN: This clause identifies which fields of a key are returned. The optional AS clause on each field overrides the name of the field in the result. Only fields that have been declared for this index can be specified.
+ LIMIT: <offset> <count>: This clause provides pagination capability in that only the keys that satisfy the offset and count values are returned. If this clause is omitted, it defaults to "LIMIT 0 10", i.e., only a maximum of 10 keys will be returned. 
+ PARAMS: Two times the number of key value pairs. Param key/value pairs can be referenced from within the query expression. For more information, see [Vector search query expression](https://docs.aws.amazon.com/memorydb/latest/devguide/vector-search-overview.html#vector-search-query-expression).
+ COUNT: This clause suppresses returning the contents of keys, only the number of keys is returned. This is an alias for "LIMIT 0 0".

**Return**

Returns an array or error reply.
+ If the operation completes successfully, it returns an array. The first element is the total number of keys matching the query. The remaining elements are pairs of key name and field list. Field list is another array comprising pairs of field name and values. 
+ If the index is in progress of being back-filled, the command immediately returns an error reply.
+ If timeout is reached, the command returns an error reply.

**Example: Do some searches**

**Note**  
The following example uses arguments native to [valkey-cli](https://valkey.io/topics/cli/), such as de-quoting and de-escaping of data, before sending it to Valkey or Redis OSS. To use other programming-language clients (Python, Ruby, C\$1, etc.), follow those environments' handling rules for dealing with strings and binary data. For more information on supported clients, see [Tools to Build on AWS](https://aws.amazon.com/developer/tools/)

**A hash search**

```
FT.SEARCH hash_idx1 "*=>[KNN 2 @VEC $query_vec]" PARAMS 2 query_vec "\x00\x00\x00\x00\x00\x00\x00\x00" DIALECT 2
1) (integer) 2
2) "hash:0"
3) 1) "__VEC_score"
   2) "0"
   3) "vec"
   4) "\x00\x00\x00\x00\x00\x00\x00\x00"
4) "hash:1"
5) 1) "__VEC_score"
   2) "1"
   3) "vec"
   4) "\x00\x00\x00\x00\x00\x00\x80\xbf"
```

This produces two results, sorted by their score, which is the distance from the query vector (entered as hex).

**JSON searches**

```
FT.SEARCH json_idx1 "*=>[KNN 2 @VEC $query_vec]" PARAMS 2 query_vec "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" DIALECT 2
1) (integer) 2
2) "json:2"
3) 1) "__VEC_score"
   2) "11.11"
   3) "$"
   4) "[{\"vec\":[1.1, 1.2, 1.3, 1.4, 1.5, 1.6]}]"
4) "json:0"
5) 1) "__VEC_score"
   2) "91"
   3) "$"
   4) "[{\"vec\":[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]}]"
```

This produces the two closest results, sorted by their score, and note that the JSON vector values are converted to floats and the query vector is still vector data. Note also that because the `KNN` parameter is 2, there are only two results. A larger value will return more results:

```
FT.SEARCH json_idx1 "*=>[KNN 100 @VEC $query_vec]" PARAMS 2 query_vec "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" DIALECT 2
1) (integer) 3
2) "json:2"
3) 1) "__VEC_score"
   2) "11.11"
   3) "$"
   4) "[{\"vec\":[1.1, 1.2, 1.3, 1.4, 1.5, 1.6]}]"
4) "json:0"
5) 1) "__VEC_score"
   2) "91"
   3) "$"
   4) "[{\"vec\":[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]}]"
6) "json:1"
7) 1) "__VEC_score"
   2) "9100"
   3) "$"
   4) "[{\"vec\":[10.0, 20.0, 30.0, 40.0, 50.0, 60.0]}]"
```

# FT.AGGREGATE
<a name="vector-search-commands-ft.aggregate"></a>

A superset of the FT.SEARCH command, it allows substantial additional processing of the keys selected by the query expression.

**Syntax**

```
FT.AGGREGATE index query
  [LOAD * | [count field [field ...]]]
  [TIMEOUT timeout]
  [PARAMS count name value [name value ...]]
  [FILTER expression]
  [LIMIT offset num]  
  [GROUPBY count property [property ...] [REDUCE function count arg [arg ...] [AS name] [REDUCE function count arg [arg ...] [AS name] ...]] ...]] 
  [SORTBY count [ property ASC | DESC [property ASC | DESC ...]] [MAX num]] 
  [APPLY expression AS name]
```
+ FILTER, LIMIT, GROUPBY, SORTBY and APPLY clauses can be repeated multiple times in any order and be freely intermixed. They are applied in the order specified with the output of one clause feeding the input of the next clause.
+ In the above syntax, a “property” is either a field declared in the [FT.CREATE](https://docs.aws.amazon.com/memorydb/latest/devguide/vector-search-commands-ft.create.html) command for this index OR the output of a previous APPLY clause or REDUCE function.
+ The LOAD clause is restricted to loading fields that have been declared in the index. “LOAD \$1” will load all fields declared in the index. 
+ The following reducer functions are supported: COUNT, COUNT\$1DISTINCTISH, SUM, MIN, MAX, AVG, STDDEV, QUANTILE, TOLIST, FIRST\$1VALUE, and RANDOM\$1SAMPLE. For more information, see [Aggregations](https://redis.io/docs/interact/search-and-query/search/aggregations/)
+ LIMIT <offset> <count>: Retains records starting at <offset> and continuing for up to <count>, all other records are discarded.
+ PARAMS: Two times the number of key value pairs. Param key/value pairs can be referenced from within the query expression.

**Return**

Returns an array or error reply.
+ If the operation completes successfully, it returns an array. The first element is an integer with no particular meaning (should be ignored). The remaining elements are the results output by the last stage. Each element is an array of field name and value pairs.
+ If the index is in progress of being back-filled, the command immediately returns an error reply.
+ If timeout is reached, the command returns an error reply.

# FT.DROPINDEX
<a name="vector-search-commands-ft.dropindex"></a>

Drop an index. The index definition and associated content are deleted. Keys are unaffected.

**Syntax**

```
FT.DROPINDEX <index-name>
```

**Return**

Returns a simple string OK message or an error reply.

# FT.INFO
<a name="vector-search-commands-ft.info"></a>

**Syntax**

```
FT.INFO <index-name>
```

Output from the FT.INFO page is an array of key value pairs as described in the following table:


| Key | Value type | Description | 
| --- | --- | --- | 
| index\$1name | string | Name of the index | 
| creation\$1timestamp | integer | Unix-style timestamp of creation time | 
| key\$1type | string | HASH or JSON | 
| key\$1prefixes | array of strings | Key prefixes for this index | 
| fields | array of field information | Fields of this index | 
| space\$1usage | integer | Memory bytes used by this index | 
| fullext\$1space\$1usage | integer | Memory bytes used by non-vector fields | 
| vector\$1space\$1usage | integer | Memory bytes used by vector fields | 
| num\$1docs | integer | Number of keys currently contained in the index | 
| num\$1indexed\$1vectors | integer | Number of vectors currently contained in the index | 
| current\$1lag | integer | Recent delay of ingestion (milliSeconds) | 
| backfill\$1status | string | One of: Completed, InProgres, Paused or Failed  | 

The following table describes the information for each field:


| Key | Value type | Description | 
| --- | --- | --- | 
| identifier | string | name of field | 
| field\$1name | string | Hash member name or JSON Path | 
| type | string | one of: Numeric, Tag, Text or Vector | 
| option | string | ignore | 

If the field is of type Vector, additional information will be present depending on the algorithm. 

For the HNSW algorithm:


| Key | Value type | Description | 
| --- | --- | --- | 
| algorithm | string | HNSW | 
| data\$1type | string | FLOAT32 | 
| distance\$1metric | string | one of: L2, IP or Cosine | 
| initial\$1capacity | integer | Initial size of vector field index | 
| current\$1capacity | integer | Current size of vector field index | 
| maximum\$1edges | integer | M parameter at creation | 
| ef\$1construction | integer | EF\$1CONSTRUCTION parameter at creation | 
| ef\$1runtime | integer | EF\$1RUNTIME parameter at creation | 

For the FLAT algorithm:


| Key | Value type | Description | 
| --- | --- | --- | 
| algorithm | string | FLAT | 
| data\$1type | string | FLOAT32 | 
| distance\$1metric | string | one of: L2, IP or Cosine | 
| initial\$1capacity | integer | Initial size of vector field index | 
| current\$1capacity | integer | Current size of vector field index | 

# FT.\$1LIST
<a name="vector-search-commands-ft.list"></a>

List all indexes.

**Syntax**

```
FT._LIST 
```

**Return**

Returns an array of index names

# FT.ALIASADD
<a name="vector-search-commands-ft.aliasadd"></a>

Add an alias for an index. The new alias name can be used anywhere that an index name is required.

**Syntax**

```
FT.ALIASADD <alias> <index-name> 
```

**Return**

Returns a simple string OK message or an error reply.

# FT.ALIASDEL
<a name="vector-search-commands-ft.aliasdel"></a>

Delete an existing alias for an index.

**Syntax**

```
FT.ALIASDEL <alias>
```

**Return**

Returns a simple string OK message or an error reply.

# FT.ALIASUPDATE
<a name="vector-search-commands-ft.aliasupdate"></a>

Update an existing alias to point to a different physical index. This command only affects future references to the alias. Currently in-progress operations (FT.SEARCH, FT.AGGREGATE) are unaffected by this command.

**Syntax**

```
FT.ALIASUPDATE <alias> <index>
```

**Return**

Returns a simple string OK message or an error reply.

# FT.\$1ALIASLIST
<a name="vector-search-commands-ft.aliaslist"></a>

List the index aliases.

**Syntax**

```
FT._ALIASLIST
```

**Return**

Returns an array the size of the number of current aliases. Each element of the array is the alias-index pair.

# FT.PROFILE
<a name="vector-search-commands-ft.profile"></a>

Run a query and return profile information about that query.

**Syntax**

```
FT.PROFILE 

<index>
SEARCH | AGGREGATE 
[LIMITED]
QUERY <query ....>
```

**Return**

A two-element array. The first element is the result of the `FT.SEARCH` or `FT.AGGREGATE` command that was profiled. The second element is an array of performance and profiling information.

# FT.EXPLAIN
<a name="vector-search-commands-ft.explain"></a>

Parse a query and return information about how that query was parsed.

**Syntax**

```
FT.EXPLAIN <index> <query>
```

**Return**

A string containing the parsed results.

# FT.EXPLAINCLI
<a name="vector-search-commands-ft.explain-cli"></a>

Same as the FT.EXPLAIN command except that the results are displayed in a different format more useful with the redis-cli.

**Syntax**

```
FT.EXPLAINCLI <index> <query>
```

**Return**

A string containing the parsed results.