

# Controlling How Search Results are Returned in Amazon CloudSearch
<a name="controlling-search-results"></a>

You can specify parameters in your search request to control how the search results are sorted, return results in XML rather than JSON, and paginate through the result set. You can define expressions that calculate a custom value that can be used to specify search constraints or sort results. 

**Topics**
+ [Sorting Results in Amazon CloudSearch](sorting-results.md)
+ [Using Relative Field Weighting to Customize Relevance Ranking in Amazon CloudSearch](weighting-fields.md)
+ [Configuring Expressions in Amazon CloudSearch](configuring-expressions.md)
+ [Get results as XML](getting-xml-results.md)
+ [Paginate the results](paginating-results.md)

# Sorting Results in Amazon CloudSearch
<a name="sorting-results"></a>

By default, search results are sorted according to their relevance to the search request. A document's relevance score (`_score`) is based on how often the search terms appear in the document compared to how common the term is across all documents in the domain. Relevance scores are positive values that can vary widely depending on your data and queries. The scores for each clause in your query are additive, so queries with more clauses will naturally have higher scores than queries with just one or two. If you know what your typical queries will look like, you can do some test queries to get an idea of the range of scores you’re likely to see. 

To change how search results are sorted, you can:
+ Use a `text` or `literal` field to sort results alphabetically. Note that Amazon CloudSearch sorts by Unicode codepoint, so numbers come before letters and uppercase letters come before lowercase letters. Numbers are sorted as strings, not by value; for example, 10 will come before 2. 
+ Use an `int` or `double` field to sort results numerically. 
+ Use a `date` field to sort results by date. 
+ Use a custom expression to sort results.

To use a field to sort the search results, you must configure the field to be `SortEnabled`. Only single-value fields can be `SortEnabled`—you cannot use the array-type fields for sorting. For more information about configuring fields, see [Configuring Index Fields](configuring-index-fields.md).

To use an expression for sorting, you construct a numeric expression using `int` fields, other expressions, a document's relevance score, and numeric operators and functions. You can define expressions in your domain configuration, or within a search request. For more information about configuring expressions, see [Configuring Expressions](configuring-expressions.md).

**Tip**  
To sort results randomly, you can use a simple `_rand` expression:  

```
/2013-01-01/search?expr.r=_rand&q=test&return=r%2Cplot%2Ctitle&sort=r+desc
```
This expression is stable, which lets you page back and forth without losing the initial, randomized sort. If you want to use a different randomized sort, you can add `a-z` and `0-9` characters after the `_rand` value, such as:  

```
/2013-01-01/search?expr.r=_rand1a2b3c&q=test&return=r%2Cplot%2Ctitle&sort=r+desc
```

You use the `sort` parameter to specify the field or expression you want to use to sort the results. You must explicitly specify the sort direction along with the name of the field or expression. For example, `sort=year asc` or `sort=year desc`.

When you use a field for sorting, documents without a value in that field are listed last. If you specify a comma separated list of fields or expressions, the first field or expression is used as the primary sort criteria, the second is used as the secondary sort criteria, and so on. 

 If you do not specify the `sort` parameter, the search results are ranked using the documents' default relevance scores with the highest-scoring documents listed first. This is equivalent to specifying `sort=_score desc`. 

You can use the `q.options` parameter to specify field weights to apply when calculating a document's relevance `_score`. For more information, see [Using Relative Field Weighting to Customize Text Relevance](weighting-fields.md).

# Using Relative Field Weighting to Customize Relevance Ranking in Amazon CloudSearch
<a name="weighting-fields"></a>

You can assign weights to selected fields so you can boost the relevance `_score` of documents with matches in key fields such as a `title` field, and minimize the impact of matches in less important fields. By default all fields have a weight of 1.

Field weights are set with the `q.options` `fields` option. You specify fields as an array of strings. To set the weight for a field, you append a caret (`^`) and a positive numeric value to the field name. You cannot set a field weight to zero or use mathematical functions or expressions to define a field weight. 

For example, if you want matches within the `title` field to score higher than matches within the `plot` field, you could set the weight of the `title` field to 2 and the weight of the `plot` field to 0.5:

```
q.options={fields:['title^2','plot^0.5']}
```

In addition to controlling field weights, the `fields` option defines the set of fields that are searched by default if you use the simple query parser or don't specify a field in part of a compound expression when using the structured query parser. For more information, see [Search Request Parameters](search-api.md#search-request-parameters) in the Search API Reference.

To reference the weighted relevance score in the definition of an expression, you use `_score`. You can use the weighted `_score` value in conjunction with numeric fields, other expressions, and the standard numeric operators and functions. For more information, see [Configuring Expressions](configuring-expressions.md).

# Configuring Expressions in Amazon CloudSearch
<a name="configuring-expressions"></a>

You can define numeric expressions and use them to sort search results. Expressions can also be returned in search results. You can add expressions to the domain configuration or define expressions within search requests. 

**Topics**
+ [Writing Expressions for Amazon CloudSearch](#writing-expressions)
+ [query time expressions](defining-expressions-in-requests.md)
+ [Configuring Reusable Expressions for a Search Domain in Amazon CloudSearch](configuring-reusable-expressions.md)
+ [Comparing Expressions in Amazon CloudSearch](comparing-expressions.md)

## Writing Expressions for Amazon CloudSearch
<a name="writing-expressions"></a>

Amazon CloudSearch expressions can contain:
+ Single value, sort enabled numeric fields (`int`, `double`, `date`). (You must specify a specific field, wildcards are not supported.)
+ Other expressions
+ The `_score` variable, which references a document's relevance score
+ The `_time` variable, which references the current epoch time
+ The `_rand` variable, which returns a randomly generated value
+ Integer, floating point, hex, and octal literals
+ Arithmetic operators: `+ - * / %`
+ Bitwise operators:` | & ^ ~ << >> >>>`
+ Boolean operators (including the ternary operator):` && || ! ?: `
+ Comparison operators:` < <= == >= > `
+ Mathematical functions: `abs ceil exp floor ln log10 logn max min pow sqrt `
+ Trigonometric functions: `acos acosh asin asinh atan atan2 atanh cos cosh sin sinh tanh tan`
+ The `haversin` distance function

[ JavaScript order of precedence rules](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Operator_Precedence#Table) apply for operators. You can override operator precedence by using parentheses.

Shortcut evaluation is used when evaluating logical expressions—if the value of the expression can be determined after evaluating the first argument, the second argument is not evaluated. For example, in the expression `a || b`, `b` is only evaluated if `a` is not true.

Expressions always return an integer value from 0 to the maximum 64-bit signed integer value (2^63 - 1). Intermediate results are calculated as double-precision floating point values and the return value is rounded to the nearest integer. If the expression is invalid or evaluates to a negative value, it returns 0. If the expression evaluates to a value greater than the maximum, it returns the maximum value. 

Expression names must begin with a letter and be at least 3 and no more than 64 characters long. The following characters are allowed: a-z (lower-case letters), 0-9, and \$1 (underscore). The name *score* is reserved and cannot be used as an expression name.

For example, if you define an `int` field named *popularity* for your domain, you could use that field in conjunction with the default relevance `_score` to construct a custom expression. 

```
(0.3*popularity)+(0.7*_score)
```

Note that this simple example assumes that the popularity ranking and the relevance \$1score values are in about the same range. To tune your expressions for ranking results, you need to do some testing to determine how to weight the components of your expressions to get the results you want. 

### Using Date Fields in Amazon CloudSearch Expressions
<a name="using-dates-in-expressions"></a>

The value from a `date` field is stored as an epoch time with millisecond resolution. This means you can use the mathematical and comparison operators to construct expressions using dates stored in your documents and the current epoch time (`_time`). For example, using the following expression to sort search results from the movies domain pushes movies with recent release dates toward the top of the list. 

```
_score/(_time - release_date)
```

# Defining Amazon CloudSearch Expressions in Search Requests
<a name="defining-expressions-in-requests"></a>

You can define and use expressions directly within a search request so that you can iterate quickly while you fine-tune expressions that you use to sort results. By defining an expression within a search request, you can also incorporate contextual information into the expression, such as the user's geographic location. You can override an expression defined in the domain configuration by defining an expression with the same name within a search request.

When you define an expression within a search request, it is not stored as part of your domain configuration. If you want to use the expression in other requests, you must define the expression in each request or add the expression to your domain configuration. Defining an expression in every request rather than adding it to the domain configuration increases the request overhead, which can result in slower response times and potentially increase the cost of running your domain. For information about adding expressions to the domain configuration, see [Configuring Expressions](configuring-expressions.md). 

You can define and use multiple expressions in a search request. The definition of an expression can reference other expressions defined within the request, as well as expressions configured as part of the domain configuration. 

There are no restrictions on how you can use expressions that you define in a search request. You can use the expression to sort the search results, define other expressions, or return computed information in the search results. 

**To define an expression in a search request**

1. Use the `expr.NAME` parameter, where NAME is the name of the expression you are defining. For example: 

   ```
   expr.rank1=log10(clicks)*_score
   ```

1. To use the expression to sort the results, specify the name of the expression with the `sort` parameter:

   ```
   search?q=terminator&expr.rank1=log10(clicks)*_score&sort=rank1 desc
   ```

1. To include the computed value in the search results, add the expression to the list of `return` fields: 

   ```
   search?q=terminator&expr.rank1=log10(clicks)*_score&sort=rank1 desc&return=rank1
   ```

 For example, the following request creates two expressions that are used to sort the results and returns one of them in the search results:

```
search?q=terminator&expr.rank1=sin( _score)&expression.rank2=cos( _score)&sort=rank1 desc,rank2 desc&return=title,_score,rank2
```

# Configuring Reusable Expressions for a Search Domain in Amazon CloudSearch
<a name="configuring-reusable-expressions"></a>

When you define an expression in a domain's configuration, you can reference the expression in any search request. Adding an expression to the domain configuration reduces the overhead of specifying it in every request, and helps maximize response times and minimize costs. 

When you add an expression to your domain configuration, it takes some time for the change to be processed and the new expression to become active. To quickly test changes to an expression, you can define and use the expression directly in a search request, as described in [Defining Expressions in Search Requests](defining-expressions-in-requests.md). After you have finished testing and tuning an expression, you should add it to your domain configuration. 

**Topics**
+ [Amazon CloudSearch console](#configuring-expressions-console)
+ [aws cloudsearch define-expression](#configuring-expressions-clt)
+ [DefineRankExpression](#configuring-expressions-sdk)

## Configuring Expressions Using the Amazon CloudSearch Console
<a name="configuring-expressions-console"></a>

**To configure an expression**

1. Open the Amazon CloudSearch console at [https://console.aws.amazon.com/cloudsearch/home](https://console.aws.amazon.com/cloudsearch/home).

1. From the left navigation pane, choose **Domains**.

1. Choose the name of the domain to open its configuration.

1. Go to the **Advanced search options** tab.

1. In the **Expressions** pane, choose **Add expression**.

1. Enter a name for the new expression.

1. For **Value**, enter the numerical expression that you want to evaluate at search time. You can select **Insert** to add special values and mathematical and trigonometric functions.

1. Choose **Save**.

## Configuring Amazon CloudSearch Expressions Using the AWS CLI
<a name="configuring-expressions-clt"></a>

You use the `aws cloudsearch define-expression` command to define computed expressions for a domain.

**To configure an expression**
+ Run the `aws cloudsearch define-expression` command to define a new expression. You specify a name for the expression with the `--name` option, and the numeric expression that you want to evaluate with the `--expression` option. For example, the following request creates an expression called `popularhits` that takes into account a document's `popularity` and relevance `_score`.

  ```
  aws cloudsearch define-expression --domain-name movies --name popularhits --expression '((0.3*popularity)/10.0)+(0.7* _score)'
  
  {
      "Expression": {
          "Status": {
              "PendingDeletion": false, 
              "State": "Processing", 
              "CreationDate": "2014-05-01T01:15:18Z", 
              "UpdateVersion": 52, 
              "UpdateDate": "2014-05-01T01:15:18Z"
          }, 
          "Options": {
              "ExpressionName": "popularhits", 
              "ExpressionValue": "((0.3*popularity)/10.0)+(0.7* _score)"
          }
      }
  }
  ```

## Configuring Expressions Using the Amazon CloudSearch Configuration API
<a name="configuring-expressions-sdk"></a>

The AWS SDKs (except the Android and iOS SDKs) support all of the Amazon CloudSearch actions defined in the Amazon CloudSearch Configuration API, including `DefineExpression`. For more information about installing and using the AWS SDKs, see [AWS Software Development Kits](http://aws.amazon.com/code).

# Comparing Expressions in Amazon CloudSearch
<a name="comparing-expressions"></a>

 You can use the Amazon CloudSearch console to compare expressions and see how changes to the expression and field weights affect how Amazon CloudSearch sorts search results.

**To compare expressions**

1. Open the Amazon CloudSearch console at [https://console.aws.amazon.com/cloudsearch/home](https://console.aws.amazon.com/cloudsearch/home).

1. In the left navigation pane, choose **Domains**.

1. Choose the name of the domain to open its configuration.

1. Choose **Actions**, **Compare expressions**.

1. In the **Search** box, enter the terms you want to search for. Amazon CloudSearch ranks the search results using the specified expressions and weights. It refreshes the results whenever you make changes to the expressions or weights.

1. In each expression editor, specify the rank expressions to compare. You can add new expressions or select an existing expression from the **Saved expressions** menu. Amazon CloudSearch evaluates new expressions when you submit a search request.

1. Specify the field weights to use for each expression. You can also edit the field weights directly in the expression. Field weights must be in the range 0.0 to 10.0, inclusive. By default, the weight for all fields is set to 1.0. You can set individual field weights to control how much matches in particular text or literal fields affect a document's relevance \$1score. You can also change the default weight.
**Note**  
Adjusting field weights only affects result ranking if the expression references the `_score` value. You can modify the expression to change how the weight relevance `_score` contributes to a document's overall ranking. For more information, see [Using Relative Field Weighting to Customize Text Relevance](weighting-fields.md).

1. Choose **Run**.

1. The search results for the two expressions are shown side-by-side. (If the expression is empty, the results are sorted according to the default relevance `_score`.) Four icons highlight the differences:  
![\[Green upward-pointing arrow icon indicating an increase or positive trend.\]](http://docs.aws.amazon.com/cloudsearch/latest/developerguide/images/cloudsearch-console-green-up-arrow.png) Green up arrow  
 The document is ranked higher in the search results using the second expression.   
![\[Red downward-pointing arrow icon indicating a download or direction.\]](http://docs.aws.amazon.com/cloudsearch/latest/developerguide/images/cloudsearch-console-red-down-arrow.png) Red down arrow  
 The document is ranked lower in the search results using the second expression.   
![\[Yellow plus sign icon typically used to indicate an add or create action.\]](http://docs.aws.amazon.com/cloudsearch/latest/developerguide/images/cloudsearch-console-yellow-plus.png) Yellow plus  
 The document is included in the search results using the second expression, but was omitted from the search results using the first expression.  
![\[Red circular sign with a white horizontal bar, indicating prohibition or restriction.\]](http://docs.aws.amazon.com/cloudsearch/latest/developerguide/images/cloudsearch-console-red-minus.png) Red minus  
 The document was omitted from the search results using the second expression, but was included in the search results using the first expression. 

**Note**  
You can save expressions to your domain configuration directly from the **Compare expressions** pane. To save either expression, choose **Save expression**.

# Getting Results as XML in Amazon CloudSearch
<a name="getting-xml-results"></a>

By default, Amazon CloudSearch search responses are formatted in JSON. To get results as XML, specify the query parameter `format=xml` in your search request:

```
search?q=star wars&return=_no_fields&format=xml
```

Search responses formatted in XML contain exactly the same information as a JSON response: 

```
<results>
    <status rid="3abhhs8oEAqMHnk=" time-ms="2"/>
    <hits found="9" start="0">
        <hit id="tt0076759"/>
        <hit id="tt0086190"/>
        <hit id="tt0121766"/>
        <hit id="tt2488496"/>
        <hit id="tt1408101"/>
        <hit id="tt0489049"/>
        <hit id="tt0120915"/>
        <hit id="tt0080684"/>
        <hit id="tt0121765"/>
    </hits>
</results>
```

For detailed information about the JSON and XML response formats for search requests, see [Search Response](search-api.md#search-response).

# Paginating Results in Amazon CloudSearch
<a name="paginating-results"></a>

By default, Amazon CloudSearch returns the top ten hits according to the specified sort order. To control the number of hits returned in a result set, you use the `size` parameter. 

To get the next set of hits beginning from a particular offset, you can use the `start` parameter. Note that the result set is zero-based—the first result is at index 0. You can get the first 10,000 hits using the `size` and `start` parameters. To page through more than 10,000 hits, use the `cursor` parameter. For more information, see [Deep Paging Beyond 10,000 Hits ](#deep-paging).

For example, `search?q=wolverine` returns the first 10 hits that contain *wolverine*, starting at index 0. The following example sets the `start` parameter to 10 to get the next set of ten hits.

```
search?q=wolverine&start=10
```

If you want to retrieve 25 hits at a time, set the `size` parameter to 25. To get the first set of hits, you don't have to set the `start` parameter. 

```
search?q=wolverine&size=25
```

For subsequent requests, use the `start` parameter to retrieve the set of hits you want. For example, to get the third batch of 25 hits, specify the following:

```
search?q=wolverine&size=25&start=50
```

## Deep Paging Beyond 10,000 Hits in Amazon CloudSearch
<a name="deep-paging"></a>

Using `size` and `start` to page through results works well if you only need to access the first few pages of results. However, if you need to page through thousands of hits, using a cursor is more efficient. To page through more than 10,000 hits, you must use a `cursor`. (You can only access the first 10,000 hits using the `start` and `size` parameters.)

To page through results using a cursor, you specify `cursor=initial` in your initial search request and include the `size` parameter to specify how many hits you want to get. Amazon CloudSearch returns a cursor value in the response that you use to get the next set of hits. Cursors return sequential sets of hits; however, you can use them to simulate random access of a deep page if you need to. Keep in mind that cursors are intended to be used to page through a result set within a reasonable amount of time of the initial request. Using a stale cursor can return stale results if updates have been posted to the index in the interim. 

**Important**  
When you use a cursor to page through a result set that is sorted by document score (`_score`), you can get inconsistent results if the index is updated between requests. This can also occur if your domain's replication count is greater than one, because updates are applied in an eventually consistent manner across the instances in the domain. If this is an issue, avoid sorting the results by score. You can either use the `sort` option to sort by a particular field, or use `fq` instead of `q` to specify your search criteria. (Document scores are not calculated for filter queries.)

For example, the following request sets the `cursor` value to `initial` and the `size` parameter to `100` to get the first set of hits.

```
search?q=-star&cursor=initial&size=100
```

The cursor for the next set of hits is included in the response.

```
{
    "status": {
        "rid": "z67+3L0oHgo6swY=",
        "time-ms": 7
    },
    "hits": {
        "found": 1649,
        "start": 0,
        "cursor": "Vb-HSS4YQW9JSVFKeFpvQ2wwZERBek16SXpOems9Aw",
        "hit": [
            {
                "id": "tt0397892"
            },
            .
            .
            .
            {
                "id": "tt0332379"
            }
        ]
    }
}
```

In the next request, the `cursor` parameter specifies the returned cursor value.

```
search?q=-star&cursor=Vb-HSS4YQW9JSVFKeFpvQ2wwZERBek16SXpOems9Aw&size=100
```