

# Using Select to Create Amazon SimpleDB Queries
<a name="UsingSelect"></a>

This section describes `Select`, a function that takes query expressions similar to the standard SQL `SELECT` statement.

Use the following format for the `Select` function. 

```
select output_list
from domain_name
[where expression]
[sort_instructions]
[limit limit]
```

The *output\$1list* can be any of the following:
+ \$1 (all attributes)
+ itemName() (the item name only)
+ count(\$1)
+ An explicit list of attributes (attribute1,..., attributeN)


|  Name  |  Description  | 
| --- | --- | 
|  domain\$1name  |   The domain to search.   | 
|  expression  |   The match expression. This rest of this section provides examples of how to form select expressions.   | 
|  sort\$1instructions  |  Sorts the results on a single attribute, in ascending or descending order. For information on sorting results, see [Sort](SortingDataSelect.md).  | 
|  limit  |   The *limit* is the maximum number of results per page to return (default: 100, max. 2500).  The total size of the response cannot exceed 1 MB. Amazon SimpleDB automatically adjusts the number of items returned per page to enforce this limit. For example, even if you ask to retrieve 2500 items, but each individual item is 10 KB in size, the system returns 100 items and an appropriate next token so you can get the next page of results.   | 

The *expression* can be any of the following:
+  <select expression> intersection <select expression> 
+  NOT <select expression> 
+  (<select expression>) 
+  <select expression> or <select expression> 
+  <select expression> and <select expression> 
+  <simple comparison> 

**Note**  
For information on how to use quotes with Amazon SimpleDB, see [Select Quoting Rules](QuotingRulesSelect.md).

# Comparison Operators
<a name="UsingSelectOperators"></a>

Comparison operators are applied to a single attribute and are lexicographical in nature. When designing an application, you should carefully think through storing data in its appropriate string representation. For more information, see [Working with Numerical Data](NumericalData.md).

The following table shows all Amazon SimpleDB comparison operators.


| Comparison Operator | Description | Select Example | 
| --- | --- | --- | 
| = | Attribute value or itemName() equals the specified constant. |  `select * from mydomain where city = 'Seattle'` `select * from mydomain where city = 'Seattle' or city = 'Portland'`  | 
| \$1= | Attribute value or itemName() does not equal to the specified constant. |  `select * from mydomain where name != 'John'` `select * from mydomain where name != 'John' and name != 'Humberto'`  | 
| > | Attribute value or itemName() is greater than the specified constant. | select \$1 from mydomain where weight > '0034' | 
| >= | Attribute value or itemName() is greater than or equal to the specified constant. | select \$1 from mydomain where weight >= '065' | 
| < | Attribute value or itemName() is less than the specified constant. | select \$1 from mydomain where weight < '0034' | 
| <= | Attribute value or itemName() is less than or equal to the specified constant. | select \$1 from mydomain where year <= '2000' | 
| like | Attribute value or itemName() contains the specified constant.  The `like` operator can be used to evaluate the start of a string ('*string*%'), the end of a string ('%*string*'), or any part of a string ('%*string*%').    Using the `like` operator to evaluate the end of a string or any part of a string is an expensive operation. Make sure to combine it with other predicates to reduce number of items it has to evaluate.  To search for strings that contain the percent sign (%), you must escape it. For example, to search for a string that ends in '3%', you enter `select * from mydomain where name like '%3\%'`. To search for a string that begins with '3%', you enter `select * from mydomain where name like '3\%%'`. To search for a string that contains '3%', you enter `select * from mydomain where name like '%3\%%'`.    |  select \$1 from mydomain where author like 'Henry%' select \$1 from mydomain where keyword = 'Book' and author like '%Miller'  | 
|  not like  | Attribute value or itemName() does not contain the specified constant.  The `not like` operator can be used to evaluate the start of a string ('*string*%'), the end of a string ('%*string*'), or any part of a string ('%*string*%').    Using the `not like` operator to evaluate the end of a string or any part of a string is an expensive operation. Make sure to combine it with other predicates to reduce number of items it has to evaluate.    |  select \$1 from mydomain where author not like 'Henry%' select \$1 from mydomain where keyword = 'Book' and author not like '%Miller'  | 
| between | Attribute value or itemName() falls within a range, including the start and end value. | select \$1 from mydomain where year between '1998' and '2000' | 
| in | Attribute value or itemName() is equal to one of the specified constants. When used with items, in acts as a batch get. |  select \$1 from mydomain where year in('1998','2000','2003') select \$1 from mydomain where itemName() in('0385333498','0802131786','B000T9886K')  | 
| is null | Attribute does not exist. If an item has the attribute with an empty string, it is not returned.   Due to performance issues, this operator is not recommended when most items have the specified attribute.    | select \$1 from mydomain where year is null | 
| is not null | Attribute value or itemName() contains any value. | select \$1 from mydomain where year is not null | 
| every() | For multi-valued attributes, every attribute value must satisfy the constraint.   Due to the cost of running more complex queries, this operator is only recommended for multi-valued attributes.    | `select * from mydomain where every(keyword) = 'Book'`  `select * from mydomain where every(keyword) like '***%'`  | 

# Sample Query Data Set
<a name="UsingSelectSampleDataset"></a>

The following table contains the data set used throughout this section.


| Item Name | Title | Author | Year | Pages | Keyword | Rating | 
| --- | --- | --- | --- | --- | --- | --- | 
| 0385333498 | The Sirens of Titan | Kurt Vonnegut | 1959 | 00336 |  Book Paperback  |  \$1\$1\$1\$1\$1 5 stars Excellent  | 
| 0802131786 | Tropic of Cancer | Henry Miller | 1934 | 00318 |  Book  |  \$1\$1\$1\$1  | 
| 1579124585 | The Right Stuff | Tom Wolfe | 1979 | 00304 |  Book Hardcover American  |  \$1\$1\$1\$1 4 stars  | 
| B000T9886K | In Between | Paul Van Dyk | 2007 |  |  CD Trance  |  4 stars  | 
| B00005JPLW | 300 | Zack Snyder | 2007 |  |  DVD Action Frank Miller  |  \$1\$1\$1 3 stars Not bad  | 
| B000SF3NGK | Heaven's Gonna Burn Your Eyes | Thievery Corporation | 2002 |  |    | \$1\$1\$1\$1\$1 | 

# Simple Queries
<a name="SimpleQueriesSelect"></a>

This section shows simple queries and their results.

**Note**  
To view the source data for the queries, see [Sample Query Data Set](UsingSelectSampleDataset.md). 

The following table shows some simple queries, how they are interpreted, and the results they return from the sample dataset.


| Select Expression | Description | Result | 
| --- | --- | --- | 
| select \$1 from mydomain where Title = 'The Right Stuff' | Retrieves all items where the attribute "Title" equals "The Right Stuff". | 1579124585 | 
| select \$1 from mydomain where Year > '1985' |  Retrieves all items where "Year" is greater than "1985". Although this looks like a numerical comparison, it is lexicographical. Because the calendar won't change to five digits for nearly 8,000 years, "Year" is not zero padded.  | B000T9886K, B00005JPLW, B000SF3NGK | 
| select \$1 from mydomain where Rating like '\$1\$1\$1\$1%' |  Retrieves all items that have at least a 4 star (\$1\$1\$1\$1) rating. The prefix comparison is case-sensitive and exact and does not match attributes that only have the "4 star" value, such as item B000T9886K.   | 0385333498, 1579124585, 0802131786, B000SF3NGK  | 
| select \$1 from mydomain where Pages < '00320' |  Retrieves all items that have less than 320 pages. This attribute is zero padded in the data set and the select expression, which allows for proper lexicographical comparison between the strings. Items without this attribute are not considered.  | 1579124585, 0802131786,  | 

# Range Queries
<a name="RangeQueriesSelect"></a>

Amazon SimpleDB enables you to execute more than one comparison against attribute values within the same predicate. This is most commonly used to specify a range of values.

This section shows range queries and their results. 

**Note**  
To view the source data for the queries, see [Sample Query Data Set](UsingSelectSampleDataset.md). 

The following table shows some range queries, how they are interpreted, and the results they return from the sample dataset.


| Select Expression | Description | Result | 
| --- | --- | --- | 
| select \$1 from mydomain where Year > '1975' and Year < '2008' | Retrieves all items that have a "Year" value between "1975" and "2008", excluding "1975" and "2008". | 1579124585, B000T9886K, B00005JPLW, B000SF3NGK | 
| select \$1 from mydomain where Year between '1975' and '2008' | Retrieves all items that have a "Year" value between "1975" and "2008", including "1975" and "2008". | 1579124585, B000T9886K, B00005JPLW, B000SF3NGK | 
| select \$1 from mydomain where Rating = '\$1\$1\$1' or Rating = '\$1\$1\$1\$1\$1' |  Retrieves all items that have 3 (\$1\$1\$1) or 5 (\$1\$1\$1\$1\$1) star rating  This is a discontiguous range query that consists of two distinct values selected from the range of all possible values for the attribute.  | 0385333498, B00005JPLW, B000SF3NGK | 
| select \$1 from mydomain where (Year > '1950' and Year < '1960') or Year like '193%' or Year = '2007' | Retrieves all items where the "Year" attribute is either between "1950" and "1960", excluding "1950" and "1960", or falls in the nineteen-thirties, or equals "2007".  | 0385333498, 0802131786, B000T9886K, B00005JPLW  | 

# Queries on Attributes with Multiple Values
<a name="RangeValueQueriesSelect"></a>

One of the unique features of Amazon SimpleDB is that it allows you to associate multiple values with a single attribute. Internet-related attributes such as *tag* or *keyword* often contain multiple values, which are easy to support through the Amazon SimpleDB data model and query language.

**Important**  
Each attribute is considered individually against the comparison conditions defined in the predicate. Item names are selected if *any* of the values match the predicate condition. To change this behavior, use the `every()` operator to return results where *every* attribute matches the query expression.

This section shows queries on attributes with multiple values and their results. 

**Note**  
To view the source data for the queries, see [Sample Query Data Set](UsingSelectSampleDataset.md). 

The following table shows some queries on attributes with multiple values, how they are interpreted, and the results they return from the sample dataset.


| Select Expression | Description | Result | 
| --- | --- | --- | 
| select \$1 from mydomain where Rating = '4 stars' or Rating = '\$1\$1\$1\$1' |  Retrieves all items with a 4 star (\$1\$1\$1\$1) rating. The data set has this rating stored as both "4 stars" and "\$1\$1\$1\$1." Amazon SimpleDB returns items that have either or both.  | 1579124585, 0802131786, B000T9886K | 
| select \$1 from mydomain where Keyword = 'Book' and Keyword = 'Hardcover' |  Retrieve all items that have the Keyword attribute as both "Book" and "Hardcover." Based on the data set, you might be surprised that the result did not return the "1579124585" item. As described earlier, each value is evaluated individually against the predicate expression. Since neither of the values satisfies *both* comparisons defined in the predicate, the item name is not selected. To get the desired results, you can use the `select * from mydomain where Keyword = 'Book' intersection Keyword = 'Hardcover'` expression. For more information, see [Multiple Attribute Queries](MultipleAttributeQueriesSelect.md).  | <none> | 
| select \$1 from mydomain where every(keyword) in ('Book', 'Paperback') |  Retrieves all items where the only keyword is Book or Paperback. If the item contains any other keyword entries, it is not returned.  | 0385333498, 0802131786 | 

# Multiple Attribute Queries
<a name="MultipleAttributeQueriesSelect"></a>

The previous examples show how to create expressions for single predicates. The Amazon SimpleDB query language also supports constructing expressions across multiple predicates using the `intersection` operator.

Multiple attribute queries work by producing a set of item names from each predicate and applying the `intersection` operator. The intersection operator only returns item names that appear in both result sets.

This section shows multiple attribute queries and their results. 

**Note**  
To view the source data for the queries, see [Sample Query Data Set](UsingSelectSampleDataset.md). 

The following table shows some multiple attribute queries, how they are interpreted, and the results they return from the sample dataset.


| Select Expression | Description | Result | 
| --- | --- | --- | 
| select \$1 from mydomain where Rating = '\$1\$1\$1\$1' |  Retrieves all items that have a "\$1\$1\$1\$1" Rating.  | 0802131786, 1579124585 | 
| select \$1 from mydomain where every(Rating) = '\$1\$1\$1\$1' |  Retrieves all items that only have a "\$1\$1\$1\$1" Rating. Items are not returned that have a multi-valued Rating attribute that contains any value other than '\$1\$1\$1\$1.'  | 0802131786 | 
| select \$1 from mydomain where Keyword = 'Book' intersection Keyword = 'Hardcover' |  Retrieves all items that have a "Book" Keyword and a "Hardcover" Keyword. The first predicate produces 0385333498, 0802131786, and 1579124585. The second produces 1579124585. The intersection operator returns results that appear in both queries.  | 1579124585 | 

# Sort
<a name="SortingDataSelect"></a>

Amazon SimpleDB supports sorting data on a single attribute or the item names, in ascending (default) or descending order. This section describes how to sort the result set returned from `Select`.

**Note**  
 All sort operations are performed in lexicographical order.   
 The sort attribute must be present in at least one of the predicates of the expression.   
Because returned results must contain the attribute on which you are sorting, do not use `is null` on the sort attribute. For example, `select * from mydomain where author is null and title is not null order by title` will succeed. However, `select * from mydomain where author is null order by title ` will fail because `title` is not constrained by the `not null` predicate.   
To view the source data for the queries, see [Sample Query Data Set](UsingSelectSampleDataset.md).

The following table shows sort queries, how they are interpreted, and the results they return from the sample dataset.


| Select Expression | Description | Result | 
| --- | --- | --- | 
| select \$1 from mydomain where Year < '1980' order by Year asc |  Retrieves all items released before 1980 and lists them in ascending order.  | 0802131786, 0385333498, 1579124585 | 
| select \$1 from mydomain where Year < '1980' order by Year |  Same as the previous entry, with "asc" (ascending) omitted.  | 0802131786, 0385333498, 1579124585 | 
| select \$1 from mydomain where Year = '2007' intersection Author is not null order by Author desc |  Retrieves all items released in 2007 and sorts them by author name in descending order.  | B00005JPLW, B000T9886K | 
| select \$1 from mydomain order by Year asc |  Invalid because Year is not constrained by a predicate in the where clause.  | InvalidSortExpression error. See [API Error Codes](APIError.md). | 
| select \$1 from mydomain where Year < '1980' order by Year limit 2 |  Retrieves two items that were released before 1980 and lists them in ascending order.  | 0802131786, 0385333498 | 
| select itemName() from mydomain where itemName() like 'B000%' order by itemName() |  Retrieves all itemNames() that start with B000 and lists them in ascending order.  | B00005JPLW, B000SF3NGK, B000T9886K | 

# Count
<a name="CountingDataSelect"></a>

If you want to count the number of items in a result set instead of returning the items, use `count(*)`. Instead of returning a list of items, Amazon SimpleDB returns a single item called `Domain` with a `Count` attribute.

**Note**  
 If the count request takes more than five seconds, Amazon SimpleDB returns the number of items that it could count and a next token to return additional results. The client is responsible for accumulating the partial counts.   
 If Amazon SimpleDB returns a 408 Request Timeout, please resubmit the request.   
 The default result limit of 100 and maximum result limit of 2500 do not apply to count(\$1). However, you can restrict the maximum number of counted results using the limit clause. 

 The next token returned by `count(*)` and `select` are interchangeable as long as the `where` and `order by` clauses match. For example, if you want to return the 200 items after the first 10,000 (similar to an offset), you can perform a count with a limit clause of 10,000 and use the next token to return the next 200 items with `select`. 

The following table shows `count(*)` queries and the results they return from the sample dataset.

**Note**  
To view the source data for the queries, see [Sample Query Data Set](UsingSelectSampleDataset.md). 


| Select Expression | Description | Result | 
| --- | --- | --- | 
| select count(\$1) from mydomain where Title = 'The Right Stuff' | Counts all items where the attribute "Title" equals "The Right Stuff." | 1 | 
| select count(\$1) from mydomain where Year > '1985' |  Counts all items where "Year" is greater than "1985."  | 3 | 
| select count(\$1) from mydomain limit 500 | Counts all items in the domain, with a limit of 500. | 6 | 
| select count(\$1) from mydomain limit 4 | Counts all items in the domain, with a limit of 4. | 4 | 

# Select Quoting Rules
<a name="QuotingRulesSelect"></a>

Attribute values must be quoted with a single or double quote. If a quote appears within the attribute value, it must be escaped with the same quote symbol. These following two expressions are equivalent:

```
select * from mydomain where attr1 = 'He said, "That''s the ticket!"'
select * from mydomain where attr1 = "He said, ""That's the ticket!"""
```

Attribute and domain names may appear without quotes if they contain only letters, numbers, underscores (\$1), or dollar symbols (\$1) and do not start with a number. You must quote all other attribute and domain names with the backtick (`).

```
select * from mydomain where `timestamp-1` > '1194393600'
```

You must escape the backtick when it appears in the attribute or domain name by replacing it with two backticks. For example, we can retrieve any items that have the attribute abc`123 set to the value 1 with this select expression:

```
select * from mydomain where `abc``123` = '1'
```

The following is the list of reserved keywords that are valid identifiers that must be backtick quoted if used as an attribute or domain name in the Select syntax.
+ or
+ and
+ not
+ from
+ where
+ select
+ like
+ null
+ is
+ order
+ by
+ asc
+ desc
+ in
+ between
+ intersection
+ limit
+ every