

# SQL reference for Amazon S3 Select
<a name="s3-select-sql-reference"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

This reference contains a description of the structured query language (SQL) elements that are supported by Amazon S3 Select.

**Topics**
+ [

# SELECT command
](s3-select-sql-reference-select.md)
+ [

# Data types
](s3-select-sql-reference-data-types.md)
+ [

# Operators
](s3-select-sql-reference-operators.md)
+ [

# Reserved keywords
](s3-select-sql-reference-keyword-list.md)
+ [

# SQL functions
](s3-select-sql-reference-sql-functions.md)

# SELECT command
<a name="s3-select-sql-reference-select"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports only the `SELECT` SQL command. The following ANSI standard clauses are supported for `SELECT`: 


+ `SELECT` list
+ `FROM` clause 
+ `WHERE` clause
+ `LIMIT` clause

**Note**  
Amazon S3 Select queries currently do not support subqueries or joins.

## SELECT list
<a name="s3-select-sql-reference-select-list"></a>

The `SELECT` list names the columns, functions, and expressions that you want the query to return. The list represents the output of the query. 

```
SELECT *
SELECT projection1 AS column_alias_1, projection2 AS column_alias_2
```

The first form of `SELECT` with the `*` (asterisk) returns every row that passed the `WHERE` clause, as-is. The second form of `SELECT` creates a row with user-defined output scalar expressions **`projection1`** and **`projection2`** for each column.

## FROM clause
<a name="s3-select-sql-reference-from"></a>

Amazon S3 Select supports the following forms of the `FROM` clause:

```
FROM table_name
FROM table_name alias
FROM table_name AS alias
```

In each form of the `FROM` clause, `table_name` is the `S3Object` that's being queried. Users coming from traditional relational databases can think of this as a database schema that contains multiple views over a table.

Following standard SQL, the `FROM` clause creates rows that are filtered in the `WHERE` clause and projected in the `SELECT` list. 

For JSON objects that are stored in Amazon S3 Select, you can also use the following forms of the `FROM` clause:

```
FROM S3Object[*].path
FROM S3Object[*].path alias
FROM S3Object[*].path AS alias
```

Using this form of the `FROM` clause, you can select from arrays or objects within a JSON object. You can specify `path` by using one of the following forms:
+ By name (in an object): `.name` or `['name']`
+ By index (in an array): `[index]`
+ By wildcard character (in an object): `.*`
+ By wildcard character (in an array): `[*]`

**Note**  
This form of the `FROM` clause works only with JSON objects.
Wildcard characters always emit at least one record. If no record matches, then Amazon S3 Select emits the value `MISSING`. During output serialization (after the query finishes running), Amazon S3 Select replaces `MISSING` values with empty records.
Aggregate functions (`AVG`, `COUNT`, `MAX`, `MIN`, and `SUM`) skip `MISSING` values.
If you don't provide an alias when using a wildcard character, you can refer to the row by using the last element in the path. For example, you could select all prices from a list of books by using the query `SELECT price FROM S3Object[*].books[*].price`. If the path ends in a wildcard character instead of a name, then you can use the value `_1` to refer to the row. For example, instead of `SELECT price FROM S3Object[*].books[*].price`, you could use the query `SELECT _1.price FROM S3Object[*].books[*]`.
Amazon S3 Select always treats a JSON document as an array of root-level values. Thus, even if the JSON object that you are querying has only one root element, the `FROM` clause must begin with `S3Object[*]`. However, for compatibility reasons, Amazon S3 Select allows you to omit the wildcard character if you don't include a path. Thus, the complete clause `FROM S3Object` is equivalent to `FROM S3Object[*] as S3Object`. If you include a path, you must also use the wildcard character. So, `FROM S3Object` and `FROM S3Object[*].path` are both valid clauses, but `FROM S3Object.path` is not.

**Example**  
**Examples:**  
*Example \$11*  
This example shows results when using the following dataset and query:  

```
{ "Rules": [ {"id": "1"}, {"expr": "y > x"}, {"id": "2", "expr": "z = DEBUG"} ]}
{ "created": "June 27", "modified": "July 6" }
```

```
SELECT id FROM S3Object[*].Rules[*].id
```

```
{"id":"1"}
{}
{"id":"2"}
{}
```
Amazon S3 Select produces each result for the following reasons:  
+ `{"id":"id-1"}` – `S3Object[0].Rules[0].id` produced a match.
+ `{}` – `S3Object[0].Rules[1].id` did not match a record, so Amazon S3 Select emitted `MISSING`, which was then changed to an empty record during output serialization and returned.
+ `{"id":"id-2"}` – `S3Object[0].Rules[2].id` produced a match.
+ `{}` – `S3Object[1]` did not match on `Rules`, so Amazon S3 Select emitted `MISSING`, which was then changed to an empty record during output serialization and returned.
If you don't want Amazon S3 Select to return empty records when it doesn't find a match, you can test for the value `MISSING`. The following query returns the same results as the previous query, but with the empty values omitted:  

```
SELECT id FROM S3Object[*].Rules[*].id WHERE id IS NOT MISSING
```

```
{"id":"1"}
{"id":"2"}
```
*Example \$12*  
This example shows results when using the following dataset and queries:  

```
{ "created": "936864000", "dir_name": "important_docs", "files": [ { "name": "." }, { "name": ".." }, { "name": ".aws" }, { "name": "downloads" } ], "owner": "Amazon S3" }
{ "created": "936864000", "dir_name": "other_docs", "files": [ { "name": "." }, { "name": ".." }, { "name": "my stuff" }, { "name": "backup" } ], "owner": "User" }
```

```
SELECT d.dir_name, d.files FROM S3Object[*] d
```

```
{"dir_name":"important_docs","files":[{"name":"."},{"name":".."},{"name":".aws"},{"name":"downloads"}]}
{"dir_name":"other_docs","files":[{"name":"."},{"name":".."},{"name":"my stuff"},{"name":"backup"}]}
```

```
SELECT _1.dir_name, _1.owner FROM S3Object[*]
```

```
{"dir_name":"important_docs","owner":"Amazon S3"}
{"dir_name":"other_docs","owner":"User"}
```

## WHERE clause
<a name="s3-select-sql-reference-where"></a>

The `WHERE` clause follows this syntax: 

```
WHERE condition
```

The `WHERE` clause filters rows based on the `condition`. A condition is an expression that has a Boolean result. Only rows for which the condition evaluates to `TRUE` are returned in the result.

## LIMIT clause
<a name="s3-select-sql-reference-limit"></a>

The `LIMIT` clause follows this syntax: 

```
LIMIT number
```

The `LIMIT` clause limits the number of records that you want the query to return based on `number`.

## Attribute access
<a name="s3-select-sql-reference-attribute-access"></a>

The `SELECT` and `WHERE` clauses can refer to record data by using one of the methods in the following sections, depending on whether the file that is being queried is in CSV or JSON format.

### CSV
<a name="s3-select-sql-reference-attribute-access-csv"></a>
+ **Column Numbers** – You can refer to the *Nth* column of a row with the column name `_N`, where *`N`* is the column position. The position count starts at 1. For example, the first column is named `_1` and the second column is named `_2`.

  You can refer to a column as `_N` or `alias._N`. For example, `_2` and `myAlias._2` are both valid ways to refer to a column in the `SELECT` list and `WHERE` clause.
+ **Column Headers** – For objects in CSV format that have a header row, the headers are available to the `SELECT` list and `WHERE` clause. In particular, as in traditional SQL, within `SELECT` and `WHERE` clause expressions, you can refer to the columns by `alias.column_name` or `column_name`.

### JSON
<a name="s3-select-sql-reference-attribute-access-json"></a>
+ **Document** – You can access JSON document fields as `alias.name`. You can also access nested fields, for example, `alias.name1.name2.name3`.
+ **List** – You can access elements in a JSON list by using zero-based indexes with the `[]` operator. For example, you can access the second element of a list as `alias[1]`. You can combine accessing list elements with fields, for example, `alias.name1.name2[1].name3`.
+ **Examples:** Consider this JSON object as a sample dataset:

  ```
  {"name": "Susan Smith",
  "org": "engineering",
  "projects":
      [
       {"project_name":"project1", "completed":false},
       {"project_name":"project2", "completed":true}
      ]
  }
  ```

  *Example \$11*

  The following query returns these results:

  ```
  Select s.name from S3Object s
  ```

  ```
  {"name":"Susan Smith"}
  ```

  *Example \$12*

  The following query returns these results:

  ```
  Select s.projects[0].project_name from S3Object s
  ```

  ```
  {"project_name":"project1"}
  ```

## Case sensitivity of header and attribute names
<a name="s3-select-sql-reference-case-sensitivity"></a>

With Amazon S3 Select, you can use double quotation marks to indicate that column headers (for CSV objects) and attributes (for JSON objects) are case sensitive. Without double quotation marks, object headers and attributes are case insensitive. An error is thrown in cases of ambiguity.

The following examples are either 1) Amazon S3 objects in CSV format with the specified column headers, and with `FileHeaderInfo` set to `"Use"` for the query request; or 2) Amazon S3 objects in JSON format with the specified attributes.

*Example \$11:* The object being queried has the header or attribute `NAME`.
+ The following expression successfully returns values from the object. Because there are no quotation marks, the query is case insensitive.

  ```
  SELECT s.name from S3Object s
  ```
+ The following expression results in a 400 error `MissingHeaderName`. Because there are quotation marks, the query is case sensitive. 

  ```
  SELECT s."name" from S3Object s
  ```

*Example \$12:* The Amazon S3 object being queried has one header or attribute with `NAME` and another header or attribute with `name`.
+ The following expression results in a 400 error `AmbiguousFieldName`. Because there are no quotation marks, the query is case insensitive, but there are two matches, so the error is thrown.

  ```
  SELECT s.name from S3Object s
  ```
+ The following expression successfully returns values from the object. Because there are quotation marks, the query is case sensitive, so there is no ambiguity.

  ```
  SELECT s."NAME" from S3Object s
  ```

## Using reserved keywords as user-defined terms
<a name="s3-select-sql-reference-using-keywords"></a>

Amazon S3 Select has a set of reserved keywords that are needed to run the SQL expressions used to query object content. Reserved keywords include function names, data types, operators, and so on. In some cases, user-defined terms, such as the column headers (for CSV files) or attributes (for JSON objects), might clash with a reserved keyword. When this happens, you must use double quotation marks to indicate that you are intentionally using a user-defined term that clashes with a reserved keyword. Otherwise a 400 parse error will result.

For the full list of reserved keywords, see [Reserved keywords](s3-select-sql-reference-keyword-list.md).

The following example is either 1) an Amazon S3 object in CSV format with the specified column headers, with `FileHeaderInfo` set to `"Use"` for the query request, or 2) an Amazon S3 object in JSON format with the specified attributes.

*Example:* The object being queried has a header or attribute named `CAST`, which is a reserved keyword.
+ The following expression successfully returns values from the object. Because quotation marks are used in the query, S3 Select uses the user-defined header or attribute.

  ```
  SELECT s."CAST" from S3Object s
  ```
+ The following expression results in a 400 parse error. Because no quotation marks are used in the query, `CAST` clashes with a reserved keyword.

  ```
  SELECT s.CAST from S3Object s
  ```

## Scalar expressions
<a name="s3-select-sql-reference-scalar"></a>

Within the `WHERE` clause and the `SELECT` list, you can have SQL *scalar expressions*, which are expressions that return scalar values. They have the following form:
+ ***`literal`*** 

  An SQL literal. 
+ ***`column_reference`*** 

  A reference to a column in the form `column_name` or `alias.column_name`. 
+ **`unary_op`** **`expression`** 

  In this case, ****`unary_op`**** is an SQL unary operator.
+ **`expression`** **`binary_op`** ***`expression`*** 

   In this case, ****`binary_op`**** is an SQL binary operator. 
+ **`func_name`** 

   In this case, **`func_name`** is the name of the scalar function to invoke. 
+ ***`expression`*** `[ NOT ] BETWEEN` ****`expression`**** `AND` ****`expression`****
+ ***`expression`*** `LIKE` ****`expression`**** [ `ESCAPE` ***`expression`*** ]

# Data types
<a name="s3-select-sql-reference-data-types"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports several primitive data types.

## Data type conversions
<a name="s3-select-sql-reference-data-conversion"></a>

The general rule is to follow the `CAST` function if it's defined. If `CAST` is not defined, then all input data is treated as a string. In that case, you must cast your input data into the relevant data types when necessary.

For more information about the `CAST` function, see [CAST](s3-select-sql-reference-conversion.md#s3-select-sql-reference-cast).

## Supported data types
<a name="s3-select-sql-reference-supported-data-types"></a>

Amazon S3 Select supports the following set of primitive data types.


|  Name  |  Description  |  Examples  | 
| --- | --- | --- | 
| `bool` | A Boolean value, either `TRUE` or `FALSE`. | `FALSE` | 
| `int`, `integer` | An 8-byte signed integer in the range -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.  | `100000` | 
| `string` | A UTF8-encoded variable-length string. The default limit is 1 character. The maximum character limit is 2,147,483,647.  | `'xyz'` | 
| `float` | An 8-byte floating point number.  | `CAST(0.456 AS FLOAT)` | 
| `decimal`, `numeric` |  A base-10 number, with a maximum precision of 38 (that is, the maximum number of significant digits), and with a scale within the range of -231 to 231-1 (that is, the base-10 exponent).  Amazon S3 Select ignores scale and precision when you provide both at the same time.   | `123.456 ` | 
| `timestamp` |  Timestamps represent a specific moment in time, always include a local offset, and are capable of arbitrary precision. In the text format, timestamps follow the [W3C note on date and time formats](https://www.w3.org/TR/NOTE-datetime), but they must end with the literal `T` if the timestamps are not at least whole-day precision. Fractional seconds are allowed, with at least one digit of precision, and an unlimited maximum. Local-time offsets can be represented as either hour:minute offsets from UTC, or as the literal `Z` to denote a local time of UTC. Local-time offsets are required on timestamps with time and are not allowed on date values.  | `CAST('2007-04-05T14:30Z' AS TIMESTAMP)` | 

### Supported Parquet types
<a name="s3-select-sql-reference-supported-data-types-parquet"></a>

Amazon S3 Select supports the following Parquet types.
+ `DATE`
+ `DECIMAL`
+ `ENUM`
+ `INT(8)`
+ `INT(16)`
+ `INT(32)`
+ `INT(64)`
+ `LIST`
**Note**  
For `LIST` Parquet type output, Amazon S3 Select supports only JSON format. However, if the query limits the data to simple values, the `LIST` Parquet type can also be queried in CSV format.
+ `STRING`
+ `TIMESTAMP` supported precision (`MILLIS`/`MICROS`/`NANOS`)
**Note**  
Timestamps saved as an `INT(96)` are unsupported.   
Because of the range of the `INT(64)` type, timestamps that are using the `NANOS` unit can represent only values between `1677-09-21 00:12:43` and `2262-04-11 23:47:16`. Values outside of this range cannot be represented with the `NANOS` unit.

### Mapping of Parquet types to supported data types in Amazon S3 Select
<a name="s3-select-sql-reference-supported-data-types-parquet-mapping"></a>


| Parquet types | Supported data types | 
| --- | --- | 
| `DATE` |  `timestamp`  | 
|  `DECIMAL`  |  `decimal`, `numeric`  | 
|  `ENUM`  |  `string`  | 
|  `INT(8)`  |  `int`, `integer`  | 
|  `INT(16)`  | `int`, `integer` | 
| `INT(32)` | `int`, `integer` | 
|  `INT(64)`  |  `decimal`, `numeric`  | 
|  `LIST`  |  Each Parquet type in list is mapped to the corresponding data type.  | 
|  `STRING`  |  `string`  | 
|  `TIMESTAMP`  |  `timestamp`  | 

# Operators
<a name="s3-select-sql-reference-operators"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following operators.

## Logical operators
<a name="s3-select-sql-reference-loical-ops"></a>
+ `AND`
+ `NOT`
+ `OR`

## Comparison operators
<a name="s3-select-sql-reference-compare-ops"></a>
+ `<` 
+ `>` 
+ `<=`
+ `>=`
+ `=`
+ `<>`
+ `!=`
+ `BETWEEN`
+ `IN` – For example: `IN ('a', 'b', 'c')`

  

## Pattern-matching operators
<a name="s3-select-sql-reference-pattern"></a>
+ `LIKE`
+ `_` (Matches any character)
+ `%` (Matches any sequence of characters)

## Unitary operators
<a name="s3-select-sql-reference-unitary-ops"></a>
+ `IS NULL`
+ `IS NOT NULL`

## Math operators
<a name="s3-select-sql-referencemath-ops"></a>

Addition, subtraction, multiplication, division, and modulo are supported, as follows:
+ \$1
+ -
+ \$1
+ /
+ %

## Operator precedence
<a name="s3-select-sql-reference-op-Precedence"></a>

The following table shows the operators' precedence in decreasing order.


|  Operator or element  |  Associativity |  Required  | 
| --- | --- | --- | 
| `-`  | right  | unary minus  | 
| `*`, `/`, `%`  | left  | multiplication, division, modulo  | 
| `+`, `-`  | left  | addition, subtraction  | 
| `IN` |  | set membership  | 
| `BETWEEN` |  | range containment  | 
| `LIKE` |  | string pattern matching  | 
| `<``>` |  | less than, greater than  | 
| `=` | right  | equality, assignment | 
| `NOT` | right | logical negation  | 
| `AND` | left | logical conjunction  | 
| `OR` | left | logical disjunction  | 

# Reserved keywords
<a name="s3-select-sql-reference-keyword-list"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

The following is the list of reserved keywords for Amazon S3 Select. These keywords include function names, data types, operators, and so on, that are needed to run the SQL expressions that are used to query object content.

```
absolute
action
add
all
allocate
alter
and
any
are
as
asc
assertion
at
authorization
avg
bag
begin
between
bit
bit_length
blob
bool
boolean
both
by
cascade
cascaded
case
cast
catalog
char
char_length
character
character_length
check
clob
close
coalesce
collate
collation
column
commit
connect
connection
constraint
constraints
continue
convert
corresponding
count
create
cross
current
current_date
current_time
current_timestamp
current_user
cursor
date
day
deallocate
dec
decimal
declare
default
deferrable
deferred
delete
desc
describe
descriptor
diagnostics
disconnect
distinct
domain
double
drop
else
end
end-exec
escape
except
exception
exec
execute
exists
external
extract
false
fetch
first
float
for
foreign
found
from
full
get
global
go
goto
grant
group
having
hour
identity
immediate
in
indicator
initially
inner
input
insensitive
insert
int
integer
intersect
interval
into
is
isolation
join
key
language
last
leading
left
level
like
limit
list
local
lower
match
max
min
minute
missing
module
month
names
national
natural
nchar
next
no
not
null
nullif
numeric
octet_length
of
on
only
open
option
or
order
outer
output
overlaps
pad
partial
pivot
position
precision
prepare
preserve
primary
prior
privileges
procedure
public
read
real
references
relative
restrict
revoke
right
rollback
rows
schema
scroll
second
section
select
session
session_user
set
sexp
size
smallint
some
space
sql
sqlcode
sqlerror
sqlstate
string
struct
substring
sum
symbol
system_user
table
temporary
then
time
timestamp
timezone_hour
timezone_minute
to
trailing
transaction
translate
translation
trim
true
tuple
union
unique
unknown
unpivot
update
upper
usage
user
using
value
values
varchar
varying
view
when
whenever
where
with
work
write
year
zone
```

# SQL functions
<a name="s3-select-sql-reference-sql-functions"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following SQL functions.

**Topics**
+ [

# Aggregate functions
](s3-select-sql-reference-aggregate.md)
+ [

# Conditional functions
](s3-select-sql-reference-conditional.md)
+ [

# Conversion functions
](s3-select-sql-reference-conversion.md)
+ [

# Date functions
](s3-select-sql-reference-date.md)
+ [

# String functions
](s3-select-sql-reference-string.md)

# Aggregate functions
<a name="s3-select-sql-reference-aggregate"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following aggregate functions.


| Function | Argument type | Return type | 
| --- | --- | --- | 
| `AVG(expression)` | `INT`, `FLOAT`, `DECIMAL` | `DECIMAL` for an `INT` argument, `FLOAT` for a floating-point argument; otherwise the same as the argument data type. | 
| `COUNT` |  `-`  | `INT` | 
| `MAX(expression)` | `INT`, `DECIMAL` | Same as the argument type. | 
| `MIN(expression)` | `INT`, `DECIMAL` | Same as the argument type. | 
| `SUM(expression)` | `INT`, `FLOAT`, `DOUBLE`, `DECIMAL` | `INT` for an `INT` argument, `FLOAT` for a floating-point argument; otherwise, the same as the argument data type. | 

## SUM example
<a name="s3-select-sql-reference-aggregate-case-examples"></a>

To aggregate the total object sizes of a folder in an [S3 Inventory report](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-inventory.html), use a `SUM` expression.

The following S3 Inventory report is a CSV file that's compressed with GZIP. There are three columns.
+ The first column is the name of the S3 bucket (*`DOC-EXAMPLE-BUCKET`*) that the S3 Inventory report is for.
+ The second column is the object key name that uniquely identifies the object in the bucket.

  The `example-folder/` value in the first row is for the folder `example-folder`. In Amazon S3, when you create a folder in your bucket, S3 creates a 0-byte object with a key that's set to the folder name that you provided.

  The `example-folder/object1` value in the second row is for the object `object1` in the folder `example-folder`.

  The `example-folder/object2` value in the third row is for the object `object2` in the folder `example-folder`.

  For more information about S3 folders, see [Organizing objects in the Amazon S3 console by using folders](using-folders.md).
+ The third column is the object size in bytes.

```
"DOC-EXAMPLE-BUCKET","example-folder/","0"
"DOC-EXAMPLE-BUCKET","example-folder/object1","2011267"
"DOC-EXAMPLE-BUCKET","example-folder/object2","1570024"
```

To use a `SUM` expression to calculate the total size of the folder `example-folder`, run the SQL query with Amazon S3 Select.

```
SELECT SUM(CAST(_3 as INT)) FROM s3object s WHERE _2 LIKE 'example-folder/%' AND _2 != 'example-folder/';
```

Query Result: 

```
3581291
```

# Conditional functions
<a name="s3-select-sql-reference-conditional"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following conditional functions.

**Topics**
+ [

## CASE
](#s3-select-sql-reference-case)
+ [

## COALESCE
](#s3-select-sql-reference-coalesce)
+ [

## NULLIF
](#s3-select-sql-reference-nullif)

## CASE
<a name="s3-select-sql-reference-case"></a>

The `CASE` expression is a conditional expression, similar to `if/then/else` statements found in other languages. `CASE` is used to specify a result when there are multiple conditions. There are two types of `CASE` expressions: simple and searched.

In simple `CASE` expressions, an expression is compared with a value. When a match is found, the specified action in the `THEN` clause is applied. If no match is found, the action in the `ELSE` clause is applied.

In searched `CASE` expressions, each `CASE` is evaluated based on a Boolean expression, and the `CASE` statement returns the first matching `CASE`. If no matching `CASE` is found among the `WHEN` clauses, the action in the `ELSE` clause is returned.

### Syntax
<a name="s3-select-sql-reference-case-syntax"></a>

**Note**  
Currently, Amazon S3 Select doesn't support `ORDER BY` or queries that contain new lines. Make sure that you use queries with no line breaks.

The following is a simple `CASE` statement that's used to match conditions:

```
CASE expression WHEN value THEN result [WHEN...] [ELSE result] END					
```

The following is a searched `CASE` statement that's used to evaluate each condition:

```
CASE WHEN boolean condition THEN result [WHEN ...] [ELSE result] END					
```

### Examples
<a name="s3-select-sql-reference-case-examples"></a>

**Note**  
If you use the Amazon S3 console to run the following examples and your CSV file contains a header row, choose **Exclude the first line of CSV data**. 

**Example 1:** Use a simple `CASE` expression to replace `New York City` with `Big Apple` in a query. Replace all other city names with `other`.

```
SELECT venuecity, CASE venuecity WHEN 'New York City' THEN 'Big Apple' ELSE 'other' END FROM S3Object;
```

Query result: 

```
venuecity        |   case
-----------------+-----------
Los Angeles      | other
New York City    | Big Apple
San Francisco    | other
Baltimore        | other
...
```

**Example 2:** Use a searched `CASE` expression to assign group numbers based on the `pricepaid` value for individual ticket sales:

```
SELECT pricepaid, CASE WHEN CAST(pricepaid as FLOAT) < 10000 THEN 'group 1' WHEN CAST(pricepaid as FLOAT) > 10000 THEN 'group 2' ELSE 'group 3' END FROM S3Object;					
```

Query result: 

```
pricepaid |  case
-----------+---------
12624.00 | group 2
10000.00 | group 3
10000.00 | group 3
9996.00 | group 1
9988.00 | group 1
...
```

## COALESCE
<a name="s3-select-sql-reference-coalesce"></a>

`COALESCE` evaluates the arguments in order and returns the first non-unknown value, that is, the first non-null or non-missing value. This function does not propagate null and missing values.

### Syntax
<a name="s3-select-sql-reference-coalesce-syntax"></a>

```
COALESCE ( expression, expression, ... )
```

### Parameters
<a name="s3-select-sql-reference-coalesce-parameters"></a>

 *`expression`*   
The target expression that the function operates on.

### Examples
<a name="s3-select-sql-reference-coalesce-examples"></a>

```
COALESCE(1)                -- 1
COALESCE(null)             -- null
COALESCE(null, null)       -- null
COALESCE(missing)          -- null
COALESCE(missing, missing) -- null
COALESCE(1, null)          -- 1
COALESCE(null, null, 1)    -- 1
COALESCE(null, 'string')   -- 'string'
COALESCE(missing, 1)       -- 1
```

## NULLIF
<a name="s3-select-sql-reference-nullif"></a>

Given two expressions, `NULLIF` returns `NULL` if the two expressions evaluate to the same value; otherwise, `NULLIF` returns the result of evaluating the first expression.

### Syntax
<a name="s3-select-sql-reference-nullif-syntax"></a>

```
NULLIF ( expression1, expression2 )
```

### Parameters
<a name="s3-select-sql-reference-nullif-parameters"></a>

 `expression1, expression2`   
The target expressions that the function operates on.

### Examples
<a name="s3-select-sql-reference-nullif-examples"></a>

```
NULLIF(1, 1)             -- null
NULLIF(1, 2)             -- 1
NULLIF(1.0, 1)           -- null
NULLIF(1, '1')           -- 1
NULLIF([1], [1])         -- null
NULLIF(1, NULL)          -- 1
NULLIF(NULL, 1)          -- null
NULLIF(null, null)       -- null
NULLIF(missing, null)    -- null
NULLIF(missing, missing) -- null
```

# Conversion functions
<a name="s3-select-sql-reference-conversion"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following conversion function.

**Topics**
+ [

## CAST
](#s3-select-sql-reference-cast)

## CAST
<a name="s3-select-sql-reference-cast"></a>

The `CAST` function converts an entity, such as an expression that evaluates to a single value, from one type to another. 

### Syntax
<a name="s3-select-sql-reference-cast-syntax"></a>

```
CAST ( expression AS data_type )
```

### Parameters
<a name="s3-select-sql-reference-cast-parameters"></a>

 *`expression`*   
A combination of one or more values, operators, and SQL functions that evaluate to a value.

 *`data_type`*   
The target data type, such as `INT`, to cast the expression to. For a list of supported data types, see [Data types](s3-select-sql-reference-data-types.md).

### Examples
<a name="s3-select-sql-reference-cast-examples"></a>

```
CAST('2007-04-05T14:30Z' AS TIMESTAMP)
CAST(0.456 AS FLOAT)
```

# Date functions
<a name="s3-select-sql-reference-date"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following date functions.

**Topics**
+ [

## DATE\$1ADD
](#s3-select-sql-reference-date-add)
+ [

## DATE\$1DIFF
](#s3-select-sql-reference-date-diff)
+ [

## EXTRACT
](#s3-select-sql-reference-extract)
+ [

## TO\$1STRING
](#s3-select-sql-reference-to-string)
+ [

## TO\$1TIMESTAMP
](#s3-select-sql-reference-to-timestamp)
+ [

## UTCNOW
](#s3-select-sql-reference-utcnow)

## DATE\$1ADD
<a name="s3-select-sql-reference-date-add"></a>

Given a date part, a quantity, and a timestamp, `DATE_ADD` returns an updated timestamp by altering the date part by the quantity.

### Syntax
<a name="s3-select-sql-reference-date-add-syntax"></a>

```
DATE_ADD( date_part, quantity, timestamp )
```

### Parameters
<a name="s3-select-sql-reference-date-add-parameters"></a>

*`date_part`*   
Specifies which part of the date to modify. This can be one of the following:  
+ year
+ month
+ day
+ hour
+ minute
+ second

 *`quantity`*   
The value to apply to the updated timestamp. Positive values for `quantity` add to the timestamp's date\$1part, and negative values subtract.

 *`timestamp`*   
The target timestamp that the function operates on.

### Examples
<a name="s3-select-sql-reference-date-add-examples"></a>

```
DATE_ADD(year, 5, `2010-01-01T`)                -- 2015-01-01 (equivalent to 2015-01-01T)
DATE_ADD(month, 1, `2010T`)                     -- 2010-02T (result will add precision as necessary)
DATE_ADD(month, 13, `2010T`)                    -- 2011-02T
DATE_ADD(day, -1, `2017-01-10T`)                -- 2017-01-09 (equivalent to 2017-01-09T)
DATE_ADD(hour, 1, `2017T`)                      -- 2017-01-01T01:00-00:00
DATE_ADD(hour, 1, `2017-01-02T03:04Z`)          -- 2017-01-02T04:04Z
DATE_ADD(minute, 1, `2017-01-02T03:04:05.006Z`) -- 2017-01-02T03:05:05.006Z
DATE_ADD(second, 1, `2017-01-02T03:04:05.006Z`) -- 2017-01-02T03:04:06.006Z
```

## DATE\$1DIFF
<a name="s3-select-sql-reference-date-diff"></a>

Given a date part and two valid timestamps, `DATE_DIFF` returns the difference in date parts. The return value is a negative integer when the `date_part` value of `timestamp1` is greater than the `date_part` value of `timestamp2`. The return value is a positive integer when the `date_part` value of `timestamp1` is less than the `date_part` value of `timestamp2`.

### Syntax
<a name="s3-select-sql-reference-date-diff-syntax"></a>

```
DATE_DIFF( date_part, timestamp1, timestamp2 )
```

### Parameters
<a name="s3-select-sql-reference-date-diff-parameters"></a>

 **`date_part`**   
Specifies which part of the timestamps to compare. For the definition of `date_part`, see [DATE\$1ADD](#s3-select-sql-reference-date-add).

 **`timestamp1`**   
The first timestamp to compare.

 **`timestamp2`**   
The second timestamp to compare.

### Examples
<a name="s3-select-sql-reference-date-diff-examples"></a>

```
DATE_DIFF(year, `2010-01-01T`, `2011-01-01T`)            -- 1
DATE_DIFF(year, `2010T`, `2010-05T`)                     -- 4 (2010T is equivalent to 2010-01-01T00:00:00.000Z)
DATE_DIFF(month, `2010T`, `2011T`)                       -- 12
DATE_DIFF(month, `2011T`, `2010T`)                       -- -12
DATE_DIFF(day, `2010-01-01T23:00`, `2010-01-02T01:00`) -- 0 (need to be at least 24h apart to be 1 day apart)
```

## EXTRACT
<a name="s3-select-sql-reference-extract"></a>

Given a date part and a timestamp, `EXTRACT` returns the timestamp's date part value.

### Syntax
<a name="s3-select-sql-reference-extract-syntax"></a>

```
EXTRACT( date_part FROM timestamp )
```

### Parameters
<a name="s3-select-sql-reference-extract-parameters"></a>

 **`date_part`**   
Specifies which part of the timestamps to extract. This can be one of the following:  
+ `YEAR`
+ `MONTH`
+ `DAY`
+ `HOUR`
+ `MINUTE`
+ `SECOND`
+ `TIMEZONE_HOUR`
+ `TIMEZONE_MINUTE`

 **`timestamp`**   
The target timestamp that the function operates on.

### Examples
<a name="s3-select-sql-reference-extract-examples"></a>

```
EXTRACT(YEAR FROM `2010-01-01T`)                           -- 2010
EXTRACT(MONTH FROM `2010T`)                                -- 1 (equivalent to 2010-01-01T00:00:00.000Z)
EXTRACT(MONTH FROM `2010-10T`)                             -- 10
EXTRACT(HOUR FROM `2017-01-02T03:04:05+07:08`)             -- 3
EXTRACT(MINUTE FROM `2017-01-02T03:04:05+07:08`)           -- 4
EXTRACT(TIMEZONE_HOUR FROM `2017-01-02T03:04:05+07:08`)    -- 7
EXTRACT(TIMEZONE_MINUTE FROM `2017-01-02T03:04:05+07:08`)  -- 8
```

## TO\$1STRING
<a name="s3-select-sql-reference-to-string"></a>

Given a timestamp and a format pattern, `TO_STRING` returns a string representation of the timestamp in the given format.

### Syntax
<a name="s3-select-sql-reference-size-syntax"></a>

```
TO_STRING ( timestamp time_format_pattern )
```

### Parameters
<a name="s3-select-sql-reference-size-parameters"></a>

 *`timestamp`*   
The target timestamp that the function operates on.

 *`time_format_pattern`*   
A string that has the following special character interpretations:      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-select-sql-reference-date.html)

### Examples
<a name="s3-select-sql-reference-size-examples"></a>

```
TO_STRING(`1969-07-20T20:18Z`,  'MMMM d, y')                    -- "July 20, 1969"
TO_STRING(`1969-07-20T20:18Z`, 'MMM d, yyyy')                   -- "Jul 20, 1969"
TO_STRING(`1969-07-20T20:18Z`, 'M-d-yy')                        -- "7-20-69"
TO_STRING(`1969-07-20T20:18Z`, 'MM-d-y')                        -- "07-20-1969"
TO_STRING(`1969-07-20T20:18Z`, 'MMMM d, y h:m a')               -- "July 20, 1969 8:18 PM"
TO_STRING(`1969-07-20T20:18Z`, 'y-MM-dd''T''H:m:ssX')           -- "1969-07-20T20:18:00Z"
TO_STRING(`1969-07-20T20:18+08:00Z`, 'y-MM-dd''T''H:m:ssX')     -- "1969-07-20T20:18:00Z"
TO_STRING(`1969-07-20T20:18+08:00`, 'y-MM-dd''T''H:m:ssXXXX')   -- "1969-07-20T20:18:00+0800"
TO_STRING(`1969-07-20T20:18+08:00`, 'y-MM-dd''T''H:m:ssXXXXX')  -- "1969-07-20T20:18:00+08:00"
```

## TO\$1TIMESTAMP
<a name="s3-select-sql-reference-to-timestamp"></a>

Given a string, `TO_TIMESTAMP` converts it to a timestamp. `TO_TIMESTAMP` is the inverse operation of `TO_STRING`.

### Syntax
<a name="s3-select-sql-reference-to-timestamp-syntax"></a>

```
TO_TIMESTAMP ( string )
```

### Parameters
<a name="s3-select-sql-reference-to-timestamp-parameters"></a>

 *`string`*   
The target string that the function operates on.

### Examples
<a name="s3-select-sql-reference-to-timestamp-examples"></a>

```
TO_TIMESTAMP('2007T')                         -- `2007T`
TO_TIMESTAMP('2007-02-23T12:14:33.079-08:00') -- `2007-02-23T12:14:33.079-08:00`
```

## UTCNOW
<a name="s3-select-sql-reference-utcnow"></a>

`UTCNOW` returns the current time in UTC as a timestamp.

### Syntax
<a name="s3-select-sql-reference-utcnow-syntax"></a>

```
UTCNOW()
```

### Parameters
<a name="s3-select-sql-reference-utcnow-parameters"></a>

`UTCNOW` takes no parameters.

### Examples
<a name="s3-select-sql-reference-utcnow-examples"></a>

```
UTCNOW() -- 2017-10-13T16:02:11.123Z
```

# String functions
<a name="s3-select-sql-reference-string"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following string functions.

**Topics**
+ [

## CHAR\$1LENGTH, CHARACTER\$1LENGTH
](#s3-select-sql-reference-char-length)
+ [

## LOWER
](#s3-select-sql-reference-lower)
+ [

## SUBSTRING
](#s3-select-sql-reference-substring)
+ [

## TRIM
](#s3-select-sql-reference-trim)
+ [

## UPPER
](#s3-select-sql-reference-upper)

## CHAR\$1LENGTH, CHARACTER\$1LENGTH
<a name="s3-select-sql-reference-char-length"></a>

`CHAR_LENGTH` (or `CHARACTER_LENGTH`) counts the number of characters in the specified string.

**Note**  
`CHAR_LENGTH` and `CHARACTER_LENGTH` are synonyms.

### Syntax
<a name="s3-select-sql-reference-char-length-syntax"></a>

```
CHAR_LENGTH ( string )
```

### Parameters
<a name="s3-select-sql-reference-char-length-parameters"></a>

 *`string`*   
The target string that the function operates on.

### Examples
<a name="s3-select-sql-reference-char-length-examples"></a>

```
CHAR_LENGTH('')          -- 0
CHAR_LENGTH('abcdefg')   -- 7
```

## LOWER
<a name="s3-select-sql-reference-lower"></a>

Given a string, `LOWER` converts all uppercase characters to lowercase characters. Any non-uppercased characters remain unchanged.

### Syntax
<a name="s3-select-sql-reference-lower-syntax"></a>

```
LOWER ( string )
```

### Parameters
<a name="s3-select-sql-reference-lower-parameters"></a>

 **`string`**   
The target string that the function operates on.

### Examples
<a name="s3-select-sql-reference-lower-examples"></a>

```
LOWER('AbCdEfG!@#$') -- 'abcdefg!@#$'
```

## SUBSTRING
<a name="s3-select-sql-reference-substring"></a>

Given a string, a start index, and optionally a length, `SUBSTRING` returns the substring from the start index up to the end of the string, or up to the length provided.

**Note**  
The first character of the input string has an index position of 1.  
 If `start` is < 1, with no length specified, then the index position is set to 1. 
 If `start` is < 1, with a length specified, then the index position is set to `start + length -1`. 
 If `start + length -1` < 0, then an empty string is returned. 
 If `start + length -1` > = 0, then the substring starting at index position 1 with the length `start + length - 1` is returned. 

### Syntax
<a name="s3-select-sql-reference-substring-syntax"></a>

```
SUBSTRING( string FROM start [ FOR length ] )
```

### Parameters
<a name="s3-select-sql-reference-substring-parameters"></a>

 **`string`**   
The target string that the function operates on.

 **`start`**   
The start position of the string.

 **`length`**   
The length of the substring to return. If not present, proceed to the end of the string.

### Examples
<a name="s3-select-sql-reference-substring-examples"></a>

```
SUBSTRING("123456789", 0)      -- "123456789"
SUBSTRING("123456789", 1)      -- "123456789"
SUBSTRING("123456789", 2)      -- "23456789"
SUBSTRING("123456789", -4)     -- "123456789"
SUBSTRING("123456789", 0, 999) -- "123456789" 
SUBSTRING("123456789", 1, 5)   -- "12345"
```

## TRIM
<a name="s3-select-sql-reference-trim"></a>

Trims leading or trailing characters from a string. The default character to remove is a space (`' '`).

### Syntax
<a name="s3-select-sql-reference-trim-syntax"></a>

```
TRIM ( [[LEADING | TRAILING | BOTH remove_chars] FROM] string )
```

### Parameters
<a name="s3-select-sql-reference-trim-parameters"></a>

 **`string`**   
The target string that the function operates on.

 `LEADING` \$1 `TRAILING` \$1 `BOTH`   
This parameter indicates whether to trim leading or trailing characters, or both leading and trailing characters.

 **`remove_chars`**   
The set of characters to remove. `remove_chars` can be a string with a length > 1. This function returns the string with any character from `remove_chars` found at the beginning or end of the string that was removed.

### Examples
<a name="s3-select-sql-reference-trim-examples"></a>

```
TRIM('       foobar         ')               -- 'foobar'
TRIM('      \tfoobar\t         ')            -- '\tfoobar\t'
TRIM(LEADING FROM '       foobar         ')  -- 'foobar         '
TRIM(TRAILING FROM '       foobar         ') -- '       foobar'
TRIM(BOTH FROM '       foobar         ')     -- 'foobar'
TRIM(BOTH '12' FROM '1112211foobar22211122') -- 'foobar'
```

## UPPER
<a name="s3-select-sql-reference-upper"></a>

Given a string, `UPPER` converts all lowercase characters to uppercase characters. Any non-lowercased characters remain unchanged.

### Syntax
<a name="s3-select-sql-reference-upper-syntax"></a>

```
UPPER ( string )
```

### Parameters
<a name="s3-select-sql-reference-upper-parameters"></a>

 **`string`**   
The target string that the function operates on.

### Examples
<a name="s3-select-sql-reference-upper-examples"></a>

```
UPPER('AbCdEfG!@#$') -- 'ABCDEFG!@#$'
```