

# Querying data in place with Amazon S3 Select
<a name="selecting-content-from-objects"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

With Amazon S3 Select, you can use structured query language (SQL) statements to filter the contents of an Amazon S3 object and retrieve only the subset of data that you need. By using Amazon S3 Select to filter this data, you can reduce the amount of data that Amazon S3 transfers, which reduces the cost and latency to retrieve this data.

Amazon S3 Select only allows you to query one object at a time. It works on an object stored in CSV, JSON, or Apache Parquet format. It also works with an object that is compressed with GZIP or BZIP2 (for CSV and JSON objects only), and a server-side encrypted object. You can specify the format of the results as either CSV or JSON, and you can determine how the records in the result are delimited.

You pass SQL expressions to Amazon S3 in the request. Amazon S3 Select supports a subset of SQL. For more information about the SQL elements that are supported by Amazon S3 Select, see [SQL reference for Amazon S3 Select](s3-select-sql-reference.md).

You can perform SQL queries by using the Amazon S3 console, the AWS Command Line Interface (AWS CLI), the `SelectObjectContent` REST API operation, or the AWS SDKs. 

**Note**  
The Amazon S3 console limits the amount of data returned to 40 MB. To retrieve more data, use the AWS CLI or the API.

## Requirements and limits
<a name="selecting-content-from-objects-requirements-and-limits"></a>

The following are requirements for using Amazon S3 Select:
+ You must have `s3:GetObject` permission for the object you are querying.
+ If the object you are querying is encrypted with server-side encryption with customer-provided keys (SSE-C), you must use `https`, and you must provide the encryption key in the request.

The following limits apply when using Amazon S3 Select:
+ S3 Select can query only one object per request.
+ S3 Select supports querying files up to 5 TB in size.
+ The maximum length of a SQL expression is 256 KB.
+ The maximum length of a record in the input or result is 1 MB.
+ Amazon S3 Select can only emit nested data by using the JSON output format.
+ You cannot query an object stored in the S3 Glacier Flexible Retrieval, S3 Glacier Deep Archive, or Reduced Redundancy Storage (RRS) storage classes. You also cannot query an object stored in the S3 Intelligent-Tiering Archive Access tier or the S3 Intelligent-Tiering Deep Archive Access tier. For more information about storage classes, see [Understanding and managing Amazon S3 storage classes](storage-class-intro.md).

Additional limitations apply when using Amazon S3 Select with a Parquet object:
+ Amazon S3 Select supports only columnar compression using GZIP or Snappy. Amazon S3 Select doesn't support whole-object compression for a Parquet object.
+ Amazon S3 Select doesn't support Parquet output. You must specify the output format as CSV or JSON.
+ The maximum uncompressed row group size is 512 MB.
+ You must use the data types that are specified in the object's schema.
+ Selecting on a repeated field returns only the last value.

## Constructing a request
<a name="selecting-content-from-objects-contructing-request"></a>

When you construct a request, you provide details of the object that is being queried by using an `InputSerialization` object. You provide details of how the results are to be returned by using an `OutputSerialization` object. You also include the SQL expression that Amazon S3 uses to filter the request.

For more information about constructing an Amazon S3 Select request, see [https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectSELECTContent.html](https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectSELECTContent.html) in the *Amazon Simple Storage Service API Reference*. You can also see one of the SDK code examples in the following sections.

### Requests using scan ranges
<a name="selecting-content-from-objects-using-byte-range"></a>

With Amazon S3 Select, you can scan a subset of an object by specifying a range of bytes to query. This capability lets you parallelize scanning the whole object by splitting the work into separate Amazon S3 Select requests for a series of non-overlapping scan ranges.

 Scan ranges don't need to be aligned with record boundaries. An Amazon S3 Select scan range request runs across the byte range that you specify. A record that starts within the specified scan range but extends beyond that scan range will be processed by the query. For example, the following shows an Amazon S3 object that contains a series of records in a line-delimited CSV format:

```
A,B
C,D
D,E
E,F
G,H
I,J
```

Suppose that you're using the Amazon S3 Select `ScanRange` parameter and *Start* at (Byte) 1 and *End* at (Byte) 4. So the scan range would start at "`,`" and scan until the end of the record starting at `C`. Your scan range request will return the result `C, D` because that is the end of the record. 

 Amazon S3 Select scan range requests support Parquet, CSV (without quoted delimiters), or JSON objects (in `LINES` mode only). CSV and JSON objects must be uncompressed. For line-based CSV and JSON objects, when a scan range is specified as part of the Amazon S3 Select request, all records that start within the scan range are processed. For Parquet objects, all of the row groups that start within the scan range requested are processed. 

Amazon S3 Select scan range requests are available to use with the AWS CLI, Amazon S3 API, and AWS SDKs. You can use the `ScanRange` parameter in the Amazon S3 Select request for this feature. For more information, see [https://docs.aws.amazon.com/AmazonS3/latest/API/API_SelectObjectContent.html](https://docs.aws.amazon.com/AmazonS3/latest/API/API_SelectObjectContent.html) in the *Amazon Simple Storage Service API Reference*.

## Errors
<a name="selecting-content-from-objects-errors"></a>

Amazon S3 Select returns an error code and associated error message when an issue is encountered while attempting to run a query. For a list of error codes and descriptions, see the [List of SELECT Object Content Error Codes](https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html#SelectObjectContentErrorCodeList) section of the *Error Responses* page in the *Amazon Simple Storage Service API Reference*.

For more information about Amazon S3 Select, see the following topics.

**Topics**
+ [

## Requirements and limits
](#selecting-content-from-objects-requirements-and-limits)
+ [

## Constructing a request
](#selecting-content-from-objects-contructing-request)
+ [

## Errors
](#selecting-content-from-objects-errors)
+ [

# Examples of using Amazon S3 Select on an object
](using-select.md)
+ [

# SQL reference for Amazon S3 Select
](s3-select-sql-reference.md)

# Examples of using Amazon S3 Select on an object
<a name="using-select"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

You can use S3 Select to select content from one object by using the Amazon S3 console, the REST API, and the AWS SDKs. 

For more information about supported SQL functions for S3 Select, see [SQL functions](s3-select-sql-reference-sql-functions.md).

## Using the S3 console
<a name="s3-select-objects-console"></a>

**To select content from an object in the Amazon S3 console**

1. Sign in to the AWS Management Console and open the Amazon S3 console at [https://console.aws.amazon.com/s3/](https://console.aws.amazon.com/s3/).

1. In the left navigation pane, choose **Buckets**.

1. Choose the bucket that contains the object that you want to select content from, and then choose the name of the object.

1. Choose **Object actions**, and choose **Query with S3 Select**.

1. Configure **Input settings**, based on the format of your input data.

1. Configure **Output settings**, based on the format of the output that you want to receive.

1. To extract records from the chosen object, under **SQL query**, enter the SELECT SQL commands. For more information on how to write SQL commands, see [SQL reference for Amazon S3 Select](s3-select-sql-reference.md).

1. After you enter SQL queries, choose **Run SQL query**. Then, under **Query results**, you can see the results of your SQL queries.

## Using the REST API
<a name="SelectObjectContentUsingRestApi"></a>

You can use the AWS SDKs to select content from an object. However, if your application requires it, you can send REST requests directly. For more information about the request and response format, see [https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectSELECTContent.html](https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectSELECTContent.html).

## Using the AWS SDKs
<a name="SelectObjectContentUsingSDK"></a>

You can use Amazon S3 Select to select some of the content of an object by using the `selectObjectContent` method. If this method is successful, it returns the results of the SQL expression.

------
#### [ Java ]

To use Amazon S3 Select with the AWS SDK for Java, you can return the value of the first column for each record that is stored in an object that contains data stored in CSV format. You can also request `Progress` and `Stats` messages to be returned. You must provide a valid bucket name and an object that contains data in CSV format.

To use Amazon S3 Select with the AWS SDK for Java, you can return the value of the first column for each record that is stored in an object that contains data stored in CSV format. You can also request `Progress` and `Stats` messages to be returned. You must provide a valid bucket name and an object that contains data in CSV format.

For examples of how to use Amazon S3 Select with the AWS SDK for Java, see [Select content from an object](https://docs.aws.amazon.com/AmazonS3/latest/API/s3_example_s3_SelectObjectContent_section.html) in the *Amazon S3 API Reference*.

------
#### [ JavaScript ]

For a JavaScript example that uses the AWS SDK for JavaScript with the S3 `SelectObjectContent` API operation to select records from JSON and CSV files that are stored in Amazon S3, see the blog post [ Introducing support for Amazon S3 Select in the AWS SDK for JavaScript](https://aws.amazon.com/blogs/developer/introducing-support-for-amazon-s3-select-in-the-aws-sdk-for-javascript/). 

------
#### [ Python ]

For a Python example of using SQL queries to search through data that was loaded to Amazon S3 as a comma-separated value (CSV) file by using S3 Select, see the blog post [Querying data without servers or databases using Amazon S3 Select](https://aws.amazon.com/blogs/storage/querying-data-without-servers-or-databases-using-amazon-s3-select/). 

------

# SQL reference for Amazon S3 Select
<a name="s3-select-sql-reference"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

This reference contains a description of the structured query language (SQL) elements that are supported by Amazon S3 Select.

**Topics**
+ [

# SELECT command
](s3-select-sql-reference-select.md)
+ [

# Data types
](s3-select-sql-reference-data-types.md)
+ [

# Operators
](s3-select-sql-reference-operators.md)
+ [

# Reserved keywords
](s3-select-sql-reference-keyword-list.md)
+ [

# SQL functions
](s3-select-sql-reference-sql-functions.md)

# SELECT command
<a name="s3-select-sql-reference-select"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports only the `SELECT` SQL command. The following ANSI standard clauses are supported for `SELECT`: 


+ `SELECT` list
+ `FROM` clause 
+ `WHERE` clause
+ `LIMIT` clause

**Note**  
Amazon S3 Select queries currently do not support subqueries or joins.

## SELECT list
<a name="s3-select-sql-reference-select-list"></a>

The `SELECT` list names the columns, functions, and expressions that you want the query to return. The list represents the output of the query. 

```
SELECT *
SELECT projection1 AS column_alias_1, projection2 AS column_alias_2
```

The first form of `SELECT` with the `*` (asterisk) returns every row that passed the `WHERE` clause, as-is. The second form of `SELECT` creates a row with user-defined output scalar expressions **`projection1`** and **`projection2`** for each column.

## FROM clause
<a name="s3-select-sql-reference-from"></a>

Amazon S3 Select supports the following forms of the `FROM` clause:

```
FROM table_name
FROM table_name alias
FROM table_name AS alias
```

In each form of the `FROM` clause, `table_name` is the `S3Object` that's being queried. Users coming from traditional relational databases can think of this as a database schema that contains multiple views over a table.

Following standard SQL, the `FROM` clause creates rows that are filtered in the `WHERE` clause and projected in the `SELECT` list. 

For JSON objects that are stored in Amazon S3 Select, you can also use the following forms of the `FROM` clause:

```
FROM S3Object[*].path
FROM S3Object[*].path alias
FROM S3Object[*].path AS alias
```

Using this form of the `FROM` clause, you can select from arrays or objects within a JSON object. You can specify `path` by using one of the following forms:
+ By name (in an object): `.name` or `['name']`
+ By index (in an array): `[index]`
+ By wildcard character (in an object): `.*`
+ By wildcard character (in an array): `[*]`

**Note**  
This form of the `FROM` clause works only with JSON objects.
Wildcard characters always emit at least one record. If no record matches, then Amazon S3 Select emits the value `MISSING`. During output serialization (after the query finishes running), Amazon S3 Select replaces `MISSING` values with empty records.
Aggregate functions (`AVG`, `COUNT`, `MAX`, `MIN`, and `SUM`) skip `MISSING` values.
If you don't provide an alias when using a wildcard character, you can refer to the row by using the last element in the path. For example, you could select all prices from a list of books by using the query `SELECT price FROM S3Object[*].books[*].price`. If the path ends in a wildcard character instead of a name, then you can use the value `_1` to refer to the row. For example, instead of `SELECT price FROM S3Object[*].books[*].price`, you could use the query `SELECT _1.price FROM S3Object[*].books[*]`.
Amazon S3 Select always treats a JSON document as an array of root-level values. Thus, even if the JSON object that you are querying has only one root element, the `FROM` clause must begin with `S3Object[*]`. However, for compatibility reasons, Amazon S3 Select allows you to omit the wildcard character if you don't include a path. Thus, the complete clause `FROM S3Object` is equivalent to `FROM S3Object[*] as S3Object`. If you include a path, you must also use the wildcard character. So, `FROM S3Object` and `FROM S3Object[*].path` are both valid clauses, but `FROM S3Object.path` is not.

**Example**  
**Examples:**  
*Example \$11*  
This example shows results when using the following dataset and query:  

```
{ "Rules": [ {"id": "1"}, {"expr": "y > x"}, {"id": "2", "expr": "z = DEBUG"} ]}
{ "created": "June 27", "modified": "July 6" }
```

```
SELECT id FROM S3Object[*].Rules[*].id
```

```
{"id":"1"}
{}
{"id":"2"}
{}
```
Amazon S3 Select produces each result for the following reasons:  
+ `{"id":"id-1"}` – `S3Object[0].Rules[0].id` produced a match.
+ `{}` – `S3Object[0].Rules[1].id` did not match a record, so Amazon S3 Select emitted `MISSING`, which was then changed to an empty record during output serialization and returned.
+ `{"id":"id-2"}` – `S3Object[0].Rules[2].id` produced a match.
+ `{}` – `S3Object[1]` did not match on `Rules`, so Amazon S3 Select emitted `MISSING`, which was then changed to an empty record during output serialization and returned.
If you don't want Amazon S3 Select to return empty records when it doesn't find a match, you can test for the value `MISSING`. The following query returns the same results as the previous query, but with the empty values omitted:  

```
SELECT id FROM S3Object[*].Rules[*].id WHERE id IS NOT MISSING
```

```
{"id":"1"}
{"id":"2"}
```
*Example \$12*  
This example shows results when using the following dataset and queries:  

```
{ "created": "936864000", "dir_name": "important_docs", "files": [ { "name": "." }, { "name": ".." }, { "name": ".aws" }, { "name": "downloads" } ], "owner": "Amazon S3" }
{ "created": "936864000", "dir_name": "other_docs", "files": [ { "name": "." }, { "name": ".." }, { "name": "my stuff" }, { "name": "backup" } ], "owner": "User" }
```

```
SELECT d.dir_name, d.files FROM S3Object[*] d
```

```
{"dir_name":"important_docs","files":[{"name":"."},{"name":".."},{"name":".aws"},{"name":"downloads"}]}
{"dir_name":"other_docs","files":[{"name":"."},{"name":".."},{"name":"my stuff"},{"name":"backup"}]}
```

```
SELECT _1.dir_name, _1.owner FROM S3Object[*]
```

```
{"dir_name":"important_docs","owner":"Amazon S3"}
{"dir_name":"other_docs","owner":"User"}
```

## WHERE clause
<a name="s3-select-sql-reference-where"></a>

The `WHERE` clause follows this syntax: 

```
WHERE condition
```

The `WHERE` clause filters rows based on the `condition`. A condition is an expression that has a Boolean result. Only rows for which the condition evaluates to `TRUE` are returned in the result.

## LIMIT clause
<a name="s3-select-sql-reference-limit"></a>

The `LIMIT` clause follows this syntax: 

```
LIMIT number
```

The `LIMIT` clause limits the number of records that you want the query to return based on `number`.

## Attribute access
<a name="s3-select-sql-reference-attribute-access"></a>

The `SELECT` and `WHERE` clauses can refer to record data by using one of the methods in the following sections, depending on whether the file that is being queried is in CSV or JSON format.

### CSV
<a name="s3-select-sql-reference-attribute-access-csv"></a>
+ **Column Numbers** – You can refer to the *Nth* column of a row with the column name `_N`, where *`N`* is the column position. The position count starts at 1. For example, the first column is named `_1` and the second column is named `_2`.

  You can refer to a column as `_N` or `alias._N`. For example, `_2` and `myAlias._2` are both valid ways to refer to a column in the `SELECT` list and `WHERE` clause.
+ **Column Headers** – For objects in CSV format that have a header row, the headers are available to the `SELECT` list and `WHERE` clause. In particular, as in traditional SQL, within `SELECT` and `WHERE` clause expressions, you can refer to the columns by `alias.column_name` or `column_name`.

### JSON
<a name="s3-select-sql-reference-attribute-access-json"></a>
+ **Document** – You can access JSON document fields as `alias.name`. You can also access nested fields, for example, `alias.name1.name2.name3`.
+ **List** – You can access elements in a JSON list by using zero-based indexes with the `[]` operator. For example, you can access the second element of a list as `alias[1]`. You can combine accessing list elements with fields, for example, `alias.name1.name2[1].name3`.
+ **Examples:** Consider this JSON object as a sample dataset:

  ```
  {"name": "Susan Smith",
  "org": "engineering",
  "projects":
      [
       {"project_name":"project1", "completed":false},
       {"project_name":"project2", "completed":true}
      ]
  }
  ```

  *Example \$11*

  The following query returns these results:

  ```
  Select s.name from S3Object s
  ```

  ```
  {"name":"Susan Smith"}
  ```

  *Example \$12*

  The following query returns these results:

  ```
  Select s.projects[0].project_name from S3Object s
  ```

  ```
  {"project_name":"project1"}
  ```

## Case sensitivity of header and attribute names
<a name="s3-select-sql-reference-case-sensitivity"></a>

With Amazon S3 Select, you can use double quotation marks to indicate that column headers (for CSV objects) and attributes (for JSON objects) are case sensitive. Without double quotation marks, object headers and attributes are case insensitive. An error is thrown in cases of ambiguity.

The following examples are either 1) Amazon S3 objects in CSV format with the specified column headers, and with `FileHeaderInfo` set to `"Use"` for the query request; or 2) Amazon S3 objects in JSON format with the specified attributes.

*Example \$11:* The object being queried has the header or attribute `NAME`.
+ The following expression successfully returns values from the object. Because there are no quotation marks, the query is case insensitive.

  ```
  SELECT s.name from S3Object s
  ```
+ The following expression results in a 400 error `MissingHeaderName`. Because there are quotation marks, the query is case sensitive. 

  ```
  SELECT s."name" from S3Object s
  ```

*Example \$12:* The Amazon S3 object being queried has one header or attribute with `NAME` and another header or attribute with `name`.
+ The following expression results in a 400 error `AmbiguousFieldName`. Because there are no quotation marks, the query is case insensitive, but there are two matches, so the error is thrown.

  ```
  SELECT s.name from S3Object s
  ```
+ The following expression successfully returns values from the object. Because there are quotation marks, the query is case sensitive, so there is no ambiguity.

  ```
  SELECT s."NAME" from S3Object s
  ```

## Using reserved keywords as user-defined terms
<a name="s3-select-sql-reference-using-keywords"></a>

Amazon S3 Select has a set of reserved keywords that are needed to run the SQL expressions used to query object content. Reserved keywords include function names, data types, operators, and so on. In some cases, user-defined terms, such as the column headers (for CSV files) or attributes (for JSON objects), might clash with a reserved keyword. When this happens, you must use double quotation marks to indicate that you are intentionally using a user-defined term that clashes with a reserved keyword. Otherwise a 400 parse error will result.

For the full list of reserved keywords, see [Reserved keywords](s3-select-sql-reference-keyword-list.md).

The following example is either 1) an Amazon S3 object in CSV format with the specified column headers, with `FileHeaderInfo` set to `"Use"` for the query request, or 2) an Amazon S3 object in JSON format with the specified attributes.

*Example:* The object being queried has a header or attribute named `CAST`, which is a reserved keyword.
+ The following expression successfully returns values from the object. Because quotation marks are used in the query, S3 Select uses the user-defined header or attribute.

  ```
  SELECT s."CAST" from S3Object s
  ```
+ The following expression results in a 400 parse error. Because no quotation marks are used in the query, `CAST` clashes with a reserved keyword.

  ```
  SELECT s.CAST from S3Object s
  ```

## Scalar expressions
<a name="s3-select-sql-reference-scalar"></a>

Within the `WHERE` clause and the `SELECT` list, you can have SQL *scalar expressions*, which are expressions that return scalar values. They have the following form:
+ ***`literal`*** 

  An SQL literal. 
+ ***`column_reference`*** 

  A reference to a column in the form `column_name` or `alias.column_name`. 
+ **`unary_op`** **`expression`** 

  In this case, ****`unary_op`**** is an SQL unary operator.
+ **`expression`** **`binary_op`** ***`expression`*** 

   In this case, ****`binary_op`**** is an SQL binary operator. 
+ **`func_name`** 

   In this case, **`func_name`** is the name of the scalar function to invoke. 
+ ***`expression`*** `[ NOT ] BETWEEN` ****`expression`**** `AND` ****`expression`****
+ ***`expression`*** `LIKE` ****`expression`**** [ `ESCAPE` ***`expression`*** ]

# Data types
<a name="s3-select-sql-reference-data-types"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports several primitive data types.

## Data type conversions
<a name="s3-select-sql-reference-data-conversion"></a>

The general rule is to follow the `CAST` function if it's defined. If `CAST` is not defined, then all input data is treated as a string. In that case, you must cast your input data into the relevant data types when necessary.

For more information about the `CAST` function, see [CAST](s3-select-sql-reference-conversion.md#s3-select-sql-reference-cast).

## Supported data types
<a name="s3-select-sql-reference-supported-data-types"></a>

Amazon S3 Select supports the following set of primitive data types.


|  Name  |  Description  |  Examples  | 
| --- | --- | --- | 
| `bool` | A Boolean value, either `TRUE` or `FALSE`. | `FALSE` | 
| `int`, `integer` | An 8-byte signed integer in the range -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.  | `100000` | 
| `string` | A UTF8-encoded variable-length string. The default limit is 1 character. The maximum character limit is 2,147,483,647.  | `'xyz'` | 
| `float` | An 8-byte floating point number.  | `CAST(0.456 AS FLOAT)` | 
| `decimal`, `numeric` |  A base-10 number, with a maximum precision of 38 (that is, the maximum number of significant digits), and with a scale within the range of -231 to 231-1 (that is, the base-10 exponent).  Amazon S3 Select ignores scale and precision when you provide both at the same time.   | `123.456 ` | 
| `timestamp` |  Timestamps represent a specific moment in time, always include a local offset, and are capable of arbitrary precision. In the text format, timestamps follow the [W3C note on date and time formats](https://www.w3.org/TR/NOTE-datetime), but they must end with the literal `T` if the timestamps are not at least whole-day precision. Fractional seconds are allowed, with at least one digit of precision, and an unlimited maximum. Local-time offsets can be represented as either hour:minute offsets from UTC, or as the literal `Z` to denote a local time of UTC. Local-time offsets are required on timestamps with time and are not allowed on date values.  | `CAST('2007-04-05T14:30Z' AS TIMESTAMP)` | 

### Supported Parquet types
<a name="s3-select-sql-reference-supported-data-types-parquet"></a>

Amazon S3 Select supports the following Parquet types.
+ `DATE`
+ `DECIMAL`
+ `ENUM`
+ `INT(8)`
+ `INT(16)`
+ `INT(32)`
+ `INT(64)`
+ `LIST`
**Note**  
For `LIST` Parquet type output, Amazon S3 Select supports only JSON format. However, if the query limits the data to simple values, the `LIST` Parquet type can also be queried in CSV format.
+ `STRING`
+ `TIMESTAMP` supported precision (`MILLIS`/`MICROS`/`NANOS`)
**Note**  
Timestamps saved as an `INT(96)` are unsupported.   
Because of the range of the `INT(64)` type, timestamps that are using the `NANOS` unit can represent only values between `1677-09-21 00:12:43` and `2262-04-11 23:47:16`. Values outside of this range cannot be represented with the `NANOS` unit.

### Mapping of Parquet types to supported data types in Amazon S3 Select
<a name="s3-select-sql-reference-supported-data-types-parquet-mapping"></a>


| Parquet types | Supported data types | 
| --- | --- | 
| `DATE` |  `timestamp`  | 
|  `DECIMAL`  |  `decimal`, `numeric`  | 
|  `ENUM`  |  `string`  | 
|  `INT(8)`  |  `int`, `integer`  | 
|  `INT(16)`  | `int`, `integer` | 
| `INT(32)` | `int`, `integer` | 
|  `INT(64)`  |  `decimal`, `numeric`  | 
|  `LIST`  |  Each Parquet type in list is mapped to the corresponding data type.  | 
|  `STRING`  |  `string`  | 
|  `TIMESTAMP`  |  `timestamp`  | 

# Operators
<a name="s3-select-sql-reference-operators"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following operators.

## Logical operators
<a name="s3-select-sql-reference-loical-ops"></a>
+ `AND`
+ `NOT`
+ `OR`

## Comparison operators
<a name="s3-select-sql-reference-compare-ops"></a>
+ `<` 
+ `>` 
+ `<=`
+ `>=`
+ `=`
+ `<>`
+ `!=`
+ `BETWEEN`
+ `IN` – For example: `IN ('a', 'b', 'c')`

  

## Pattern-matching operators
<a name="s3-select-sql-reference-pattern"></a>
+ `LIKE`
+ `_` (Matches any character)
+ `%` (Matches any sequence of characters)

## Unitary operators
<a name="s3-select-sql-reference-unitary-ops"></a>
+ `IS NULL`
+ `IS NOT NULL`

## Math operators
<a name="s3-select-sql-referencemath-ops"></a>

Addition, subtraction, multiplication, division, and modulo are supported, as follows:
+ \$1
+ -
+ \$1
+ /
+ %

## Operator precedence
<a name="s3-select-sql-reference-op-Precedence"></a>

The following table shows the operators' precedence in decreasing order.


|  Operator or element  |  Associativity |  Required  | 
| --- | --- | --- | 
| `-`  | right  | unary minus  | 
| `*`, `/`, `%`  | left  | multiplication, division, modulo  | 
| `+`, `-`  | left  | addition, subtraction  | 
| `IN` |  | set membership  | 
| `BETWEEN` |  | range containment  | 
| `LIKE` |  | string pattern matching  | 
| `<``>` |  | less than, greater than  | 
| `=` | right  | equality, assignment | 
| `NOT` | right | logical negation  | 
| `AND` | left | logical conjunction  | 
| `OR` | left | logical disjunction  | 

# Reserved keywords
<a name="s3-select-sql-reference-keyword-list"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

The following is the list of reserved keywords for Amazon S3 Select. These keywords include function names, data types, operators, and so on, that are needed to run the SQL expressions that are used to query object content.

```
absolute
action
add
all
allocate
alter
and
any
are
as
asc
assertion
at
authorization
avg
bag
begin
between
bit
bit_length
blob
bool
boolean
both
by
cascade
cascaded
case
cast
catalog
char
char_length
character
character_length
check
clob
close
coalesce
collate
collation
column
commit
connect
connection
constraint
constraints
continue
convert
corresponding
count
create
cross
current
current_date
current_time
current_timestamp
current_user
cursor
date
day
deallocate
dec
decimal
declare
default
deferrable
deferred
delete
desc
describe
descriptor
diagnostics
disconnect
distinct
domain
double
drop
else
end
end-exec
escape
except
exception
exec
execute
exists
external
extract
false
fetch
first
float
for
foreign
found
from
full
get
global
go
goto
grant
group
having
hour
identity
immediate
in
indicator
initially
inner
input
insensitive
insert
int
integer
intersect
interval
into
is
isolation
join
key
language
last
leading
left
level
like
limit
list
local
lower
match
max
min
minute
missing
module
month
names
national
natural
nchar
next
no
not
null
nullif
numeric
octet_length
of
on
only
open
option
or
order
outer
output
overlaps
pad
partial
pivot
position
precision
prepare
preserve
primary
prior
privileges
procedure
public
read
real
references
relative
restrict
revoke
right
rollback
rows
schema
scroll
second
section
select
session
session_user
set
sexp
size
smallint
some
space
sql
sqlcode
sqlerror
sqlstate
string
struct
substring
sum
symbol
system_user
table
temporary
then
time
timestamp
timezone_hour
timezone_minute
to
trailing
transaction
translate
translation
trim
true
tuple
union
unique
unknown
unpivot
update
upper
usage
user
using
value
values
varchar
varying
view
when
whenever
where
with
work
write
year
zone
```

# SQL functions
<a name="s3-select-sql-reference-sql-functions"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following SQL functions.

**Topics**
+ [

# Aggregate functions
](s3-select-sql-reference-aggregate.md)
+ [

# Conditional functions
](s3-select-sql-reference-conditional.md)
+ [

# Conversion functions
](s3-select-sql-reference-conversion.md)
+ [

# Date functions
](s3-select-sql-reference-date.md)
+ [

# String functions
](s3-select-sql-reference-string.md)

# Aggregate functions
<a name="s3-select-sql-reference-aggregate"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following aggregate functions.


| Function | Argument type | Return type | 
| --- | --- | --- | 
| `AVG(expression)` | `INT`, `FLOAT`, `DECIMAL` | `DECIMAL` for an `INT` argument, `FLOAT` for a floating-point argument; otherwise the same as the argument data type. | 
| `COUNT` |  `-`  | `INT` | 
| `MAX(expression)` | `INT`, `DECIMAL` | Same as the argument type. | 
| `MIN(expression)` | `INT`, `DECIMAL` | Same as the argument type. | 
| `SUM(expression)` | `INT`, `FLOAT`, `DOUBLE`, `DECIMAL` | `INT` for an `INT` argument, `FLOAT` for a floating-point argument; otherwise, the same as the argument data type. | 

## SUM example
<a name="s3-select-sql-reference-aggregate-case-examples"></a>

To aggregate the total object sizes of a folder in an [S3 Inventory report](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-inventory.html), use a `SUM` expression.

The following S3 Inventory report is a CSV file that's compressed with GZIP. There are three columns.
+ The first column is the name of the S3 bucket (*`DOC-EXAMPLE-BUCKET`*) that the S3 Inventory report is for.
+ The second column is the object key name that uniquely identifies the object in the bucket.

  The `example-folder/` value in the first row is for the folder `example-folder`. In Amazon S3, when you create a folder in your bucket, S3 creates a 0-byte object with a key that's set to the folder name that you provided.

  The `example-folder/object1` value in the second row is for the object `object1` in the folder `example-folder`.

  The `example-folder/object2` value in the third row is for the object `object2` in the folder `example-folder`.

  For more information about S3 folders, see [Organizing objects in the Amazon S3 console by using folders](using-folders.md).
+ The third column is the object size in bytes.

```
"DOC-EXAMPLE-BUCKET","example-folder/","0"
"DOC-EXAMPLE-BUCKET","example-folder/object1","2011267"
"DOC-EXAMPLE-BUCKET","example-folder/object2","1570024"
```

To use a `SUM` expression to calculate the total size of the folder `example-folder`, run the SQL query with Amazon S3 Select.

```
SELECT SUM(CAST(_3 as INT)) FROM s3object s WHERE _2 LIKE 'example-folder/%' AND _2 != 'example-folder/';
```

Query Result: 

```
3581291
```

# Conditional functions
<a name="s3-select-sql-reference-conditional"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following conditional functions.

**Topics**
+ [

## CASE
](#s3-select-sql-reference-case)
+ [

## COALESCE
](#s3-select-sql-reference-coalesce)
+ [

## NULLIF
](#s3-select-sql-reference-nullif)

## CASE
<a name="s3-select-sql-reference-case"></a>

The `CASE` expression is a conditional expression, similar to `if/then/else` statements found in other languages. `CASE` is used to specify a result when there are multiple conditions. There are two types of `CASE` expressions: simple and searched.

In simple `CASE` expressions, an expression is compared with a value. When a match is found, the specified action in the `THEN` clause is applied. If no match is found, the action in the `ELSE` clause is applied.

In searched `CASE` expressions, each `CASE` is evaluated based on a Boolean expression, and the `CASE` statement returns the first matching `CASE`. If no matching `CASE` is found among the `WHEN` clauses, the action in the `ELSE` clause is returned.

### Syntax
<a name="s3-select-sql-reference-case-syntax"></a>

**Note**  
Currently, Amazon S3 Select doesn't support `ORDER BY` or queries that contain new lines. Make sure that you use queries with no line breaks.

The following is a simple `CASE` statement that's used to match conditions:

```
CASE expression WHEN value THEN result [WHEN...] [ELSE result] END					
```

The following is a searched `CASE` statement that's used to evaluate each condition:

```
CASE WHEN boolean condition THEN result [WHEN ...] [ELSE result] END					
```

### Examples
<a name="s3-select-sql-reference-case-examples"></a>

**Note**  
If you use the Amazon S3 console to run the following examples and your CSV file contains a header row, choose **Exclude the first line of CSV data**. 

**Example 1:** Use a simple `CASE` expression to replace `New York City` with `Big Apple` in a query. Replace all other city names with `other`.

```
SELECT venuecity, CASE venuecity WHEN 'New York City' THEN 'Big Apple' ELSE 'other' END FROM S3Object;
```

Query result: 

```
venuecity        |   case
-----------------+-----------
Los Angeles      | other
New York City    | Big Apple
San Francisco    | other
Baltimore        | other
...
```

**Example 2:** Use a searched `CASE` expression to assign group numbers based on the `pricepaid` value for individual ticket sales:

```
SELECT pricepaid, CASE WHEN CAST(pricepaid as FLOAT) < 10000 THEN 'group 1' WHEN CAST(pricepaid as FLOAT) > 10000 THEN 'group 2' ELSE 'group 3' END FROM S3Object;					
```

Query result: 

```
pricepaid |  case
-----------+---------
12624.00 | group 2
10000.00 | group 3
10000.00 | group 3
9996.00 | group 1
9988.00 | group 1
...
```

## COALESCE
<a name="s3-select-sql-reference-coalesce"></a>

`COALESCE` evaluates the arguments in order and returns the first non-unknown value, that is, the first non-null or non-missing value. This function does not propagate null and missing values.

### Syntax
<a name="s3-select-sql-reference-coalesce-syntax"></a>

```
COALESCE ( expression, expression, ... )
```

### Parameters
<a name="s3-select-sql-reference-coalesce-parameters"></a>

 *`expression`*   
The target expression that the function operates on.

### Examples
<a name="s3-select-sql-reference-coalesce-examples"></a>

```
COALESCE(1)                -- 1
COALESCE(null)             -- null
COALESCE(null, null)       -- null
COALESCE(missing)          -- null
COALESCE(missing, missing) -- null
COALESCE(1, null)          -- 1
COALESCE(null, null, 1)    -- 1
COALESCE(null, 'string')   -- 'string'
COALESCE(missing, 1)       -- 1
```

## NULLIF
<a name="s3-select-sql-reference-nullif"></a>

Given two expressions, `NULLIF` returns `NULL` if the two expressions evaluate to the same value; otherwise, `NULLIF` returns the result of evaluating the first expression.

### Syntax
<a name="s3-select-sql-reference-nullif-syntax"></a>

```
NULLIF ( expression1, expression2 )
```

### Parameters
<a name="s3-select-sql-reference-nullif-parameters"></a>

 `expression1, expression2`   
The target expressions that the function operates on.

### Examples
<a name="s3-select-sql-reference-nullif-examples"></a>

```
NULLIF(1, 1)             -- null
NULLIF(1, 2)             -- 1
NULLIF(1.0, 1)           -- null
NULLIF(1, '1')           -- 1
NULLIF([1], [1])         -- null
NULLIF(1, NULL)          -- 1
NULLIF(NULL, 1)          -- null
NULLIF(null, null)       -- null
NULLIF(missing, null)    -- null
NULLIF(missing, missing) -- null
```

# Conversion functions
<a name="s3-select-sql-reference-conversion"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following conversion function.

**Topics**
+ [

## CAST
](#s3-select-sql-reference-cast)

## CAST
<a name="s3-select-sql-reference-cast"></a>

The `CAST` function converts an entity, such as an expression that evaluates to a single value, from one type to another. 

### Syntax
<a name="s3-select-sql-reference-cast-syntax"></a>

```
CAST ( expression AS data_type )
```

### Parameters
<a name="s3-select-sql-reference-cast-parameters"></a>

 *`expression`*   
A combination of one or more values, operators, and SQL functions that evaluate to a value.

 *`data_type`*   
The target data type, such as `INT`, to cast the expression to. For a list of supported data types, see [Data types](s3-select-sql-reference-data-types.md).

### Examples
<a name="s3-select-sql-reference-cast-examples"></a>

```
CAST('2007-04-05T14:30Z' AS TIMESTAMP)
CAST(0.456 AS FLOAT)
```

# Date functions
<a name="s3-select-sql-reference-date"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following date functions.

**Topics**
+ [

## DATE\$1ADD
](#s3-select-sql-reference-date-add)
+ [

## DATE\$1DIFF
](#s3-select-sql-reference-date-diff)
+ [

## EXTRACT
](#s3-select-sql-reference-extract)
+ [

## TO\$1STRING
](#s3-select-sql-reference-to-string)
+ [

## TO\$1TIMESTAMP
](#s3-select-sql-reference-to-timestamp)
+ [

## UTCNOW
](#s3-select-sql-reference-utcnow)

## DATE\$1ADD
<a name="s3-select-sql-reference-date-add"></a>

Given a date part, a quantity, and a timestamp, `DATE_ADD` returns an updated timestamp by altering the date part by the quantity.

### Syntax
<a name="s3-select-sql-reference-date-add-syntax"></a>

```
DATE_ADD( date_part, quantity, timestamp )
```

### Parameters
<a name="s3-select-sql-reference-date-add-parameters"></a>

*`date_part`*   
Specifies which part of the date to modify. This can be one of the following:  
+ year
+ month
+ day
+ hour
+ minute
+ second

 *`quantity`*   
The value to apply to the updated timestamp. Positive values for `quantity` add to the timestamp's date\$1part, and negative values subtract.

 *`timestamp`*   
The target timestamp that the function operates on.

### Examples
<a name="s3-select-sql-reference-date-add-examples"></a>

```
DATE_ADD(year, 5, `2010-01-01T`)                -- 2015-01-01 (equivalent to 2015-01-01T)
DATE_ADD(month, 1, `2010T`)                     -- 2010-02T (result will add precision as necessary)
DATE_ADD(month, 13, `2010T`)                    -- 2011-02T
DATE_ADD(day, -1, `2017-01-10T`)                -- 2017-01-09 (equivalent to 2017-01-09T)
DATE_ADD(hour, 1, `2017T`)                      -- 2017-01-01T01:00-00:00
DATE_ADD(hour, 1, `2017-01-02T03:04Z`)          -- 2017-01-02T04:04Z
DATE_ADD(minute, 1, `2017-01-02T03:04:05.006Z`) -- 2017-01-02T03:05:05.006Z
DATE_ADD(second, 1, `2017-01-02T03:04:05.006Z`) -- 2017-01-02T03:04:06.006Z
```

## DATE\$1DIFF
<a name="s3-select-sql-reference-date-diff"></a>

Given a date part and two valid timestamps, `DATE_DIFF` returns the difference in date parts. The return value is a negative integer when the `date_part` value of `timestamp1` is greater than the `date_part` value of `timestamp2`. The return value is a positive integer when the `date_part` value of `timestamp1` is less than the `date_part` value of `timestamp2`.

### Syntax
<a name="s3-select-sql-reference-date-diff-syntax"></a>

```
DATE_DIFF( date_part, timestamp1, timestamp2 )
```

### Parameters
<a name="s3-select-sql-reference-date-diff-parameters"></a>

 **`date_part`**   
Specifies which part of the timestamps to compare. For the definition of `date_part`, see [DATE\$1ADD](#s3-select-sql-reference-date-add).

 **`timestamp1`**   
The first timestamp to compare.

 **`timestamp2`**   
The second timestamp to compare.

### Examples
<a name="s3-select-sql-reference-date-diff-examples"></a>

```
DATE_DIFF(year, `2010-01-01T`, `2011-01-01T`)            -- 1
DATE_DIFF(year, `2010T`, `2010-05T`)                     -- 4 (2010T is equivalent to 2010-01-01T00:00:00.000Z)
DATE_DIFF(month, `2010T`, `2011T`)                       -- 12
DATE_DIFF(month, `2011T`, `2010T`)                       -- -12
DATE_DIFF(day, `2010-01-01T23:00`, `2010-01-02T01:00`) -- 0 (need to be at least 24h apart to be 1 day apart)
```

## EXTRACT
<a name="s3-select-sql-reference-extract"></a>

Given a date part and a timestamp, `EXTRACT` returns the timestamp's date part value.

### Syntax
<a name="s3-select-sql-reference-extract-syntax"></a>

```
EXTRACT( date_part FROM timestamp )
```

### Parameters
<a name="s3-select-sql-reference-extract-parameters"></a>

 **`date_part`**   
Specifies which part of the timestamps to extract. This can be one of the following:  
+ `YEAR`
+ `MONTH`
+ `DAY`
+ `HOUR`
+ `MINUTE`
+ `SECOND`
+ `TIMEZONE_HOUR`
+ `TIMEZONE_MINUTE`

 **`timestamp`**   
The target timestamp that the function operates on.

### Examples
<a name="s3-select-sql-reference-extract-examples"></a>

```
EXTRACT(YEAR FROM `2010-01-01T`)                           -- 2010
EXTRACT(MONTH FROM `2010T`)                                -- 1 (equivalent to 2010-01-01T00:00:00.000Z)
EXTRACT(MONTH FROM `2010-10T`)                             -- 10
EXTRACT(HOUR FROM `2017-01-02T03:04:05+07:08`)             -- 3
EXTRACT(MINUTE FROM `2017-01-02T03:04:05+07:08`)           -- 4
EXTRACT(TIMEZONE_HOUR FROM `2017-01-02T03:04:05+07:08`)    -- 7
EXTRACT(TIMEZONE_MINUTE FROM `2017-01-02T03:04:05+07:08`)  -- 8
```

## TO\$1STRING
<a name="s3-select-sql-reference-to-string"></a>

Given a timestamp and a format pattern, `TO_STRING` returns a string representation of the timestamp in the given format.

### Syntax
<a name="s3-select-sql-reference-size-syntax"></a>

```
TO_STRING ( timestamp time_format_pattern )
```

### Parameters
<a name="s3-select-sql-reference-size-parameters"></a>

 *`timestamp`*   
The target timestamp that the function operates on.

 *`time_format_pattern`*   
A string that has the following special character interpretations:      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-select-sql-reference-date.html)

### Examples
<a name="s3-select-sql-reference-size-examples"></a>

```
TO_STRING(`1969-07-20T20:18Z`,  'MMMM d, y')                    -- "July 20, 1969"
TO_STRING(`1969-07-20T20:18Z`, 'MMM d, yyyy')                   -- "Jul 20, 1969"
TO_STRING(`1969-07-20T20:18Z`, 'M-d-yy')                        -- "7-20-69"
TO_STRING(`1969-07-20T20:18Z`, 'MM-d-y')                        -- "07-20-1969"
TO_STRING(`1969-07-20T20:18Z`, 'MMMM d, y h:m a')               -- "July 20, 1969 8:18 PM"
TO_STRING(`1969-07-20T20:18Z`, 'y-MM-dd''T''H:m:ssX')           -- "1969-07-20T20:18:00Z"
TO_STRING(`1969-07-20T20:18+08:00Z`, 'y-MM-dd''T''H:m:ssX')     -- "1969-07-20T20:18:00Z"
TO_STRING(`1969-07-20T20:18+08:00`, 'y-MM-dd''T''H:m:ssXXXX')   -- "1969-07-20T20:18:00+0800"
TO_STRING(`1969-07-20T20:18+08:00`, 'y-MM-dd''T''H:m:ssXXXXX')  -- "1969-07-20T20:18:00+08:00"
```

## TO\$1TIMESTAMP
<a name="s3-select-sql-reference-to-timestamp"></a>

Given a string, `TO_TIMESTAMP` converts it to a timestamp. `TO_TIMESTAMP` is the inverse operation of `TO_STRING`.

### Syntax
<a name="s3-select-sql-reference-to-timestamp-syntax"></a>

```
TO_TIMESTAMP ( string )
```

### Parameters
<a name="s3-select-sql-reference-to-timestamp-parameters"></a>

 *`string`*   
The target string that the function operates on.

### Examples
<a name="s3-select-sql-reference-to-timestamp-examples"></a>

```
TO_TIMESTAMP('2007T')                         -- `2007T`
TO_TIMESTAMP('2007-02-23T12:14:33.079-08:00') -- `2007-02-23T12:14:33.079-08:00`
```

## UTCNOW
<a name="s3-select-sql-reference-utcnow"></a>

`UTCNOW` returns the current time in UTC as a timestamp.

### Syntax
<a name="s3-select-sql-reference-utcnow-syntax"></a>

```
UTCNOW()
```

### Parameters
<a name="s3-select-sql-reference-utcnow-parameters"></a>

`UTCNOW` takes no parameters.

### Examples
<a name="s3-select-sql-reference-utcnow-examples"></a>

```
UTCNOW() -- 2017-10-13T16:02:11.123Z
```

# String functions
<a name="s3-select-sql-reference-string"></a>

**Important**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. [Learn more](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/) 

Amazon S3 Select supports the following string functions.

**Topics**
+ [

## CHAR\$1LENGTH, CHARACTER\$1LENGTH
](#s3-select-sql-reference-char-length)
+ [

## LOWER
](#s3-select-sql-reference-lower)
+ [

## SUBSTRING
](#s3-select-sql-reference-substring)
+ [

## TRIM
](#s3-select-sql-reference-trim)
+ [

## UPPER
](#s3-select-sql-reference-upper)

## CHAR\$1LENGTH, CHARACTER\$1LENGTH
<a name="s3-select-sql-reference-char-length"></a>

`CHAR_LENGTH` (or `CHARACTER_LENGTH`) counts the number of characters in the specified string.

**Note**  
`CHAR_LENGTH` and `CHARACTER_LENGTH` are synonyms.

### Syntax
<a name="s3-select-sql-reference-char-length-syntax"></a>

```
CHAR_LENGTH ( string )
```

### Parameters
<a name="s3-select-sql-reference-char-length-parameters"></a>

 *`string`*   
The target string that the function operates on.

### Examples
<a name="s3-select-sql-reference-char-length-examples"></a>

```
CHAR_LENGTH('')          -- 0
CHAR_LENGTH('abcdefg')   -- 7
```

## LOWER
<a name="s3-select-sql-reference-lower"></a>

Given a string, `LOWER` converts all uppercase characters to lowercase characters. Any non-uppercased characters remain unchanged.

### Syntax
<a name="s3-select-sql-reference-lower-syntax"></a>

```
LOWER ( string )
```

### Parameters
<a name="s3-select-sql-reference-lower-parameters"></a>

 **`string`**   
The target string that the function operates on.

### Examples
<a name="s3-select-sql-reference-lower-examples"></a>

```
LOWER('AbCdEfG!@#$') -- 'abcdefg!@#$'
```

## SUBSTRING
<a name="s3-select-sql-reference-substring"></a>

Given a string, a start index, and optionally a length, `SUBSTRING` returns the substring from the start index up to the end of the string, or up to the length provided.

**Note**  
The first character of the input string has an index position of 1.  
 If `start` is < 1, with no length specified, then the index position is set to 1. 
 If `start` is < 1, with a length specified, then the index position is set to `start + length -1`. 
 If `start + length -1` < 0, then an empty string is returned. 
 If `start + length -1` > = 0, then the substring starting at index position 1 with the length `start + length - 1` is returned. 

### Syntax
<a name="s3-select-sql-reference-substring-syntax"></a>

```
SUBSTRING( string FROM start [ FOR length ] )
```

### Parameters
<a name="s3-select-sql-reference-substring-parameters"></a>

 **`string`**   
The target string that the function operates on.

 **`start`**   
The start position of the string.

 **`length`**   
The length of the substring to return. If not present, proceed to the end of the string.

### Examples
<a name="s3-select-sql-reference-substring-examples"></a>

```
SUBSTRING("123456789", 0)      -- "123456789"
SUBSTRING("123456789", 1)      -- "123456789"
SUBSTRING("123456789", 2)      -- "23456789"
SUBSTRING("123456789", -4)     -- "123456789"
SUBSTRING("123456789", 0, 999) -- "123456789" 
SUBSTRING("123456789", 1, 5)   -- "12345"
```

## TRIM
<a name="s3-select-sql-reference-trim"></a>

Trims leading or trailing characters from a string. The default character to remove is a space (`' '`).

### Syntax
<a name="s3-select-sql-reference-trim-syntax"></a>

```
TRIM ( [[LEADING | TRAILING | BOTH remove_chars] FROM] string )
```

### Parameters
<a name="s3-select-sql-reference-trim-parameters"></a>

 **`string`**   
The target string that the function operates on.

 `LEADING` \$1 `TRAILING` \$1 `BOTH`   
This parameter indicates whether to trim leading or trailing characters, or both leading and trailing characters.

 **`remove_chars`**   
The set of characters to remove. `remove_chars` can be a string with a length > 1. This function returns the string with any character from `remove_chars` found at the beginning or end of the string that was removed.

### Examples
<a name="s3-select-sql-reference-trim-examples"></a>

```
TRIM('       foobar         ')               -- 'foobar'
TRIM('      \tfoobar\t         ')            -- '\tfoobar\t'
TRIM(LEADING FROM '       foobar         ')  -- 'foobar         '
TRIM(TRAILING FROM '       foobar         ') -- '       foobar'
TRIM(BOTH FROM '       foobar         ')     -- 'foobar'
TRIM(BOTH '12' FROM '1112211foobar22211122') -- 'foobar'
```

## UPPER
<a name="s3-select-sql-reference-upper"></a>

Given a string, `UPPER` converts all lowercase characters to uppercase characters. Any non-lowercased characters remain unchanged.

### Syntax
<a name="s3-select-sql-reference-upper-syntax"></a>

```
UPPER ( string )
```

### Parameters
<a name="s3-select-sql-reference-upper-parameters"></a>

 **`string`**   
The target string that the function operates on.

### Examples
<a name="s3-select-sql-reference-upper-examples"></a>

```
UPPER('AbCdEfG!@#$') -- 'ABCDEFG!@#$'
```