

# ORDER BY clause
<a name="sql-reference-order-by-clause"></a>

A streaming query can use ORDER BY if its leading expression is time-based and monotonic. For example, a streaming query whose leading expression is based on the ROWTIME column can use ORDER BY to do the following operations:
+ Sort the results of a streaming GROUP BY.
+ Sort a batch of rows arriving within a fixed time window of a stream.
+ Perform streaming ORDER BY on windowed-joins.

The "time-based and monotonic" requirement on the leading expression means that the query

```
CREATE OR REPLACE PUMP "STREAM_PUMP" AS INSERT INTO "DESTINATION_SQL_STREAM"
SELECT STREAM DISTINCT ticker FROM trades ORDER BY ticker
```

will fail, but the query

```
CREATE OR REPLACE PUMP "STREAM_PUMP" AS INSERT INTO "DESTINATION_SQL_STREAM"
SELECT STREAM DISTINCT rowtime, ticker FROM trades ORDER BY ROWTIME, ticker
```

will succeed.

**Note**  
The preceding examples use the `DISTINCT` clause to remove duplicate instances of the same ticker symbol from the result set, so that the results will be monotonic.

Streaming ORDER BY sorts rows using SQL-2008 compliant syntax for the ORDER BY clause. It can be combined with a UNION ALL statement, and can sort on expressions, such as:

```
CREATE OR REPLACE PUMP "STREAM_PUMP" AS INSERT INTO "DESTINATION_SQL_STREAM"
SELECT STREAM x, y FROM t1
UNION ALL
SELECT STREAM a, b FROM t2 ORDER BY ROWTIME, MOD(x, 5)
```

The ORDER BY clause can specify ascending or descending sort order, and can use column ordinals, as well as ordinals specifying (referring to) the position of items in the select list.

**Note**  
The `UNION` statement in the preceding query collects records from two separate streams for ordering.
<a name="TOC32"></a>
**Streaming ORDER BY SQL Declarations**  
The streaming ORDER BY clause includes the following functional attributes:
+ Gathers rows until the monotonic expression in streaming ORDER BY clause does not change.
+ Does not require streaming GROUP BY clause in the same statement.
+ Can use any column with a basic SQL data type of TIMESTAMP, DATE, DECIMAL, INTEGER, FLOAT, CHAR, VARCHAR.
+ Does not require that columns/expressions in the ORDER BY clause be present in the SELECT list of the statement.
+ Applies all the standard SQL validation rules for ORDER BY clause.

The following query is an example of streaming ORDER BY:

```
CREATE OR REPLACE PUMP "STREAM_PUMP" AS INSERT INTO "DESTINATION_SQL_STREAM"
SELECT STREAM state, city, SUM(amount)
FROM orders
GROUP BY FLOOR(ROWTIME TO HOUR), state, city
ORDER BY FLOOR(ROWTIME TO HOUR), state, SUM(amount)
```

# T-sorting Stream Input
<a name="sqlrf_T-sorting_Stream_Input"></a>

Amazon Kinesis Data Analytics real-time analytics use the fact that arriving data is ordered by ROWTIME. However, sometimes data arriving from multiple sources may not be time-synchronized.

While Amazon Kinesis Data Analytics can sort data from individual data sources that have been independently inserted into Amazon Kinesis Data Analytics application's native stream, in some cases such data may have already combined from multiple sources (such as for efficient consumption at an earlier stage in processing). At other times, high volume data sources could make direct insertion impossible.

In addition, an unreliable data source could block progress by forcing Amazon Kinesis Data Analytics application to wait indefinitely, unable to proceed until all connected data sources deliver. In this case, data from this source could be unsynchronized.

You can use the ORDER BY clause to resolve these issues. Amazon Kinesis Data Analytics uses a sliding time-based window of incoming rows to reorder those rows by ROWTIME.

**Syntax**  
You specify the time-based parameter for sorting and the time-based window in which the streaming rows are to be time-sorted, using the following syntax:

```
  ORDER BY <timestamp_expr> WITHIN
      <interval_literal>
```

**Restrictions**  
The T-sort has the following restrictions:
+ The datatype of the ORDER BY expression must be timestamp.
+ The partially-ordered expression <timestamp\$1expr> must be present in the select list of the query with the alias ROWTIME.
+ The leading expression of the ORDER BY clause must not contain the ROWTIME function and must not use the DESC keyword.
+ The ROWTIME column needs to be fully qualified. For example:
+ `ORDER BY FLOOR(ROWTIME TO MINUTE), ...` fails.
+ `ORDER BY FLOOR(s.ROWTIME TO MINUTE), ...` works.

If any of these requirements are not met, the statement will fail with errors.

Additional notes:
+ You cannot use incoming rowtimebounds. These are ignored by the system.
+ If <timestamp\$1expr> evaluates to NULL, the corresponding row is discarded.