

# Window functions
<a name="Window_functions"></a>

By using window functions, you can create analytic business queries more efficiently. Window functions operate on a partition or "window" of a result set, and return a value for every row in that window. In contrast, non-windowed functions perform their calculations with respect to every row in the result set. Unlike group functions that aggregate result rows, window functions retain all rows in the table expression. 

 The values returned are calculated by using values from the sets of rows in that window. For each row in the table, the window defines a set of rows that is used to compute additional attributes. A window is defined using a window specification (the OVER clause), and is based on three main concepts: 
+  *Window partitioning,* which forms groups of rows (PARTITION clause) 
+  *Window ordering*, which defines an order or sequence of rows within each partition (ORDER BY clause) 
+  *Window frames*, which are defined relative to each row to further restrict the set of rows (ROWS specification) 

Window functions are the last set of operations performed in a query except for the final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING clauses are completed before the window functions are processed. Therefore, window functions can appear only in the select list or ORDER BY clause. You can use multiple window functions within a single query with different frame clauses. You can also use window functions in other scalar expressions, such as CASE. 

## Window function syntax summary
<a name="Window_function_synopsis"></a>

Window functions follow a standard syntax, which is as follows.

```
function (expression) OVER (
[ PARTITION BY expr_list ]
[ ORDER BY order_list [ frame_clause ] ] )
```

 Here, *function* is one of the functions described in this section.

The *expr\$1list* is as follows.

```
expression | column_name [, expr_list ]
```

 The *order\$1list* is as follows. 

```
expression | column_name [ ASC | DESC ] 
[ NULLS FIRST | NULLS LAST ]
[, order_list ]
```

 The *frame\$1clause* is as follows. 

```
ROWS
{ UNBOUNDED PRECEDING | unsigned_value PRECEDING | CURRENT ROW } |

{ BETWEEN
{ UNBOUNDED PRECEDING | unsigned_value { PRECEDING | FOLLOWING } | CURRENT ROW}
AND
{ UNBOUNDED FOLLOWING | unsigned_value { PRECEDING | FOLLOWING } | CURRENT ROW }}
```

### Arguments
<a name="Window_function_synopsis-arguments"></a>

 *function*   
The window function. For details, see the individual function descriptions. 

OVER   
The clause that defines the window specification. The OVER clause is mandatory for window functions, and differentiates window functions from other SQL functions. 

PARTITION BY *expr\$1list*   
(Optional) The PARTITION BY clause subdivides the result set into partitions, much like the GROUP BY clause. If a partition clause is present, the function is calculated for the rows in each partition. If no partition clause is specified, a single partition contains the entire table, and the function is computed for that complete table.   
The ranking functions DENSE\$1RANK, NTILE, RANK, and ROW\$1NUMBER require a global comparison of all the rows in the result set. When a PARTITION BY clause is used, the query optimizer can run each aggregation in parallel by spreading the workload across multiple slices according to the partitions. If the PARTITION BY clause is not present, the aggregation step must be run serially on a single slice, which can have a significant negative impact on performance, especially for large clusters.  
AWS Clean Rooms doesn't support string literals in PARTITION BY clauses.

ORDER BY *order\$1list*   
(Optional) The window function is applied to the rows within each partition sorted according to the order specification in ORDER BY. This ORDER BY clause is distinct from and completely unrelated to ORDER BY clauses in the *frame\$1clause*. The ORDER BY clause can be used without the PARTITION BY clause.   
For ranking functions, the ORDER BY clause identifies the measures for the ranking values. For aggregation functions, the partitioned rows must be ordered before the aggregate function is computed for each frame. For more about window function types, see [Window functions](#Window_functions).  
Column identifiers or expressions that evaluate to column identifiers are required in the order list. Neither constants nor constant expressions can be used as substitutes for column names.   
NULLS values are treated as their own group, sorted and ranked according to the NULLS FIRST or NULLS LAST option. By default, NULL values are sorted and ranked last in ASC ordering, and sorted and ranked first in DESC ordering.  
AWS Clean Rooms doesn't support string literals in ORDER BY clauses.  
 If the ORDER BY clause is omitted, the order of the rows is nondeterministic.   
In any parallel system such as AWS Clean Rooms, when an ORDER BY clause doesn't produce a unique and total ordering of the data, the order of the rows is nondeterministic. That is, if the ORDER BY expression produces duplicate values (a partial ordering), the return order of those rows might vary from one run of AWS Clean Rooms to the next. In turn, window functions might return unexpected or inconsistent results. For more information, see [Unique ordering of data for window functions](#Examples_order_by_WF). 

 *column\$1name*   
Name of a column to be partitioned by or ordered by. 

ASC \$1 DESC   
Option that defines the sort order for the expression, as follows:   
+ ASC: ascending (for example, low to high for numeric values and 'A' to 'Z' for character strings). If no option is specified, data is sorted in ascending order by default.
+ DESC: descending (high to low for numeric values; 'Z' to 'A' for strings). 

NULLS FIRST \$1 NULLS LAST  
Option that specifies whether NULLS should be ordered first, before non-null values, or last, after non-null values. By default, NULLS are sorted and ranked last in ASC ordering, and sorted and ranked first in DESC ordering.

 *frame\$1clause*   
For aggregate functions, the frame clause further refines the set of rows in a function's window when using ORDER BY. It enables you to include or exclude sets of rows within the ordered result. The frame clause consists of the ROWS keyword and associated specifiers.  
The frame clause doesn't apply to ranking functions. Also, the frame clause isn't required when no ORDER BY clause is used in the OVER clause for an aggregate function. If an ORDER BY clause is used for an aggregate function, an explicit frame clause is required.  
When no ORDER BY clause is specified, the implied frame is unbounded, equivalent to ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. 

ROWS  
This clause defines the window frame by specifying a physical offset from the current row.  
This clause specifies the rows in the current window or partition that the value in the current row is to be combined with. It uses arguments that specify row position, which can be before or after the current row. The reference point for all window frames is the current row. Each row becomes the current row in turn as the window frame slides forward in the partition.  
The frame can be a simple set of rows up to and including the current row.  

```
{UNBOUNDED PRECEDING | offset PRECEDING | CURRENT ROW}
```
Or it can be a set of rows between two boundaries.  

```
BETWEEN
{ UNBOUNDED PRECEDING | offset { PRECEDING | FOLLOWING } | CURRENT ROW }
AND
{ UNBOUNDED FOLLOWING | offset { PRECEDING | FOLLOWING } | CURRENT ROW }
```
UNBOUNDED PRECEDING indicates that the window starts at the first row of the partition; *offset* PRECEDING indicates that the window starts a number of rows equivalent to the value of offset before the current row. UNBOUNDED PRECEDING is the default.  
CURRENT ROW indicates the window begins or ends at the current row.   
UNBOUNDED FOLLOWING indicates that the window ends at the last row of the partition; *offset* FOLLOWING indicates that the window ends a number of rows equivalent to the value of offset after the current row.  
*offset* identifies a physical number of rows before or after the current row. In this case, *offset* must be a constant that evaluates to a positive numeric value. For example, 5 FOLLOWING ends the frame five rows after the current row.  
Where BETWEEN is not specified, the frame is implicitly bounded by the current row. For example, `ROWS 5 PRECEDING` is equal to `ROWS BETWEEN 5 PRECEDING AND CURRENT ROW`. Also, `ROWS UNBOUNDED FOLLOWING` is equal to `ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING`.  
You can't specify a frame in which the starting boundary is greater than the ending boundary. For example, you can't specify any of the following frames.   

```
between 5 following and 5 preceding
between current row and 2 preceding
between 3 following and current row
```

## Unique ordering of data for window functions
<a name="Examples_order_by_WF"></a>

If an ORDER BY clause for a window function doesn't produce a unique and total ordering of the data, the order of the rows is nondeterministic. If the ORDER BY expression produces duplicate values (a partial ordering), the return order of those rows can vary in multiple runs. In this case, window functions can also return unexpected or inconsistent results. 

For example, the following query returns different results over multiple runs. These different results occur because `order by dateid` doesn't produce a unique ordering of the data for the SUM window function. 

```
select dateid, pricepaid,
sum(pricepaid) over(order by dateid rows unbounded preceding) as sumpaid
from sales
group by dateid, pricepaid;

dateid | pricepaid |   sumpaid
--------+-----------+-------------
1827 |   1730.00 |     1730.00
1827 |    708.00 |     2438.00
1827 |    234.00 |     2672.00
...

select dateid, pricepaid,
sum(pricepaid) over(order by dateid rows unbounded preceding) as sumpaid
from sales
group by dateid, pricepaid;

dateid | pricepaid |   sumpaid
--------+-----------+-------------
1827 |    234.00 |      234.00
1827 |    472.00 |      706.00
1827 |    347.00 |     1053.00
...
```

 In this case, adding a second ORDER BY column to the window function can solve the problem. 

```
select dateid, pricepaid,
sum(pricepaid) over(order by dateid, pricepaid rows unbounded preceding) as sumpaid
from sales
group by dateid, pricepaid;

dateid | pricepaid | sumpaid
--------+-----------+---------
1827 |    234.00 |  234.00
1827 |    337.00 |  571.00
1827 |    347.00 |  918.00
...
```

## Supported functions
<a name="Window_function_supported"></a>

AWS Clean Rooms Spark SQL supports two types of window functions: aggregate and ranking.

Following are the supported aggregate functions: 
+ [CUME\$1DIST window function](WF_CUME_DIST.md)
+ [DENSE\$1RANK window function](WF_DENSE_RANK.md)
+ [FIRST window function](WF_FIRST.md)
+ [FIRST\$1VALUE window function](WF_first_value.md)
+ [LAG window function](WF_LAG.md) 
+ [LAST window function](WF-LAST.md)
+ [LAST\$1VALUE window function](WF_last_value.md)
+ [LEAD window function](WF_LEAD.md) 

Following are the supported ranking functions: 
+ [DENSE\$1RANK window function](WF_DENSE_RANK.md) 
+ [PERCENT\$1RANK window function](WF_PERCENT_RANK.md)
+ [RANK window function](WF_RANK.md) 
+ [ROW\$1NUMBER window function](WF_ROW_NUMBER.md)

## Sample table for window function examples
<a name="Window_function_example"></a>

You can find specific window function examples with each function description. Some of the examples use a table named WINSALES, which contains 11 rows, as shown in the following table.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/sql-reference/Window_functions.html)

# CUME\$1DIST window function
<a name="WF_CUME_DIST"></a>

Calculates the cumulative distribution of a value within a window or partition. Assuming ascending ordering, the cumulative distribution is determined using this formula:

`count of rows with values <= x / count of rows in the window or partition`

where *x* equals the value in the current row of the column specified in the ORDER BY clause. The following dataset illustrates use of this formula:

```
Row#	Value	  Calculation    CUME_DIST
1        2500	   (1)/(5)	   0.2
2        2600	   (2)/(5)	   0.4
3        2800	   (3)/(5)	   0.6
4        2900	   (4)/(5)	   0.8
5        3100	   (5)/(5)	   1.0
```

The return value range is >0 to 1, inclusive.

## Syntax
<a name="WF_CUME_DIST-synopsis"></a>

```
CUME_DIST ()
OVER ( 
[ PARTITION BY partition_expression ] 
[ ORDER BY order_list ]
)
```

## Arguments
<a name="WF_CUME_DIST-arguments"></a>

OVER  
A clause that specifies the window partitioning. The OVER clause cannot contain a window frame specification.

PARTITION BY *partition\$1expression*   
Optional. An expression that sets the range of records for each group in the OVER clause.

ORDER BY *order\$1list*   
The expression on which to calculate cumulative distribution. The expression must have either a numeric data type or be implicitly convertible to one. If ORDER BY is omitted, the return value is 1 for all rows.   
If ORDER BY doesn't produce a unique ordering, the order of the rows is nondeterministic. For more information, see [Unique ordering of data for window functions](Window_functions.md#Examples_order_by_WF). 

## Return type
<a name="WF_CUME_DIST-returns"></a>

FLOAT8

## Examples
<a name="WF_CUME_DIST-examples"></a>

The following example calculates the cumulative distribution of the quantity for each seller:

```
select sellerid, qty, cume_dist() 
over (partition by sellerid order by qty) 
from winsales;

sellerid   qty	   cume_dist
--------------------------------------------------
1         10.00	   0.33
1         10.64	   0.67
1         30.37	   1
3         10.04	   0.25
3         15.15	   0.5
3         20.75	   0.75
3         30.55	   1
2         20.09	   0.5
2         20.12	   1
4         10.12	   0.5
4         40.23	   1
```

For a description of the WINSALES table, see [Sample table for window function examples](Window_functions.md#Window_function_example).

# DENSE\$1RANK window function
<a name="WF_DENSE_RANK"></a>

The DENSE\$1RANK window function determines the rank of a value in a group of values, based on the ORDER BY expression in the OVER clause. If the optional PARTITION BY clause is present, the rankings are reset for each group of rows. Rows with equal values for the ranking criteria receive the same rank. The DENSE\$1RANK function differs from RANK in one respect: If two or more rows tie, there is no gap in the sequence of ranked values. For example, if two rows are ranked 1, the next rank is 2. 

You can have ranking functions with different PARTITION BY and ORDER BY clauses in the same query. 

## Syntax
<a name="WF_DENSE_RANK-synopsis"></a>

```
DENSE_RANK () OVER
(
[ PARTITION BY expr_list ]
[ ORDER BY order_list ]
)
```

## Arguments
<a name="WF_DENSE_RANK-arguments"></a>

( )   
The function takes no arguments, but the empty parentheses are required. 

OVER   
The window clauses for the DENSE\$1RANK function. 

PARTITION BY *expr\$1list*   
Optional. One or more expressions that define the window. 

ORDER BY *order\$1list*   
Optional. The expression on which the ranking values are based. If no PARTITION BY is specified, ORDER BY uses the entire table. If ORDER BY is omitted, the return value is 1 for all rows.   
If ORDER BY doesn't produce a unique ordering, the order of the rows is nondeterministic. For more information, see [Unique ordering of data for window functions](Window_functions.md#Examples_order_by_WF). 

## Return type
<a name="Supported_data_types_wf_dense_rank"></a>

INTEGER

## Examples
<a name="WF_DENSE_RANK-examples"></a>

The following example orders the table by the quantity sold (in descending order), and assign both a dense rank and a regular rank to each row. The results are sorted after the window function results are applied. 

```
select salesid, qty,
dense_rank() over(order by qty desc) as d_rnk,
rank() over(order by qty desc) as rnk
from winsales
order by 2,1;

salesid | qty | d_rnk | rnk
---------+-----+-------+-----
10001 |  10 |     5 |   8
10006 |  10 |     5 |   8
30001 |  10 |     5 |   8
40005 |  10 |     5 |   8
30003 |  15 |     4 |   7
20001 |  20 |     3 |   4
20002 |  20 |     3 |   4
30004 |  20 |     3 |   4
10005 |  30 |     2 |   2
30007 |  30 |     2 |   2
40001 |  40 |     1 |   1
(11 rows)
```

Note the difference in rankings assigned to the same set of rows when the DENSE\$1RANK and RANK functions are used side by side in the same query. For a description of the WINSALES table, see [Sample table for window function examples](Window_functions.md#Window_function_example).

The following example partitions the table by SELLERID and orders each partition by the quantity (in descending order) and assign a dense rank to each row. The results are sorted after the window function results are applied. 

```
select salesid, sellerid, qty,
dense_rank() over(partition by sellerid order by qty desc) as d_rnk
from winsales
order by 2,3,1;

salesid | sellerid | qty | d_rnk
---------+----------+-----+-------
10001 |        1 |  10 |     2
10006 |        1 |  10 |     2
10005 |        1 |  30 |     1
20001 |        2 |  20 |     1
20002 |        2 |  20 |     1
30001 |        3 |  10 |     4
30003 |        3 |  15 |     3
30004 |        3 |  20 |     2
30007 |        3 |  30 |     1
40005 |        4 |  10 |     2
40001 |        4 |  40 |     1
(11 rows)
```

 For a description of the WINSALES table, see [Sample table for window function examples](Window_functions.md#Window_function_example).

# FIRST window function
<a name="WF_FIRST"></a>

 Given an ordered set of rows, FIRST returns the value of the specified expression with respect to the first row in the window frame.

For information about selecting the last row in the frame, see [LAST window function](WF-LAST.md).

## Syntax
<a name="WF_FIRST-synopsis"></a>

```
FIRST( expression )[ IGNORE NULLS | RESPECT NULLS ]
OVER (
[ PARTITION BY expr_list ]
[ ORDER BY order_list frame_clause ]
)
```

## Arguments
<a name="WF_FIRST-arguments"></a>

 *expression*   
 The target column or expression that the function operates on. 

IGNORE NULLS   
When this option is used with FIRST, the function returns the first value in the frame that is not NULL (or NULL if all values are NULL).

RESPECT NULLS   
 Indicates that AWS Clean Rooms should include null values in the determination of which row to use. RESPECT NULLS is supported by default if you do not specify IGNORE NULLS. 

OVER   
Introduces the window clauses for the function. 

PARTITION BY *expr\$1list*   
Defines the window for the function in terms of one or more expressions. 

ORDER BY *order\$1list*   
Sorts the rows within each partition. If no PARTITION BY clause is specified, ORDER BY sorts the entire table. If you specify an ORDER BY clause, you must also specify a *frame\$1clause*.   
The results of the FIRST function depends on the ordering of the data. The results are nondeterministic in the following cases:   
+ When no ORDER BY clause is specified and a partition contains two different values for an expression 
+ When the expression evaluates to different values that correspond to the same value in the ORDER BY list. 

 *frame\$1clause*   
If an ORDER BY clause is used for an aggregate function, an explicit frame clause is required. The frame clause refines the set of rows in a function's window, including or excluding sets of rows in the ordered result. The frame clause consists of the ROWS keyword and associated specifiers. See [Window function syntax summary](Window_functions.md#Window_function_synopsis). 

## Return type
<a name="Supported_data_types_wf_first"></a>

These functions support expressions that use primitive AWS Clean Rooms data types. The return type is the same as the data type of the *expression*.

## Examples
<a name="WF_FIRST-examples"></a>

The following example returns the seating capacity for each venue in the VENUE table, with the results ordered by capacity (high to low). The FIRST function is used to select the name of the venue that corresponds to the first row in the frame: in this case, the row with the highest number of seats. The results are partitioned by state, so when the VENUESTATE value changes, a new first value is selected. The window frame is unbounded so the same first value is selected for each row in each partition. 

For California, `Qualcomm Stadium` has the highest number of seats (`70561`), so this name is the first value for all of the rows in the `CA` partition. 

```
select venuestate, venueseats, venuename,
first(venuename)
over(partition by venuestate
order by venueseats desc
rows between unbounded preceding and unbounded following)
from (select * from venue where venueseats >0)
order by venuestate;

venuestate | venueseats |           venuename            |         first
-----------+------------+--------------------------------+------------------------------
CA         |      70561 | Qualcomm Stadium               | Qualcomm Stadium
CA         |      69843 | Monster Park                   | Qualcomm Stadium
CA         |      63026 | McAfee Coliseum                | Qualcomm Stadium
CA         |      56000 | Dodger Stadium                 | Qualcomm Stadium
CA         |      45050 | Angel Stadium of Anaheim       | Qualcomm Stadium
CA         |      42445 | PETCO Park                     | Qualcomm Stadium
CA         |      41503 | AT&T Park                      | Qualcomm Stadium
CA         |      22000 | Shoreline Amphitheatre         | Qualcomm Stadium
CO         |      76125 | INVESCO Field                  | INVESCO Field
CO         |      50445 | Coors Field                    | INVESCO Field
DC         |      41888 | Nationals Park                 | Nationals Park
FL         |      74916 | Dolphin Stadium                | Dolphin Stadium
FL         |      73800 | Jacksonville Municipal Stadium | Dolphin Stadium
FL         |      65647 | Raymond James Stadium          | Dolphin Stadium
FL         |      36048 | Tropicana Field                | Dolphin Stadium
...
```

# FIRST\$1VALUE window function
<a name="WF_first_value"></a>

 Given an ordered set of rows, FIRST\$1VALUE returns the value of the specified expression with respect to the first row in the window frame.

For information about selecting the last row in the frame, see [LAST\$1VALUE window function](WF_last_value.md) .

## Syntax
<a name="WF_first_value-synopsis"></a>

```
FIRST_VALUE( expression )[ IGNORE NULLS | RESPECT NULLS ]
OVER (
[ PARTITION BY expr_list ]
[ ORDER BY order_list frame_clause ]
)
```

## Arguments
<a name="WF_first_value-arguments"></a>

 *expression*   
 The target column or expression that the function operates on. 

IGNORE NULLS   
When this option is used with FIRST\$1VALUE, the function returns the first value in the frame that is not NULL (or NULL if all values are NULL).

RESPECT NULLS   
 Indicates that AWS Clean Rooms should include null values in the determination of which row to use. RESPECT NULLS is supported by default if you do not specify IGNORE NULLS. 

OVER   
Introduces the window clauses for the function. 

PARTITION BY *expr\$1list*   
Defines the window for the function in terms of one or more expressions. 

ORDER BY *order\$1list*   
Sorts the rows within each partition. If no PARTITION BY clause is specified, ORDER BY sorts the entire table. If you specify an ORDER BY clause, you must also specify a *frame\$1clause*.   
The results of the FIRST\$1VALUE function depends on the ordering of the data. The results are nondeterministic in the following cases:   
+ When no ORDER BY clause is specified and a partition contains two different values for an expression 
+ When the expression evaluates to different values that correspond to the same value in the ORDER BY list. 

 *frame\$1clause*   
If an ORDER BY clause is used for an aggregate function, an explicit frame clause is required. The frame clause refines the set of rows in a function's window, including or excluding sets of rows in the ordered result. The frame clause consists of the ROWS keyword and associated specifiers. See [Window function syntax summary](Window_functions.md#Window_function_synopsis). 

## Return type
<a name="Supported_data_types_wf_first_value"></a>

These functions support expressions that use primitive AWS Clean Rooms data types. The return type is the same as the data type of the *expression*.

## Examples
<a name="WF_first_value-examples"></a>

The following example returns the seating capacity for each venue in the VENUE table, with the results ordered by capacity (high to low). The FIRST\$1VALUE function is used to select the name of the venue that corresponds to the first row in the frame: in this case, the row with the highest number of seats. The results are partitioned by state, so when the VENUESTATE value changes, a new first value is selected. The window frame is unbounded so the same first value is selected for each row in each partition. 

For California, `Qualcomm Stadium` has the highest number of seats (`70561`), so this name is the first value for all of the rows in the `CA` partition. 

```
select venuestate, venueseats, venuename,
first_value(venuename)
over(partition by venuestate
order by venueseats desc
rows between unbounded preceding and unbounded following)
from (select * from venue where venueseats >0)
order by venuestate;

venuestate | venueseats |           venuename            |         first_value
-----------+------------+--------------------------------+------------------------------
CA         |      70561 | Qualcomm Stadium               | Qualcomm Stadium
CA         |      69843 | Monster Park                   | Qualcomm Stadium
CA         |      63026 | McAfee Coliseum                | Qualcomm Stadium
CA         |      56000 | Dodger Stadium                 | Qualcomm Stadium
CA         |      45050 | Angel Stadium of Anaheim       | Qualcomm Stadium
CA         |      42445 | PETCO Park                     | Qualcomm Stadium
CA         |      41503 | AT&T Park                      | Qualcomm Stadium
CA         |      22000 | Shoreline Amphitheatre         | Qualcomm Stadium
CO         |      76125 | INVESCO Field                  | INVESCO Field
CO         |      50445 | Coors Field                    | INVESCO Field
DC         |      41888 | Nationals Park                 | Nationals Park
FL         |      74916 | Dolphin Stadium                | Dolphin Stadium
FL         |      73800 | Jacksonville Municipal Stadium | Dolphin Stadium
FL         |      65647 | Raymond James Stadium          | Dolphin Stadium
FL         |      36048 | Tropicana Field                | Dolphin Stadium
...
```

# LAG window function
<a name="WF_LAG"></a>

 The LAG window function returns the values for a row at a given offset above (before) the current row in the partition. 

## Syntax
<a name="WF_LAG-synopsis"></a>

```
LAG (value_expr [, offset ])
[ IGNORE NULLS | RESPECT NULLS ]
OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering )
```

## Arguments
<a name="WF_LAG-arguments"></a>

 *value\$1expr*   
 The target column or expression that the function operates on. 

 *offset*   
 An optional parameter that specifies the number of rows before the current row to return values for. The offset can be a constant integer or an expression that evaluates to an integer. If you do not specify an offset, AWS Clean Rooms uses `1` as the default value. An offset of `0` indicates the current row. 

IGNORE NULLS   
An optional specification that indicates that AWS Clean Rooms should skip null values in the determination of which row to use. Null values are included if IGNORE NULLS is not listed.   
You can use an NVL or COALESCE expression to replace the null values with another value. 

RESPECT NULLS   
 Indicates that AWS Clean Rooms should include null values in the determination of which row to use. RESPECT NULLS is supported by default if you do not specify IGNORE NULLS. 

OVER   
Specifies the window partitioning and ordering. The OVER clause cannot contain a window frame specification. 

PARTITION BY *window\$1partition*   
An optional argument that sets the range of records for each group in the OVER clause. 

ORDER BY *window\$1ordering*   
Sorts the rows within each partition. 

The LAG window function supports expressions that use any of the AWS Clean Rooms data types. The return type is the same as the type of the *value\$1expr*. 

## Examples
<a name="WF_LAG-examples"></a>

 The following example shows the quantity of tickets sold to the buyer with a buyer ID of 3 and the time that buyer 3 bought the tickets. To compare each sale with the previous sale for buyer 3, the query returns the previous quantity sold for each sale. Since there is no purchase before 1/16/2008, the first previous quantity sold value is null: 

```
select buyerid, saletime, qtysold,
lag(qtysold,1) over (order by buyerid, saletime) as prev_qtysold
from sales where buyerid = 3 order by buyerid, saletime;

buyerid |      saletime       | qtysold | prev_qtysold
---------+---------------------+---------+--------------
3 | 2008-01-16 01:06:09 |       1 |
3 | 2008-01-28 02:10:01 |       1 |            1
3 | 2008-03-12 10:39:53 |       1 |            1
3 | 2008-03-13 02:56:07 |       1 |            1
3 | 2008-03-29 08:21:39 |       2 |            1
3 | 2008-04-27 02:39:01 |       1 |            2
3 | 2008-08-16 07:04:37 |       2 |            1
3 | 2008-08-22 11:45:26 |       2 |            2
3 | 2008-09-12 09:11:25 |       1 |            2
3 | 2008-10-01 06:22:37 |       1 |            1
3 | 2008-10-20 01:55:51 |       2 |            1
3 | 2008-10-28 01:30:40 |       1 |            2
(12 rows)
```

# LAST window function
<a name="WF-LAST"></a>

 Given an ordered set of rows, The LAST function returns the value of the expression with respect to the last row in the frame. 

For information about selecting the first row in the frame, see [FIRST window function](WF_FIRST.md). 

## Syntax
<a name="WF-LAST-syntax"></a>

```
LAST( expression )[ IGNORE NULLS | RESPECT NULLS ]
OVER (
[ PARTITION BY expr_list ]
[ ORDER BY order_list frame_clause ]
)
```

## Arguments
<a name="WF-LAST-arguments"></a>

 *expression*   
 The target column or expression that the function operates on. 

IGNORE NULLS   
The function returns the last value in the frame that is not NULL (or NULL if all values are NULL). 

RESPECT NULLS   
Indicates that AWS Clean Rooms should include null values in the determination of which row to use. RESPECT NULLS is supported by default if you do not specify IGNORE NULLS. 

OVER   
Introduces the window clauses for the function. 

PARTITION BY *expr\$1list*   
Defines the window for the function in terms of one or more expressions. 

ORDER BY *order\$1list*   
Sorts the rows within each partition. If no PARTITION BY clause is specified, ORDER BY sorts the entire table. If you specify an ORDER BY clause, you must also specify a *frame\$1clause*.   
The results depend on the ordering of the data. The results are nondeterministic in the following cases:   
+ When no ORDER BY clause is specified and a partition contains two different values for an expression 
+ When the expression evaluates to different values that correspond to the same value in the ORDER BY list. 

 *frame\$1clause*   
If an ORDER BY clause is used for an aggregate function, an explicit frame clause is required. The frame clause refines the set of rows in a function's window, including or excluding sets of rows in the ordered result. The frame clause consists of the ROWS keyword and associated specifiers. See [Window function syntax summary](Window_functions.md#Window_function_synopsis). 

## Return type
<a name="WF-LAST-return-type"></a>

These functions support expressions that use primitive AWS Clean Rooms data types. The return type is the same as the data type of the *expression*.

## Examples
<a name="WF-LAST-examples"></a>

The following example returns the seating capacity for each venue in the VENUE table, with the results ordered by capacity (high to low). The LAST function is used to select the name of the venue that corresponds to the last row in the frame: in this case, the row with the least number of seats. The results are partitioned by state, so when the VENUESTATE value changes, a new last value is selected. The window frame is unbounded so the same last value is selected for each row in each partition. 

For California, `Shoreline Amphitheatre` is returned for every row in the partition because it has the lowest number of seats (`22000`). 

```
select venuestate, venueseats, venuename,
last(venuename)
over(partition by venuestate
order by venueseats desc
rows between unbounded preceding and unbounded following)
from (select * from venue where venueseats >0)
order by venuestate;

venuestate | venueseats |           venuename            |          last
-----------+------------+--------------------------------+------------------------------
CA         |      70561 | Qualcomm Stadium               | Shoreline Amphitheatre
CA         |      69843 | Monster Park                   | Shoreline Amphitheatre
CA         |      63026 | McAfee Coliseum                | Shoreline Amphitheatre
CA         |      56000 | Dodger Stadium                 | Shoreline Amphitheatre
CA         |      45050 | Angel Stadium of Anaheim       | Shoreline Amphitheatre
CA         |      42445 | PETCO Park                     | Shoreline Amphitheatre
CA         |      41503 | AT&T Park                      | Shoreline Amphitheatre
CA         |      22000 | Shoreline Amphitheatre         | Shoreline Amphitheatre
CO         |      76125 | INVESCO Field                  | Coors Field
CO         |      50445 | Coors Field                    | Coors Field
DC         |      41888 | Nationals Park                 | Nationals Park
FL         |      74916 | Dolphin Stadium                | Tropicana Field
FL         |      73800 | Jacksonville Municipal Stadium | Tropicana Field
FL         |      65647 | Raymond James Stadium          | Tropicana Field
FL         |      36048 | Tropicana Field                | Tropicana Field
...
```

# LAST\$1VALUE window function
<a name="WF_last_value"></a>

 Given an ordered set of rows, The LAST\$1VALUE function returns the value of the expression with respect to the last row in the frame. 

For information about selecting the first row in the frame, see [FIRST\$1VALUE window function](WF_first_value.md) .

## Syntax
<a name="WF_last_value-synopsis"></a>

```
LAST_VALUE( expression )[ IGNORE NULLS | RESPECT NULLS ]
OVER (
[ PARTITION BY expr_list ]
[ ORDER BY order_list frame_clause ]
)
```

## Arguments
<a name="WF_last_value-arguments"></a>

 *expression*   
 The target column or expression that the function operates on. 

IGNORE NULLS   
The function returns the last value in the frame that is not NULL (or NULL if all values are NULL). 

RESPECT NULLS   
Indicates that AWS Clean Rooms should include null values in the determination of which row to use. RESPECT NULLS is supported by default if you do not specify IGNORE NULLS. 

OVER   
Introduces the window clauses for the function. 

PARTITION BY *expr\$1list*   
Defines the window for the function in terms of one or more expressions. 

ORDER BY *order\$1list*   
Sorts the rows within each partition. If no PARTITION BY clause is specified, ORDER BY sorts the entire table. If you specify an ORDER BY clause, you must also specify a *frame\$1clause*.   
The results depend on the ordering of the data. The results are nondeterministic in the following cases:   
+ When no ORDER BY clause is specified and a partition contains two different values for an expression 
+ When the expression evaluates to different values that correspond to the same value in the ORDER BY list. 

 *frame\$1clause*   
If an ORDER BY clause is used for an aggregate function, an explicit frame clause is required. The frame clause refines the set of rows in a function's window, including or excluding sets of rows in the ordered result. The frame clause consists of the ROWS keyword and associated specifiers. See [Window function syntax summary](Window_functions.md#Window_function_synopsis). 

## Return type
<a name="Supported_data_types_wf_last_value"></a>

These functions support expressions that use primitive AWS Clean Rooms data types. The return type is the same as the data type of the *expression*.

## Examples
<a name="WF_last_value-examples"></a>

The following example returns the seating capacity for each venue in the VENUE table, with the results ordered by capacity (high to low). The LAST\$1VALUE function is used to select the name of the venue that corresponds to the last row in the frame: in this case, the row with the least number of seats. The results are partitioned by state, so when the VENUESTATE value changes, a new last value is selected. The window frame is unbounded so the same last value is selected for each row in each partition. 

For California, `Shoreline Amphitheatre` is returned for every row in the partition because it has the lowest number of seats (`22000`). 

```
select venuestate, venueseats, venuename,
last_value(venuename)
over(partition by venuestate
order by venueseats desc
rows between unbounded preceding and unbounded following)
from (select * from venue where venueseats >0)
order by venuestate;

venuestate | venueseats |           venuename            |          last_value
-----------+------------+--------------------------------+------------------------------
CA         |      70561 | Qualcomm Stadium               | Shoreline Amphitheatre
CA         |      69843 | Monster Park                   | Shoreline Amphitheatre
CA         |      63026 | McAfee Coliseum                | Shoreline Amphitheatre
CA         |      56000 | Dodger Stadium                 | Shoreline Amphitheatre
CA         |      45050 | Angel Stadium of Anaheim       | Shoreline Amphitheatre
CA         |      42445 | PETCO Park                     | Shoreline Amphitheatre
CA         |      41503 | AT&T Park                      | Shoreline Amphitheatre
CA         |      22000 | Shoreline Amphitheatre         | Shoreline Amphitheatre
CO         |      76125 | INVESCO Field                  | Coors Field
CO         |      50445 | Coors Field                    | Coors Field
DC         |      41888 | Nationals Park                 | Nationals Park
FL         |      74916 | Dolphin Stadium                | Tropicana Field
FL         |      73800 | Jacksonville Municipal Stadium | Tropicana Field
FL         |      65647 | Raymond James Stadium          | Tropicana Field
FL         |      36048 | Tropicana Field                | Tropicana Field
...
```

# LEAD window function
<a name="WF_LEAD"></a>

 The LEAD window function returns the values for a row at a given offset below (after) the current row in the partition. 

## Syntax
<a name="WF_LEAD-synopsis"></a>

```
LEAD (value_expr [, offset ])
[ IGNORE NULLS | RESPECT NULLS ]
OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering )
```

## Arguments
<a name="WF_LEAD-arguments"></a>

 *value\$1expr*   
The target column or expression that the function operates on. 

 *offset*   
 An optional parameter that specifies the number of rows below the current row to return values for. The offset can be a constant integer or an expression that evaluates to an integer. If you do not specify an offset, AWS Clean Rooms uses `1` as the default value. An offset of `0` indicates the current row. 

IGNORE NULLS   
An optional specification that indicates that AWS Clean Rooms should skip null values in the determination of which row to use. Null values are included if IGNORE NULLS is not listed.   
You can use an NVL or COALESCE expression to replace the null values with another value. 

RESPECT NULLS   
 Indicates that AWS Clean Rooms should include null values in the determination of which row to use. RESPECT NULLS is supported by default if you do not specify IGNORE NULLS. 

OVER   
Specifies the window partitioning and ordering. The OVER clause cannot contain a window frame specification. 

PARTITION BY *window\$1partition*   
An optional argument that sets the range of records for each group in the OVER clause. 

ORDER BY *window\$1ordering*   
Sorts the rows within each partition. 

The LEAD window function supports expressions that use any of the AWS Clean Rooms data types. The return type is the same as the type of the *value\$1expr*. 

## Examples
<a name="WF_LEAD-examples"></a>

 The following example provides the commission for events in the SALES table for which tickets were sold on January 1, 2008 and January 2, 2008 and the commission paid for ticket sales for the subsequent sale. 

```
select eventid, commission, saletime,
lead(commission, 1) over (order by saletime) as next_comm
from sales where saletime between '2008-01-01 00:00:00' and '2008-01-02 12:59:59'
order by saletime;

eventid | commission |      saletime       | next_comm
---------+------------+---------------------+-----------
6213 |      52.05 | 2008-01-01 01:00:19 |    106.20
7003 |     106.20 | 2008-01-01 02:30:52 |    103.20
8762 |     103.20 | 2008-01-01 03:50:02 |     70.80
1150 |      70.80 | 2008-01-01 06:06:57 |     50.55
1749 |      50.55 | 2008-01-01 07:05:02 |    125.40
8649 |     125.40 | 2008-01-01 07:26:20 |     35.10
2903 |      35.10 | 2008-01-01 09:41:06 |    259.50
6605 |     259.50 | 2008-01-01 12:50:55 |    628.80
6870 |     628.80 | 2008-01-01 12:59:34 |     74.10
6977 |      74.10 | 2008-01-02 01:11:16 |     13.50
4650 |      13.50 | 2008-01-02 01:40:59 |     26.55
4515 |      26.55 | 2008-01-02 01:52:35 |     22.80
5465 |      22.80 | 2008-01-02 02:28:01 |     45.60
5465 |      45.60 | 2008-01-02 02:28:02 |     53.10
7003 |      53.10 | 2008-01-02 02:31:12 |     70.35
4124 |      70.35 | 2008-01-02 03:12:50 |     36.15
1673 |      36.15 | 2008-01-02 03:15:00 |   1300.80
...
(39 rows)
```

# PERCENT\$1RANK window function
<a name="WF_PERCENT_RANK"></a>

Calculates the percent rank of a given row. The percent rank is determined using this formula:

`(x - 1) / (the number of rows in the window or partition - 1)`

where *x* is the rank of the current row. The following dataset illustrates use of this formula:

```
Row#	Value	Rank	Calculation	PERCENT_RANK
1	15	1	(1-1)/(7-1)	0.0000
2	20	2	(2-1)/(7-1)	0.1666
3	20	2	(2-1)/(7-1)	0.1666
4	20	2	(2-1)/(7-1)	0.1666
5	30	5	(5-1)/(7-1)	0.6666
6	30	5	(5-1)/(7-1)	0.6666
7	40	7	(7-1)/(7-1)	1.0000
```

The return value range is 0 to 1, inclusive. The first row in any set has a PERCENT\$1RANK of 0.

## Syntax
<a name="WF_PERCENT_RANK-synopsis"></a>

```
PERCENT_RANK ()
OVER ( 
[ PARTITION BY partition_expression ] 
[ ORDER BY order_list ]
)
```

## Arguments
<a name="WF_PERCENT_RANK-arguments"></a>

( )   
The function takes no arguments, but the empty parentheses are required. 

OVER  
A clause that specifies the window partitioning. The OVER clause cannot contain a window frame specification.

PARTITION BY *partition\$1expression*   
Optional. An expression that sets the range of records for each group in the OVER clause.

ORDER BY *order\$1list*   
Optional. The expression on which to calculate percent rank. The expression must have either a numeric data type or be implicitly convertible to one. If ORDER BY is omitted, the return value is 0 for all rows.  
If ORDER BY does not produce a unique ordering, the order of the rows is nondeterministic. For more information, see [Unique ordering of data for window functions](Window_functions.md#Examples_order_by_WF). 

## Return type
<a name="WF_PERCENT_RANK-return-type"></a>

FLOAT8

## Examples
<a name="WF_PERCENT_RANK-examples"></a>

The following example calculates the percent rank of the sales quantities for each seller:

```
select sellerid, qty, percent_rank() 
over (partition by sellerid order by qty) 
from winsales;

sellerid	qty		percent_rank
----------------------------------------
1		10.00		0.0
1		10.64		0.5
1		30.37		1.0
3		10.04		0.0
3		15.15		0.33
3		20.75		0.67
3		30.55		1.0
2		20.09		0.0
2		20.12		1.0
4		10.12		0.0
4		40.23		1.0
```

For a description of the WINSALES table, see [Sample table for window function examples](Window_functions.md#Window_function_example).

# RANK window function
<a name="WF_RANK"></a>

 The RANK window function determines the rank of a value in a group of values, based on the ORDER BY expression in the OVER clause. If the optional PARTITION BY clause is present, the rankings are reset for each group of rows. Rows with equal values for the ranking criteria receive the same rank. AWS Clean Rooms adds the number of tied rows to the tied rank to calculate the next rank and thus the ranks might not be consecutive numbers. For example, if two rows are ranked 1, the next rank is 3. 

 RANK differs from the [DENSE\$1RANK window function](WF_DENSE_RANK.md) in one respect: For DENSE\$1RANK, if two or more rows tie, there is no gap in the sequence of ranked values. For example, if two rows are ranked 1, the next rank is 2.

You can have ranking functions with different PARTITION BY and ORDER BY clauses in the same query. 

## Syntax
<a name="WF_RANK-synopsis"></a>

```
RANK () OVER
(
[ PARTITION BY expr_list ]
[ ORDER BY order_list ]
)
```

## Arguments
<a name="WF_RANK-arguments"></a>

( )   
The function takes no arguments, but the empty parentheses are required. 

OVER   
The window clauses for the RANK function.

PARTITION BY *expr\$1list*   
Optional. One or more expressions that define the window. 

ORDER BY *order\$1list*   
Optional. Defines the columns on which the ranking values are based. If no PARTITION BY is specified, ORDER BY uses the entire table. If ORDER BY is omitted, the return value is 1 for all rows.   
If ORDER BY does not produce a unique ordering, the order of the rows is nondeterministic. For more information, see [Unique ordering of data for window functions](Window_functions.md#Examples_order_by_WF). 

## Return type
<a name="Supported_data_types_wf_rank"></a>

INTEGER

## Examples
<a name="WF_RANK-examples"></a>

The following example orders the table by the quantity sold (default ascending), and assign a rank to each row. A rank value of 1 is the highest ranked value. The results are sorted after the window function results are applied: 

```
select salesid, qty,
rank() over (order by qty) as rnk
from winsales
order by 2,1;

salesid | qty | rnk
--------+-----+-----
10001 |  10 |  1
10006 |  10 |  1
30001 |  10 |  1
40005 |  10 |  1
30003 |  15 |  5
20001 |  20 |  6
20002 |  20 |  6
30004 |  20 |  6
10005 |  30 |  9
30007 |  30 |  9
40001 |  40 |  11
(11 rows)
```

Note that the outer ORDER BY clause in this example includes columns 2 and 1 to make sure that AWS Clean Rooms returns consistently sorted results each time this query is run. For example, rows with sales IDs 10001 and 10006 have identical QTY and RNK values. Ordering the final result set by column 1 ensures that row 10001 always falls before 10006. For a description of the WINSALES table, see [Sample table for window function examples](Window_functions.md#Window_function_example).

In the following example, the ordering is reversed for the window function (`order by qty desc`). Now the highest rank value applies to the largest QTY value. 

```
select salesid, qty,
rank() over (order by qty desc) as rank
from winsales
order by 2,1;

 salesid | qty | rank
---------+-----+-----
   10001 |  10 |   8
   10006 |  10 |   8
   30001 |  10 |   8
   40005 |  10 |   8
   30003 |  15 |   7
   20001 |  20 |   4
   20002 |  20 |   4
   30004 |  20 |   4
   10005 |  30 |   2
   30007 |  30 |   2
   40001 |  40 |   1
(11 rows)
```

For a description of the WINSALES table, see [Sample table for window function examples](Window_functions.md#Window_function_example). 

The following example partitions the table by SELLERID and order each partition by the quantity (in descending order) and assign a rank to each row. The results are sorted after the window function results are applied. 

```
select salesid, sellerid, qty, rank() over
(partition by sellerid
order by qty desc) as rank
from winsales
order by 2,3,1;

salesid | sellerid | qty | rank
--------+----------+-----+-----
  10001 |        1 |  10 |  2
  10006 |        1 |  10 |  2
  10005 |        1 |  30 |  1
  20001 |        2 |  20 |  1
  20002 |        2 |  20 |  1
  30001 |        3 |  10 |  4
  30003 |        3 |  15 |  3
  30004 |        3 |  20 |  2
  30007 |        3 |  30 |  1
  40005 |        4 |  10 |  2
  40001 |        4 |  40 |  1
(11 rows)
```

# ROW\$1NUMBER window function
<a name="WF_ROW_NUMBER"></a>

Determines the ordinal number of the current row within a group of rows, counting from 1, based on the ORDER BY expression in the OVER clause. If the optional PARTITION BY clause is present, the ordinal numbers are reset for each group of rows. Rows with equal values for the ORDER BY expressions receive the different row numbers nondeterministically. 

## Syntax
<a name="WF_ROW_NUMBER-synopsis"></a>

```
ROW_NUMBER () OVER
(
[ PARTITION BY expr_list ]
[ ORDER BY order_list ]
)
```

## Arguments
<a name="WF_ROW_NUMBER-arguments"></a>

( )   
The function takes no arguments, but the empty parentheses are required. 

OVER   
The window clauses for the ROW\$1NUMBER function. 

PARTITION BY *expr\$1list*   
Optional. One or more expressions that define the ROW\$1NUMBER function. 

ORDER BY *order\$1list*   
Optional. The expression that defines the columns on which the row numbers are based. If no PARTITION BY is specified, ORDER BY uses the entire table.   
If ORDER BY does not produce a unique ordering or is omitted, the order of the rows is nondeterministic. For more information, see [Unique ordering of data for window functions](Window_functions.md#Examples_order_by_WF). 

## Return type
<a name="Supported_data_types_WF_ROW_NUMBER"></a>

BIGINT

## Examples
<a name="WF_ROW_NUMBER-examples"></a>

The following example partitions the table by SELLERID and orders each partition by QTY (in ascending order), then assigns a row number to each row. The results are sorted after the window function results are applied. 

```
select salesid, sellerid, qty, 
row_number() over 
(partition by sellerid
 order by qty asc) as row
from winsales
order by 2,4;

 salesid | sellerid | qty | row
---------+----------+-----+-----
   10006 |        1 |  10 |   1
   10001 |        1 |  10 |   2
   10005 |        1 |  30 |   3
   20001 |        2 |  20 |   1
   20002 |        2 |  20 |   2
   30001 |        3 |  10 |   1
   30003 |        3 |  15 |   2
   30004 |        3 |  20 |   3
   30007 |        3 |  30 |   4
   40005 |        4 |  10 |   1
   40001 |        4 |  40 |   2
(11 rows)
```

 For a description of the WINSALES table, see [Sample table for window function examples](Window_functions.md#Window_function_example). 