mirror of
https://github.com/FirebirdSQL/firebird.git
synced 2025-01-22 20:43:02 +01:00
Documentation for CORE-3647 - Window Function: frame (rows / range) clause.
This commit is contained in:
parent
1d435303fe
commit
f80d135842
@ -2,21 +2,41 @@
|
||||
|
||||
By the SQL specification, window functions (also know as analytical functions) are a kind of aggregation, but which does not "filter" the result set of a query. The aggregated data is mixed with the query result set.
|
||||
|
||||
That sort of functions are used with the `OVER` clause. Window functions may appear only in the select list or the order by clause of a query.
|
||||
That sort of functions are used with the `OVER` clause. Window functions may appear only in the select list or the `ORDER BY` clause of a query.
|
||||
|
||||
Additional to the `OVER` clause, Firebird window functions may be partitioned and ordered.
|
||||
Additional to the `OVER` clause, Firebird window functions may use partitions, order and frames (v4).
|
||||
|
||||
Syntax:
|
||||
|
||||
```
|
||||
<window function> ::= <window function name>([<expr> [, <expr> ...]]) OVER (
|
||||
[PARTITION BY <expr> [, <expr> ...]]
|
||||
[ORDER BY <expr> [<direction>] [<nulls placement>] [, <expr> [<direction>] [<nulls placement>] ...]
|
||||
)
|
||||
<window function> ::=
|
||||
<window function name>([<expr> [, <expr> ...]])
|
||||
OVER ([<window partition>] [<window order>] [<window frame>])
|
||||
|
||||
<direction> ::= {ASC | DESC}
|
||||
<window partition> ::=
|
||||
PARTITION BY <expr> [, <expr> ...]
|
||||
|
||||
<nulls placement> ::= NULLS {FIRST | LAST}
|
||||
<window order> ::=
|
||||
ORDER BY <expr> [<direction>] [<nulls placement>] [, <expr> [<direction>] [<nulls placement>]] ...
|
||||
|
||||
<window frame> ::=
|
||||
{RANGE | ROWS} <window frame extent>
|
||||
|
||||
<window frame extent> ::=
|
||||
{<window frame start> | <window frame between>}
|
||||
|
||||
<window frame start> ::=
|
||||
{UNBOUNDED PRECEDING | <expr> PRECEDING | CURRENT ROW}
|
||||
|
||||
<window frame between> ::=
|
||||
BETWEEN {UNBOUNDED PRECEDING | <expr> PRECEDING | <expr> FOLLOWING | CURRENT ROW} AND
|
||||
{UNBOUNDED FOLLOWING | <expr> PRECEDING | <expr> FOLLOWING | CURRENT ROW}
|
||||
|
||||
<direction> ::=
|
||||
{ASC | DESC}
|
||||
|
||||
<nulls placement> ::=
|
||||
NULLS {FIRST | LAST}
|
||||
```
|
||||
|
||||
## 1. Aggregate functions used as window functions
|
||||
@ -63,10 +83,6 @@ Here, `sum(salary) over ()` is computed with the sum of all SALARY from the quer
|
||||
|
||||
Like aggregate functions, that may operate alone or in relation to a group, window functions may also operate on a group, which is called "partition". Its syntax is:
|
||||
|
||||
```
|
||||
<window function name>(...) OVER (PARTITION BY <expr> [, <expr> ...])
|
||||
```
|
||||
|
||||
When aggregation is done over a group, it could produce more than one row. So the result set generated by a partition is joined with the main query using the same expression list of the partition.
|
||||
|
||||
Continuing the employee example, instead of get the percentage of the employee salary over all employees, we would like to get the percentage based only on the employees in the same department:
|
||||
@ -93,7 +109,7 @@ Results:
|
||||
|
||||
## 3. Ordering
|
||||
|
||||
The `ORDER BY` sub-clause can be used with or without partitions, and used with the standard aggregate functions, make them return the partial aggregations as the records are being processed. Example:
|
||||
The `ORDER BY` sub-clause can be used with or without partitions, and used with the standard aggregate functions and without frames, make them return the partial aggregations as the records are being processed. Example:
|
||||
|
||||
```sql
|
||||
select
|
||||
@ -116,6 +132,8 @@ The result set produced will be:
|
||||
|
||||
Then running_salary returns the partial/accumulated (or running) aggregation (of the `SUM` function). It may appear strange that 37.00 is repeated for the ids 1 and 5, but that is how it should work. The `ORDER BY` keys are grouped together and the aggregation is computed once (but summing the two 10.00). To avoid this, you can add the ID field to the end of the `ORDER BY` clause.
|
||||
|
||||
The fact of running aggregations and grouping results by the order key happens because when an explicit frame is not used, it computes the window functions as a `RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW` frame. More on frames later in this document.
|
||||
|
||||
It's possible to use multiple windows with different orders, and `ORDER BY` parts like `ASC` / `DESC` and `NULLS FIRST` / `NULLS LAST`.
|
||||
|
||||
With a partition, `ORDER BY` works the same way, but at each partition boundary the aggregation is reset.
|
||||
@ -128,7 +146,7 @@ Beyond aggregate functions, there is also exclusive window functions, currently
|
||||
|
||||
Both set of functions can be used with/without partition/ordering, but the usage does not make much sense without ordering.
|
||||
|
||||
## 4.1 Ranking functions
|
||||
## 4.1. Ranking functions
|
||||
|
||||
Syntax:
|
||||
|
||||
@ -179,7 +197,7 @@ The difference between `DENSE_RANK` and `RANK` is that there is a gap related to
|
||||
|
||||
`NTILE` distributes the rows into a specified number of groups. `NTILE` argument is restricted to integral positive literal, variable (`:var`) and DSQL parameter (`?`).
|
||||
|
||||
## 4.2 Navigational functions
|
||||
## 4.2. Navigational functions
|
||||
|
||||
Syntax:
|
||||
|
||||
@ -194,7 +212,7 @@ Syntax:
|
||||
|
||||
The navigational functions gets the simple (non-aggregated) value of an expression from another row (inside the same partition) of the query.
|
||||
|
||||
It's important to note that `FIRST_VALUE`, `LAST_VALUE` and `NTH_VALUE` also operates on a window frame, and Firebird is currently always framing from the first to the current (and not the last) row of the partition. This is likely to get strange results for `NTH_VALUE` and specially `LAST_VALUE`.
|
||||
It's important to note that `FIRST_VALUE`, `LAST_VALUE` and `NTH_VALUE` also operates on a window frame, and the default frame is `RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`. This is likely to get strange results for `NTH_VALUE` and specially `LAST_VALUE`. More on frames later in this document.
|
||||
|
||||
```sql
|
||||
select
|
||||
@ -225,6 +243,24 @@ And the result set:
|
||||
|
||||
`LAG` and `LEAD` get the value within a distance respect to the current row and the offset (which defaults to 1) passed. In the case the offset points to outside of the partition, the default parameter (which defaults to NULL) is returned. `LAG` looks for a preceding row, and `LEAD` for a following row.
|
||||
|
||||
## 5. Frames (v4)
|
||||
|
||||
It's possible to specify the frame that some window functions work. The frame is divided in three piecies: unit, start bound and end bound.
|
||||
|
||||
The unit `RANGE` or `ROWS` defines how the bounds `<expr> PRECEDING`, `<expr> FOLLOWING` and `CURRENT ROW` works.
|
||||
|
||||
With `RANGE`, the `ORDER BY` should specify only one expression, and that expression should be of a numeric, date, time or timestamp type. For `<expr> PRECEDING` and `<expr> FOLLOWING` bounds, `<expr>` is respectively subtracted or added to the order expression, and for `CURRENT ROW` only the order expression is used. Then, all rows (inside the partition) between the bounds are considered part of the resulting window frame.
|
||||
|
||||
With `ROWS`, order expressions is not limited by number or types. In this case, `<expr> PRECEDING`, `<expr> FOLLOWING` and `CURRENT ROW` relates to the row position under the partition, and not to the order keys values.
|
||||
|
||||
`UNBOUNDED PRECEDING` and `UNBOUNDED FOLLOWING` work identically with `RANGE` and `ROWS`. `UNBOUNDED PRECEDING` looks for the first row and `UNBOUNDED FOLLOWING` the last one, always inside the partition.
|
||||
|
||||
The frame syntax with `<window frame start>` specifies the start frame, with the end frame being `CURRENT ROW`.
|
||||
|
||||
Some window functions discard frames. `ROW_NUMBER`, `LAG` and `LEAD` always work as `ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`. And `DENSE_RANK`, `RANK`, `PERCENT_RANK` and `CUME_DIST` always work as `RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`.
|
||||
|
||||
`FIRST_VALUE`, `LAST_VALUE` and `NTH_VALUE` respect frames, but the `RANGE` unit works identically as `ROWS`.
|
||||
|
||||
|
||||
Author:
|
||||
Adriano dos Santos Fernandes <adrianosf at gmail.com>
|
||||
|
Loading…
Reference in New Issue
Block a user