Window Aggregate and Ranking Functions
Window aggregate functions enable you to compute aggregate values on a particular partition or window of the result set. Unlike grouped aggregate functions that group the results and return a single value, window aggregates return a value for each row in the specified window. For example, using the grouped aggregate SUM function to add up the total number of tickets sold in a year returns one value: the total number of tickets sold for the year. By using the SUM window aggregate instead, the results could be partitioned by month so that the query returns 12 values: the sum of the number of tickets sold in each month of the year.
This topic describes the window aggregate and ranking functions in AnzoGraph DB:
- AVG: Calculates the average value of each group of values.
- COUNT: Counts the number of values in each group of values.
- MAX: Calculates the maximum value of each group of values.
- MIN: Calculates the minimum value of each group of values.
- NTILE: Divides the rows in the partition into the specified number of ranked groups and returns the group that each value belongs to.
- PERCENTILE: Divides the rows in the partition into 100 ranked groups and returns the group that each value belongs to.
- PRODUCT: Calculates the product of each group of values.
- QUARTILE: Divides the rows in the partition into four ranked groups and returns the group that each value belongs to.
- ROW_NUMBER: Assigns unique numbers to each row in the partition.
- SUM: Calculates the sum of each group of values.
Typographical Conventions
The following list describes the conventions used to document function syntax:
CAPS
: Although SPARQL is case-insensitive, SPARQL keywords in this section are written in uppercase for readability.[ argument ]
: Brackets indicate an optional argument or keyword.|
: Means OR. Indicates that you can use one or more of the specified options.
AVG
This function calculates the average value of each group of values.
Syntax
(AVG(value) OVER ( [ PARTITION BY partition_value ] [ ORDER BY order_value ] # frame clause [ [ ROWS ] frame_start | [ ROWS ] BETWEEN frame_start AND frame_end ] ) AS ?variable )
Argument | Description |
---|---|
value | Required argument that defines the group of values to operate on. |
partition_value | The optional PARTITION BY clause forms the groups of rows, dividing the result set into the partitions defined by the given partition_value . If you do not include PARTITION BY, the partition becomes the entire result set. When PARTITION BY is included, the function is applied to the group of rows in each partition. |
order_value | The optional ORDER BY clause defines the order or sequence of rows within each partition. |
frame clause | The reference point for all window frames is the current row. The optional frame clause further defines the frame by specifying the rows in a partition to combine with the current row. There are two types of window frames:
The frame clause can be one of the following options: [ ROWS ] frame_start [ ROWS ] BETWEEN frame_start AND frame_end When a frame clause is not included, the window frame is unbounded: |
frame_start | The starting point of the frame. This argument can be one of the following values:UNBOUNDED PRECEDING positive_int PRECEDING CURRENT ROW positive_int FOLLOWING |
frame_end | The end of the frame. This argument can be one of the following values:positive_int PRECEDING CURRENT ROW positive_int FOLLOWING UNBOUNDED FOLLOWING |
COUNT
This function counts the number of values in each group of values.
Syntax
(COUNT(value) OVER ( [ PARTITION BY partition_value ] [ ORDER BY order_value ] # frame clause [ [ ROWS ] frame_start | [ ROWS ] BETWEEN frame_start AND frame_end ] ) AS ?variable )
Argument | Description |
---|---|
value | Required argument that defines the group of values to operate on. |
partition_value | The optional PARTITION BY clause forms the groups of rows, dividing the result set into the partitions defined by the given partition_value . If you do not include PARTITION BY, the partition becomes the entire result set. When PARTITION BY is included, the function is applied to the group of rows in each partition. |
order_value | The optional ORDER BY clause defines the order or sequence of rows within each partition. |
frame clause | The reference point for all window frames is the current row. The optional frame clause further defines the frame by specifying the rows in a partition to combine with the current row. There are two types of window frames:
The frame clause can be one of the following options: [ ROWS ] frame_start [ ROWS ] BETWEEN frame_start AND frame_end When a frame clause is not included, the window frame is unbounded: |
frame_start | The starting point of the frame. This argument can be one of the following values:UNBOUNDED PRECEDING positive_int PRECEDING CURRENT ROW positive_int FOLLOWING |
frame_end | The end of the frame. This argument can be one of the following values:positive_int PRECEDING CURRENT ROW positive_int FOLLOWING UNBOUNDED FOLLOWING |
MAX
This function calculates the maximum value of each group of values.
Syntax
(MAX(value) OVER ( [ PARTITION BY partition_value ] [ ORDER BY order_value ] # frame clause [ [ ROWS ] frame_start | [ ROWS ] BETWEEN frame_start AND frame_end ] ) AS ?variable )
Argument | Description |
---|---|
value | Required argument that defines the group of values to operate on. |
partition_value | The optional PARTITION BY clause forms the groups of rows, dividing the result set into the partitions defined by the given partition_value . If you do not include PARTITION BY, the partition becomes the entire result set. When PARTITION BY is included, the function is applied to the group of rows in each partition. |
order_value | The optional ORDER BY clause defines the order or sequence of rows within each partition. |
frame clause | The reference point for all window frames is the current row. The optional frame clause further defines the frame by specifying the rows in a partition to combine with the current row. There are two types of window frames:
The frame clause can be one of the following options: [ ROWS ] frame_start [ ROWS ] BETWEEN frame_start AND frame_end When a frame clause is not included, the window frame is unbounded: |
frame_start | The starting point of the frame. This argument can be one of the following values:UNBOUNDED PRECEDING positive_int PRECEDING CURRENT ROW positive_int FOLLOWING |
frame_end | The end of the frame. This argument can be one of the following values:positive_int PRECEDING CURRENT ROW positive_int FOLLOWING UNBOUNDED FOLLOWING |
MIN
This function calculates the minimum value of each group of values.
Syntax
(MIN(value) OVER ( [ PARTITION BY partition_value ] [ ORDER BY order_value ] # frame clause [ [ ROWS ] frame_start | [ ROWS ] BETWEEN frame_start AND frame_end ] ) AS ?variable )
Argument | Description |
---|---|
value | Required argument that defines the group of values to operate on. |
partition_value | The optional PARTITION BY clause forms the groups of rows, dividing the result set into the partitions defined by the given partition_value . If you do not include PARTITION BY, the partition becomes the entire result set. When PARTITION BY is included, the function is applied to the group of rows in each partition. |
order_value | The optional ORDER BY clause defines the order or sequence of rows within each partition. |
frame clause | The reference point for all window frames is the current row. The optional frame clause further defines the frame by specifying the rows in a partition to combine with the current row. There are two types of window frames:
The frame clause can be one of the following options: [ ROWS ] frame_start [ ROWS ] BETWEEN frame_start AND frame_end When a frame clause is not included, the window frame is unbounded: |
frame_start | The starting point of the frame. This argument can be one of the following values:UNBOUNDED PRECEDING positive_int PRECEDING CURRENT ROW positive_int FOLLOWING |
frame_end | The end of the frame. This argument can be one of the following values:positive_int PRECEDING CURRENT ROW positive_int FOLLOWING UNBOUNDED FOLLOWING |
NTILE
This function divides the rows in the partition into the specified number of ranked groups and returns the group that each value belongs to.
Syntax
(NTILE(number_of_groups) OVER ( [ PARTITION BY partition_value ] [ ORDER BY order_value ] ) AS ?variable )
Argument | Description |
---|---|
number_of_groups | Required argument that defines the number of ranking groups. |
partition_value | The optional PARTITION BY clause forms the groups of rows, dividing the result set into the partitions defined by the given partition_value . If you do not include PARTITION BY, the partition becomes the entire result set. When PARTITION BY is included, the function is applied to the group of rows in each partition. |
order_value | The optional ORDER BY clause defines the order or sequence of rows within each partition. |
PERCENTILE
This function divides the rows in the partition into 100 ranked groups and returns the group that each value belongs to.
Syntax
(PERCENTILE(value) OVER ( [ PARTITION BY partition_value ] [ ORDER BY order_value ] ) AS ?variable )
Argument | Description |
---|---|
value | Required argument that defines the group of values to operate on. |
partition_value | The optional PARTITION BY clause forms the groups of rows, dividing the result set into the partitions defined by the given partition_value . If you do not include PARTITION BY, the partition becomes the entire result set. When PARTITION BY is included, the function is applied to the group of rows in each partition. |
order_value | The optional ORDER BY clause defines the order or sequence of rows within each partition. |
PRODUCT
This function calculates the product of each group of values.
Syntax
(PRODUCT(value) OVER ( [ PARTITION BY partition_value ] [ ORDER BY order_value ] # frame clause [ [ ROWS ] frame_start | [ ROWS ] BETWEEN frame_start AND frame_end ] ) AS ?variable )
Argument | Description |
---|---|
value | Required argument that defines the group of values to operate on. |
partition_value | The optional PARTITION BY clause forms the groups of rows, dividing the result set into the partitions defined by the given partition_value . If you do not include PARTITION BY, the partition becomes the entire result set. When PARTITION BY is included, the function is applied to the group of rows in each partition. |
order_value | The optional ORDER BY clause defines the order or sequence of rows within each partition. |
frame clause | The reference point for all window frames is the current row. The optional frame clause further defines the frame by specifying the rows in a partition to combine with the current row. There are two types of window frames:
The frame clause can be one of the following options: [ ROWS ] frame_start [ ROWS ] BETWEEN frame_start AND frame_end When a frame clause is not included, the window frame is unbounded: |
frame_start | The starting point of the frame. This argument can be one of the following values:UNBOUNDED PRECEDING positive_int PRECEDING CURRENT ROW positive_int FOLLOWING |
frame_end | The end of the frame. This argument can be one of the following values:positive_int PRECEDING CURRENT ROW positive_int FOLLOWING UNBOUNDED FOLLOWING |
QUARTILE
This function divides the rows in the partition into four ranked groups and returns the group that each value belongs to.
Syntax
(QUARTILE(value) OVER ( [ PARTITION BY partition_value ] [ ORDER BY order_value ] ) AS ?variable )
Argument | Description |
---|---|
value | Required argument that defines the group of values to operate on. |
partition_value | The optional PARTITION BY clause forms the groups of rows, dividing the result set into the partitions defined by the given partition_value . If you do not include PARTITION BY, the partition becomes the entire result set. When PARTITION BY is included, the function is applied to the group of rows in each partition. |
order_value | The optional ORDER BY clause defines the order or sequence of rows within each partition. |
ROW_NUMBER
This function assigns unique numbers to each row in the partition.
Syntax
(ROW_NUMBER() OVER ( [ PARTITION BY partition_value ] [ ORDER BY order_value ] ) AS ?variable )
Argument | Description |
---|---|
partition_value | The optional PARTITION BY clause forms the groups of rows, dividing the result set into the partitions defined by the given partition_value . If you do not include PARTITION BY, the partition becomes the entire result set. When PARTITION BY is included, the function is applied to the group of rows in each partition. |
order_value | The optional ORDER BY clause defines the order or sequence of rows within each partition. |
SUM
This function calculates the sum of each group of values.
Syntax
(SUM(value) OVER ( [ PARTITION BY partition_value ] [ ORDER BY order_value ] # frame clause [ [ ROWS ] frame_start | [ ROWS ] BETWEEN frame_start AND frame_end ] ) AS ?variable )
Argument | Description |
---|---|
value | Required argument that defines the group of values to operate on. |
partition_value | The optional PARTITION BY clause forms the groups of rows, dividing the result set into the partitions defined by the given partition_value . If you do not include PARTITION BY, the partition becomes the entire result set. When PARTITION BY is included, the function is applied to the group of rows in each partition. |
order_value | The optional ORDER BY clause defines the order or sequence of rows within each partition. |
frame clause | The reference point for all window frames is the current row. The optional frame clause further defines the frame by specifying the rows in a partition to combine with the current row. There are two types of window frames:
The frame clause can be one of the following options: [ ROWS ] frame_start [ ROWS ] BETWEEN frame_start AND frame_end When a frame clause is not included, the window frame is unbounded: |
frame_start | The starting point of the frame. This argument can be one of the following values:UNBOUNDED PRECEDING positive_int PRECEDING CURRENT ROW positive_int FOLLOWING |
frame_end | The end of the frame. This argument can be one of the following values:positive_int PRECEDING CURRENT ROW positive_int FOLLOWING UNBOUNDED FOLLOWING |
Examples
This example queries the sample Tickit data set to return the percentage of a salesperson's total sales that came from the "Gypsy" event:
PREFIX tickit: <http://anzograph.com/tickit/> SELECT ?event_name ?fname ?lname ((?dollars * 100.0/(SUM(?dollars) OVER(PARTITION BY ?event))) as ?percent_of_sales) FROM <http://anzograph.com/tickit> WHERE { ?sale tickit:eventid ?event . ?event tickit:eventname ?event_name . ?sale tickit:sellerid ?salesperson . ?sale tickit:pricepaid ?dollars . ?salesperson tickit:firstname ?fname . ?salesperson tickit:lastname ?lname . FILTER(?event_name = "Gypsy"). } ORDER BY ?event_name desc(?percent_of_sales)
event_name | fname | lname | percent_of_sales -----------+-------------+-------------+------------------ Gypsy | Zoe | Sosa | 100 Gypsy | Xaviera | Jacobson | 50.9415 Gypsy | Brianna | Mcfarland | 50.5076 Gypsy | Alexa | Baird | 45.7926 Gypsy | Roanna | Wood | 42.0408 Gypsy | Colette | Clay | 36.9388 Gypsy | Amela | Holman | 35.7277 Gypsy | Aubrey | Terrell | 32.2457 Gypsy | Bruno | Griffin | 31.8139 Gypsy | Damian | Berger | 31.2459 Gypsy | Zelenia | Woods | 31.1616 Gypsy | Imogene | Mclean | 31.0005 ... 857 rows
This example queries the sample Tickit data set to return a running total of the number of tickets sold for the event "Mamma Mia!":
PREFIX tickit: <http://anzograph.com/tickit/> SELECT ?event ?month (SUM(?qty) OVER (PARTITION BY ?month ORDER BY ?event ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS ?tickets) FROM <http://anzograph.com/tickit> WHERE { ?s tickit:qtysold ?qty . ?s tickit:eventid ?eventid . ?eventid tickit:eventname ?event . ?s tickit:dateid ?date . ?date tickit:month ?month . filter(?event="Mamma Mia!") } ORDER BY ?tickets LIMIT 100
event | month | tickets -----------+-------+--------- Mamma Mia! | FEB | 1 Mamma Mia! | AUG | 1 Mamma Mia! | MAR | 1 Mamma Mia! | MAY | 1 Mamma Mia! | JUN | 2 Mamma Mia! | SEP | 2 Mamma Mia! | AUG | 2 Mamma Mia! | DEC | 2 Mamma Mia! | NOV | 2 Mamma Mia! | OCT | 2 Mamma Mia! | JAN | 2 Mamma Mia! | MAR | 2 Mamma Mia! | NOV | 3 Mamma Mia! | JAN | 3 Mamma Mia! | JUN | 4 Mamma Mia! | JUL | 4 Mamma Mia! | SEP | 4 ... 100 rows