Window Aggregate and Ranking Functions
Window aggregates operate on a particular partition or window of the result set. Unlike grouped aggregate functions that group the result set and return a single row, window aggregates retain the resulting rows and return a value for each row. For example, using the grouped aggregate SUM function to add the total number of tickets sold in a year returns one value: the total number of tickets sold for the year. By using WINDOW_SUM, the results could be partitioned by month so that the query returns 12 values: the sum of the number of tickets sold in each month of the year. This topic describes the window aggregate functions in Anzo.
Typographical Conventions
This documentation uses the following conventions in function syntax:
CAPS
: Although SPARQL is case-insensitive, function names and other keywords are written in uppercase for readability.[ argument ]
: Brackets are used to indicate optional arguments. Arguments without brackets are required.
Functions
- WINDOW_AVG: Calculates the average value of each group of values.
- WINDOW_COUNT: Counts the number of values in each group of values.
- WINDOW_MAX: Calculates the maximum value of each group of values.
- WINDOW_MIN: Calculates the minimum value of each group of values.
- WINDOW_NTILE: Divides the rows in the partition into the specified number of ranked groups and returns the group that each value belongs to.
- WINDOW_PERCENTILE: Divides the rows in the partition into 100 ranked groups and returns the group that each value belongs to.
- WINDOW_PERCENTILE_CONT: Calculates a percentile based on the continuous distribution of the specified groups of values.
- WINDOW_PERCENTILE_DISC: Calculates a percentile based on the discrete distribution of the specified groups of values.
- WINDOW_PRODUCT: Calculates the product of each group of values.
- WINDOW_QUARTILE: Divides the rows in the partition into four ranked groups and returns the group that each value belongs to.
- WINDOW_SUM: Calculates the sum of each group of values.
WINDOW_AVG
This function calculates the average value of each group of values.
Syntax
WINDOW_AVG(value [, partition_over ] [, order_by ] [, order ] [, start_frame_type ] [, start_frame_value ] [, end_frame_type ] [, end_frame_value ])
Argument | Data Type | Description |
---|---|---|
value | numeric | Required argument that defines the groups of values to operate on. |
partition_over | variable | Optional argument that partitions the results into groups of rows. If you do not include partition_over , the partition becomes the entire set identified by value . |
order_by | variable | Optional argument that defines the order or sequence of rows within each partition. |
order | boolean | Optional argument that controls whether the order is ascending or descending. When true , the order is ascending. When false , the order is descending. |
start_frame_type | string | When order_by is specified, the optional start_frame_type , start_frame_value , end_frame_type , and end_frame_value arguments can be included to refine the set of rows to include in the partitions or groups.The
|
start_frame_value | int | Optional argument that specifies the starting row based on the start_frame_type value. |
end_frame_type | string | This argument defines the ending row of the partition and can be one of the following values:
|
end_frame_value | int | Optional argument that specifies the ending row based on the end_frame_type value. |
Returns
Data Type | Description |
---|---|
int | The average values. |
WINDOW_COUNT
This function counts the number of values in each group of values.
Syntax
WINDOW_COUNT(value [, partition_over ] [, order_by ] [, order ] [, start_frame_type ] [, start_frame_value ] [, end_frame_type ] [, end_frame_value ])
Argument | Data Type | Description |
---|---|---|
value | numeric | Required argument that defines the groups of values to operate on. |
partition_over | variable | Optional argument that partitions the results into groups of rows. If you do not include partition_over , the partition becomes the entire set identified by value . |
order_by | variable | Optional argument that defines the order or sequence of rows within each partition. |
order | boolean | Optional argument that controls whether the order is ascending or descending. When true , the order is ascending. When false , the order is descending. |
start_frame_type | string | When order_by is specified, the optional start_frame_type , start_frame_value , end_frame_type , and end_frame_value arguments can be included to refine the set of rows to include in the partitions or groups.The
|
start_frame_value | int | Optional argument that specifies the starting row based on the start_frame_type value. |
end_frame_type | string | This argument defines the ending row of the partition and can be one of the following values:
|
end_frame_value | int | Optional argument that specifies the ending row based on the end_frame_type value. |
Returns
Data Type | Description |
---|---|
int | The counts of values. |
WINDOW_MAX
This function calculates the maximum value of each group of values.
Syntax
WINDOW_MAX(value [, partition_over ] [, order_by ] [, order ] [, start_frame ] [, start_frame_type ] [, start_frame_value ] [, end_frame_type ] [, end_frame_value ])
Argument | Data Type | Description |
---|---|---|
value | numeric | Required argument that defines the groups of values to operate on. |
partition_over | variable | Optional argument that partitions the results into groups of rows. If you do not include partition_over , the partition becomes the entire set identified by value . |
order_by | variable | Optional argument that defines the order or sequence of rows within each partition. |
order | boolean | Optional argument that controls whether the order is ascending or descending. When true , the order is ascending. When false , the order is descending. |
start_frame_type | string | When order_by is specified, the optional start_frame_type , start_frame_value , end_frame_type , and end_frame_value arguments can be included to refine the set of rows to include in the partitions or groups.The
|
start_frame_value | int | Optional argument that specifies the starting row based on the start_frame_type value. |
end_frame_type | string | This argument defines the ending row of the partition and can be one of the following values:
|
end_frame_value | int | Optional argument that specifies the ending row based on the end_frame_type value. |
Returns
Data Type | Description |
---|---|
int | The maximum values. |
WINDOW_MIN
This function calculates the minimum value of each group of values.
Syntax
WINDOW_MIN(value [, partition_over ] [, order_by ] [, order ] [, start_frame_type ] [, start_frame_value ] [, end_frame_type ] [, end_frame_value ])
Argument | Data Type | Description |
---|---|---|
value | numeric | Required argument that defines the groups of values to operate on. |
partition_over | variable | Optional argument that partitions the results into groups of rows. If you do not include partition_over , the partition becomes the entire set identified by value . |
order_by | variable | Optional argument that defines the order or sequence of rows within each partition. |
order | boolean | Optional argument that controls whether the order is ascending or descending. When true , the order is ascending. When false , the order is descending. |
start_frame_type | string | When order_by is specified, the optional start_frame_type , start_frame_value , end_frame_type , and end_frame_value arguments can be included to refine the set of rows to include in the partitions or groups.The
|
start_frame_value | int | Optional argument that specifies the starting row based on the start_frame_type value. |
end_frame_type | string | This argument defines the ending row of the partition and can be one of the following values:
|
end_frame_value | int | Optional argument that specifies the ending row based on the end_frame_type value. |
Returns
Data Type | Description |
---|---|
int | The minimum values. |
WINDOW_NTILE
This function divides the rows in the partition into the specified number of ranked groups and returns the group that each value belongs to.
Syntax
WINDOW_NTILE(ntile, value, order_by [, partition_over ])
Argument | Data Type | Description |
---|---|---|
ntile | int | Required argument that specifies the number of ranking groups. |
value | numeric | Required argument that defines the groups of values to operate on. |
order_by | variable | Required argument that defines the order or sequence of rows within each partition. |
partition_over | variable | Optional argument that partitions the results into groups of rows. If you do not include partition_over , the partition becomes the entire set identified by value . |
Returns
Data Type | Description |
---|---|
int | The group that the values belong to. |
WINDOW_PERCENTILE
This function divides the rows in the partition into 100 ranked groups and returns the group that each value belongs to.
Syntax
WINDOW_PERCENTILE(value, order_by [, partition_over ])
Argument | Data Type | Description |
---|---|---|
value | numeric | Required argument that defines the groups of values to operate on. |
order_by | variable | Required argument that defines the order or sequence of rows within each partition. |
partition_over | variable | Optional argument that partitions the results into groups of rows. If you do not include partition_over , the partition becomes the entire set identified by value . |
Returns
Data Type | Description |
---|---|
int | The group that the values belong to. |
WINDOW_PERCENTILE_CONT
This function calculates a percentile based on the continuous distribution of the specified groups of values. The returned value is interpolated and may not be equal to any of the values in the group.
Syntax
WINDOW_PERCENTILE_CONT(percentile, value, order_by [, partition_over ])
Argument | Data Type | Description |
---|---|---|
percentile | int | Required argument that specifies the percentile for the calculation. |
value | numeric | Required argument that defines the groups of values to operate on. |
order_by | variable | Required argument that defines the order or sequence of rows within each partition. |
partition_over | variable | Optional argument that partitions the results into groups of rows. If you do not include partition_over , the partition becomes the entire set identified by value . |
Returns
Data Type | Description |
---|---|
int | The interpolated percentiles. |
WINDOW_PERCENTILE_DISC
This function calculates a percentile based on the discrete distribution of the specified groups of values.
Syntax
WINDOW_PERCENTILE_DISC(percentile, value, order_by [, partition_over ])
Argument | Data Type | Description |
---|---|---|
percentile | int | Required argument that specifies the percentile for the calculation. |
value | numeric | Required argument that defines the groups of values to operate on. |
order_by | variable | Required argument that defines the order or sequence of rows within each partition. |
partition_over | variable | Optional argument that partitions the results into groups of rows. If you do not include partition_over , the partition becomes the entire set identified by value . |
Returns
Data Type | Description |
---|---|
int | The percentiles based on the discrete distribution of the groups. |
WINDOW_PRODUCT
This function calculates the product of each group of values.
Syntax
WINDOW_PRODUCT(value [, partition_over ] [, order_by ] [, order ] [, start_frame_type ] [, start_frame_value ] [, end_frame_type ] [, end_frame_value ])
Argument | Data Type | Description |
---|---|---|
value | numeric | Required argument that defines the groups of values to operate on. |
partition_over | variable | Optional argument that partitions the results into groups of rows. If you do not include partition_over , the partition becomes the entire set identified by value . |
order_by | variable | Optional argument that defines the order or sequence of rows within each partition. |
order | boolean | Optional argument that controls whether the order is ascending or descending. When true , the order is ascending. When false , the order is descending. |
start_frame_type | string | When order_by is specified, the optional start_frame_type , start_frame_value , end_frame_type , and end_frame_value arguments can be included to refine the set of rows to include in the partitions or groups.The
|
start_frame_value | int | Optional argument that specifies the starting row based on the start_frame_type value. |
end_frame_type | string | This argument defines the ending row of the partition and can be one of the following values:
|
end_frame_value | int | Optional argument that specifies the ending row based on the end_frame_type value. |
Returns
Data Type | Description |
---|---|
int | The products of the groups. |
WINDOW_QUARTILE
This function divides the rows in the partition into four ranked groups and returns the group that each value belongs to.
Syntax
WINDOW_QUARTILE(value, order_by [, partition_over ])
Argument | Data Type | Description |
---|---|---|
value | numeric | Required argument that defines the groups of values to operate on. |
order_by | variable | Required argument that defines the order or sequence of rows within each partition. |
partition_over | variable | Optional argument that partitions the results into groups of rows. If you do not include partition_over , the partition becomes the entire set identified by value . |
Returns
Data Type | Description |
---|---|
int | The group that the values belong to. |
WINDOW_SUM
This function calculates the sum of each group of values.
Syntax
WINDOW_SUM(value [, partition_over ] [, order_by ] [, order ] [, start_frame_type ] [, start_frame_value ] [, end_frame_type ] [, end_frame_value ])
Argument | Data Type | Description |
---|---|---|
value | numeric | Required argument that defines the groups of values to operate on. |
partition_over | variable | Optional argument that partitions the results into groups of rows. If you do not include partition_over , the partition becomes the entire set identified by value . |
order_by | variable | Optional argument that defines the order or sequence of rows within each partition. |
order | boolean | Optional argument that controls whether the order is ascending or descending. When true , the order is ascending. When false , the order is descending. |
start_frame_type | string | When order_by is specified, the optional start_frame_type , start_frame_value , end_frame_type , and end_frame_value arguments can be included to refine the set of rows to include in the partitions or groups.The
|
start_frame_value | int | Optional argument that specifies the starting row based on the start_frame_type value. |
end_frame_type | string | This argument defines the ending row of the partition and can be one of the following values:
|
end_frame_value | int | Optional argument that specifies the ending row based on the end_frame_type value. |
Returns
Data Type | Description |
---|---|
int | The sum of each group. |
Example
The example below first creates data by running the following INSERT DATA query in a graphmart Query Step:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX ex: <http://example.com/csi#> INSERT DATA { GRAPH ${targetGraph} { ex:sale1 ex:date "2014-05-23T10:20:13"^^xsd:dateTime ; ex:volume 15 . ex:sale2 ex:date "2014-06-23T10:20:13"^^xsd:dateTime ; ex:volume 3 . ex:sale3 ex:date "2014-06-23T10:22:13"^^xsd:dateTime ; ex:volume 35 . ex:sale4 ex:date "2014-07-23T10:20:13"^^xsd:dateTime ; ex:volume 66 . ex:sale5 ex:date "2014-09-23T10:20:13"^^xsd:dateTime ; ex:volume 19 . ex:sale6 ex:date "2014-11-23T10:20:13"^^xsd:dateTime ; ex:volume 33 . ex:sale7 ex:date "2014-12-23T10:20:13"^^xsd:dateTime ; ex:volume 12 . } }
The following query against the new data uses the WINDOW_SUM function to return the total volume of sales for each month:
PREFIX ex: <http://example.com/csi#> SELECT DISTINCT ?Month ?Total_Volume WHERE { { SELECT ?Month (WINDOW_SUM(?o, ?Month) AS ?Total_Volume) WHERE { ?s ex:volume ?o . ?s ex:date ?date . BIND (MONTH(?date) AS ?Month) } } }
The query returns the following results:
Month | Total_Volume ------+------------- 5 | 15 6 | 38 7 | 66 9 | 19 11 | 33 12 | 12