Window Aggregate and Ranking Functions

Window aggregates operate on a particular partition or window of the result set. Unlike grouped aggregate functions that group the result set and return a single row, window aggregates retain the resulting rows and return a value for each row. For example, using the grouped aggregate SUM function to add the total number of tickets sold in a year returns one value: the total number of tickets sold for the year. By using WINDOW_SUM, the results could be partitioned by month so that the query returns 12 values: the sum of the number of tickets sold in each month of the year. This topic describes the window aggregate functions in Anzo.

Typographical Conventions

This documentation uses the following conventions in function syntax:

  • CAPS: Although SPARQL is case-insensitive, function names and other keywords are written in uppercase for readability.
  • [ argument ]: Brackets are used to indicate optional arguments. Arguments without brackets are required.

Functions

  • WINDOW_AVG: Calculates the average value of each group of values.
  • WINDOW_COUNT: Counts the number of values in each group of values.
  • WINDOW_MAX: Calculates the maximum value of each group of values.
  • WINDOW_MIN: Calculates the minimum value of each group of values.
  • WINDOW_NTILE: Divides the rows in the partition into the specified number of ranked groups and returns the group that each value belongs to.
  • WINDOW_PERCENTILE: Divides the rows in the partition into 100 ranked groups and returns the group that each value belongs to.
  • WINDOW_PERCENTILE_CONT: Calculates a percentile based on the continuous distribution of the specified groups of values.
  • WINDOW_PERCENTILE_DISC: Calculates a percentile based on the discrete distribution of the specified groups of values.
  • WINDOW_PRODUCT: Calculates the product of each group of values.
  • WINDOW_QUARTILE: Divides the rows in the partition into four ranked groups and returns the group that each value belongs to.
  • WINDOW_SUM: Calculates the sum of each group of values.

WINDOW_AVG

This function calculates the average value of each group of values.

Syntax

WINDOW_AVG(value [, partition_over ] [, order_by ] [, order ]
          [, start_frame_type ] [, start_frame_value ]
          [, end_frame_type ] [, end_frame_value ])
Argument Data Type Description
value numeric Required argument that defines the groups of values to operate on.
partition_over variable Optional argument that partitions the results into groups of rows. If you do not include partition_over, the partition becomes the entire set identified by value.
order_by variable Optional argument that defines the order or sequence of rows within each partition.
order boolean Optional argument that controls whether the order is ascending or descending. When true, the order is ascending. When false, the order is descending.
start_frame_type string When order_by is specified, the optional start_frame_type, start_frame_value, end_frame_type, and end_frame_value arguments can be included to refine the set of rows to include in the partitions or groups.

The start_frame_type argument defines the starting row of the partition and can be one of the following values:

  • CURRENT ROW
  • UNBOUNDED PRECEDING
  • PRECEDING start_frame_value
  • FOLLOWING start_frame_value
start_frame_value int Optional argument that specifies the starting row based on the start_frame_type value.
end_frame_type string This argument defines the ending row of the partition and can be one of the following values:
  • CURRENT ROW
  • UNBOUNDED FOLLOWING
  • PRECEDING end_frame_value
  • FOLLOWING end_frame_value
end_frame_value int Optional argument that specifies the ending row based on the end_frame_type value.

Returns

Data Type Description
int The average values.

WINDOW_COUNT

This function counts the number of values in each group of values.

Syntax

WINDOW_COUNT(value [, partition_over ] [, order_by ] [, order ]
            [, start_frame_type ] [, start_frame_value ]
            [, end_frame_type ] [, end_frame_value ])
Argument Data Type Description
value numeric Required argument that defines the groups of values to operate on.
partition_over variable Optional argument that partitions the results into groups of rows. If you do not include partition_over, the partition becomes the entire set identified by value.
order_by variable Optional argument that defines the order or sequence of rows within each partition.
order boolean Optional argument that controls whether the order is ascending or descending. When true, the order is ascending. When false, the order is descending.
start_frame_type string When order_by is specified, the optional start_frame_type, start_frame_value, end_frame_type, and end_frame_value arguments can be included to refine the set of rows to include in the partitions or groups.

The start_frame_type argument defines the starting row of the partition and can be one of the following values:

  • CURRENT ROW
  • UNBOUNDED PRECEDING
  • PRECEDING start_frame_value
  • FOLLOWING start_frame_value
start_frame_value int Optional argument that specifies the starting row based on the start_frame_type value.
end_frame_type string This argument defines the ending row of the partition and can be one of the following values:
  • CURRENT ROW
  • UNBOUNDED FOLLOWING
  • PRECEDING end_frame_value
  • FOLLOWING end_frame_value
end_frame_value int Optional argument that specifies the ending row based on the end_frame_type value.

Returns

Data Type Description
int The counts of values.

WINDOW_MAX

This function calculates the maximum value of each group of values.

Syntax

WINDOW_MAX(value [, partition_over ] [, order_by ] [, order ]
          [, start_frame ] [, start_frame_type ] [, start_frame_value ]
          [, end_frame_type ] [, end_frame_value ])
Argument Data Type Description
value numeric Required argument that defines the groups of values to operate on.
partition_over variable Optional argument that partitions the results into groups of rows. If you do not include partition_over, the partition becomes the entire set identified by value.
order_by variable Optional argument that defines the order or sequence of rows within each partition.
order boolean Optional argument that controls whether the order is ascending or descending. When true, the order is ascending. When false, the order is descending.
start_frame_type string When order_by is specified, the optional start_frame_type, start_frame_value, end_frame_type, and end_frame_value arguments can be included to refine the set of rows to include in the partitions or groups.

The start_frame_type argument defines the starting row of the partition and can be one of the following values:

  • CURRENT ROW
  • UNBOUNDED PRECEDING
  • PRECEDING start_frame_value
  • FOLLOWING start_frame_value
start_frame_value int Optional argument that specifies the starting row based on the start_frame_type value.
end_frame_type string This argument defines the ending row of the partition and can be one of the following values:
  • CURRENT ROW
  • UNBOUNDED FOLLOWING
  • PRECEDING end_frame_value
  • FOLLOWING end_frame_value
end_frame_value int Optional argument that specifies the ending row based on the end_frame_type value.

Returns

Data Type Description
int The maximum values.

WINDOW_MIN

This function calculates the minimum value of each group of values.

Syntax

WINDOW_MIN(value [, partition_over ] [, order_by ] [, order ]
          [, start_frame_type ] [, start_frame_value ]
          [, end_frame_type ] [, end_frame_value ])
Argument Data Type Description
value numeric Required argument that defines the groups of values to operate on.
partition_over variable Optional argument that partitions the results into groups of rows. If you do not include partition_over, the partition becomes the entire set identified by value.
order_by variable Optional argument that defines the order or sequence of rows within each partition.
order boolean Optional argument that controls whether the order is ascending or descending. When true, the order is ascending. When false, the order is descending.
start_frame_type string When order_by is specified, the optional start_frame_type, start_frame_value, end_frame_type, and end_frame_value arguments can be included to refine the set of rows to include in the partitions or groups.

The start_frame_type argument defines the starting row of the partition and can be one of the following values:

  • CURRENT ROW
  • UNBOUNDED PRECEDING
  • PRECEDING start_frame_value
  • FOLLOWING start_frame_value
start_frame_value int Optional argument that specifies the starting row based on the start_frame_type value.
end_frame_type string This argument defines the ending row of the partition and can be one of the following values:
  • CURRENT ROW
  • UNBOUNDED FOLLOWING
  • PRECEDING end_frame_value
  • FOLLOWING end_frame_value
end_frame_value int Optional argument that specifies the ending row based on the end_frame_type value.

Returns

Data Type Description
int The minimum values.

WINDOW_NTILE

This function divides the rows in the partition into the specified number of ranked groups and returns the group that each value belongs to.

Syntax

WINDOW_NTILE(ntile, value, order_by [, partition_over ])
Argument Data Type Description
ntile int Required argument that specifies the number of ranking groups.
value numeric Required argument that defines the groups of values to operate on.
order_by variable Required argument that defines the order or sequence of rows within each partition.
partition_over variable Optional argument that partitions the results into groups of rows. If you do not include partition_over, the partition becomes the entire set identified by value.

Returns

Data Type Description
int The group that the values belong to.

WINDOW_PERCENTILE

This function divides the rows in the partition into 100 ranked groups and returns the group that each value belongs to.

Syntax

WINDOW_PERCENTILE(value, order_by [, partition_over ])
Argument Data Type Description
value numeric Required argument that defines the groups of values to operate on.
order_by variable Required argument that defines the order or sequence of rows within each partition.
partition_over variable Optional argument that partitions the results into groups of rows. If you do not include partition_over, the partition becomes the entire set identified by value.

Returns

Data Type Description
int The group that the values belong to.

WINDOW_PERCENTILE_CONT

This function calculates a percentile based on the continuous distribution of the specified groups of values. The returned value is interpolated and may not be equal to any of the values in the group.

Syntax

WINDOW_PERCENTILE_CONT(percentile, value, order_by [, partition_over ])
Argument Data Type Description
percentile int Required argument that specifies the percentile for the calculation.
value numeric Required argument that defines the groups of values to operate on.
order_by variable Required argument that defines the order or sequence of rows within each partition.
partition_over variable Optional argument that partitions the results into groups of rows. If you do not include partition_over, the partition becomes the entire set identified by value.

Returns

Data Type Description
int The interpolated percentiles.

WINDOW_PERCENTILE_DISC

This function calculates a percentile based on the discrete distribution of the specified groups of values.

Syntax

WINDOW_PERCENTILE_DISC(percentile, value, order_by [, partition_over ])
Argument Data Type Description
percentile int Required argument that specifies the percentile for the calculation.
value numeric Required argument that defines the groups of values to operate on.
order_by variable Required argument that defines the order or sequence of rows within each partition.
partition_over variable Optional argument that partitions the results into groups of rows. If you do not include partition_over, the partition becomes the entire set identified by value.

Returns

Data Type Description
int The percentiles based on the discrete distribution of the groups.

WINDOW_PRODUCT

This function calculates the product of each group of values.

Syntax

WINDOW_PRODUCT(value [, partition_over ] [, order_by ] [, order ]
              [, start_frame_type ] [, start_frame_value ]
              [, end_frame_type ] [, end_frame_value ])
Argument Data Type Description
value numeric Required argument that defines the groups of values to operate on.
partition_over variable Optional argument that partitions the results into groups of rows. If you do not include partition_over, the partition becomes the entire set identified by value.
order_by variable Optional argument that defines the order or sequence of rows within each partition.
order boolean Optional argument that controls whether the order is ascending or descending. When true, the order is ascending. When false, the order is descending.
start_frame_type string When order_by is specified, the optional start_frame_type, start_frame_value, end_frame_type, and end_frame_value arguments can be included to refine the set of rows to include in the partitions or groups.

The start_frame_type argument defines the starting row of the partition and can be one of the following values:

  • CURRENT ROW
  • UNBOUNDED PRECEDING
  • PRECEDING start_frame_value
  • FOLLOWING start_frame_value
start_frame_value int Optional argument that specifies the starting row based on the start_frame_type value.
end_frame_type string This argument defines the ending row of the partition and can be one of the following values:
  • CURRENT ROW
  • UNBOUNDED FOLLOWING
  • PRECEDING end_frame_value
  • FOLLOWING end_frame_value
end_frame_value int Optional argument that specifies the ending row based on the end_frame_type value.

Returns

Data Type Description
int The products of the groups.

WINDOW_QUARTILE

This function divides the rows in the partition into four ranked groups and returns the group that each value belongs to.

Syntax

WINDOW_QUARTILE(value, order_by [, partition_over ])
Argument Data Type Description
value numeric Required argument that defines the groups of values to operate on.
order_by variable Required argument that defines the order or sequence of rows within each partition.
partition_over variable Optional argument that partitions the results into groups of rows. If you do not include partition_over, the partition becomes the entire set identified by value.

Returns

Data Type Description
int The group that the values belong to.

WINDOW_SUM

This function calculates the sum of each group of values.

Syntax

WINDOW_SUM(value [, partition_over ] [, order_by ] [, order ]
          [, start_frame_type ] [, start_frame_value ]
          [, end_frame_type ] [, end_frame_value ])
Argument Data Type Description
value numeric Required argument that defines the groups of values to operate on.
partition_over variable Optional argument that partitions the results into groups of rows. If you do not include partition_over, the partition becomes the entire set identified by value.
order_by variable Optional argument that defines the order or sequence of rows within each partition.
order boolean Optional argument that controls whether the order is ascending or descending. When true, the order is ascending. When false, the order is descending.
start_frame_type string When order_by is specified, the optional start_frame_type, start_frame_value, end_frame_type, and end_frame_value arguments can be included to refine the set of rows to include in the partitions or groups.

The start_frame_type argument defines the starting row of the partition and can be one of the following values:

  • CURRENT ROW
  • UNBOUNDED PRECEDING
  • PRECEDING start_frame_value
  • FOLLOWING start_frame_value
start_frame_value int Optional argument that specifies the starting row based on the start_frame_type value.
end_frame_type string This argument defines the ending row of the partition and can be one of the following values:
  • CURRENT ROW
  • UNBOUNDED FOLLOWING
  • PRECEDING end_frame_value
  • FOLLOWING end_frame_value
end_frame_value int Optional argument that specifies the ending row based on the end_frame_type value.

Returns

Data Type Description
int The sum of each group.

Example

The example below first creates data by running the following INSERT DATA query in a graphmart Query Step:

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX ex: <http://example.com/csi#>
INSERT DATA {
  GRAPH ${targetGraph} {
    ex:sale1 ex:date "2014-05-23T10:20:13"^^xsd:dateTime ; ex:volume 15 .
    ex:sale2 ex:date "2014-06-23T10:20:13"^^xsd:dateTime ; ex:volume 3 .
    ex:sale3 ex:date "2014-06-23T10:22:13"^^xsd:dateTime ; ex:volume 35 .
    ex:sale4 ex:date "2014-07-23T10:20:13"^^xsd:dateTime ; ex:volume 66 .
    ex:sale5 ex:date "2014-09-23T10:20:13"^^xsd:dateTime ; ex:volume 19 .
    ex:sale6 ex:date "2014-11-23T10:20:13"^^xsd:dateTime ; ex:volume 33 .
    ex:sale7 ex:date "2014-12-23T10:20:13"^^xsd:dateTime ; ex:volume 12 .
  }
}

The following query against the new data uses the WINDOW_SUM function to return the total volume of sales for each month:

PREFIX ex: <http://example.com/csi#>
SELECT DISTINCT ?Month	?Total_Volume
WHERE {
  {
    SELECT
      ?Month (WINDOW_SUM(?o, ?Month) AS ?Total_Volume)
    WHERE { 
      ?s ex:volume ?o .
      ?s ex:date ?date .
      BIND (MONTH(?date) AS ?Month)
    }
  }
}

The query returns the following results:

Month | Total_Volume
------+-------------
5     |     15
6     |     38
7     |     66
9     |     19
11    |     33
12    |     12