Window Aggregates
AnzoGraph includes support for window aggregates. These analytic functions enable you to compute aggregate values on a particular partition or window of the result set. Unlike grouped aggregate functions that group the results and return a single value, window aggregates return a value for each row in the specified window. For example, using the grouped aggregate SUM function to add up the total number of tickets sold in a year returns one value: the total number of tickets sold for the year. By using the SUM window aggregate instead, the results could be partitioned by month so that the query returns 12 values: the sum of the number of tickets sold in each month of the year.
AnzoGraph provides the following built-in window aggregate and ranking functions:
- AVG: Calculates the average of the input expression values.
- COUNT: Counts the number of results defined by the expression.
- MAX: Returns the maximum of the input expression values.
- MIN: Returns the minimum of the input expression values.
- NTILE: Divides the rows in the partition into the specified number of ranked groups and returns the group that each value belongs to.
- PERCENTILE: Like using NTILE(100), this function divides the rows in the partition into 100 ranked groups and returns the group that each value belongs to.
- PRODUCT: Multiplies the input expression values and returns the product.
- QUARTILE: Like using NTILE(4), this function divides the rows in the partition into 4 ranked groups and returns the group that each value belongs to.
- ROW_NUMBER: Assigns unique numbers to each row in the partition.
- SUM: Returns the sum of the input expression values.
Window Aggregate Syntax
Use the following syntax when incorporating window aggregates in queries. Each keyword or option is described below the syntax.
(FUNCTION (expression1) OVER ( [ PARTITION BY expression2 ] | ORDER BY expression3 [ frame_clause ] ) AS ?variable )
FUNCTION (expression1)
The window aggregate name and expression or expressions that produce the values to operate on.
OVER()
The required OVER() keyword differentiates window aggregates from other aggregate functions.
PARTITION BY expression2
This optional clause forms the groups of rows, dividing the result set into partitions. If you do not include a PARTITION BY clause, the partition becomes the entire result set identified by the WHERE clause. When PARTITION BY is included, the system calculates the specified function for the rows in each partition.
ORDER BY expression3
This required clause defines the order or sequence of rows within each partition.
frame_clause
The optional frame clause defines the window frame, which specifies the rows in a partition to combine with the current row. There are two types of window frames:
- A fixed frame with two moving endpoints, where each row becomes the current row as the window frame slides forward in the partition. This type of frame is ideal for computing aggregations over moving time frames.
- A resizing frame with one anchored endpoint, where one row is a fixed endpoint and the frame resizes up (preceding) or down (following). This type of frame is ideal for computing running totals.
The reference point for all window frames is the current row. The frame clause can be one of the following options:
[ ROWS ] frame_start [ ROWS ] BETWEEN frame_start AND frame_end
where frame_start is one of the following options:
UNBOUNDED PRECEDING positive_value PRECEDING CURRENT ROW positive_value FOLLOWING
and frame_end is one of the following options:
positive_value PRECEDING CURRENT ROW positive_value FOLLOWING UNBOUNDED FOLLOWING
When a frame clause is not specified, the system treats the window frame as unbounded: ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
Window Aggregate Examples
The following example queries the sample Tickit data set to return the number of events per month:
SELECT ?month (COUNT(?event) OVER (PARTITION BY ?month) AS ?events) FROM <tickit> WHERE { { SELECT ?event (MONTH(?date) AS ?month) WHERE { ?sale <dateid> ?dateid . ?dateid <caldate> ?date . ?sale <eventid> ?eventid . ?eventid <eventname> ?event . } } } ORDER BY ?month
month | events ------+-------- 1 | 26117 1 | 26117 ... 2 | 30911 2 | 30911 ... 3 | 33932 3 | 33932 ... 4 | 32196 4 | 32196 ... 8 | 34315 8 | 34315 ... 12 | 11648 12 | 11648 ... 364953 rows
This example queries the sample Tickit data set to return the percentage of a salesperson's total sales that came from the "Gypsy" event:
SELECT ?event_name ?fname ?lname ((?dollars * 100.0/(SUM(?dollars) OVER(PARTITION BY ?event))) as ?percent_of_sales) FROM <tickit> WHERE { ?sale <eventid> ?event . ?event <eventname> ?event_name . ?sale <sellerid> ?salesperson . ?sale <pricepaid> ?dollars . ?salesperson <firstname> ?fname . ?salesperson <lastname> ?lname . FILTER(?event_name = "Gypsy"). } ORDER BY ?event_name desc(?percent_of_sales)
event_name | fname | lname | percent_of_sales -----------+---------+-----------+------------------ Gypsy | Zoe | Sosa | 100.000000 Gypsy | Xaviera | Jacobson | 50.941458 Gypsy | Brianna | Mcfarland | 50.507614 Gypsy | Alexa | Baird | 45.792564 Gypsy | Roanna | Wood | 42.040816 Gypsy | Colette | Clay | 36.938776 Gypsy | Amela | Holman | 35.727660 Gypsy | Aubrey | Terrell | 32.245681 Gypsy | Bruno | Griffin | 31.813858 Gypsy | Damian | Berger | 31.245923 Gypsy | Zelenia | Woods | 31.161644 Gypsy | Imogene | Mclean | 31.000483 Gypsy | Helen | Haney | 30.295666 Gypsy | Silas | Wynn | 28.730306 Gypsy | Caryn | Hester | 28.714012 Gypsy | Camden | Horne | 27.253097 Gypsy | Adara | Burris | 26.809349 Gypsy | Raya | Fleming | 26.790955 ... 857 rows
This example queries the sample Tickit data set to return a running total of the number of tickets sold for the event "Mamma Mia!":
SELECT ?event ?month (SUM(?qty) OVER (PARTITION BY ?month ORDER BY ?event ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS ?tickets) FROM <tickit> WHERE { ?s <qtysold> ?qty . ?s <eventid> ?eventid . ?eventid <eventname> ?event . ?s <dateid> ?date . ?date <month> ?month . filter(?event="Mamma Mia!") } ORDER BY ?tickets LIMIT 500
event | month | tickets -----------+-------+--------- Mamma Mia! | MAY | 1 Mamma Mia! | JAN | 1 Mamma Mia! | SEP | 1 Mamma Mia! | APR | 2 Mamma Mia! | JAN | 2 Mamma Mia! | AUG | 2 Mamma Mia! | JUN | 2 Mamma Mia! | MAY | 2 Mamma Mia! | NOV | 2 Mamma Mia! | SEP | 3 Mamma Mia! | NOV | 3 Mamma Mia! | FEB | 4 Mamma Mia! | APR | 4 ... Mamma Mia! | JAN | 85 Mamma Mia! | JUL | 85 Mamma Mia! | MAR | 85 Mamma Mia! | JUN | 85 Mamma Mia! | OCT | 85 Mamma Mia! | NOV | 85 Mamma Mia! | FEB | 86 Mamma Mia! | APR | 86 ... 500 rows