Aggregate Functions
This topic describes the aggregate functions in Anzo.
Typographical Conventions
This documentation uses the following conventions in function syntax:
CAPS
: Although SPARQL is case-insensitive, function names and other keywords are written in uppercase for readability.
[ argument ]
: Brackets are used to indicate optional arguments. Arguments without brackets are required.
Functions
- AVERAGEIF: Calculates the average of the range of values that meet the specified criterion.
- AVERAGEIFS: Calculates the averages of the ranges of values that meet the specified criteria.
- AVG: Calculates the average (arithmetic mean) value for a group of numbers.
- CHOOSE_BY_MAX: Returns the value from a group that corresponds to the maximum value from another group.
- CHOOSE_BY_MIN: Returns the value from a group that corresponds to the minimum value from another group.
- COUNT: Counts the number of values that exist for a group.
- COUNT_DISTINCT: Counts the number of unique values that exist for a group.
- COUNTIF: Counts the number of values that meet the specified criterion.
- COUNTIFS: Counts the number of values that meet the specified criteria.
- GROUP_CONCAT: Concatenates a group of strings into a single string.
- GROUPCONCAT: Concatenates a group of strings into a single string. This function is a customizable version of GROUP_CONCAT.
- MAX: Returns the maximum value from each group of values.
- MEDIAN: Returns the median number out of a group of numbers.
- MIN: Returns the minimum value from each group of values.
- MODE: Returns the mode (the value that occurs most frequently) from a group of values.
- MODEPERCENT: Calculates the percentage of values in a group that belong to the mode.
- PERCENTILE_CONT: Calculates a percentile based on the continuous distribution of the specified group of values.
- PERCENTILE_DISC: Calculates a percentile based on the discrete distribution of the specified group of values.
- PRODUCT: Calculates the product of a group of numbers.
- SAMPLE: Returns an arbitrary value from the specified group of values.
- STDEV: Measures the standard deviation in a group of numbers.
- STDEVP: Calculates the product of the standard deviation for a group of numbers.
- SUM: Calculates the sum of the numbers within a group.
- SUMIF: Calculates the sum of the range of values that meet the specified criterion.
- SUMIFS: Calculates the sums of the ranges of values that meet the specified criteria.
- SUMPRODUCT: Multiplies the numbers in a group and adds the results.
- SUMSQ: Calculates the square root of each number in a group and adds the results.
- VAR: Calculates the unbiased (sample) variance of a group of numbers.
- VARP: Calculates the biased (population) variance of a group of numbers.
- WEIGHTEDAVERAGE: Calculates the weighted average of a group of values.
AVERAGEIF
This function calculates the average of the range of values that meet the specified criterion.
You can use AVERAGEIFS to specify multiple value ranges and conditions.
Syntax
AVERAGEIF(values_to_test, criterion [, range_of_values ])
Argument |
Data Type |
Description |
values_to_test |
RDF term |
The literal, URI, or blank node value that defines the values to test against the criteria . |
criterion |
RDF term |
The literal, URI, or blank node value that defines the condition to test values against. |
range_of_values |
numeric |
An optional number that defines the range of values to average. When omitted, values_to_test is used. |
Returns
Data Type |
Description |
number |
The average value from the range of values that meet the criterion. |
AVERAGEIFS
This function calculates the averages of the ranges of values that meet the specified criteria. Unlike AVERAGEIF, this function enables you to specify multiple ranges and multiple conditions.
Syntax
AVERAGEIFS(values_to_average, value_range1, criteria1,
value_range2, criteria2
[, value_rangeN, criteriaN ])
Argument |
Data Type |
Description |
values_to_average |
numeric |
The numeric value that defines the overall range of values to evaluate. |
value_range1–N |
RDF term |
The literal, URI, or blank node value that defines the range of values to test against the corresponding criteria . |
criteria1–N |
RDF term |
The literal, URI, or blank node value that defines the condition to test the corresponding value_range against. |
Returns
Data Type |
Description |
number |
The average values from the ranges of values that meet the criteria. |
AVG
This function calculates the average (arithmetic mean) value for a group of numbers.
Syntax
AVG(number)
Argument |
Data Type |
Description |
number |
numeric |
The numeric value for which to calculate the average. |
Returns
Data Type |
Description |
number |
The arithmetic mean of the input values. |
CHOOSE_BY_MAX
This function calculates the maximum value for one group and returns the value from another group that corresponds to the maximum from the first group.
Syntax
CHOOSE_BY_MAX(test, value)
Argument |
Data Type |
Description |
test |
RDF term |
The group of literal, URI, or blank node values from which to find the maximum value. |
value |
RDF term |
The group of literal, URI, or blank node values from which to return the value that corresponds to the maximum value of test . |
Example
In a fictional ticket sales data set, the following statement returns the ID of the buyer who paid the most:
CHOOSE_BY_MAX(?totalPaid, ?buyerID)
Returns
Data Type |
Description |
RDF term |
The term from the value group that corresponds to the maximum value from the test group. |
CHOOSE_BY_MIN
This function calculates the minimum value for one group and returns the value from another group that corresponds to the minimum from the first group.
Syntax
CHOOSE_BY_MIN(test, value)
Argument |
Data Type |
Description |
test |
RDF term |
The group of literal, URI, or blank node values from which to find the minimum value. |
value |
RDF term |
The group of literal, URI, or blank node values from which to return the value that corresponds to the minimum value of test . |
Example
In a fictional ticket sales data set, the following statement returns the ID of the seller who sold the least number of tickets:
CHOOSE_BY_MIN(?totalTickets, ?sellerID)
Returns
Data Type |
Description |
RDF term |
The term from the value group that corresponds to the minimum value from the test group. |
COUNT
This function counts the number of values that exist for a group.
Syntax
COUNT(value)
Argument |
Data Type |
Description |
value |
RDF term |
The group of literal, URI, or blank node values to count. |
Returns
Data Type |
Description |
long |
The number of values in the group. |
COUNT_DISTINCT
This function counts the number of unique values that exist for a group.
Syntax
COUNT_DISTINCT(value)
Argument |
Data Type |
Description |
value |
RDF term |
The group of literal, URI, or blank node values for which to count the number of distinct values. |
Returns
Data Type |
Description |
long |
The number of unique values in the group. |
COUNTIF
This function counts the number of values that meet the specified criterion.
You can use COUNTIFS to specify multiple conditions.
Syntax
COUNTIF(values_to_test, criterion)
Argument |
Data Type |
Description |
values_to_test |
RDF term |
The literal, URI, or blank node value that defines the values to test against the criterion . |
criterion |
RDF term |
The literal, URI, or blank node value that defines the condition to test values against. |
Returns
Data Type |
Description |
long |
The number of values that meet the criterion. |
COUNTIFS
This function counts the number of values that meet the specified criteria. Unlike COUNTIF, this function enables you to specify multiple conditions.
Syntax
COUNTIFS(values_to_count, criteria1 [, criteria2 ] [, criteriaN ])
Argument |
Data Type |
Description |
values_to_count |
RDF term |
The literal, URI, or blank node value to compare against the criteria. |
criteria1–N |
RDF term |
A literal, URI, or blank node value that defines a condition to test the values_to_count against. |
Returns
Data Type |
Description |
long |
The number of values that meet the specified conditions. |
GROUP_CONCAT
This function concatenates a group of strings into a single string. It is a simplified version of GROUPCONCAT as it takes only one argument.
Syntax
GROUP_CONCAT(text)
Argument |
Data Type |
Description |
text |
string |
The string property whose values to concatenate into a single string. |
Returns
Data Type |
Description |
string |
The concatenated string. |
GROUPCONCAT
This function concatenates a group of strings into a single string. Unlike GROUP_CONCAT, this function allows for customization of the separator to use as well as the configuration of limits and options like prefixes and suffixes.
Syntax
GROUPCONCAT(group1, [ group2, ..., groupN, ] group_value_separator, separator, serialize,
row_limit, value_limit, delimit_blanks [, prefix ] [, suffix ] [, max_length ])
Argument |
Data Type |
Description |
group1–N |
string |
The group(s) of strings to concatenate. |
group_value_separator |
string |
The separator string to use between the groups of strings if you specified more than one group . |
separator |
string |
The separator string to use between the values in a concatenated group of strings. |
serialize |
boolean |
A boolean value that indicates whether returned values should be serialized with the value's data type. |
row_limit |
int |
An integer that puts a maximum limit on the number of rows to retrieve for a group. |
value_limit |
int |
An integer that puts a maximum limit on the number of values to retrieve from a group of rows. |
delimit_blanks |
boolean |
A boolean value that indicates whether to delimit blanks with the separator value. |
prefix |
string |
Optional string to add as a prefix to the resulting string. |
suffix |
string |
Optional string to add as a suffix to the resulting string. |
max_length |
int |
Optional integer that puts a maximum limit on the number of characters the resulting string can have. |
Returns
Data Type |
Description |
string |
The concatenated string. |
MAX
This function returns the maximum value from each group of values.
Syntax
MAX(value1 [, value2 ] [, valueN ])
Argument |
Data Type |
Description |
value1–N |
RDF term |
The group(s) of literal, URI, or blank node values for which to return the maximum value. |
Returns
Data Type |
Description |
RDF term |
The maximum value from each group. |
MEDIAN
This function returns the median value from a group of numbers. The median is the number in the group where half of the numbers are greater than the number and half are less than the number.
Syntax
MEDIAN(number)
Argument |
Data Type |
Description |
number |
numeric |
The group of numeric values for which to calculate the median. |
Returns
Data Type |
Description |
number |
The median for the group. |
MIN
This function returns the minimum value from each group of values.
Syntax
MIN(value1 [, value2 ] [, valueN ])
Argument |
Data Type |
Description |
value1–N |
RDF term |
The group(s) of literal, URI, or blank node values for which to return the minimum value. |
Returns
Data Type |
Description |
RDF term |
The minimum value from each group. |
MODE
This function returns the mode from a group of values. The mode is the value that occurs most frequently in the group.
Syntax
MODE(value)
Argument |
Data Type |
Description |
value |
RDF term |
The group of literal, URI, or blank node values for which to return the mode. |
Returns
Data Type |
Description |
RDF term |
The mode from the group. |
MODEPERCENT
This function calculates the percentage of values in a group that belong to the mode.
Syntax
MODEPERCENT(value)
Argument |
Data Type |
Description |
value |
RDF term |
The group of literal, URI, or blank node values for which to calculate the modepercent. |
Returns
Data Type |
Description |
double |
The percentage of values that belong to the mode. |
PERCENTILE_CONT
This function calculates a percentile based on the continuous distribution of the specified group of values. The returned value is interpolated and may not be equal to any of the values in the group.
Syntax
PERCENTILE_CONT(percentile, value)
Argument |
Data Type |
Description |
percentile |
float |
A float value that specifies the percentile to compute. |
value |
RDF term |
The group of literal, URI, or blank node values for which to calculate the percentile. |
Returns
Data Type |
Description |
RDF term |
The interpolated percentile. |
PERCENTILE_DISC
This function calculates a percentile based on the discrete distribution of the specified group of values.
Syntax
PERCENTILE_DISC(percentile, value)
Argument |
Data Type |
Description |
percentile |
float |
A float value that specifies the percentile to compute. |
value |
RDF term |
The group of literal, URI, or blank node values for which to calculate the percentile. |
Returns
Data Type |
Description |
RDF term |
The percentile based on the discrete distribution of the group. |
PRODUCT
This function calculates the product of a group of numbers.
Syntax
PRODUCT(number)
Argument |
Data Type |
Description |
number |
numeric |
The group of numbers to multiply. |
Returns
Data Type |
Description |
number |
The product of the group. |
SAMPLE
This function returns an arbitrary value from the specified group of values.
Syntax
SAMPLE(value)
Argument |
Data Type |
Description |
value |
RDF term |
The group of literal, URI, or blank node values from which to choose a sample value. |
Returns
Data Type |
Description |
RDF term |
The arbitrary value from the group. |
STDEV
This function measures the standard deviation (amount of dispersion) of a group of numbers.
Syntax
STDEV(value)
Argument |
Data Type |
Description |
value |
numeric |
The numeric value that defines the set of numbers for which to measure the standard deviation. |
Returns
Data Type |
Description |
number |
The standard deviation of the group. |
STDEVP
This function calculates the product of the standard deviation for a group of numbers.
Syntax
STDEVP(value)
Argument |
Data Type |
Description |
value |
numeric |
The numeric value that defines the set of numbers for which to measure the standard deviation and compute the product. |
Returns
Data Type |
Description |
number |
The product of the standard deviation for the group. |
SUM
This function calculates the sum of the numbers within a group.
Syntax
SUM(number)
Argument |
Data Type |
Description |
number |
numeric |
The group of numbers to sum. |
Returns
Data Type |
Description |
number |
The sum of the values in the group. |
SUMIF
This function calculates the sum of the range of values that meet the specified criterion.
You can use SUMIFS to specify multiple value ranges and conditions.
Syntax
SUMIF(values_to_test, criterion [, range_of_values ])
Argument |
Data Type |
Description |
values_to_test |
RDF term |
The literal, URI, or blank node value that defines the values to test against the criterion . |
criterion |
RDF term |
The literal, URI, or blank node value that defines the condition to test values against. |
range_of_values |
numeric |
An optional number that defines the range of values to sum. When omitted, values_to_test is used. |
Returns
Data Type |
Description |
number |
The sum of the range of values. |
SUMIFS
This function calculates the sums of the ranges of values that meet the specified criteria. Unlike SUMIF, this function enables you to specify multiple ranges and multiple conditions.
Syntax
SUMIFS(values_to_sum, value_range1, criteria1,
value_range2, criteria2
[, value_rangeN, criteriaN ])
Argument |
Data Type |
Description |
values_to_sum |
numeric |
The numeric value that defines the overall range of values to evaluate. |
value_range1–N |
RDF term |
The literal, URI, or blank node value that defines the range of values to test against the corresponding criteria . |
criteria1–N |
RDF term |
The literal, URI, or blank node value that defines the condition to test the corresponding value_range against. |
Returns
Data Type |
Description |
number |
The sums of the ranges of values. |
SUMPRODUCT
This function multiplies the numbers in a group and adds the results.
Syntax
SUMPRODUCT(number)
Argument |
Data Type |
Description |
number |
numeric |
The group of numbers to multiply and then sum the results. |
Returns
Data Type |
Description |
number |
The sum of the product of the numbers in the group. |
SUMSQ
This function calculates the square root of each number in a group and adds the results.
Syntax
SUMSQ(number)
Argument |
Data Type |
Description |
number |
numeric |
The group of numbers for which to calculate the square root and then sum the results. |
Returns
Data Type |
Description |
number |
The sum of the square root of the numbers in the group. |
VAR
This function calculates the unbiased (sample) variance for a group of numbers.
Syntax
VAR(value)
Argument |
Data Type |
Description |
value |
numeric |
The numeric value that defines the set of numbers for which to measure the variance. |
Returns
Data Type |
Description |
number |
The unbiased variance for the group. |
VARP
This function calculates the biased (population) variance for a group of numbers.
Syntax
VARP(value)
Argument |
Data Type |
Description |
value |
number |
The value that defines the set of numbers for which to measure the population variance. |
Returns
Data Type |
Description |
decimal |
The biased variance for the group. |
WEIGHTEDAVERAGE
This function calculates the weighted average of a group of values.
Syntax
WEIGHTEDAVERAGE(value, weight)
Argument |
Data Type |
Description |
value |
decimal |
The decimal value that defines the group of values for which to calculate the weighted average. |
weight |
decimal |
The decimal value that defines the weight to use in the calculation. |
Returns
Data Type |
Description |
decimal |
The weighted average for the group. |