Aggregate Functions

This topic describes the aggregate functions in Anzo.

Typographical Conventions

This documentation uses the following conventions in function syntax:

CAPS: Although SPARQL is case-insensitive, function names and other keywords are written in uppercase for readability.
[ argument ]: Brackets are used to indicate optional arguments. Arguments without brackets are required.

Functions

AVERAGEIF: Calculates the average of the range of values that meet the specified criterion.
AVERAGEIFS: Calculates the averages of the ranges of values that meet the specified criteria.
AVG: Calculates the average (arithmetic mean) value for a group of numbers.
CHOOSE_BY_MAX: Returns the value from a group that corresponds to the maximum value from another group.
CHOOSE_BY_MIN: Returns the value from a group that corresponds to the minimum value from another group.
COUNT: Counts the number of values that exist for a group.
COUNT_DISTINCT: Counts the number of unique values that exist for a group.
COUNTIF: Counts the number of values that meet the specified criterion.
COUNTIFS: Counts the number of values that meet the specified criteria.
GROUP_CONCAT: Concatenates a group of strings into a single string.
GROUPCONCAT: Concatenates a group of strings into a single string. This function is a customizable version of GROUP_CONCAT.
MAX: Returns the maximum value from each group of values.
MEDIAN: Returns the median number out of a group of numbers.
MIN: Returns the minimum value from each group of values.
MODE: Returns the mode (the value that occurs most frequently) from a group of values.
MODEPERCENT: Calculates the percentage of values in a group that belong to the mode.
PERCENTILE_CONT: Calculates a percentile based on the continuous distribution of the specified group of values.
PERCENTILE_DISC: Calculates a percentile based on the discrete distribution of the specified group of values.
PRODUCT: Calculates the product of a group of numbers.
SAMPLE: Returns an arbitrary value from the specified group of values.
STDEV: Measures the standard deviation in a group of numbers.
STDEVP: Calculates the product of the standard deviation for a group of numbers.
SUM: Calculates the sum of the numbers within a group.
SUMIF: Calculates the sum of the range of values that meet the specified criterion.
SUMIFS: Calculates the sums of the ranges of values that meet the specified criteria.
SUMPRODUCT: Multiplies the numbers in a group and adds the results.
SUMSQ: Calculates the square root of each number in a group and adds the results.
VAR: Calculates the unbiased (sample) variance of a group of numbers.
VARP: Calculates the biased (population) variance of a group of numbers.
WEIGHTEDAVERAGE: Calculates the weighted average of a group of values.

AVERAGEIF

This function calculates the average of the range of values that meet the specified criterion.

You can use AVERAGEIFS to specify multiple value ranges and conditions.

Syntax

AVERAGEIF(values_to_test, criterion [, range_of_values ])

Argument	Data Type	Description
values_to_test	RDF term	The literal, URI, or blank node value that defines the values to test against the `criteria`.
criterion	RDF term	The literal, URI, or blank node value that defines the condition to test values against.
range_of_values	numeric	An optional number that defines the range of values to average. When omitted, `values_to_test` is used.

Returns

Data Type	Description
number	The average value from the range of values that meet the criterion.

AVERAGEIFS

This function calculates the averages of the ranges of values that meet the specified criteria. Unlike AVERAGEIF, this function enables you to specify multiple ranges and multiple conditions.

Syntax

AVERAGEIFS(values_to_average, value_range1, criteria1,
           value_range2, criteria2 
           [, value_rangeN, criteriaN ])

Argument	Data Type	Description
values_to_average	numeric	The numeric value that defines the overall range of values to evaluate.
value_range1–N	RDF term	The literal, URI, or blank node value that defines the range of values to test against the corresponding `criteria`.
criteria1–N	RDF term	The literal, URI, or blank node value that defines the condition to test the corresponding `value_range` against.

Returns

Data Type	Description
number	The average values from the ranges of values that meet the criteria.

AVG

This function calculates the average (arithmetic mean) value for a group of numbers.

Syntax

AVG(number)

Argument	Data Type	Description
number	numeric	The numeric value for which to calculate the average.

Returns

Data Type	Description
number	The arithmetic mean of the input values.

CHOOSE_BY_MAX

This function calculates the maximum value for one group and returns the value from another group that corresponds to the maximum from the first group.

Syntax

CHOOSE_BY_MAX(test, value)

Argument	Data Type	Description
test	RDF term	The group of literal, URI, or blank node values from which to find the maximum value.
value	RDF term	The group of literal, URI, or blank node values from which to return the value that corresponds to the maximum value of `test`.

Example

In a fictional ticket sales data set, the following statement returns the ID of the buyer who paid the most:

CHOOSE_BY_MAX(?totalPaid, ?buyerID)

Returns

Data Type	Description
RDF term	The term from the `value` group that corresponds to the maximum value from the `test` group.

CHOOSE_BY_MIN

This function calculates the minimum value for one group and returns the value from another group that corresponds to the minimum from the first group.

Syntax

CHOOSE_BY_MIN(test, value)

Argument	Data Type	Description
test	RDF term	The group of literal, URI, or blank node values from which to find the minimum value.
value	RDF term	The group of literal, URI, or blank node values from which to return the value that corresponds to the minimum value of `test`.

Example

In a fictional ticket sales data set, the following statement returns the ID of the seller who sold the least number of tickets:

CHOOSE_BY_MIN(?totalTickets, ?sellerID)

Returns

Data Type	Description
RDF term	The term from the `value` group that corresponds to the minimum value from the `test` group.

COUNT

This function counts the number of values that exist for a group.

Syntax

COUNT(value)

Argument	Data Type	Description
value	RDF term	The group of literal, URI, or blank node values to count.

Returns

Data Type	Description
long	The number of values in the group.

COUNT_DISTINCT

This function counts the number of unique values that exist for a group.

Syntax

COUNT_DISTINCT(value)

Argument	Data Type	Description
value	RDF term	The group of literal, URI, or blank node values for which to count the number of distinct values.

Returns

Data Type	Description
long	The number of unique values in the group.

COUNTIF

This function counts the number of values that meet the specified criterion.

You can use COUNTIFS to specify multiple conditions.

Syntax

COUNTIF(values_to_test, criterion)

Argument	Data Type	Description
values_to_test	RDF term	The literal, URI, or blank node value that defines the values to test against the `criterion`.
criterion	RDF term	The literal, URI, or blank node value that defines the condition to test values against.

Returns

Data Type	Description
long	The number of values that meet the criterion.

COUNTIFS

This function counts the number of values that meet the specified criteria. Unlike COUNTIF, this function enables you to specify multiple conditions.

Syntax

COUNTIFS(values_to_count, criteria1 [, criteria2 ] [, criteriaN ])

Argument	Data Type	Description
values_to_count	RDF term	The literal, URI, or blank node value to compare against the criteria.
criteria1–N	RDF term	A literal, URI, or blank node value that defines a condition to test the `values_to_count` against.

Returns

Data Type	Description
long	The number of values that meet the specified conditions.

GROUP_CONCAT

This function concatenates a group of strings into a single string. It is a simplified version of GROUPCONCAT as it takes only one argument.

Syntax

GROUP_CONCAT(text)

Argument	Data Type	Description
text	string	The string property whose values to concatenate into a single string.

Returns

Data Type	Description
string	The concatenated string.

GROUPCONCAT

This function concatenates a group of strings into a single string. Unlike GROUP_CONCAT, this function allows for customization of the separator to use as well as the configuration of limits and options like prefixes and suffixes.

Syntax

GROUPCONCAT(group1, [ group2, ..., groupN, ] group_value_separator, separator, serialize,
            row_limit, value_limit, delimit_blanks [, prefix ] [, suffix ] [, max_length ])

Argument	Data Type	Description
group1–N	string	The group(s) of strings to concatenate.
group_value_separator	string	The separator string to use between the groups of strings if you specified more than one `group`.
separator	string	The separator string to use between the values in a concatenated group of strings.
serialize	boolean	A boolean value that indicates whether returned values should be serialized with the value's data type.
row_limit	int	An integer that puts a maximum limit on the number of rows to retrieve for a group.
value_limit	int	An integer that puts a maximum limit on the number of values to retrieve from a group of rows.
delimit_blanks	boolean	A boolean value that indicates whether to delimit blanks with the `separator` value.
prefix	string	Optional string to add as a prefix to the resulting string.
suffix	string	Optional string to add as a suffix to the resulting string.
max_length	int	Optional integer that puts a maximum limit on the number of characters the resulting string can have.

Returns

Data Type	Description
string	The concatenated string.

MAX

This function returns the maximum value from each group of values.

Syntax

MAX(value1 [, value2 ] [, valueN ])

Argument	Data Type	Description
value1–N	RDF term	The group(s) of literal, URI, or blank node values for which to return the maximum value.

Returns

Data Type	Description
RDF term	The maximum value from each group.

MEDIAN

This function returns the median value from a group of numbers. The median is the number in the group where half of the numbers are greater than the number and half are less than the number.

Syntax

MEDIAN(number)

Argument	Data Type	Description
number	numeric	The group of numeric values for which to calculate the median.

Returns

Data Type	Description
number	The median for the group.

MIN

This function returns the minimum value from each group of values.

Syntax

MIN(value1 [, value2 ] [, valueN ])

Argument	Data Type	Description
value1–N	RDF term	The group(s) of literal, URI, or blank node values for which to return the minimum value.

Returns

Data Type	Description
RDF term	The minimum value from each group.

MODE

This function returns the mode from a group of values. The mode is the value that occurs most frequently in the group.

Syntax

MODE(value)

Argument	Data Type	Description
value	RDF term	The group of literal, URI, or blank node values for which to return the mode.

Returns

Data Type	Description
RDF term	The mode from the group.

MODEPERCENT

This function calculates the percentage of values in a group that belong to the mode.

Syntax

MODEPERCENT(value)

Argument	Data Type	Description
value	RDF term	The group of literal, URI, or blank node values for which to calculate the modepercent.

Returns

Data Type	Description
double	The percentage of values that belong to the mode.

PERCENTILE_CONT

This function calculates a percentile based on the continuous distribution of the specified group of values. The returned value is interpolated and may not be equal to any of the values in the group.

Syntax

PERCENTILE_CONT(percentile, value)

Argument	Data Type	Description
percentile	float	A float value that specifies the percentile to compute.
value	RDF term	The group of literal, URI, or blank node values for which to calculate the percentile.

Returns

Data Type	Description
RDF term	The interpolated percentile.

PERCENTILE_DISC

This function calculates a percentile based on the discrete distribution of the specified group of values.

Syntax

PERCENTILE_DISC(percentile, value)

Argument	Data Type	Description
percentile	float	A float value that specifies the percentile to compute.
value	RDF term	The group of literal, URI, or blank node values for which to calculate the percentile.

Returns

Data Type	Description
RDF term	The percentile based on the discrete distribution of the group.

PRODUCT

This function calculates the product of a group of numbers.

Syntax

PRODUCT(number)

Argument	Data Type	Description
number	numeric	The group of numbers to multiply.

Returns

Data Type	Description
number	The product of the group.

SAMPLE

This function returns an arbitrary value from the specified group of values.

Syntax

SAMPLE(value)

Argument	Data Type	Description
value	RDF term	The group of literal, URI, or blank node values from which to choose a sample value.

Returns

Data Type	Description
RDF term	The arbitrary value from the group.

STDEV

This function measures the standard deviation (amount of dispersion) of a group of numbers.

Syntax

STDEV(value)

Argument	Data Type	Description
value	numeric	The numeric value that defines the set of numbers for which to measure the standard deviation.

Returns

Data Type	Description
number	The standard deviation of the group.

STDEVP

This function calculates the product of the standard deviation for a group of numbers.

Syntax

STDEVP(value)

Argument	Data Type	Description
value	numeric	The numeric value that defines the set of numbers for which to measure the standard deviation and compute the product.

Returns

Data Type	Description
number	The product of the standard deviation for the group.

SUM

This function calculates the sum of the numbers within a group.

Syntax

SUM(number)

Argument	Data Type	Description
number	numeric	The group of numbers to sum.

Returns

Data Type	Description
number	The sum of the values in the group.

SUMIF

This function calculates the sum of the range of values that meet the specified criterion.

You can use SUMIFS to specify multiple value ranges and conditions.

Syntax

SUMIF(values_to_test, criterion [, range_of_values ])

Argument	Data Type	Description
values_to_test	RDF term	The literal, URI, or blank node value that defines the values to test against the `criterion`.
criterion	RDF term	The literal, URI, or blank node value that defines the condition to test values against.
range_of_values	numeric	An optional number that defines the range of values to sum. When omitted, `values_to_test` is used.

Returns

Data Type	Description
number	The sum of the range of values.

SUMIFS

This function calculates the sums of the ranges of values that meet the specified criteria. Unlike SUMIF, this function enables you to specify multiple ranges and multiple conditions.

Syntax

SUMIFS(values_to_sum, value_range1, criteria1,
       value_range2, criteria2 
       [, value_rangeN, criteriaN ])

Argument	Data Type	Description
values_to_sum	numeric	The numeric value that defines the overall range of values to evaluate.
value_range1–N	RDF term	The literal, URI, or blank node value that defines the range of values to test against the corresponding `criteria`.
criteria1–N	RDF term	The literal, URI, or blank node value that defines the condition to test the corresponding `value_range` against.

Returns

Data Type	Description
number	The sums of the ranges of values.

SUMPRODUCT

This function multiplies the numbers in a group and adds the results.

Syntax

SUMPRODUCT(number)

Argument	Data Type	Description
number	numeric	The group of numbers to multiply and then sum the results.

Returns

Data Type	Description
number	The sum of the product of the numbers in the group.

SUMSQ

This function calculates the square root of each number in a group and adds the results.

Syntax

SUMSQ(number)

Argument	Data Type	Description
number	numeric	The group of numbers for which to calculate the square root and then sum the results.

Returns

Data Type	Description
number	The sum of the square root of the numbers in the group.

VAR

This function calculates the unbiased (sample) variance for a group of numbers.

Syntax

VAR(value)

Argument	Data Type	Description
value	numeric	The numeric value that defines the set of numbers for which to measure the variance.

Returns

Data Type	Description
number	The unbiased variance for the group.

VARP

This function calculates the biased (population) variance for a group of numbers.

Syntax

VARP(value)

Argument	Data Type	Description
value	number	The value that defines the set of numbers for which to measure the population variance.

Returns

Data Type	Description
decimal	The biased variance for the group.

WEIGHTEDAVERAGE

This function calculates the weighted average of a group of values.

Syntax

WEIGHTEDAVERAGE(value, weight)

Argument	Data Type	Description
value	decimal	The decimal value that defines the group of values for which to calculate the weighted average.
weight	decimal	The decimal value that defines the weight to use in the calculation.

Returns

Data Type	Description
decimal	The weighted average for the group.