Aggregate Functions

This topic describes the aggregate functions in Anzo.

Typographical Conventions

This documentation uses the following conventions in function syntax:

  • CAPS: Although SPARQL is case-insensitive, function names and other keywords are written in uppercase for readability.
  • [ argument ]: Brackets are used to indicate optional arguments. Arguments without brackets are required.

Functions

  • AVERAGEIF: Calculates the average of the range of values that meet the specified criterion.
  • AVERAGEIFS: Calculates the averages of the ranges of values that meet the specified criteria.
  • AVG: Calculates the average (arithmetic mean) value for a group of numbers.
  • CHOOSE_BY_MAX: Returns the value from a group that corresponds to the maximum value from another group.
  • CHOOSE_BY_MIN: Returns the value from a group that corresponds to the minimum value from another group.
  • COUNT: Counts the number of values that exist for a group.
  • COUNT_DISTINCT: Counts the number of unique values that exist for a group.
  • COUNTIF: Counts the number of values that meet the specified criterion.
  • COUNTIFS: Counts the number of values that meet the specified criteria.
  • GROUP_CONCAT: Concatenates a group of strings into a single string.
  • GROUPCONCAT: Concatenates a group of strings into a single string. This function is a customizable version of GROUP_CONCAT.
  • MAX: Returns the maximum value from each group of values.
  • MEDIAN: Returns the median number out of a group of numbers.
  • MIN: Returns the minimum value from each group of values.
  • MODE: Returns the mode (the value that occurs most frequently) from a group of values.
  • MODEPERCENT: Calculates the percentage of values in a group that belong to the mode.
  • PERCENTILE_CONT: Calculates a percentile based on the continuous distribution of the specified group of values.
  • PERCENTILE_DISC: Calculates a percentile based on the discrete distribution of the specified group of values.
  • PRODUCT: Calculates the product of a group of numbers.
  • SAMPLE: Returns an arbitrary value from the specified group of values.
  • STDEV: Measures the standard deviation in a group of numbers.
  • STDEVP: Calculates the product of the standard deviation for a group of numbers.
  • SUM: Calculates the sum of the numbers within a group.
  • SUMIF: Calculates the sum of the range of values that meet the specified criterion.
  • SUMIFS: Calculates the sums of the ranges of values that meet the specified criteria.
  • SUMPRODUCT: Multiplies the numbers in a group and adds the results.
  • SUMSQ: Calculates the square root of each number in a group and adds the results.
  • VAR: Calculates the unbiased (sample) variance of a group of numbers.
  • VARP: Calculates the biased (population) variance of a group of numbers.
  • WEIGHTEDAVERAGE: Calculates the weighted average of a group of values.

AVERAGEIF

This function calculates the average of the range of values that meet the specified criterion.

You can use AVERAGEIFS to specify multiple value ranges and conditions.

Syntax

AVERAGEIF(values_to_test, criterion [, range_of_values ])
Argument Data Type Description
values_to_test RDF term The literal, URI, or blank node value that defines the values to test against the criteria.
criterion RDF term The literal, URI, or blank node value that defines the condition to test values against.
range_of_values numeric An optional number that defines the range of values to average. When omitted, values_to_test is used.

Returns

Data Type Description
number The average value from the range of values that meet the criterion.

AVERAGEIFS

This function calculates the averages of the ranges of values that meet the specified criteria. Unlike AVERAGEIF, this function enables you to specify multiple ranges and multiple conditions.

Syntax

AVERAGEIFS(values_to_average, value_range1, criteria1,
           value_range2, criteria2 
           [, value_rangeN, criteriaN ])
Argument Data Type Description
values_to_average numeric The numeric value that defines the overall range of values to evaluate.
value_range1–N RDF term The literal, URI, or blank node value that defines the range of values to test against the corresponding criteria.
criteria1–N RDF term The literal, URI, or blank node value that defines the condition to test the corresponding value_range against.

Returns

Data Type Description
number The average values from the ranges of values that meet the criteria.

AVG

This function calculates the average (arithmetic mean) value for a group of numbers.

Syntax

AVG(number)
Argument Data Type Description
number numeric The numeric value for which to calculate the average.

Returns

Data Type Description
number The arithmetic mean of the input values.

CHOOSE_BY_MAX

This function calculates the maximum value for one group and returns the value from another group that corresponds to the maximum from the first group.

Syntax

CHOOSE_BY_MAX(test, value)
Argument Data Type Description
test RDF term The group of literal, URI, or blank node values from which to find the maximum value.
value RDF term The group of literal, URI, or blank node values from which to return the value that corresponds to the maximum value of test.

Example

In a fictional ticket sales data set, the following statement returns the ID of the buyer who paid the most:

CHOOSE_BY_MAX(?totalPaid, ?buyerID)

Returns

Data Type Description
RDF term The term from the value group that corresponds to the maximum value from the test group.

CHOOSE_BY_MIN

This function calculates the minimum value for one group and returns the value from another group that corresponds to the minimum from the first group.

Syntax

CHOOSE_BY_MIN(test, value)
Argument Data Type Description
test RDF term The group of literal, URI, or blank node values from which to find the minimum value.
value RDF term The group of literal, URI, or blank node values from which to return the value that corresponds to the minimum value of test.

Example

In a fictional ticket sales data set, the following statement returns the ID of the seller who sold the least number of tickets:

CHOOSE_BY_MIN(?totalTickets, ?sellerID)

Returns

Data Type Description
RDF term The term from the value group that corresponds to the minimum value from the test group.

COUNT

This function counts the number of values that exist for a group.

Syntax

COUNT(value)
Argument Data Type Description
value RDF term The group of literal, URI, or blank node values to count.

Returns

Data Type Description
long The number of values in the group.

COUNT_DISTINCT

This function counts the number of unique values that exist for a group.

Syntax

COUNT_DISTINCT(value)
Argument Data Type Description
value RDF term The group of literal, URI, or blank node values for which to count the number of distinct values.

Returns

Data Type Description
long The number of unique values in the group.

COUNTIF

This function counts the number of values that meet the specified criterion.

You can use COUNTIFS to specify multiple conditions.

Syntax

COUNTIF(values_to_test, criterion)
Argument Data Type Description
values_to_test RDF term The literal, URI, or blank node value that defines the values to test against the criterion.
criterion RDF term The literal, URI, or blank node value that defines the condition to test values against.

Returns

Data Type Description
long The number of values that meet the criterion.

COUNTIFS

This function counts the number of values that meet the specified criteria. Unlike COUNTIF, this function enables you to specify multiple conditions.

Syntax

COUNTIFS(values_to_count, criteria1 [, criteria2 ] [, criteriaN ])
Argument Data Type Description
values_to_count RDF term The literal, URI, or blank node value to compare against the criteria.
criteria1–N RDF term A literal, URI, or blank node value that defines a condition to test the values_to_count against.

Returns

Data Type Description
long The number of values that meet the specified conditions.

GROUP_CONCAT

This function concatenates a group of strings into a single string. It is a simplified version of GROUPCONCAT as it takes only one argument.

Syntax

GROUP_CONCAT(text)
Argument Data Type Description
text string The string property whose values to concatenate into a single string.

Returns

Data Type Description
string The concatenated string.

GROUPCONCAT

This function concatenates a group of strings into a single string. Unlike GROUP_CONCAT, this function allows for customization of the separator to use as well as the configuration of limits and options like prefixes and suffixes.

Syntax

GROUPCONCAT(group1, [ group2, ..., groupN, ] group_value_separator, separator, serialize,
            row_limit, value_limit, delimit_blanks [, prefix ] [, suffix ] [, max_length ])
Argument Data Type Description
group1–N string The group(s) of strings to concatenate.
group_value_separator string The separator string to use between the groups of strings if you specified more than one group.
separator string The separator string to use between the values in a concatenated group of strings.
serialize boolean A boolean value that indicates whether returned values should be serialized with the value's data type.
row_limit int An integer that puts a maximum limit on the number of rows to retrieve for a group.
value_limit int An integer that puts a maximum limit on the number of values to retrieve from a group of rows.
delimit_blanks boolean A boolean value that indicates whether to delimit blanks with the separator value.
prefix string Optional string to add as a prefix to the resulting string.
suffix string Optional string to add as a suffix to the resulting string.
max_length int Optional integer that puts a maximum limit on the number of characters the resulting string can have.

Returns

Data Type Description
string The concatenated string.

MAX

This function returns the maximum value from each group of values.

Syntax

MAX(value1 [, value2 ] [, valueN ])
Argument Data Type Description
value1–N RDF term The group(s) of literal, URI, or blank node values for which to return the maximum value.

Returns

Data Type Description
RDF term The maximum value from each group.

MEDIAN

This function returns the median value from a group of numbers. The median is the number in the group where half of the numbers are greater than the number and half are less than the number.

Syntax

MEDIAN(number)
Argument Data Type Description
number numeric The group of numeric values for which to calculate the median.

Returns

Data Type Description
number The median for the group.

MIN

This function returns the minimum value from each group of values.

Syntax

MIN(value1 [, value2 ] [, valueN ])
Argument Data Type Description
value1–N RDF term The group(s) of literal, URI, or blank node values for which to return the minimum value.

Returns

Data Type Description
RDF term The minimum value from each group.

MODE

This function returns the mode from a group of values. The mode is the value that occurs most frequently in the group.

Syntax

MODE(value)
Argument Data Type Description
value RDF term The group of literal, URI, or blank node values for which to return the mode.

Returns

Data Type Description
RDF term The mode from the group.

MODEPERCENT

This function calculates the percentage of values in a group that belong to the mode.

Syntax

MODEPERCENT(value)
Argument Data Type Description
value RDF term The group of literal, URI, or blank node values for which to calculate the modepercent.

Returns

Data Type Description
double The percentage of values that belong to the mode.

PERCENTILE_CONT

This function calculates a percentile based on the continuous distribution of the specified group of values. The returned value is interpolated and may not be equal to any of the values in the group.

Syntax

PERCENTILE_CONT(percentile, value)
Argument Data Type Description
percentile float A float value that specifies the percentile to compute.
value RDF term The group of literal, URI, or blank node values for which to calculate the percentile.

Returns

Data Type Description
RDF term The interpolated percentile.

PERCENTILE_DISC

This function calculates a percentile based on the discrete distribution of the specified group of values.

Syntax

PERCENTILE_DISC(percentile, value)
Argument Data Type Description
percentile float A float value that specifies the percentile to compute.
value RDF term The group of literal, URI, or blank node values for which to calculate the percentile.

Returns

Data Type Description
RDF term The percentile based on the discrete distribution of the group.

PRODUCT

This function calculates the product of a group of numbers.

Syntax

PRODUCT(number)
Argument Data Type Description
number numeric The group of numbers to multiply.

Returns

Data Type Description
number The product of the group.

SAMPLE

This function returns an arbitrary value from the specified group of values.

Syntax

SAMPLE(value)
Argument Data Type Description
value RDF term The group of literal, URI, or blank node values from which to choose a sample value.

Returns

Data Type Description
RDF term The arbitrary value from the group.

STDEV

This function measures the standard deviation (amount of dispersion) of a group of numbers.

Syntax

STDEV(value)
Argument Data Type Description
value numeric The numeric value that defines the set of numbers for which to measure the standard deviation.

Returns

Data Type Description
number The standard deviation of the group.

STDEVP

This function calculates the product of the standard deviation for a group of numbers.

Syntax

STDEVP(value)
Argument Data Type Description
value numeric The numeric value that defines the set of numbers for which to measure the standard deviation and compute the product.

Returns

Data Type Description
number The product of the standard deviation for the group.

SUM

This function calculates the sum of the numbers within a group.

Syntax

SUM(number)
Argument Data Type Description
number numeric The group of numbers to sum.

Returns

Data Type Description
number The sum of the values in the group.

SUMIF

This function calculates the sum of the range of values that meet the specified criterion.

You can use SUMIFS to specify multiple value ranges and conditions.

Syntax

SUMIF(values_to_test, criterion [, range_of_values ])
Argument Data Type Description
values_to_test RDF term The literal, URI, or blank node value that defines the values to test against the criterion.
criterion RDF term The literal, URI, or blank node value that defines the condition to test values against.
range_of_values numeric An optional number that defines the range of values to sum. When omitted, values_to_test is used.

Returns

Data Type Description
number The sum of the range of values.

SUMIFS

This function calculates the sums of the ranges of values that meet the specified criteria. Unlike SUMIF, this function enables you to specify multiple ranges and multiple conditions.

Syntax

SUMIFS(values_to_sum, value_range1, criteria1,
       value_range2, criteria2 
       [, value_rangeN, criteriaN ])
Argument Data Type Description
values_to_sum numeric The numeric value that defines the overall range of values to evaluate.
value_range1–N RDF term The literal, URI, or blank node value that defines the range of values to test against the corresponding criteria.
criteria1–N RDF term The literal, URI, or blank node value that defines the condition to test the corresponding value_range against.

Returns

Data Type Description
number The sums of the ranges of values.

SUMPRODUCT

This function multiplies the numbers in a group and adds the results.

Syntax

SUMPRODUCT(number)
Argument Data Type Description
number numeric The group of numbers to multiply and then sum the results.

Returns

Data Type Description
number The sum of the product of the numbers in the group.

SUMSQ

This function calculates the square root of each number in a group and adds the results.

Syntax

SUMSQ(number)
Argument Data Type Description
number numeric The group of numbers for which to calculate the square root and then sum the results.

Returns

Data Type Description
number The sum of the square root of the numbers in the group.

VAR

This function calculates the unbiased (sample) variance for a group of numbers.

Syntax

VAR(value)
Argument Data Type Description
value numeric The numeric value that defines the set of numbers for which to measure the variance.

Returns

Data Type Description
number The unbiased variance for the group.

VARP

This function calculates the biased (population) variance for a group of numbers.

Syntax

VARP(value)
Argument Data Type Description
value number The value that defines the set of numbers for which to measure the population variance.

Returns

Data Type Description
decimal The biased variance for the group.

WEIGHTEDAVERAGE

This function calculates the weighted average of a group of values.

Syntax

WEIGHTEDAVERAGE(value, weight)
Argument Data Type Description
value decimal The decimal value that defines the group of values for which to calculate the weighted average.
weight decimal The decimal value that defines the weight to use in the calculation.

Returns

Data Type Description
decimal The weighted average for the group.