String Functions

This topic describes the Anzo functions that operate on string data types.

Typographical Conventions

This documentation uses the following conventions in function syntax:

  • CAPS: Although SPARQL is case-insensitive, function names and other keywords are written in uppercase for readability.
  • [ argument ]: Brackets are used to indicate optional arguments. Arguments without brackets are required.

Functions

  • BUSINESS_ENTITY_EXCLUDER: Removes suffixes that represent business entities.
  • CONCATENATE: Concatenates two or more strings and returns the result as a string.
  • CONCATURL: Concatenates two or more strings and returns the result as a URI.
  • CONTAINS: Evaluates whether the specified string contains the given pattern.
  • ENCODE_FOR_URI: Encodes the specified string as a URI.
  • ESCAPEHTML: Escapes the specified string for use in HTML.
  • FIND: Returns the position—from left to right—of a string within another string.
  • FINDREVERSE: Returns the position—from right to left—of a string within another string.
  • GROUP_CONCAT: Concatenates a group of strings into a single string.
  • GROUPCONCAT: Concatenates a group of strings into a single string. This function is a customizable version of GROUP_CONCAT.
  • LANG: Returns any language tags that are included with strings.
  • LANGMATCHES: Evaluates whether a string includes a language tag that matches the specified language range.
  • LCASE: Converts the letters in a string literal to lower case.
  • LEFT: Returns the specified number of characters starting from the beginning (left side) of the string.
  • LEN: Calculates the length (number of characters) in a string.
  • LEVENSHTEIN_DIST: Calculates the Levenshtein distance or measure of similarity between two strings.
  • LOWER: Converts all letters in a string to lower case.
  • MD5: Returns the MD5 checksum of a string as a hexadecimal string.
  • MID: Returns the specified number of characters from a string, starting from a given position in the string.
  • REGEX: Evaluates whether a string matches the specified regular expression pattern.
  • REGEXP_SUBSTR: Searches a string for the specified regular expression pattern and returns the substring that matches the pattern.
  • REPLACE: Extends the REGEX function to provide the ability to find a pattern in a string and replace it with another pattern.
  • RIGHT: Returns the specified number of characters starting from the end (right side) of the string.
  • SEARCH: Uses text search semantics to evaluate whether the specified string matches the given pattern.
  • SHA1: Calculates the SHA-1 digest of a string value.
  • SHA224: Calculates the SHA-224 digest of a string value.
  • SHA256: Calculates the SHA-256 digest of a string value.
  • SHA384: Calculates the SHA-384 digest of a string value.
  • SHA512: Calculates the SHA-512 digest of a string value.
  • STRAFTER: Returns the portion of a string that comes after the specified substring.
  • STRBEFORE: Returns the portion of a string that comes before the specified substring.
  • STRDT: Constructs a literal value with the specified data type.
  • STRENDS: Evaluates whether the specified string ends with the specified substring.
  • STRLANG: Constructs a literal value with the specified language tag.
  • STRLEN: Calculates the length of a string.
  • STRSTARTS: Evaluates whether the specified string starts with the specified substring.
  • STRUUID: Returns a string that is the result of generating a Universally Unique Identifier (UUID).
  • SUBSTITUTE: Substitutes the existing text for the specified new text.
  • SUBSTR: Returns a substring from a string value.
  • TOURI: Casts a string to a URI.
  • TRIM: Removes all spaces from a string except for any single spaces between words.
  • UCASE: Converts all letters in a string to upper case.
  • UPPER: Converts the letters in a string literal to upper case.

BUSINESS_ENTITY_EXCLUDER

This function removes from strings the suffixes that represent business entities.

Syntax

BUSINESS_ENTITY_EXCLUDER(text)
Argument Type Description
text string The string from which you want to remove business entities.

Returns

Data Type Description
string The string without the business entity suffix.

CONCATENATE

This function concatenates two or more strings and returns the result as a string.

Syntax

CONCATENATE(text1, text2 [, textN ])
Argument Type Description
text1–N string The strings that you want to concatenate to form a single string.

Returns

Type Description
string The concatenated string.

CONCATURL

This function concatenates two or more strings and returns the result as a URI.

Syntax

CONCATURL(text1, text2 [, textN ])
Argument Type Description
text1–N string The strings that you want to concatenate to form a URI.

Returns

Type Description
URI The concatenated string as a URI.

CONTAINS

This function evaluates whether the specified strings contain the given pattern. Results are grouped under "true" or "false."

Syntax

CONTAINS(text, pattern)
Argument Type Description
text string The string value that you want to check against the specified pattern.
pattern string The string pattern that you want to look for in the supplied text.

Returns

Type Description
boolean True if the strings contain the pattern and false if they do not.

ENCODE_FOR_URI

This function encodes the specified string as a URI and returns a string in URI format.

Syntax

ENCODE_FOR_URI(text)
Argument Type Description
text string The string value to encode as a URI.

Returns

Type Description
string The string as a URI.

ESCAPEHTML

This function escapes the specified string for use in HTML.

Syntax

ESCAPEHTML(text)
Argument Type Description
text string The string value to escape for HTML.

Returns

Type Description
string The string escaped for HTML.

FIND

This function returns the position—from left to right—of a string within another string.

You can use FINDREVERSE to find the character or substring position from right to left.

Syntax

FIND(find_text, within_text, start_num)
Argument Type Description
find_text string The string to look for in the within_text.
within_text string The string to search within.
start_num int An integer that indicates the position to start from when looking for the find_text. The starting position is at the beginning of the within_text value and characters are counted from left to right.

Returns

Type Description
int The character position (from left to right) where the substring starts.

FINDREVERSE

Similar to FIND, this function returns the position—from right to left—of a string within another string.

Syntax

FINDREVERSE(find_text, within_text, start_num)
Argument Type Description
find_text string The string to look for in the within_text value.
within_text string The string to search within.
start_num int An integer that indicates the position to start from when looking for the find_text. The starting position is the end of the within_text value and characters are counted from right to left.

Returns

Type Description
int The character position (from right to left) where the substring starts.

GROUP_CONCAT

This function concatenates a group of strings into a single string. It is a simplified version of GROUPCONCAT as it takes only one argument.

Syntax

GROUP_CONCAT(text)
Argument Type Description
text string The string property whose values to concatenate into a single string.

Returns

Type Description
string The concatenated string.

GROUPCONCAT

This function concatenates a group of strings into a single string. Unlike GROUP_CONCAT, this function allows for customization of the separator to use as well as the configuration of limits and options like prefixes and suffixes.

Syntax

GROUPCONCAT(group1, [ group2, ..., groupN, ] group_value_separator, separator, serialize,
            row_limit, value_limit, delimit_blanks [, prefix ] [, suffix ] [, max_length ])
Argument Type Description
group1–N string The group(s) of strings to concatenate.
group_value_separator string The separator string to use between the groups of strings if you specified more than one group.
separator string The separator string to use between the values in a concatenated group of strings.
serialize boolean A boolean value that indicates whether returned values should be serialized with the value's data type.
row_limit int An integer that puts a maximum limit on the number of rows to retrieve for a group.
value_limit int An integer that puts a maximum limit on the number of values to retrieve from a group of rows.
delimit_blanks boolean A boolean value that indicates whether to delimit blanks with the separator value.
prefix string Optional string to add as a prefix to the resulting string.
suffix string Optional string to add as a suffix to the resulting string.
max_length int Optional integer that puts a maximum limit on the number of characters the resulting string can have.

Returns

Type Description
string The concatenated string.

LANG

This function returns any language tags that are included in the string. The results are grouped by each language tag or by "blank" if a value does not have a language tag.

Syntax

LANG(text)
Argument Type Description
text string The string to search for language tags.

Returns

Type Description
string The found language tags.

LANGMATCHES

This function tests whether a string includes a language tag that matches the specified language range.

Syntax

LANGMATCHES(text, language_range)
Argument Type Description
text string The string to evaluate.
language_range string The language tag to match in the text.

Example

LANGMATCHES(LANG(?prop),"en")

Returns

Type Description
boolean True if strings include a language tag that matches the range and false if they do not.

LCASE

This function converts the letters in a string literal to lower case.

Syntax

LCASE(text)
Argument Type Description
text string The string literal to convert to lower case.

Returns

Type Description
string The string with lower case letters.

LEFT

This function returns the specified number of characters starting from the beginning (left side) of the string.

Syntax

LEFT(text, num_chars)
Argument Type Description
text string The string from which to return the specified number of characters.
num_chars int An integer that specifies the number of characters to return, starting from the left side of the text.

Returns

Type Description
string The specified number of characters from the string.

LEN

This function calculates the length (number of characters) in a string.

Syntax

LEN(text)
Argument Type Description
text string The string for which to calculate the length.

Returns

Type Description
int The number of characters in the string.

LEVENSHTEIN_DIST

This function calculates the Levenshtein distance or measure of similarity between two strings. The distance is the number of edits required to transform the first string into the second string.

Syntax

LEVENSHTEIN_DIST(text1, text2)
Argument Type Description
text1 string The string that would be transformed into text2.
text2 string The string to measure text1 against.

Returns

Type Description
int The Levenshtein distance between the strings.

LOWER

This function converts all letters in a string to lower case.

Syntax

LOWER(text)
Argument Type Description
text string The string to convert to lower case.

Returns

Type Description
string The string with lower case letters.

MD5

This function returns the MD5 checksum of a string as a hexadecimal string.

Syntax

MD5(text)
Argument Type Description
text string The string for which to return the MD5 checksum.

Returns

Type Description
string The hexadecimal string.

MID

This function returns the specified number of characters from a string, starting from a given position in the string.

Syntax

MID(text, start_num, num_chars)
Argument Type Description
text string The string from which to return the specified characters.
start_num int An integer that indicates the starting position in the string.
num_chars int An integer that specifies the number of characters to return, starting with the character indicated by start_num.

Returns

Type Description
string The specified number of characters from the string.

REGEX

This function tests whether a string matches the specified regular expression pattern.

Syntax

REGEX(text, pattern [, flags ])
Argument Type Description
text string The string to test against the pattern.
pattern string The regular expression pattern to look for in the text. For information about the supported regular expression syntax, see the Regular Expression Syntax section of the W3C XQuery 1.0 and XPath 2.0 Functions and Operators specification.
flags string You can include one or more optional modifier flags that further define the pattern. For information about flags, see the Flags section of the W3C Functions and Operators specification.

Returns

Type Description
boolean True if the string matches the regular expression pattern and false if it does not.

REGEXP_SUBSTR

This function searches a string for the specified regular expression pattern and returns the substring that matches the pattern.

Syntax

REGEXP_SUBSTR(text, pattern [, start_position ] [, nth_appearance ])
Argument Type Description
text string The string to test against the pattern.
pattern string The regular expression pattern to look for in the text. For information about the supported regular expression syntax, see the Regular Expression Syntax section of the W3C XQuery 1.0 and XPath 2.0 Functions and Operators specification.
start_position int An optional integer that specifies the number of characters from the beginning of the string to start searching for matches (the default value is 1).
nth_appearance int An optional integer that specifies which occurrence of the pattern to match (the default value is 1).

Returns

Type Description
string The substring that matches the regular expression pattern.

REPLACE

This function extends the REGEX function to provide the ability to find a pattern in a string and replace it with another pattern. The function returns the replaced string.

Syntax

REPLACE(text, pattern, replacement_pattern [, flags ])
Argument Type Description
text string The string to test against the pattern.
pattern string The regular expression pattern to look for in the text. For information about the supported regular expression syntax, see the Regular Expression Syntax section of the W3C XQuery 1.0 and XPath 2.0 Functions and Operators specification.
replacement_pattern string The pattern to replace the pattern with.
flags string You can include one or more optional modifier flags that further define the pattern. For information about flags, see the Flags section of the W3C Functions and Operators specification.

Returns

Type Description
string The string that contains the replacement pattern.

RIGHT

This function returns the specified number of characters starting from the end (right side) of the string.

Syntax

RIGHT(text, num_chars)
Argument Type Description
text string The string from which to return the specified number of characters.
num_chars int An integer that specifies the number of characters to return, starting from the right side of the text.

Returns

Type Description
string The specified characters from the string.

SEARCH

This function uses text search semantics to evaluate whether the specified string matches the given pattern.

Syntax

SEARCH(text, pattern [, required ] [, wildcard ] [, escape ])
Argument Type Description
text string The string to search.
pattern string The search string to look for in the text. Anzo automatically converts the value to a regular expression pattern that uses text search semantics.
required boolean An optional boolean value that indicates whether the text must include all elements of the search pattern to qualify as a match or whether matching just part of the pattern qualifies as a match.
wildcard boolean An optional boolean value that indicates whether or not to add the wildcard character * to the end of the search pattern.
escape boolean An optional boolean value that indicates whether or not escape all of the special characters (such as +, -, or |) in the text.

Returns

Type Description
boolean True if strings match the pattern and false if they do not.

SHA1

This function calculates the SHA-1 digest of a string.

Syntax

SHA1(text)
Argument Type Description
text string The string for which to calculate the SHA-1 digest.

Returns

Type Description
string The SHA-1 digest.

SHA224

This function calculates the SHA-224 digest of a string.

Syntax

SHA224(text)
Argument Type Description
text string The string for which to calculate the SHA-224 digest.

Returns

Type Description
string The SHA-224 digest.

SHA256

This function calculates the SHA-256 digest of a string.

Syntax

SHA256(text)
Argument Type Description
text string The string for which to calculate the SHA-256 digest.

Returns

Type Description
string The SHA-256 digest.

SHA384

This function calculates the SHA-384 digest of a string.

Syntax

SHA384(text)
Argument Type Description
text string The string for which to calculate the SHA-384 digest.

Returns

Type Description
string The SHA-384 digest.

SHA512

This function calculates the SHA-512 digest of a string.

Syntax

SHA512(text)
Argument Type Description
text string The string for which to calculate the SHA-512 digest.

Returns

Type Description
string The SHA-512 digest.

STRAFTER

This function returns the portion of a string that comes after the specified substring.

Syntax

STRAFTER(text, substring)
Argument Type Description
text string The string from which to return the characters that follow the substring.
substring string The string to match in the text. The function will return the part of the text that comes after this substring.

Returns

Type Description
string The part of the string that comes after the substring.

STRBEFORE

This function returns the portion of a string that comes before the specified substring.

Syntax

STRAFTER(text, substring)
Argument Type Description
text string The string from which to return the characters that precede the substring.
substring string The string to match in the text. The function will return the part of the text that comes before this substring.

Returns

Type Description
string The part of the string that comes before the substring.

STRDT

This function constructs a literal value with the specified data type.

Syntax

STRDT(text, datatype)
Argument Type Description
text string The string to add a data type specification to.
datatype URI The data type URI to add to the text. For example, xsd:integer or <http://www.w3.org/2001/XMLSchema#integer>.

Returns

Type Description
string The typed literal value.

STRENDS

This function evaluates whether the specified string ends with the specified substring.

Syntax

STRENDS(text, substring)
Argument Type Description
text string The string to search for the substring.
substring string The string to match at the end of text. The function returns true if the text ends in the specified substring and false if it does not.

Returns

Type Description
boolean True if strings end with the specified substring and false if they do not.

STRLANG

This function constructs a literal value with the specified language tag.

Syntax

STRLANG(text, language_tag)
Argument Type Description
text string The string to add the language tag to.
language_tag string The language tag to add to the text.

Returns

Type Description
string The literal value with the language tag.

STRLEN

This function calculates the length (in characters) of a string value.

Syntax

STRLEN(text)
Argument Type Description
text string The string for which to return the length.

Returns

Type Description
long The number of characters in the string.

STRSTARTS

This function evaluates whether the specified string starts with the specified substring.

Syntax

STRENDS(text, substring)
Argument Type Description
text string The string to search for the substring.
substring string The string to match at the beginning of text. The function returns true if the text starts with the specified substring and false if it does not.

Returns

Type Description
boolean True if strings begin with the specified substring and false if they do not.

STRUUID

This function returns a string that is the result of generating a Universally Unique Identifier (UUID).

Syntax

STRUUID()

Returns

Type Description
string The UUID.

SUBSTITUTE

This function substitutes the existing text for the specified new text.

Syntax

SUBSTITUTE(text, old_text, new_text [, instance_num ])
Argument Type Description
text string The string to substitute text in.
old_text string The string within the text to replace.
new_text string The string to replace the old_text with.
instance_num int An optional integer that specifies the number of old_text instances to replace.

Returns

Type Description
string The string with the new text.

SUBSTR

This function returns a substring from a string value.

Syntax

SUBSTR(text, start [, length ])
Argument Type Description
text string The string to find the substring in.
start int An integer that specifies the number of the character in the text that should be the start of the substring.
length int An optional integer that specifies the total number of characters to include in the substring. If not specified, the substring will end at the end of the text value.

Returns

Type Description
string The substring.

TOURI

This function casts a string literal value to a URI.

Syntax

TOURI(text)
Argument Type Description
text string The string literal to cast to a URI.

Returns

Type Description
URI The literal value as a URI.

TRIM

This function removes all spaces from a string except for any single spaces between words.

Syntax

TRIM(text)
Argument Type Description
text string The string to trim.

Returns

Type Description
string The string with spaces removed.

UCASE

This function converts all letters in a string to upper case.

Syntax

UPPER(text)
Argument Type Description
text string The string value to convert to upper case.

Returns

Type Description
string The string with upper case characters.

UPPER

This function converts all letters in a string literal to upper case.

Syntax

UPPER(text)
Argument Type Description
text string The string literal to convert to upper case.

Returns

Type Description
string The string with upper case characters.