String Functions

This topic describes the Anzo functions that operate on string data types.

Typographical Conventions

This documentation uses the following conventions in function syntax:

CAPS: Although SPARQL is case-insensitive, function names and other keywords are written in uppercase for readability.
[ argument ]: Brackets are used to indicate optional arguments. Arguments without brackets are required.

Functions

BUSINESS_ENTITY_EXCLUDER: Removes suffixes that represent business entities.
CONCATENATE: Concatenates two or more strings and returns the result as a string.
CONCATURL: Concatenates two or more strings and returns the result as a URI.
CONTAINS: Evaluates whether the specified string contains the given pattern.
ENCODE_FOR_URI: Encodes the specified string as a URI.
ESCAPEHTML: Escapes the specified string for use in HTML.
FIND: Returns the position—from left to right—of a string within another string.
FINDREVERSE: Returns the position—from right to left—of a string within another string.
GROUP_CONCAT: Concatenates a group of strings into a single string.
GROUPCONCAT: Concatenates a group of strings into a single string. This function is a customizable version of GROUP_CONCAT.
LANG: Returns any language tags that are included with strings.
LANGMATCHES: Evaluates whether a string includes a language tag that matches the specified language range.
LCASE: Converts the letters in a string literal to lower case.
LEFT: Returns the specified number of characters starting from the beginning (left side) of the string.
LEN: Calculates the length (number of characters) in a string.
LEVENSHTEIN_DIST: Calculates the Levenshtein distance or measure of similarity between two strings.
LOWER: Converts all letters in a string to lower case.
MD5: Returns the MD5 checksum of a string as a hexadecimal string.
MID: Returns the specified number of characters from a string, starting from a given position in the string.
REGEX: Evaluates whether a string matches the specified regular expression pattern.
REGEXP_SUBSTR: Searches a string for the specified regular expression pattern and returns the substring that matches the pattern.
REPLACE: Extends the REGEX function to provide the ability to find a pattern in a string and replace it with another pattern.
RIGHT: Returns the specified number of characters starting from the end (right side) of the string.
SEARCH: Uses text search semantics to evaluate whether the specified string matches the given pattern.
SHA1: Calculates the SHA-1 digest of a string value.
SHA224: Calculates the SHA-224 digest of a string value.
SHA256: Calculates the SHA-256 digest of a string value.
SHA384: Calculates the SHA-384 digest of a string value.
SHA512: Calculates the SHA-512 digest of a string value.
STRAFTER: Returns the portion of a string that comes after the specified substring.
STRBEFORE: Returns the portion of a string that comes before the specified substring.
STRDT: Constructs a literal value with the specified data type.
STRENDS: Evaluates whether the specified string ends with the specified substring.
STRLANG: Constructs a literal value with the specified language tag.
STRLEN: Calculates the length of a string.
STRSTARTS: Evaluates whether the specified string starts with the specified substring.
STRUUID: Returns a string that is the result of generating a Universally Unique Identifier (UUID).
SUBSTITUTE: Substitutes the existing text for the specified new text.
SUBSTR: Returns a substring from a string value.
TOURI: Casts a string to a URI.
TRIM: Removes all spaces from a string except for any single spaces between words.
UCASE: Converts all letters in a string to upper case.
UPPER: Converts the letters in a string literal to upper case.

BUSINESS_ENTITY_EXCLUDER

This function removes from strings the suffixes that represent business entities.

Syntax

BUSINESS_ENTITY_EXCLUDER(text)

Argument	Type	Description
text	string	The string from which you want to remove business entities.

Returns

Data Type	Description
string	The string without the business entity suffix.

CONCATENATE

This function concatenates two or more strings and returns the result as a string.

Syntax

CONCATENATE(text1, text2 [, textN ])

Argument	Type	Description
text1–N	string	The strings that you want to concatenate to form a single string.

Returns

Type	Description
string	The concatenated string.

CONCATURL

This function concatenates two or more strings and returns the result as a URI.

Syntax

CONCATURL(text1, text2 [, textN ])

Argument	Type	Description
text1–N	string	The strings that you want to concatenate to form a URI.

Returns

Type	Description
URI	The concatenated string as a URI.

CONTAINS

This function evaluates whether the specified strings contain the given pattern. Results are grouped under "true" or "false."

Syntax

CONTAINS(text, pattern)

Argument	Type	Description
text	string	The string value that you want to check against the specified pattern.
pattern	string	The string pattern that you want to look for in the supplied text.

Returns

Type	Description
boolean	`True` if the strings contain the pattern and `false` if they do not.

ENCODE_FOR_URI

This function encodes the specified string as a URI and returns a string in URI format.

Syntax

ENCODE_FOR_URI(text)

Argument	Type	Description
text	string	The string value to encode as a URI.

Returns

Type	Description
string	The string as a URI.

ESCAPEHTML

This function escapes the specified string for use in HTML.

Syntax

ESCAPEHTML(text)

Argument	Type	Description
text	string	The string value to escape for HTML.

Returns

Type	Description
string	The string escaped for HTML.

FIND

This function returns the position—from left to right—of a string within another string.

You can use FINDREVERSE to find the character or substring position from right to left.

Syntax

FIND(find_text, within_text, start_num)

Argument	Type	Description
find_text	string	The string to look for in the `within_text`.
within_text	string	The string to search within.
start_num	int	An integer that indicates the position to start from when looking for the `find_text`. The starting position is at the beginning of the `within_text` value and characters are counted from left to right.

Returns

Type	Description
int	The character position (from left to right) where the substring starts.

FINDREVERSE

Similar to FIND, this function returns the position—from right to left—of a string within another string.

Syntax

FINDREVERSE(find_text, within_text, start_num)

Argument	Type	Description
find_text	string	The string to look for in the `within_text` value.
within_text	string	The string to search within.
start_num	int	An integer that indicates the position to start from when looking for the `find_text`. The starting position is the end of the `within_text` value and characters are counted from right to left.

Returns

Type	Description
int	The character position (from right to left) where the substring starts.

GROUP_CONCAT

This function concatenates a group of strings into a single string. It is a simplified version of GROUPCONCAT as it takes only one argument.

Syntax

GROUP_CONCAT(text)

Argument	Type	Description
text	string	The string property whose values to concatenate into a single string.

Returns

Type	Description
string	The concatenated string.

GROUPCONCAT

This function concatenates a group of strings into a single string. Unlike GROUP_CONCAT, this function allows for customization of the separator to use as well as the configuration of limits and options like prefixes and suffixes.

Syntax

GROUPCONCAT(group1, [ group2, ..., groupN, ] group_value_separator, separator, serialize,
            row_limit, value_limit, delimit_blanks [, prefix ] [, suffix ] [, max_length ])

Argument	Type	Description
group1–N	string	The group(s) of strings to concatenate.
group_value_separator	string	The separator string to use between the groups of strings if you specified more than one `group`.
separator	string	The separator string to use between the values in a concatenated group of strings.
serialize	boolean	A boolean value that indicates whether returned values should be serialized with the value's data type.
row_limit	int	An integer that puts a maximum limit on the number of rows to retrieve for a group.
value_limit	int	An integer that puts a maximum limit on the number of values to retrieve from a group of rows.
delimit_blanks	boolean	A boolean value that indicates whether to delimit blanks with the `separator` value.
prefix	string	Optional string to add as a prefix to the resulting string.
suffix	string	Optional string to add as a suffix to the resulting string.
max_length	int	Optional integer that puts a maximum limit on the number of characters the resulting string can have.

Returns

Type	Description
string	The concatenated string.

LANG

This function returns any language tags that are included in the string. The results are grouped by each language tag or by "blank" if a value does not have a language tag.

Syntax

LANG(text)

Argument	Type	Description
text	string	The string to search for language tags.

Returns

Type	Description
string	The found language tags.

LANGMATCHES

This function tests whether a string includes a language tag that matches the specified language range.

Syntax

LANGMATCHES(text, language_range)

Argument	Type	Description
text	string	The string to evaluate.
language_range	string	The language tag to match in the `text`.

Example

LANGMATCHES(LANG(?prop),"en")

Returns

Type	Description
boolean	`True` if strings include a language tag that matches the range and `false` if they do not.

LCASE

This function converts the letters in a string literal to lower case.

Syntax

LCASE(text)

Argument	Type	Description
text	string	The string literal to convert to lower case.

Returns

Type	Description
string	The string with lower case letters.

LEFT

This function returns the specified number of characters starting from the beginning (left side) of the string.

Syntax

LEFT(text, num_chars)

Argument	Type	Description
text	string	The string from which to return the specified number of characters.
num_chars	int	An integer that specifies the number of characters to return, starting from the left side of the `text`.

Returns

Type	Description
string	The specified number of characters from the string.

LEN

This function calculates the length (number of characters) in a string.

Syntax

LEN(text)

Argument	Type	Description
text	string	The string for which to calculate the length.

Returns

Type	Description
int	The number of characters in the string.

LEVENSHTEIN_DIST

This function calculates the Levenshtein distance or measure of similarity between two strings. The distance is the number of edits required to transform the first string into the second string.

Syntax

LEVENSHTEIN_DIST(text1, text2)

Argument	Type	Description
text1	string	The string that would be transformed into `text2`.
text2	string	The string to measure `text1` against.

Returns

Type	Description
int	The Levenshtein distance between the strings.

LOWER

This function converts all letters in a string to lower case.

Syntax

LOWER(text)

Argument	Type	Description
text	string	The string to convert to lower case.

Returns

Type	Description
string	The string with lower case letters.

MD5

This function returns the MD5 checksum of a string as a hexadecimal string.

Syntax

MD5(text)

Argument	Type	Description
text	string	The string for which to return the MD5 checksum.

Returns

Type	Description
string	The hexadecimal string.

MID

This function returns the specified number of characters from a string, starting from a given position in the string.

Syntax

MID(text, start_num, num_chars)

Argument	Type	Description
text	string	The string from which to return the specified characters.
start_num	int	An integer that indicates the starting position in the string.
num_chars	int	An integer that specifies the number of characters to return, starting with the character indicated by `start_num`.

Returns

Type	Description
string	The specified number of characters from the string.

REGEX

This function tests whether a string matches the specified regular expression pattern.

Syntax

REGEX(text, pattern [, flags ])

Argument	Type	Description
text	string	The string to test against the `pattern`.
pattern	string	The regular expression pattern to look for in the `text`. For information about the supported regular expression syntax, see the Regular Expression Syntax section of the W3C XQuery 1.0 and XPath 2.0 Functions and Operators specification.
flags	string	You can include one or more optional modifier flags that further define the pattern. For information about flags, see the Flags section of the W3C Functions and Operators specification.

Returns

Type	Description
boolean	`True` if the string matches the regular expression pattern and `false` if it does not.

REGEXP_SUBSTR

This function searches a string for the specified regular expression pattern and returns the substring that matches the pattern.

Syntax

REGEXP_SUBSTR(text, pattern [, start_position ] [, nth_appearance ])

Argument	Type	Description
text	string	The string to test against the `pattern`.
pattern	string	The regular expression pattern to look for in the `text`. For information about the supported regular expression syntax, see the Regular Expression Syntax section of the W3C XQuery 1.0 and XPath 2.0 Functions and Operators specification.
start_position	int	An optional integer that specifies the number of characters from the beginning of the string to start searching for matches (the default value is 1).
nth_appearance	int	An optional integer that specifies which occurrence of the pattern to match (the default value is 1).

Returns

Type	Description
string	The substring that matches the regular expression pattern.

REPLACE

This function extends the REGEX function to provide the ability to find a pattern in a string and replace it with another pattern. The function returns the replaced string.

Syntax

REPLACE(text, pattern, replacement_pattern [, flags ])

Argument	Type	Description
text	string	The string to test against the `pattern`.
pattern	string	The regular expression pattern to look for in the `text`. For information about the supported regular expression syntax, see the Regular Expression Syntax section of the W3C XQuery 1.0 and XPath 2.0 Functions and Operators specification.
replacement_pattern	string	The pattern to replace the `pattern` with.
flags	string	You can include one or more optional modifier flags that further define the pattern. For information about flags, see the Flags section of the W3C Functions and Operators specification.

Returns

Type	Description
string	The string that contains the replacement pattern.

RIGHT

This function returns the specified number of characters starting from the end (right side) of the string.

Syntax

RIGHT(text, num_chars)

Argument	Type	Description
text	string	The string from which to return the specified number of characters.
num_chars	int	An integer that specifies the number of characters to return, starting from the right side of the `text`.

Returns

Type	Description
string	The specified characters from the string.

SEARCH

This function uses text search semantics to evaluate whether the specified string matches the given pattern.

Syntax

SEARCH(text, pattern [, required ] [, wildcard ] [, escape ])

Argument	Type	Description
text	string	The string to search.
pattern	string	The search string to look for in the `text`. Anzo automatically converts the value to a regular expression pattern that uses text search semantics.
required	boolean	An optional boolean value that indicates whether the `text` must include all elements of the search pattern to qualify as a match or whether matching just part of the pattern qualifies as a match.
wildcard	boolean	An optional boolean value that indicates whether or not to add the wildcard character `*` to the end of the search pattern.
escape	boolean	An optional boolean value that indicates whether or not escape all of the special characters (such as `+`, `-`, or `\|`) in the `text`.

Returns

Type	Description
boolean	`True` if strings match the pattern and `false` if they do not.

SHA1

This function calculates the SHA-1 digest of a string.

Syntax

SHA1(text)

Argument	Type	Description
text	string	The string for which to calculate the SHA-1 digest.

Returns

Type	Description
string	The SHA-1 digest.

SHA224

This function calculates the SHA-224 digest of a string.

Syntax

SHA224(text)

Argument	Type	Description
text	string	The string for which to calculate the SHA-224 digest.

Returns

Type	Description
string	The SHA-224 digest.

SHA256

This function calculates the SHA-256 digest of a string.

Syntax

SHA256(text)

Argument	Type	Description
text	string	The string for which to calculate the SHA-256 digest.

Returns

Type	Description
string	The SHA-256 digest.

SHA384

This function calculates the SHA-384 digest of a string.

Syntax

SHA384(text)

Argument	Type	Description
text	string	The string for which to calculate the SHA-384 digest.

Returns

Type	Description
string	The SHA-384 digest.

SHA512

This function calculates the SHA-512 digest of a string.

Syntax

SHA512(text)

Argument	Type	Description
text	string	The string for which to calculate the SHA-512 digest.

Returns

Type	Description
string	The SHA-512 digest.

STRAFTER

This function returns the portion of a string that comes after the specified substring.

Syntax

STRAFTER(text, substring)

Argument	Type	Description
text	string	The string from which to return the characters that follow the `substring`.
substring	string	The string to match in the `text`. The function will return the part of the text that comes after this substring.

Returns

Type	Description
string	The part of the string that comes after the substring.

STRBEFORE

This function returns the portion of a string that comes before the specified substring.

Syntax

STRAFTER(text, substring)

Argument	Type	Description
text	string	The string from which to return the characters that precede the `substring`.
substring	string	The string to match in the `text`. The function will return the part of the text that comes before this substring.

Returns

Type	Description
string	The part of the string that comes before the substring.

STRDT

This function constructs a literal value with the specified data type.

Syntax

STRDT(text, datatype)

Argument	Type	Description
text	string	The string to add a data type specification to.
datatype	URI	The data type URI to add to the `text`. For example, `xsd:integer` or `<http://www.w3.org/2001/XMLSchema#integer>`.

Returns

Type	Description
string	The typed literal value.

STRENDS

This function evaluates whether the specified string ends with the specified substring.

Syntax

STRENDS(text, substring)

Argument	Type	Description
text	string	The string to search for the `substring`.
substring	string	The string to match at the end of `text`. The function returns true if the text ends in the specified substring and false if it does not.

Returns

Type	Description
boolean	`True` if strings end with the specified substring and `false` if they do not.

STRLANG

This function constructs a literal value with the specified language tag.

Syntax

STRLANG(text, language_tag)

Argument	Type	Description
text	string	The string to add the language tag to.
language_tag	string	The language tag to add to the `text`.

Returns

Type	Description
string	The literal value with the language tag.

STRLEN

This function calculates the length (in characters) of a string value.

Syntax

STRLEN(text)

Argument	Type	Description
text	string	The string for which to return the length.

Returns

Type	Description
long	The number of characters in the string.

STRSTARTS

This function evaluates whether the specified string starts with the specified substring.

Syntax

STRENDS(text, substring)

Argument	Type	Description
text	string	The string to search for the `substring`.
substring	string	The string to match at the beginning of `text`. The function returns true if the text starts with the specified substring and false if it does not.

Returns

Type	Description
boolean	`True` if strings begin with the specified substring and `false` if they do not.

STRUUID

This function returns a string that is the result of generating a Universally Unique Identifier (UUID).

Syntax

STRUUID()

Returns

Type	Description
string	The UUID.

SUBSTITUTE

This function substitutes the existing text for the specified new text.

Syntax

SUBSTITUTE(text, old_text, new_text [, instance_num ])

Argument	Type	Description
text	string	The string to substitute text in.
old_text	string	The string within the `text` to replace.
new_text	string	The string to replace the `old_text` with.
instance_num	int	An optional integer that specifies the number of `old_text` instances to replace.

Returns

Type	Description
string	The string with the new text.

SUBSTR

This function returns a substring from a string value.

Syntax

SUBSTR(text, start [, length ])

Argument	Type	Description
text	string	The string to find the substring in.
start	int	An integer that specifies the number of the character in the `text` that should be the start of the substring.
length	int	An optional integer that specifies the total number of characters to include in the substring. If not specified, the substring will end at the end of the `text` value.

Returns

Type	Description
string	The substring.

TOURI

This function casts a string literal value to a URI.

Syntax

TOURI(text)

Argument	Type	Description
text	string	The string literal to cast to a URI.

Returns

Type	Description
URI	The literal value as a URI.

TRIM

This function removes all spaces from a string except for any single spaces between words.

Syntax

TRIM(text)

Argument	Type	Description
text	string	The string to trim.

Returns

Type	Description
string	The string with spaces removed.

UCASE

This function converts all letters in a string to upper case.

Syntax

UPPER(text)

Argument	Type	Description
text	string	The string value to convert to upper case.

Returns

Type	Description
string	The string with upper case characters.

UPPER

This function converts all letters in a string literal to upper case.

Syntax

UPPER(text)

Argument	Type	Description
text	string	The string literal to convert to upper case.

Returns

Type	Description
string	The string with upper case characters.