Annotator Settings Reference

When you edit an existing annotator, additional options become available for refining the annotation criteria or customizing the generated model or instance data. This topic describes the advanced settings that are available when editing each type of annotator.

External Service Annotator

The table below defines the settings that are displayed when an External Service Annotator is edited.

Setting Description
Title Required field that specifies the unique name for the annotator.
Description Optional field that provides a description of the annotator.
HTTP Request Config Required field that specifies the HTTP source object that contains the URL and method to use when sending data for annotations.
Document ID Response Path Required field that specifies where to find the document ID in the response.
Entity Name Path Required field that specifies the annotation object name path.
Entity Class Path Required field that specifies the base class URI for an annotation.
Result Path Root The path to the object that contains the annotation results.
Store NLP Service Response Controls whether the service's response is stored in the binary store.
Result Field Path The external NLP-specific result configuration for returned entities.
Socket Timeout Specifies the socket timeout (in milliseconds) to use for requests against the source.
Entity Snippet Path The snippet path for entities returned in the service response.
Entity End Offset Path The end text offset location in the document for entities returned in the service response.
Entity Begin Offset Path The start text offset location in the document for entities returned in the service response.
Entity Span Path The text offset location in the document for entities returned in the service response.
Entity Text Path The text path for entities returned in the service response.
Entity ID Path The ID path for entities returned in the service response.
Document ID Request Field The Document ID parameter for the external service.
Class Name Property Specifies an annotation property whose value you want to map to the name of the class. For example, if a Category property has the value Disease and you want the name of the class to be "Disease," add Category to this field. When Class Name Property is not defined, the class name is auto-generated.
Unintended Property Names A list of any property names to filter out. Type a name in the field and then click Add to add the value.
Unintended Classes A list of any classes to filter out. Type a class in the field and then click Add to add the value.
Unintended Instances A list of any entities or instances of the class to filter out. Type an instance in the field and then click Add to add the value.
Is Combine Annotation Instances Controls whether to combine multiple instances of an extraction into one annotation.
Create Additional General Annotation Type Controls whether the annotator creates a general shared annotation type in addition to the specific annotation types that are created.
Explicit Property Datatypes/Objecttypes A list of keys that map property names to a particular object property type.
Output the Detections Controls whether to include specific detections as an annotation property.
Unintended Property Values A list of any property values to filter out. Type a value in the field and then click Add to add the value.
Entity URI Property Specifies an annotation property whose value you want to map to the URIs for instances of the class. For example, if a Disease_ID property has the value http://example.com/Asthma and you want to use http://example.com/Asthma as the base URI for instances of the class, add Disease_ID to this field. When Entity URI Property is not defined, the URI is auto-generated based on the name.
Entity Name Property Specifies an annotation property whose value you want to map to the names for instances of the class. For example, if a Preferred_Label property includes disease names and you want to use those label values as the names for instances of the Disease class, add Preferred_Label to this field. When Entity Name Property is not defined, the name is auto-generated.
Domain Object Base Class URI If creating a general annotation type (Create Additional General Annotation Type is enabled), this setting specifies the class to use as the base type for the annotator's domain objects.
Class URI Property Specifies an annotation property whose value you want to map to the class URI in the model. For example, if a Category_ID property has the value http://example.com/Disease and you want to use http://example.com/Disease as the base class URI, add Category_ID to this field. When Class URI Property is not defined, the URI is auto-generated based on the name.
Is Error Fatal Controls whether to fail the pipeline if this annotator fails to create annotations.

Keyword and Phrase Annotator

The table below defines the settings that are displayed when a Keyword and Phrase Annotator is edited.

Setting Description
Only Consider Text Controls whether to find phrases via a simplified format of the document. Enabling this setting can be beneficial for a document such as a rich HTML file. Enabling this option is less ideal for documents with multibyte characters.
Require Nonstandard Word Boundaries Indicates whether the specified phrase can be present with or without surrounding character breaks (e.g. for Chinese) or with regex-nonstandard word boundaries (e.g. for Tagalog).
Title Required field that specifies the unique name for the annotator.
Description Optional field that provides a description of the annotator.
Phrase Required field that specifies the terms or phrases to annotate. Type a word or phrase in the field and then click Add to add the phrase. You can add any number of phrases.
Unintended Property Names A list of any property names to filter out. Type a name in the field and then click Add to add the value.
Create Additional General Annotation Type Controls whether the annotator creates a general shared annotation type in addition to the specific annotation types that are created.
Entity URI Property Specifies an annotation property whose value you want to map to the URIs for instances of the class. For example, if a Disease_ID property has the value http://example.com/Asthma and you want to use http://example.com/Asthma as the base URI for instances of the class, add Disease_ID to this field. When Entity URI Property is not defined, the URI is auto-generated based on the name.
Explicit Property Datatypes/Objecttypes A list of keys that map property names to a particular object property type.
Entity Name Property Specifies an annotation property whose value you want to map to the names for instances of the class. For example, if a Preferred_Label property includes disease names and you want to use those label values as the names for instances of the Disease class, add Preferred_Label to this field. When Entity Name Property is not defined, the name is auto-generated.
Domain Object Base Class URI If creating a general annotation type (Create Additional General Annotation Type is enabled), this setting specifies the class to use as the base type for the annotator's domain objects.
Unintended Classes A list of any classes to filter out. Type a class in the field and then click Add to add the value.
Class Name Property Specifies an annotation property whose value you want to map to the name of the class. For example, if a Category property has the value Disease and you want the name of the class to be "Disease," add Category to this field. When Class Name Property is not defined, the class name is auto-generated.
Unintended Instances A list of any entities or instances of the class to filter out. Type an instance in the field and then click Add to add the value.
Is Combine Annotation Instances Controls whether to combine multiple instances of an extraction into one annotation.
Class URI Property Specifies an annotation property whose value you want to map to the class URI in the model. For example, if a Category_ID property has the value http://example.com/Disease and you want to use http://example.com/Disease as the base class URI, add Category_ID to this field. When Class URI Property is not defined, the URI is auto-generated based on the name.
Output the Detections Controls whether to include specific detections as an annotation property.
Unintended Property Values A list of any property values to filter out. Type a value in the field and then click Add to add the value.
Is Error Fatal Controls whether to fail the pipeline if this annotator fails to create annotations.

Knowledgebase Annotator

The table below defines the settings that are displayed when a Knowledgebase Annotator is edited.

Setting Description
Title Required field that specifies the unique name for the annotator.
Description Optional field that provides a description of the annotator.
Backing Graphmart Optional field that specifies the graphmart or graphmarts to annotate.
Backing Layer Optional field that specifies the data layer or layers to annotate.

The Backing Layer and Backing Graphmart fields are treated independently. Layers that you select do not have to be part of the graphmart that you specify in Backing Graphmart. And specifying a layer does not mean that you must select a Backing Graphmart. However, any layers or graphmarts that you select must contain classes and properties from the Backing Ontology or the data will not be annotated.

Backing Ontology Required field that specifies the model for the backing data layers and/or graphmart.
Term Class Required field that specifies the class of data for the annotations.
Term Label Property Required field that lists the primary name or label property of the resources.
Term Identifying Properties Required field that specifies the properties that contain names, aliases, or other identifiers to use for identifying the resources.
Backing Dataset Optional field that specifies the dataset or datasets to annotate.
Case Sensitive Controls whether matches must be case-sensitive.
Invalidating Properties A list of any properties for which you do not want to find matching resources.
Discard Matches Of Common Words Controls whether to discard matches of the most common words.
Discard Matches of Substrings A list of the substrings for which you want matches to be discarded. Type a string in the field and then click Add to add the value.
Text Search Query Pattern Precedence When text search query properties are specified, this setting controls whether resource names or aliases are included as matches. When enabled, resource names and aliases will not be matched.
Lucene Pattern Properties A list of properties that contain Lucene query syntax for document categorization.
Approximate Label Properties A list of properties that contain phrases that may be matched only approximately, i.e., fault-tolerantly, via slightly alternate spellings or misspellings.
Simplified Regex Pattern Properties A list of properties that contain simplified regular expressions.
Regex Pattern Properties A list of properties that contain regular expressions.
Strip Characters for Match Characters to strip out before determining if there is a match.
Clear Caches Controls whether to clear any existing caches when the pipeline is run.
Rows Per Query The maximum number of rows to query at a time when paging through the knowledgebase.
Minimum Hit Length The minimum span length that can count as a match.
Domain Object Base Class URI If creating a general annotation type (Create Additional General Annotation Type is enabled), this setting specifies the class to use as the base type for the annotator's domain objects.
Class URI Property Specifies an annotation property whose value you want to map to the class URI in the model. For example, if a Category_ID property has the value http://example.com/Disease and you want to use http://example.com/Disease as the base class URI, add Category_ID to this field. When Class URI Property is not defined, the URI is auto-generated based on the name.
Entity URI Property Specifies an annotation property whose value you want to map to the URIs for instances of the class. For example, if a Disease_ID property has the value http://example.com/Asthma and you want to use http://example.com/Asthma as the base URI for instances of the class, add Disease_ID to this field. When Entity URI Property is not defined, the URI is auto-generated based on the name.
Is Combine Annotation Instances Controls whether to combine multiple instances of an extraction into one annotation.
Unintended Instances A list of any entities or instances of the class to filter out. Type an instance in the field and then click Add to add the value.
Explicit Property Datatypes/Objecttypes A list of keys that map property names to a particular object property type.
Unintended Classes A list of any classes to filter out. Type a class in the field and then click Add to add the value.
Unintended Property Names A list of any property names to filter out. Type a name in the field and then click Add to add the value.
Create Additional General Annotation Type Controls whether the annotator creates a general shared annotation type in addition to the specific annotation types that are created.
Output the Detections Controls whether to include specific detections as an annotation property.
Unintended Property Values A list of any property values to filter out. Type a value in the field and then click Add to add the value.
Entity Name Property Specifies an annotation property whose value you want to map to the names for instances of the class. For example, if a Preferred_Label property includes disease names and you want to use those label values as the names for instances of the Disease class, add Preferred_Label to this field. When Entity Name Property is not defined, the name is auto-generated.
Class Name Property Specifies an annotation property whose value you want to map to the name of the class. For example, if a Category property has the value Disease and you want the name of the class to be "Disease," add Category to this field. When Class Name Property is not defined, the class name is auto-generated.
Is Error Fatal Controls whether to fail the pipeline if this annotator fails to create annotations.

Regex Annotator

The table below defines the settings that are displayed when a Regex Annotator is edited.

Setting Description
Title Required field that specifies the unique name for the annotator.
Description Optional field that provides a description of the annotator.
Regular Expression Rule Required field that lists the regular expression rules for this annotator.
Case-Insensitive Enables or disables case-insensitive matching. By default, case-insensitive matching assumes that only characters in the US-ASCII character set are being matched. Unicode-aware case-insensitive matching can be enabled by enabling Unicode Case Folding in conjunction with this option.
Multiline Mode Enables or disable multiline mode. When multiline mode is enabled, the expressions ^ and $ match immediately after or before a line terminator or the end of the input sequence. When multiline mode is disabled, these expressions only match at the beginning and end of the entire input sequence.
Allow Comments Controls whether whitespace and comments are allowed in a pattern. When enabled, whitespace and embedded comments starting with # are ignored until the end of a line.
Canonical Equivalence Controls whether canonical equivalence is taken into account when finding matches. When enabled, characters are considered a match if and only if their full canonical decompositions match. For example, the expression a\u030A will match the string \u00E5.
Enable Dotall Controls whether dotall mode is used. When enabled, the expression . matches any character, including a line terminator. When disabled, . does not match line terminators.
Literal Parsing Controls whether literal parsing is employed. When enabled, the input string that specifies the pattern is treated as a sequence of literal characters and metacharacters and escape sequences have no special meaning.
Unicode Case Folding Controls whether case-insensitive matching is done in a manner that is consistent with the Unicode Standard. By default, Case-Insensitive matching assumes that only characters in the US-ASCII set are being matched.
Unix Lines Enables or disables Unix line mode. When enabled, only the \n line terminator is recognized in the behavior of ., ^, and $.
Is Combine Annotation Instances Controls whether to combine multiple instances of an extraction into one annotation.
Class URI Property Specifies an annotation property whose value you want to map to the class URI in the model. For example, if a Category_ID property has the value http://example.com/Disease and you want to use http://example.com/Disease as the base class URI, add Category_ID to this field. When Class URI Property is not defined, the URI is auto-generated based on the name.
Entity URI Property Specifies an annotation property whose value you want to map to the URIs for instances of the class. For example, if a Disease_ID property has the value http://example.com/Asthma and you want to use http://example.com/Asthma as the base URI for instances of the class, add Disease_ID to this field. When Entity URI Property is not defined, the URI is auto-generated based on the name.
Class Name Property Specifies an annotation property whose value you want to map to the name of the class. For example, if a Category property has the value Disease and you want the name of the class to be "Disease," add Category to this field. When Class Name Property is not defined, the class name is auto-generated.
Unintended Classes A list of any classes to filter out. Type a class in the field and then click Add to add the value.
Entity Name Property Specifies an annotation property whose value you want to map to the names for instances of the class. For example, if a Preferred_Label property includes disease names and you want to use those label values as the names for instances of the Disease class, add Preferred_Label to this field. When Entity Name Property is not defined, the name is auto-generated.
Unintended Instances A list of any entities or instances of the class to filter out. Type an instance in the field and then click Add to add the value.
Unintended Property Values A list of any property values to filter out. Type a value in the field and then click Add to add the value.
Create Additional General Annotation Type Controls whether the annotator creates a general shared annotation type in addition to the specific annotation types that are created.
Output the Detections Controls whether to include specific detections as an annotation property.
Explicit Property Datatypes/Objecttypes A list of keys that map property names to a particular object property type.
Unintended Property Names A list of any property names to filter out. Type a name in the field and then click Add to add the value.
Domain Object Base Class URI If creating a general annotation type (Create Additional General Annotation Type is enabled), this setting specifies the class to use as the base type for the annotator's domain objects.
Is Error Fatal Controls whether to fail the pipeline if this annotator fails to create annotations.