Export Step
This type of step exports the contents of a graphmart to an FLDS on disk.
JSON Request
The following template shows the body of a JSON request that could be used in an Export Step PUT or PATCH request. It lists all of the step's required and optional body parameters but excludes the read-only options. The default values for each parameter are shown. Below the request (in Schema Details) is a table that describes the complete schema, including the read-only parameters. Clicking a link in the template takes you to the schema details for that parameter.
{
"doNotCreateEditionsOnExport" : true,
"generateMetrics" : true,
"exportBinaryStoreContents" : true,
"overwriteFlds" : true,
"alwaysMoveBinaryStore" : true,
"gmLinkedDataset" : "string",
"edition" : "string",
"elasticsearchBulkSize" : 0,
"elasticsearchBulkActions" : 0,
"elasticsearchBulkMaxThreadsPerFlds" : 0,
"elasticsearchBulkConcurrentRequests" : 0,
"keepEsIndexOnline" : true,
"elasticsearchBulkMaxFldsThreads" : 0,
"maxComponentsInEdition" : 0,
"elasticsearchIndexSettings" : "string",
"exportElasticSearchContents" : true,
"title" : "string",
"incrementalData" : [ "string" ],
"type" : "ExportStep",
"enabled" : true,
"contextProvider" : [ "string" ],
"description" : "string",
"ontology" : [ "string" ],
"source" : [ "string" ],
"ignoreLoadErrors" : true,
"disableLoadCounts" : true,
"preGenerateStatistics" : true,
"tags" : [ {
"description" : "string",
"title" : "string"
} ],
"tagTitle" : [ "string" ]
}
Schema Details
The table below describes the Export Step schema.
You can also see the Export Step schema by expanding Schemas at the bottom of the Anzo REST API document and viewing ExportStep.
uri (read-only) |
"uri" |
Auto-generated |
The URI of the step. |
creator (read-only) |
"uri" |
Auto-generated |
The creator of the step. |
created (read-only) |
"dateTime" |
Auto-generated |
The timestamp when the step was created. |
modifier (read-only) |
"uri" |
Auto-generated |
The user who modified the step. |
alltypes (read-only) |
Array of strings |
Optional |
A list of the types related to the step, such as ExportStep, Step, LayerChild, etc. |
contextAttribute (read-only) |
Array of strings |
Optional |
A list of any context attributes that are used. |
doNotCreateEditionsOnExport
|
boolean |
Optional |
Controls whether a new edition is created for the dataset each time the step is run. |
generateMetrics
|
boolean |
Optional |
Controls whether a data profile is generated before the data is exported. If you load the exported files in the future, the data profile is also loaded. |
exportBinaryStoreContents
|
boolean |
Optional |
Applies to exports of unstructured graphmarts and controls whether the binary store is exported along with the data. |
overwriteFlds
|
boolean |
Optional |
Controls whether the existing FLDS is replaced with the exported files whenever the step is run or whether the exported files are added to the existing FLDS. When overwriteFlds is true , Anzo archives the existing files in a new timestamped export subdirectory under the FLDS directory each time the step runs. If you add the exported dataset to a graphmart, only the latest version of the data will be loaded. When overwriteFlds is false , Anzo adds all of the exported datasets to a cumulative export directory under the FLDS directory. The dataset will contain the original files as well as all cumulative working editions. If you add this dataset to a graphmart, all of the data from all of the subdirectories will be loaded. |
alwaysMoveBinaryStore
|
boolean |
Optional |
This option also applies to exports of unstructured graphmarts and controls whether the binary store is moved or copied during the export. Since the binary store can be large and have a nested structure, copying the data can take a very long time. Since moving the binary store is almost instantaneous, however, leaving this option set to true can reduce the time it takes to complete the export. |
gmLinkedDataset
|
"uri" |
Required |
The URI of the Linked Dataset Catalog entry that represents the target FLDS for the export. To get the catalog entry, you can retrieve data about the dataset and use the catalogEntry value. |
edition
|
"uri" |
Optional |
The URI of the edition that will be created on export. |
elasticsearchBulkSize
|
long |
Optional |
The maximum batch size in MB. |
elasticsearchBulkActions
|
int |
Optional |
The maximum number of documents to include in each batch. |
elasticsearchBulkMaxThreadsPerFlds
|
int |
Optional |
The maximum number of threads to use for indexing per FLDS. |
elasticsearchBulkConcurrentRequests
|
int |
Optional |
The maximum number of bulk requests that can run concurrently. |
keepEsIndexOnline
|
boolean |
Optional |
Controls whether the Elasticsearch index remains stored in Elasticsearch or is removed from Elasticsearch once it is exported. |
elasticsearchBulkMaxFldsThreads
|
int |
Optional |
The maximum number of FLDSes to index concurrently. |
maxComponentsInEdition
|
int |
Optional |
Controls the maximum number of components to retain in an edition. The default value is 0 , which means unlimited. If you specify a number in this field and the limit is reached, Anzo ages off the oldest components as new ones are created. |
elasticsearchIndexSettings
|
"string" |
Optional |
A JSON-formatted list of any Elasticsearch-specific index settings to apply. |
exportElasticSearchContents
|
boolean |
Optional |
Indicates whether to export any Elasticsearch contents that are included in the graphmart. |
title
|
"string" |
Required |
The name of the step. |
incrementalData
|
"string" |
Optional |
Incremental load data associated with the step. |
type
|
"string" |
Required |
The type of step: "ExportStep". |
enabled
|
boolean |
Optional |
Controls whether the step is enabled or disabled. |
contextProvider
|
[ "uri", "..." ] |
Optional |
A list of any referenced context providers (the data source URI). You can retrieve data for the parent layer to get a list of providers for that layer. |
description
|
"string" |
Optional |
A brief description of the step. |
ontology
|
[ "uri", "..." ] |
Optional |
A list of any models to associate with this step. |
source
|
[ "uri", "..." ] |
Required |
The source data for the step. Options are any combination of the following values:
- "http://cambridgesemantics.com/ontologies/Graphmarts#Self": The source is the data that is in this step's layer.
- "http://cambridgesemantics.com/ontologies/Graphmarts#AllPrevious": The source is the data from all of the successful layers that precede this step's layer. Failed layers are ignored.
- "http://cambridgesemantics.com/ontologies/Graphmarts#Previous": The source is the data that is in the one layer that precedes this step's layer.
- "layer_uri": The source is a specific layer in the graphmart.
|
ignoreLoadErrors
|
boolean |
Optional |
Controls whether to ignore errors and proceed with the load or fail the step if there is an error. |
disableLoadCounts
|
boolean |
Optional |
Controls whether Anzo periodically queries AnzoGraph to count the total number of statements that are processed. Disabling the load count decreases the number of queries that run during activation. |
preGenerateStatistics
|
boolean |
Optional |
Controls whether AnzoGraph generates statistics on the data before the step is run. |
tags
|
Array of objects |
Optional |
Any tags on the step. |
tagTitle
|
["string"] |
Optional |
A virtual property that is available for all objects. It lists the tags associated with the step or can be used to add a tag to the step without including a description. |