Loading Data
The topics in this section provide instructions for loading data into graphmart's data layers. For conceptual information about graphmarts, layers, and steps, see Graphmart Concepts.
Aside from loading data into graphmart layers, loading of structured and semi-structured sources, like databases, HTTP REST endpoints, and CSV, JSON, XML, Parquet, or SAS files, is automated using Graph Lakehouse's Graph Data Interface (GDI). The GDI also supports loading or virtualizing data via manually written SPARQL queries.
Loading of unstructured data sources, like documents, PDFs, text snippets, web pages, emails, and content from knowledgebases, are loading using configurable Graph Studio Distributed Unstructured pipelines. See Unstructured Data for details.
Source Layers: Adding and Managing Source Data
If you select the Source Layer option when Creating a New Layer, you can add source data to the layer (load data into the graphmart from a variety of external sources (SQL databases, CSV files, JSON, Parqet, ...).
In an empty layer, start adding source data by clicking the button From a Connection or From My Computer.
If the layer already contains data, click the Add Source Data button at the top right corner of the frame and select the desired option (From a Connection or From My Computer).
Adding Source Data From a Connection
If you choose to add source data from a connection, the Add Source Data screen opens. On the left pane, it lists Shared Connections (Data Files, Server Shared Filesystem, other storage. Previously saved connections are shown at the top left.
To create a new connection, click the New Connection button. The Select a connection type screen opens:
Search the desired type or use the filters “Databases”, File Systems”, "Other Types” at the top of the screen.
For example, to define a connection to a local or remote file system, select Remote Files from the table of connection types. Click Next. In the Connection Details screen, specify Basic Information (Title, Description, Base Directory). Click Test Connection. If the connection succeeds, click Save. The newly added connection appears in the Connections list at the top left and can now be browsed. Note: You can select and import not just individual files from folders, but a folder in its entirety. Place a checkmark next to the files or folders you wish to add.
After selecting the desired source data, click Add at the bottom right of the Add Source Data screen. The selected data sources are displayed in the Source tab of the Data Layer view:
If you added a folder from a file system, click on the folder name in the Source tab.
This will list the folder contents - the source files along with their metadata - and the Treatments pane on the left. Click on an individual file in the list. Fields information is shown by default. Click Data in the Perspective switch at the top right to preview the data in the selected file.
Click Save if you make any changes you wish to save.
Click Refresh in the graphmart screen to ingest the data and generate the model.
Load Dataset into Layer
[under construction].
Data Transformation Layers
[under construction].