Importing Data from CSV Files
This topic provides instructions for creating a CSV data source and importing data from the files.
- In the Anzo console, expand the Onboard menu and click Structured Data. Anzo displays the Data Sources screen, which lists any existing data sources. For example:
- Click the Create button and select CSV Data Source. Anzo opens the Create CSV Datasource screen.
- Type a name for the data source in the Datasource name field, and type an optional description in the Description field. Then click Save. Anzo saves the source and displays the Files tab.
- On the Files tab, select the location to import the file or files from. If the files exist on the local Anzo file system or another location with a file connection (as described in Connecting to a File Store), click Select From File System. To select files on your computer, click Select From Computer.
- In the file selector dialog box, navigate to the directory for the CSV files and select the file or files to import.
For example, when selecting files from the file system, Anzo opens the Select import files dialog box. On the left side of the screen, select the file system or storage location for the CSV files. On the right side of the screen, navigate to the directory that contains the CSV files to import. The screen displays the list of files in the directory. For example:
- Select each file that you want to import. If you have multiple files with the same schema— the files contain the same columns listed in the same order—you can select the Insert Wildcard option. Then type a string using asterisks as wildcard characters to indicate find the files with similar names. Files that match the specified string will be imported as one file. After typing a string, click Apply to include that string in the Selected list.
Example: The image below shows a directory with several CSV files. For this example, part.csv and partsupp.csv have the same schema and can be imported as one file. The Insert Wildcard option is selected, and part*.csv is specified to identify the two files.
- When you finish selecting files, click OK to close the dialog box. The Files screen lists the selected CSV files as Pending. For example:
- If necessary, change any of the default CSV file options. To change the options for a single file, click the pencil icon () next to a file. To change the options for multiple files, select the checkbox next to each of the files, and then click the Edit button at the top of the table. Anzo displays the Edit CSV File screen. For example, the image below shows the Edit screen for a single file:
Change the options as needed and then click Save & Import to import the CSV file or files.
- If you do not need to change CSV file options, click the Import Pending Files button to import all of the pending files. Anzo imports the data and updates the Status and Size columns on the screen. For example:
To view the schema that Anzo created, you can click the Overview tab and then click the Data Source Metadata link. Anzo opens the tables screen for the schema, where you can view the table details.
The source data can now be onboarded to Anzo. For instructions on onboarding the data by letting Anzo automatically generate the mapping, model, and ETL pipeline, see Auto-Ingesting Imported Data. For information about creating mappings, models, and pipelines manually, see Working with Mappings, Modeling Data, and Working with Pipelines.