Terracotta 10.15 | Terracotta Server Administration Guide | Importing and Exporting Datasets | Import-Export API
 
Import-Export API
The ability to import and export datasets to and from a terracotta cluster can be performed by any TCStore client using the import-export libraries distributed with the kit. The following code sample shows the required import paths for the key import-export classes:
import com.terracottatech.store.importexport.api.parquet.ParquetDatasetExport; // (1)
import com.terracottatech.store.importexport.api.tson.TSONDatasetExport; // (2)
import com.terracottatech.store.importexport.api.tson.TSONDatasetImport; // (3)
1
Exporting to Parquet requires imports for ParquetDatasetExport (shown), ParquetExportOptions and ParquetExportStats.
2
Exporting to TSON requires imports for TSONDatasetExport (shown), TSONExportOptions and TSONExportStats.
3
Importing from TSON requires imports for TSONDatasetImport (shown), TSONImportOptions and TSONImportStats.
The following example illustrates how to export a dataset to a Parquet file format using the import-export API:
try (DatasetManager dsManager =
DatasetManager.clustered(URI.create(connectionURI)).build()) { // (1)
ParquetExportOptions options = new ParquetExportOptions(); // (2)
options.setDatasetName("DS1"); // (3)
options.setDatasetType(LONG); // (4)
options.setOutputFolder(Paths.get(outputFolderFullPath)); // (5)
ParquetDatasetExport exporter
= new ParquetDatasetExport(dsManager, options); // (6)
ParquetExportStats stats = exporter.exportDataset(); // (7)
System.out.println(stats.toString()); // (8)
}
1
Create a DatasetManager against a server in the cluster supplying a URI connection string (e.g. terracotta://<hostname>:<hostport>).
2
Create an ExportOptions instance corresponding to the desired file format (ParquetExportOptions in this example).
3
Specify the name of the dataset from which to export records (DS1 in this example).
4
Specify the Type of the dataset identified in 3 above.
5
Specify the full Path of an existing folder where the generated output file will be created and into which records will be written.
6
Create a DatasetExport instance corresponding to the desired format (ParquetDatasetExport in this example) supplying the DatasetManager and ExportOptions instances.
7
Perform the export by calling exportDataset().
8
Understand the results of the completed export operation contained within the returned ExportStats instance (ParquetExportStats in this example):
Export Result: Success
Output Files (1):
C:\temp\dataset1_2022-08-11-09-24-49-777.parquet
1,000 records processed.
1,000 complete records written to parquet file
0 partial records written to parquet file
0 entire records NOT written to parquet file
0 records failed writing to parquet file
0 empty records excluded writing to parquet file
0 string values were truncated
0 large-size byte arrays were omitted
Note:
In the above example, the system automatically created the export file with name dataset1_2022-08-11-09-24-49-777.parquet. In fact, for Parquet export, the system will always construct the filename. However, when exporting in TSON format, the name of the generated file must be supplied by the client, as illustrated in the next example.
The following example illustrates how to export a dataset to TSON file format using the import-export API. The example also illustrates how to configure cell filtering:
try (DatasetManager dsManager =
DatasetManager.clustered(URI.create(connectionURI)).build()) { // (1)
TSONExportOptions options = new TSONExportOptions(); // (2)
options.setDatasetName("DS1"); // (3)
options.setDatasetType(LONG); // (4)
options.setOutputFileName(outputFilenameFullPath); // (5)
options.setFilterCell("myFilterCell", LONG); // (6)
options.setFilterLowValue(0); // (7)
options.setFilterHighValue(50); // (8)
TSONDatasetExport exporter = new TSONDatasetExport(dsManager, options); // (9)
TSONExportStats stats = exporter.exportDataset(); // (10)
System.out.println(stats.toString()); // (11)
}
1
Create a DatasetManager against a server in the cluster supplying a URI connection string (e.g. terracotta://<hostname>:<hostport>).
2
Create an ExportOptions instance corresponding to the desired file format (TSONExportOptions in this example).
3
Specify the name of the dataset from which to export records (DS1 in this example).
4
Specify the Type of the dataset identified in 3 above.
5
Specify the full path filename which the system will create and into which records will be written. The file's parent directory must exist.
6
Specify a named cell and its Type that is present in the dataset identified in 3 above. That cell will be used to filter the exported records.
7
Specify the low range numeric value of the filter cell for which records containing the specified filter cell and whose value is greater than or equal to the low range will be included in the export file.
8
Specify the high range numeric value of the filter cell for which records containing the specified filter cell and whose value is less than the high range will be included in the export file.
9
Create a DatasetExport instance corresponding to the desired format (TSONDatasetExport in this example) supplying the DatasetManager and ExportOptions instances.
10
Perform the export by calling exportDataset().
11
Understand the results of the completed export operation contained within the returned ExportStats instance (TSONExportStats in this example):
Export Result: Success
1,000 records processed.
0 string values were truncated
0 large-size byte arrays were omitted
0 empty records (with no cells) were omitted
The following example illustrates how to import dataset records contained within a TSON-formatted file using the import-export API:
try (DatasetManager dsManager =
DatasetManager.clustered(URI.create(connectionURI)).build()) { // (1)
TSONImportOptions options = new TSONImportOptions(); // (2)
options.setDatasetName("DS2"); // (3)
options.setDatasetType(LONG); // (4)
options.setInputFileName(inputFilenameFullPath); // (5)
options.setCompressed(false); // (6)
options.setClearDatasetBeforeImport(true); // (7)
TSONDatasetImport exporter
= new TSONDatasetImport(dsManager, options); // (8)
TSONImportStats stats = exporter.importDataset(); // (9)
System.out.println(stats.toString()); // (10)
}
1
Create a DatasetManager against a server in the cluster supplying a URI connection string (e.g. terracotta://<hostname>:<hostport>).
2
Create an ImportOptions instance corresponding to the desired file format (TSONImportOptions in this example).
3
Specify the name of an existing dataset into which records will be added (DS2 in this example).
4
Specify the Type of the dataset identified in 3 above.
5
Specify the full path filename of the file that you want to import.
6
Specify whether the input file identified in 5 above has been compressed (both ZIP and GZIP formats are supported).
7
Specify if all records present in the target dataset identified in 3 above should first be deleted before the new records are added from the import file.
8
Create a DatasetImport instance corresponding to the desired format (TSONDatasetImport in this example) supplying the DatasetManager and ImportOptions instances.
9
Perform the import by calling importDataset().
10
Understand the results of the completed import operation contained within the returned ImportStats instance (TSONImportStats in this example):
Import Result: Success
1,000 records processed.
0 empty records (with no cells) were omitted
0 records failed to be added to the Dataset.