Integration Server 10.7 | Built-In Services Reference Guide | Parquet Folder | Summary of Elements in this Folder | pub.parquet:write
 
pub.parquet:write
WmParquet. Writes a document list (an array of IData objects) to a Parquet file.
Input Parameters
fileName
String Name of the Parquet file to which the records will be written.
schema
String Optional. A Parquet schema against which the documents will be matched before the records are written to the file. If a schema is not provided, an IS document type must be provided using the docTypeName parameter.
If any of provided documents do not conform to the given schema exactly, an exception is thrown.
docTypeName
String Optional. The fully qualified name of an IS document type. A document type can be provided instead of the schema parameter. The document type is converted to a Parquet schema internally and used for converting and writing the Parquet file.
The tables Table and Mapping of Integration Server data types to Parquet Logical types list howIntegration Server data types map to Parquet schema types.
Note:
Either a schema or a docTypeName must be provided to validate the data before it is written to the Parquet file.
records
Document List Array of IData objects to be written to the Parquet file.
options
Document. Optional. Options such as compression methods can be passed to this service.
compressionCodec
String Optional. The following compression methods are supported:
*gzip
*snappy
*uncompressed
Note:
If a compressionCodec is not provided then data is not compressed.
Output Parameters
None.
Usage Notes
The following tables list how Integration Server data types map to Parquet types.
Mapping of Integration Serverdata types to Parquet basic types for write operations
Integration Server Type
Parquet Basic Type
String
STRING
String List
Repeated STRING
String Table
BINARY
Document
Group
Document List
Repeated Group
Document Reference
Group (flatten)
Document Reference List
Repeated Group (flatten)
java.lang.Boolean
BOOLEAN
java.lang.Integer
INT32
java.lang.Long
INT64
java.lang.Float
FLOAT
java.lang.Double
DOUBLE
java.util.Date
BINARY
java.lang.Byte
BINARY
java.lang.Short
BINARY
byte[]
BINARY
Object (unidentified)
BINARY
Object List (unidentified)
Repeated BINARY
Note:
Arrays of objects, such as java.lang.Boolean and java.lang.Long, are converted to arrays of the corresponding basic types.
Mapping of Integration Server data types to Parquet Logical types for write operations
Integration Server Type
Parquet Logical Type
Parquet Basic Type
String
ENUM
Binary(ENUM)
String
JSON
Binary(JSON)
Note:
The pub.parquet:write service silently overwrites the file if it already exists: No exception is thrown. Currently, there is no option to append to an existing file.
Important:
Null values are not written to a Parquet file.
Note:
Ensure that the schema does not have field names with characters such as '{', '}', '(', ',', ')' ';', '=', or ' ' (empty space) that are not supported by the Parquet specification.