Configuration of multiple output files

If very large data volumes are extracted and written to a single output file as system events, the file may become difficult to handle. In such cases, it is possible to configure the data source used in such a way that the extracted data is written to any number of XML output files. All you need to do is specify the maximum number of system events per output file and the output file name in the dataextraction XML element.

Example (for a CSV data source, same procedure for JDBC and SAP types)

<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE datasource SYSTEM "datasource.dtd">

<datasource name="BILLING" type="CSV">

<dataextraction>

<outputfilename>..\custom\<clientname>\data\

BILLING_data_$EXTRACTIONDATE$_$EXTRACTIONTIME$.zip

</outputfilename>

<numberofeventsperxmlfile>

100000

</numberofeventsperxmlfile>

</dataextraction>

...

<systemconfig>...</systemconfig>

<eventspec>...</eventspec>

...

</datasource>

For the BILLING data source, the numberofeventsperxmlfile XML element specifies that each XML output file is to contain 100000 system events, with the last output file generated containing all remaining events.
The outputfilename XML element specifies the path and name of the output files. These settings generate XML files with names in the form BILLING_data_$EXTRACTIONDATE$_$EXTRACTIONTIME$.zip in the specified output directory. The $EXTRACTIONDATE$ name variable contains the extraction date, while $EXTRACTIONTIME$ contains the extraction time. The output files generated are numbered by placing _<x> at the end of the name, where x is a consecutive number and the first output file generated is not numbered.

If 369000 system events were extracted on 23 June 2007 at 11:36:45 using the example configuration shown, the following output files would be created:

The table below lists all configuration options:

XML element

Description

numberofeventsperxmlfile

Fixed number of system events written to an output file. The last output file created contains the remaining number of events. If this element is missing, all system events are written to one output file.

outputfilename

Path and naming pattern of output files. XML and ZIP formats are supported. A ZIP output file contains an XML output file with the same name. The output files for an extraction are consecutively numbered by _<x> at the end of the name, although the first file created is not numbered.

The following variables are permitted in the output file name (outputfilename) and can be used in any combination:

Variable (data source type)

Description

$EXTRACTIONDATE$
(CSV, SAP, JDBC)

Extraction date (format: yyyyMMdd)

$EXTRACTIONTIME$
(CSV, SAP, JDBC)

Extraction time (format: HHmmss)

$BEGINDATE$
(SAP, JDBC)

Start date of extract period
(format: yyyyMMdd)

$BEGINTIME$
(SAP, JDBC)

Start time of extract period
(format: HHmmss)

$ENDDATE$
(SAP, JDBC)

End date of extract period
(format: yyyyMMdd)

$ENDTIME$
(SAP, JDBC)

End time of extract period
(format: HHmmss)

$VALUECONSTRAINT$
(SAP, see chapter Condition operators and JDBC, see chapter Condition operators)

Output format:
<Operator>_Value1 or <Operator>_Value1_<Operator>_Value2

Output of operators and integer comparison values used to restrict the volume of data extracted. The name of the output file could be: outfile_gt_230_le_300_<x>.xml, where x is the consecutive number.
Operator representation:
greater than: gt
greater than or equal to: ge
less than: lt
less than or equal to: le

You can conveniently set the distribution of the extracted data sets to multiple output files in PPM Customizing Toolkit in the Data source management of the client, via Additional settings... in the Data extraction area.