The way in which the CSV data is processed when it is imported is specified in an XML configuration file. The name of this file is transferred to the runcsv2ppm command line program as an argument.
In the configuration file, you can specify the separator or rename column headings.
The format of the XML file is specified by the following DTD:
XML element or attribute |
Description |
Example |
---|---|---|
csvextractor |
Configuration of CSV import process |
See below |
fieldseparator |
Separator |
, |
mask |
Masking character (values imported are unmasked) |
" |
hasheaderline |
CSV files include header |
|
renamefield |
Rename a data column |
See below |
name |
Name of a data column generated from CSV files |
MATERIAL |
newname |
New name of data column |
Item |
skipstartlines |
Start lines to be ignored |
|
skipendlines |
End lines to be ignored |
|
numberoflines |
Number of start lines to be ignored. Counting begins from the first line. |
10 |
pattern |
Character pattern for locating a data line from which lines are to be extracted (skipstartlines: by default, the line found is also extracted) or up to which lines are to be extracted (skipendlines: line found including all following lines are not extracted |
For example , the expression |
ignorethisline |
Ignore (yes) or do not ignore (no) the data line containing the desired pattern in the extraction. Default value: no |
yes |
mincolumn |
Specifies a minimum number of columns to define the start and/or end of the data range to be extracted. skipstartlines: Lines are extracted from (inclusive) the first line found with at least the specified number of columns. skipendlines: Lines are extracted up to (exclusive) the first line found with less than the specified number of columns. |
4 |
Example configuration (csvconfig.xml)
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE csvextractor SYSTEM 'csvextractor.dtd'>
<csvextractor>
<fieldseparator>;</fieldseparator>
<mask>"</mask>
<hasheaderline/>
<renamefield name="FIELD_3" newname=
"Order item recorded by"/>
<renamefield name="MATERIAL" newname="Item designation"/>
<skipstartlines>
<pattern ignorethisline="yes">###*</pattern>
</skipstartlines>
<skipendlines>
<pattern>5???;*;*;*;*</pattern>
</skipendlines>
</csvextractor>
CSV file to be imported (example.csv)
The file consists of a comment line, a header, and five data records:
### Order data 03/25/2006 ###
ORDER NUMBER;POSITION;;MATERIAL;QUANTITY
4711;10;"Harry, ""A"" Williams";Mobile 6600;3
4811;23;"Ben, ""B"" Snyder";Mobile 6601;2
4911;15;"George, ""C"" Nyland";Mobile 6602;1
5011;6;"George, ""C"" Nyland";Mobile 5405;2
5211;23;"George, ""C"" Nyland";Mobile 5410;1
Output file in PPM system event format (event.xml)
The CSV system event generator uses the example CSV configuration to generate the following XML output file:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE eventlist SYSTEM "event.dtd">
<eventlist>
<event>
<attribute type="ORDER NUMBER">4711</attribute>
<attribute type="POSITION">10</attribute>
<attribute type="Order item recorded by"
>Harry, "A" Williams</attribute>
<attribute type="Item designation">Mobile 6600
</attribute>
<attribute type="QUANTITY">3</attribute>
</event>
<event>
<attribute type="ORDER NUMBER">4811</attribute>
<attribute type="POSITION">23</attribute>
<attribute type="Order item recorded by"
>Ben, "B" Snyder</attribute>
<attribute type="Item designation">Mobile 6601
</attribute>
<attribute type="QUANTITY">2</attribute>
</event>
<event>
<attribute type="ORDER NUMBER">4911</attribute>
<attribute type="POSITION">15</attribute>
<attribute type="Order item recorded by"
>George, "C" Nyland</attribute>
<attribute type="Item designation">Mobile 6602
</attribute>
<attribute type="QUANTITY">1</attribute>
</event>
</eventlist>
In accordance with the specifications in the CSV configuration file, the following command line call generates one system event in the XML output file from each data record in the CSV data:
runcsv2ppm -i example.csv -csvconfig csvconfig.xml -outfile event -nozip
When importing, for all system events the name FIELD_3 assigned to the third data column is automatically replaced with Order item recorded by and the MATERIAL data column is renamed Item designation. The masked values of the Order item recorded by attribute type are unmasked. The comment line is ignored in the extraction as specified for skipstartlines. The last two lines in the data range are ignored as specified for skipendlines (order numbers 5011, 5211).