To enable easy data extraction for Data analytics, you can extract the entire contents of a source system table into a file in event format, which can then be imported into a Data analytics analysis realm.
You can also restrict the data to be extractes by defining conditions for data extraction.
The table to be extracted is not configured using the table configuration but in the data source file itself. For this, the file datasource.dtd contains the following entries:
XML element/attribute |
Description |
---|---|
analysistype |
XML attribute: |
realmtable |
Comprehensive XML element for configuring the Data analytics data source. |
tablename |
XML attribute of the realmtable element: |
sourcetable |
Comprehensive XML element for configuring the Data analytics data source table. Must include at least one sourcefield element. |
tablename |
XML attribute of the sourcetable element: |
sourcefield |
Contains the name of the field of the source system table. |
The analysistype attribute of the XML element datasource must have the value DATA_ANALYTICS if the <realmtable> element specifies an analysis realm table (default value: PROCESS).
The tablename attribute of the <realmtable> element indicates the table name of the target table in the analysis realm configuration. The table name does not affect extraction itself, but is evaluated only by the PPM import.
Further information on data import for Data analytics is available in the PPM Data Analytics user guide.
The XML element <realmtable> contains the optional element <sourcetable> that specifies the table to be extracted. The columns to be extracted from this table must be specified in the <sourcefield> element. The element <sourcetable> is optional. For the JDBC or SAP Extractor, a single source table and at least one source column must be specified. Otherwise, an error message will be output during the parsing of the data source file.
A data source definition can only have a maximum of one table. It is impossible to restrict the number of rows. All rows are extracted, including all rows with identical values at the columns to be extracted. For example, if the columns First name and Last name are to be extracted and the table contains ten entries Peter and Schmidt, ten events with identical attribute values will be generated.
The following example explains the configuration:
<realmtable tablename="COMPANY_EMPLOYEE">
<sourcetable tablename="EMPLOYEE">
<sourcefield>EMPLOYEE_ID</sourcefield>
<sourcefield>NAME</sourcefield>
</sourcetable>
...
</realmtable>
<dataextraction>
<outputfilename>..\custom\testclient\data\employee.xml</outputfilename>
</dataextraction>
...
<systemconfig>..\custom\testclient\SourceSystemConfig.xml</systemconfig>
In contrast to the behavior of the conventional JDCB or SAP Extractor, attributes no longer receive a table name as a prefix in the event output file. For example, if the table EMPLOYEE was extracted, the extractor usually generated events of the type <table name>-<column name>:
<event>
<attribute type="EMPLOYEE-EMPLOYEE_ID">4711</attribute>
<attribute type="EMPLOYEE-NAME">Schmidt</attribute>
</event>
Extracting a table using the <realmtable> element, however, creates only events without table names:
<event>
<attribute type="EMPLOYEE_ID">4711</attribute>
<attribute type="NAME">Schmidt</attribute>
</event>
Null values in the DATA ANALYTICS mode
If the value of a column is null at the time an analysis realm table is extracted, this value is not written to the event. If a row EMPLOYEE_ID = 4712 exists without last name, the extractor creates the following events.
<event>
<attribute type="EMPLOYEE_ID">4711</attribute>
<attribute type="NAME">Schmidt</attribute>
</event>
<event>
<attribute type="EMPLOYEE_ID">4712</attribute>
</event>
However, if all column values to be extracted are null, the event is written with empty attributes:
<event>
<attribute type="EMPLOYEE_ID">4711</attribute>
<attribute type="NAME">Schmidt</attribute>
</event>
<event>
<attribute type="EMPLOYEE_ID"></attribute>
<attribute type="NAME"></attribute>
</event>
If multiple such rows exist they will be – in contrast to the common extraction procedure (analysistype=PROCESS) – transferred accordingly so that as many <event> elements exist in the event file as there are rows in the data table of the source system.