Configuring Country Information for Cleansing Input
The country information is important input to any data quality cleansing software for address validation because it helps determine the country-specific rules to apply. Typically, for each country, there is a limited set of accepted codes. For example, the United States may be specified as USA, US, United States, or United States of America, while a data quality tool might always expect it as US. This conversion can either be made at the data quality tool level, thereby impacting performance, as well as requiring detailed knowledge of the software, or can be handled through OneData configuration.
By default, there is a column for country data in both the Consolidation Staging and Consolidation Gold tables. If the country code comes from the source systems in a right format that does not require transformation for the data quality tool, no further configuration changes are required. But if the country format from source systems varies, then you must configure a transformation so that the country code can be passed to the data quality cleansing software.
1. Provide the country translation table name, its primary key column, and the ISO code column in the onedata.properties file as listed in the sample given below for the following properties:
onedata.dataquality.countrytranslation.tablename = GEO_COUNTRY onedata.dataquality.countrytranslation.primarykeycolumn = CNTRY_ID onedata.dataquality.countrytranslation.isocountrycodecolumn = ISO_CHAR_2_CODE Note:
The GEO_COUNTRY table stores the list of countries in different formats. Countries that share the same ISO code are passed to the data quality software. For example, the GEO_COUNTRY table might have the structure, CNTRY_ID, CNTRY_NM, and ISO_CHAR_2_CODE. The records for United States and USA in CNTRY_NM have different IDs in CNTRY_ID (primary key values), but the same ISO_CHAR2_CODE US. The ISO code is the value that is passed to the data quality software, like Trillium.
2. Set a foreign key constraint in the Country column of Consolidation Staging and Consolidation Gold tables to the country translation table.
3. Set Additional Qualifier as Consolidation Country Indicator for the country column of the Consolidation Staging and Gold table.
Note:
The country translation table can be created with the structure suggested above or as needed. Update the corresponding OneData properties.
4. Set a foreign key constraint in the Country column of Consolidation Staging and Consolidation Gold tables to the country translation table.
5. Set Additional Qualifier as Consolidation Country Indicator for the country column of the Consolidation Staging and Gold table.
Note:
The country translation table can be created with the structure suggested above or as needed. Update the corresponding OneData properties.
6. Set a foreign key constraint in the Country column of Consolidation Staging and Consolidation Gold tables to the country translation table.
7. Set Additional Qualifier as Consolidation Country Indicator for the country column of the Consolidation Staging and Gold table.
Note:
The country translation table can be created with the structure suggested above or as needed. Update the corresponding OneData properties values.