This document covers the following topics:
Code page conversion and Unicode support make use of functionality provided by
International Components for Unicode for Software AG (ICS). If you want to enable
Natural for Unicode and code page support, you have to install the components provided
with ICS: the ICS module
SAGICU
or an alternative ICS module and ICU data libraries.
Notes:
CFICU
and
CP
are set to OFF
.
SYSCP
utility. See Invoking and Terminating
SYSCP in the Utilities documentation of
the Natural for Mainframes documentation.
If you want to enable Natural for Unicode and code page support, you need to link and load an ICU data library during the installation of Natural as described in Installing International Components for Unicode for Software AG for z/OS (see ICS 311).
The ICS module SAGICU
is intended to be used independently from localization
data. It contains no statically-linked code pages and locales. A dataset containing the
entirety of the ICU localization data, modulated in data items, is part of the ICS 311
delivery. Its name can be specified by the CFICU STEPLIB
parameter or
statically in the JCL as a Natural steplib.
Another feature of this module is collation services. Collation services are used to compare Unicode strings. They consider the fact that the alphabetical order varies from language to language. It is a big challenge to accommodate the world's languages and writing systems and the different orders that are used. However, the ICU collation service provides excellent means for comparing strings in a locale-sensitive fashion. For example, in German locale, the character "Ä" is sorted between "A" and "B"; in Swedish locale, it is sorted after "Z". In Lithuanian, the character "y" is sorted between "i" and "k". The ICU implementation of collation services is compliant to the Unicode Collation Algorithm and conforms to ISO 14651. The algorithms have been designed and reviewed by experts in multi-lingual collation, and are therefore robust and comprehensive.
Statically-linked collation data (set of code pages and locale IDs) is not supported with ICS 311.
ICS 311 uses all of the ICU localization data.
The ICS module SAGICU
provides the following code pages and locales:
Code Pages | Locales |
---|---|
IBM037 |
de_DE |
If your Natural system runs on z/OS with an IBM processor with architecture level 9 or
higher, you can replace the ICS module SAGICU
by SAGICUA9
.
SAGICUA9
is built to use advanced machine instructions introduced with
IBM's ESA/390 and z/Architecture. You can use the system command TECH
(see the
System Commands documentation) to find out the architecture level
supported on your current machine.
SAGICUA9
improves the execution performance, especially for Natural
statements that use Unicode variables or code-page encoding instructions (for example,
MOVE ENCODED
). For more information on architecture levels, refer to the
related documentation from IBM (z/Architecture, Principles of Operation).
Warning: An operation exception error (abend code S0C1) can occur if the ICS module SAGICUA9 is used, but the underlying machine architecture level is lower
than 9. |
If you want to enable Natural for Unicode and code page support, you need to link and load an ICU data library during the installation of Natural as described in Installing International Components for Unicode for Software AG for z/OS.
ICU data libraries are supplied with the following ICS data modules where
nn
denotes the current version of the module
as announced in the current Natural Release Notes for Mainframes.
Data Module | Description |
---|---|
ICSDTnnE |
Contains the most popular code pages and locales. The code
pages are already declared in NATCONFG .
|
ICSDTnnJ |
Same as ICSDTnnE , but
enhanced by Japanese code pages. ICSDTnnJ
is already linked to the ICS module SAGICU (or an alternative ICS module). It contains
the above mentioned code pages and locales.
|
ICSDTnnX |
Contains all possible converters and locales offered by the
currently supported ICU version. It supports about 230 different code pages
(predominantly EBCDIC code pages) and 238 locales. Therefore, the module size is
huge.
|
The ICU data items supported by Natural include converters and collators. For example: a
converter is used when a MOVE ENCODED
statement executes, and a collator when
strings are compared in an IF
statement.
An ICU data item is either statically linked to an ICU data library or it is dynamically loaded on request during the Natural session.
ICU data items are supplied as loadable modules on the ICS data set supplied for installation of Natural, and must be accessible through the Natural steplib chain.
When a data item is used for the first time, ICS attempts to open it from the linked or loaded ICU data library. If no data item is associated with a library, ICS attempts to dynamically load the data item from the ICS data set.
This section covers the following topics:
The name of a data item module in the ICS data set is restricted to eight characters. As indicated in the table below, it consists of the following:
A prefix (I
),
A two-digit ICU version (xx
),
A logical group identifier (C
, B
, S
,
L
, M
or D
), and
A four-digit sequence number (nnnn
).
Module Name | Contents |
---|---|
IxxCnnnn |
Charset mapping tables (converter modules) |
IxxBnnnn |
Break iterators |
IxxSnnnn |
Collators (collation services) |
IxxLnnnn |
Localization (formatting, display names and other localized data) |
IxxMnnnn |
Miscellaneous data (rule-based number formats and transliterators) |
IxxDnnnn |
Base data |
Example:
I58C0074
is the name of a converter for
ICU Version 58.2 and code page ibm-1148_P100-1997.
However, in a MOVE ENCODED
statement, Natural expects the long name of the
code page that corresponds to the data item module. Any valid alias name of the code
page can be used. The name of the code page is automatically mapped to the
eight-character short name when the data item module is loaded.
For further information, see the appropriate ICU web site.
Using dynamically loaded single data item modules allows for extensive flexibility. Data is loaded on demand and supports all code pages. A dataset containing all of the ICU localization data, modulated in single data items, is part of the ICS 311 delivery.
A single data item module is loaded when first accessed (e.g. by a MOVE
ENCODED
statement) and is available for future use instantly without the need
to reload. Only the already used code pages will be kept in memory and no
statically-linked data or a separate data library as was the case with previous ICS
versions.
If a Natural session is enabled for code page or Unicode support, you should ascertain that Natural's Adabas user session also uses the appropriate user encoding for accessing Adabas data.
Because Adabas uses Entire Conversion Services (ECS) for conversion, the ECS name must be
specified in the related NTCPAGE
entry in module NATCONFG
. To ascertain that Natural's Adabas user session uses
the correct code page, specify the ACODE
and/or WCODE
option in
the OPRB
parameter for
the databases used.
For more information on Adabas Unicode and code page support conversion, see the Adabas documentation for mainframes.
Natural uses various tables for character translation and character property definition.
The contents of the tables can be modified via profile parameters (TAB
, UTAB1
, UTAB2
and SCTAB
) during the start of a
Natural session.
If Natural is running with code page support (that is: the CP
profile parameter is set to a
value other than OFF
), the tables cannot be modified by the user. In this
case, the following Natural startup message will be issued to notify the user that the
above mentioned session parameters are not considered:
Character translation parameter table-name ignored
due to CFICU=ON.
Natural adjusts the tables automatically, according to the code page used for the Natural
session (value of the system variable *CODEPAGE
). See also Translation
Tables in the Operations documentation.
Natural supports multi-byte code pages (MBCS) such as IBM-939 which is a Japanese code
page based on EBCDIC and DBCS. Multi-byte code pages can be selected using the CP
parameter (by setting
CP
to AUTO
(if supported) or to the name of a code
page). If Natural is running with a multi-byte code page, it uses internal I/O buffers
which are based on Unicode. This means that all data written into the internal I/O buffers
by an I/O statement are converted to Unicode. Due to the requirements of Unicode and
multi-byte code pages, the size of the I/O buffers is increased as compared to the
traditional I/O since Unicode characters need twice as much space as EBCDIC characters and
enhanced attributes are needed to describe a field.
In the case of single-byte code pages (SBCS) such as IBM-1140, the traditional EBCDIC-based I/O is still used to preserve resources.