This document covers the following topics:
Code page conversion and Unicode support make use of functionality
provided by International Components for Unicode for Software AG (ICS). If you
want to enable Natural for Unicode and code page support, you have to install
the components provided with ICS: the ICS module SAGICU
or an
alternative ICS module
(z/VSE and z/OS only) and ICU data
libraries.
Notes:
CFICU
and CP
are set to OFF
.
SYSCP
utility.
See Invoking and
Terminating SYSCP in the Utilities
documentation of the Natural for Mainframes documentation.
If you want to enable Natural for Unicode and code page support, you
need to link and load the ICS module SAGICU
or an
alternative ICS module
(z/OS and z/VSE only) during the installation of Natural as described in
Installing International Components for Unicode for Software
AG for z/OS,
z/VSE and
BS2000/OSD.
The ICS module SAGICU
is intended to be used in most
European countries as well as in North and South American countries. It
contains a reduced set of code pages and locale IDs for English, German, French
and Spanish language areas. Due to the reduced set of supported languages, it
is relatively small.
The ICS module SAGICU
contains the following:
Another feature of this module is collation services. Collation services are used to compare Unicode strings. They consider the fact that the alphabetical order varies from language to language. It is a big challenge to accommodate the world's languages and writing systems and the different orders that are used. However, the ICU collation service provides excellent means for comparing strings in a locale-sensitive fashion. For example, in German locale, the character "Ä" is sorted between "A" and "B"; in Swedish locale, it is sorted after "Z". In Lithuanian, the character "y" is sorted between "i" and "k". The ICU implementation of collation services is compliant to the Unicode Collation Algorithm and conforms to ISO 14651. The algorithms have been designed and reviewed by experts in multi-lingual collation, and are therefore robust and comprehensive.
The ICS module SAGICU
provides the following code pages
and locales:
Code Pages | Locales |
---|---|
IBM037 |
de_DE |
This section does not apply to BS2000/OSD.
On z/OS and z/VSE platforms, you can use the ICS module
SAGICUA9
instead of the ICS module SAGICU
. In
addition to SAGICU
, this module supports IBM architecture level
9.
Architecture levels employ instructions available with IBM hardware
facilities that significantly improve performance. As for Natural, you can
expect better performance for Natural statements that use Unicode variables or
code-page encoding instructions (for example, MOVE ENCODED
). For
more information on architecture levels, refer to the related documentation
from IBM.
The architecture level (the higher the better) you can use depends on the IBM hardware facility installed at your site.
The following architecture levels can be specified with the ICS module
SAGICUA9
, the Natural PARSE
XML
statement and/or the
Natural Optimizer Compiler
(NOC):
Level Value | Supported By | IBM Hardware Facility Required |
---|---|---|
0 |
All | Specifies that no architecture level is used. This is the default setting for compatibility with all mainframe platforms supported by Natural. |
1 to 4 |
All | These values are not evaluated and treated as
ARCH=0 .
|
5 to 6 |
NOC only |
|
7 |
NOC only |
|
8 |
NOC only |
|
9 |
PARSE XML and ICS only |
|
10 |
NOC only |
|
11 |
NOC only |
|
Warning: An operation exception error (abend code S0C1) can occur if code generated with an architecture level greater than 0 is executed on
a machine where the corresponding hardware facility is not
installed. |
If you want to enable Natural for Unicode and code page support, you need to link and load an ICU data library during the installation of Natural as described in Installing International Components for Unicode for Software AG for z/OS, z/VSE and BS2000/OSD.
The different ICU data libraries available are contained in the following data modules delivered with International Components for Unicode for Software AG (ICS) Version 2.1.1:
Data Module | Description |
---|---|
ICSDT54E |
Contains the most popular code pages and locales.
The code pages are already declared in
NATCONFG .
|
ICSDT54J |
Same as ICSDT54E , but enhanced by
Japanese code pages. ICSDT54J is already linked to the ICS module
SAGICU (or an alternative ICS module on z/VSE or
z/OS). It contains the above mentioned code pages and locales.
|
ICSDT54X |
Contains all possible converters and locales
offered by the currently supported ICU version. It supports about 230 different
code pages (predominantly EBCDIC code pages) and 238 locales. Therefore, the
module size is huge.
Note: |
It is possible to create your own ICU data library that exactly matches your requirements (see Customizing the ICU Data Library).
The ICU data items supported by Natural include converters and
collators. For example: a converter is used when a MOVE ENCODED
statement executes, and a collator when strings are compared in an
IF
statement.
An ICU data item is either statically linked to an ICU data library or it is dynamically loaded on request during the Natural session.
ICU data items are supplied as loadable modules on the ICS data set supplied for installation of Natural, and must be accessible through the Natural steplib chain.
When a data item is used for the first time, ICS attempts to open it from the linked or loaded ICU data library. If no data item is associated with a library, ICS attempts to dynamically load the data item from the ICS data set.
The name of a data item module is restricted to 8 characters. In the
case of a converter, the name consists of a prefix (ICSC
) and a
4-character sequence number, for example, ICSC0038
for code page
ibm-1148_P100-1997. However, in a MOVE ENCODED
statement, Natural
expects the long name of the code page that corresponds to the data item
module. Any valid alias name of the code page can be used. The name of the code
page is automatically mapped to the 8-character short name when the data item
module is loaded.
The four-character name prefix of a data item in the ICS data set indicates the logical group to which the item belongs:
Name Prefix | Contents |
---|---|
ICSCnnnn |
Charset mapping tables (converter modules) |
ICSBnnnn |
Break iterators |
ICSSnnnn |
Collators (collation services) |
ICSLnnnn |
Localization (formatting, display names and other localized data) |
ICSMnnnn |
Miscellaneous data (rule-based number formats and transliterators) |
ICSDnnnn |
Base data |
For further information, see the ICU web site at http://apps.icu-project.org/datacustom/ICUData54.html.
An ICU data library is faster but less flexible than dynamically loaded single data item modules. A data library is loaded once during session initialization.
The default ICU data library is usually sufficient for most
applications but you may have to customize a data library to meet your
particular requirements. If you do not know which data item module you require,
you have to load the extended, and therefore very large, data library
ICSDT54X
. This is a potential waste of space and can cause
performance degradation, whereas single data item modules, which are
dynamically loaded as required, occupy less space but still support the
required code pages.
A single data item module is loaded when first accessed (for example,
through a MOVE ENCODED
statement). Single data item modules are
especially useful for z/VSE, which does not support the extended data library
ICSDT54X
.
If a Natural session is enabled for code page or Unicode support, you should ascertain that Natural's Adabas user session also uses the appropriate user encoding for accessing Adabas data.
Because Adabas uses Entire Conversion Services (ECS) for conversion, the
ECS name must be specified in the related
NTCPAGE
entry in module
NATCONFG
.
To ascertain that Natural's Adabas user session uses the correct code page,
specify the ACODE
and/or WCODE
option in the
OPRB
parameter for the databases used.
For more information on Adabas Unicode and code page support conversion, see the Adabas documentation for mainframes.
Natural uses various tables for character translation and character
property definition. The contents of the tables can be modified via profile
parameters (TAB
,
UTAB1
,
UTAB2
and
SCTAB
)
during the start of a Natural session.
If Natural is running with code page support (that is: the
CP
profile
parameter is set to a value other than OFF
), the tables cannot be
modified by the user. In this case, the following Natural startup message will
be issued to notify the user that the above mentioned session parameters are
not considered:
Character translation parameter
table-name ignored due to CFICU=ON.
Natural adjusts the tables automatically, according to the code page
used for the Natural session (value of the system variable
*CODEPAGE
).
See also Translation
Tables in the Operations
documentation.
Natural supports multi-byte code pages (MBCS) such as IBM-939 which is a
Japanese code page based on EBCDIC and DBCS. Multi-byte code pages can be
selected using the CP
parameter (by
setting CP
to AUTO
(if supported) or to the
name of a code page). If Natural is running with a multi-byte code page, it
uses internal I/O buffers which are based on Unicode. This means that all data
written into the internal I/O buffers by an I/O statement are converted to
Unicode. Due to the requirements of Unicode and multi-byte code pages, the size
of the I/O buffers is increased as compared to the traditional I/O since
Unicode characters need twice as much space as EBCDIC characters and enhanced
attributes are needed to describe a field.
In the case of single-byte code pages (SBCS) such as IBM-1140, the traditional EBCDIC-based I/O is still used to preserve resources.