This document covers the following topics:
When used in this document, the notation vr represents the 2-digit ICU version number.
This section lists the profile parameters and macros which are used in conjunction with Unicode and code page support.
Unless otherwise noted, the profile parameters and macros mentioned in this section are explained in detail in the Parameter Reference.
Parameter or Macro | Description |
---|---|
CFICU or NTCFICU macro
|
Enables Unicode support for various Unicode settings.
See
also |
CMPO or CPAGE keyword
subparameter of NTCMPO macro
|
Generates code page-sensitive Natural programs.
See also
|
CP |
Defines the default code page for Natural. This code page is used for the runtime and development environment if not superposed with a code page defined for a single object (for example, for a Natural source). Only platform-suitable code pages can be used. This means, for example, that no ASCII code page can be defined for a mainframe platform. An initialization error message occurs if a wrong code page is used. See also |
CPCVERR |
Specifies whether a conversion error that occurs when converting from Unicode to code page or from code page to Unicode or from one code page to another code page results in a Natural error or not. This parameter is not regarded for the conversion of Natural sources when loading them into the source area or when cataloging them. It is not regarded whether a Unicode field is converted into the code page
before an I/O on a terminal emulation. In this case, the substitution character
defined by ICU is replaced by the place holder character which is defined in
|
CPOBJIN |
Specifies the code page in which the batch input file for data
is encoded. This file is defined in the data set CMOBJIN .
|
CPPRINT |
Specifies the code page in which the batch output file shall
be encoded. This file is defined in the data set CMPRINT .
|
CPSYNIN |
Specifies the code page in which the batch input file for
commands is encoded. This file is defined in the data set CMSYNIN .
|
NTCPAGE macro
|
In the NATCONFG module, this macro defines a code page and
all related information, such as place holder character, locale ID and collation
tables.
See also
|
OPRB or NTOPRB
macro
|
Sets the ACODE and/or WCODE option
to define the user encoding if the used Adabas database is enabled for UES
(universal encoding support).
|
PRINT or CP keyword
subparameter of NTPRINT macro
|
Defines the code page for a report. |
SRETAIN |
Specifies that all existing sources have to be saved in their original encoding format. See also Customizing Your Environment. |
See also:
Natural in Batch Mode in the Operations documentation.
For valid code pages, see http://www.iana.org/assignments/character-sets.
This section covers the following topics:
The parameter CFICU
and its subparameters are explained in detail
in the Parameter Reference. Some of the subparameters have an
impact on the performance.
If collation services are used to compare Unicode strings, both strings are checked
whether they are normalized or not. The check itself consumes a lot of CPU time. If you
are sure that the strings are already normalized, you can switch off the check
(COLNORM=OFF
).
In Unicode, it is possible to represent the same character as one code point or as a
combination of two or more code points. For example, the German character
"ä" can be represented by
"U+00E4" or by the combination of the code points
"U+0061" and "U+0308". The
conversion from Unicode to, for example, IBM01140 treats combined characters as single
code points and produces an "a" followed by a substitution
character since code point "U+0308" is not represented in the
target code page. With CNVNORM=ON
, a normalization is performed right
before the actual conversion. The normalization consumes additional CPU time and
temporary storage. If you are sure that no combining characters are involved in
MOVE
statements (except
MOVE NORMALIZED
),
you should set CNVNORM
to OFF
to increase
performance. Note that all possible combinations are represented by a single coded
Unicode code point.
Conversion from Unicode to code page and vice versa is not high-performance. The reason
is that the ICU implementation is written in C++ and that it covers nearly all Unicode,
code page and language aspects in the world. However, some code pages can be mapped to
Unicode (and vice versa) via translation tables to accelerate conversion. Accelerator
tables are activated with the CPOPT
subparameter. If it is set to
ON
, Natural automatically creates two accelerator tables during session
initialization by using ICU conversion functions. The first table (with a size of 512
bytes) is used for conversion from code page to Unicode and the other table (with a size
of 65535 bytes) is used for conversion from Unicode to code page. During a Natural
session, all conversions are then executed via the accelerator tables instead of ICU
calls. Accelerator tables are only provided for the default code page (*CODEPAGE
).
Temporary code pages (for example, in MOVE ENCODED
statements) do not use accelerator tables if
the module NATCPTAB
is not linked. If it linked, up to 30 accelerator
tables based on the ICU database are used to speed up performance.
The parameters CFICU
and CP
can be used to adjust Natural to specific
purposes:
Settings | Description |
---|---|
CFICU=OFF, CP=OFF |
Compatibility mode. For running existing applications
without Unicode and without code page support. Legacy translation tables are
used for I/O translation. Compared with former versions, there is no significant
increase in resource consumption (CPU time and buffer usage). This mode does not
need the ICS module SAGICU (or an alternative ICS module) to be
linked to the Natural nucleus.
|
CFICU=ON, CP=OFF |
For new applications that are using Unicode and code page
conversion (MOVE
ENCODED ) but not default code page support. Therefore, the
system variable *CODEPAGE is empty. It is possible
to use U format variables, but it is not possible to use, for example,
MOVE A TO U , since this requires the default code page
information. The error NAT3411 will be issued indicating that no default code
page is available.
|
CFICU=ON, CP=value
* |
For new applications that are using full Unicode as well as code page support. |
CFICU=OFF, CP=value
* |
This combination does not make sense, because code page
support needs ICU services for conversion. Therefore, CFICU=ON is
enforced in this case and a session initialization message is issued.
|
* where value is any value other than
OFF
.
The compiler option CPAGE
creates objects that can be executed with a code page
which is different from the code page used at creation time. This means that all
alphanumeric constants of the object which are coded with the code page at creation time
have to be converted to the code page which is active at execution time. To make it
possible for the Natural object loader to find and convert alphanumeric constants, an
additional table is created by the compiler. This increases the size of the generated
object, depending on the number of used alphanumeric constants. The conversion at
runtime consumes additional CPU time. If the default code page (value of the system
variable *CODEPAGE
) is the same as the code page at
creation time or if the session has no default code page (CP=OFF
), no conversion is
done. Conversion errors are ignored, independent from the setting of the parameter
CPCVERR
. If the
compiler option CPAGE
is set to OFF
, no conversion is
performed at runtime and the alphanumeric constants are treated as they are.
The following sample program is cataloged with code page IBM01141 (German) and is executed with default code page IBM01140 (us). The characters "Ä", "Ö" and "Ü" are defined in both code pages, but at different code points.
Example 1 - CPAGE=OFF
:
OPTIONS CPAGE=OFF WRITE *CODEPAGE 'ÄÖÜ' END
Output with code page IBM01140 (us):
Page 1 IBM01140 ¢\!
Example 2 - CPAGE=ON
:
OPTIONS CPAGE=ON WRITE *CODEPAGE 'ÄÖÜ' END
Output with code page IBM01140 (us):
Page 1 IBM01140 ÄÖÜ
The most common standard for code page names is the IANA name. Therefore, the system
variable *CODEPAGE
contains the IANA name of the
default code page. A code page is qualified by its Coded Character Set ID (CCSID).
Currently, Adabas uses the Entire Conversion Service definition (ADAECS). The macro
NTCPAGE
can be used to assign these different names to the
unambiguous IANA name. NTCPAGE
is part of the Natural configuration module
(NATCONFG
).
It does not matter whether the IANA name, the CCSID/CCSN or the alias name is entered
with the CP
parameter. The alias name can be a user-defined name
which is used to assign a more significant name to the code page. In any case,
*CODEPAGE
contains the IANA name of the selected code
page.
In addition, a place holder character can be defined for a code page. It overwrites the
default substitution character of that code page, which is normally a non-displayable
character (for example, H’3F’
in an EBCDIC code page). The place holder
character can be used to avoid that non-displayable characters are sent to terminals.
Example:
NTCPAGE IANA=IBM01140,CCSID=1140,ECS=1140,ALIAS=’US’,PHC=003F
The values IBM01140
, 1140
or US
can be entered
with the CP
parameter to activate the code page.
*CODEPAGE
contains the name IBM01140. The substitution
character of the code page will be replaced by "U+003F",
which is a quotation mark (?).
The number of available code pages depends on the used ICU data library.
All code pages defined in the currently used data package can be used by Natural. An
NTCPAGE
entry is only necessary if an alternative alias name or place
holder character is desired.
The following configuration parameter is available with Natural Development Server (NDV):
Settings | Description |
---|---|
TERMINAL_EMULATION=WEBIO
|
Specifies that the Natural Web I/O Interface client (which supports Unicode) is used for input and output. |
The code page information of the object is part of the object directory displayed
with the LIST
system command. For details, see Displaying Directory Information in the
System Commands documentation.
The encoding of code page data can be specified on different levels.
The default code page can be defined with the CP
parameter.
A code page can be defined for Natural sources, batch input (CPOBJIN
, CPSYNIN
) and output files
(CPPRINT
).
If a code page is defined at object level, this overwrites the default code page.