This document covers the following topics:
When used in this document, the notation vr represents the 2-digit ICU version number.
This section lists the profile parameters and macros which are used in conjunction with Unicode and code page support.
Unless otherwise noted, the profile parameters and macros mentioned in this section are explained in detail in the Parameter Reference.
Parameter or Macro | Description |
---|---|
CFICU or
NTCFICU
macro
|
Enables Unicode support for various
Unicode settings.
See also |
CMPO or
CPAGE keyword subparameter of
NTCMPO
macro
|
Generates code page-sensitive Natural
programs.
See also |
CP |
Defines the default code page for Natural. This code page is used for the runtime and development environment if not superposed with a code page defined for a single object (for example, for a Natural source). Only platform-suitable code pages can be used. This means, for example, that no ASCII code page can be defined for a mainframe platform. An initialization error message occurs if a wrong code page is used. See also |
CPCVERR |
Specifies whether a conversion error that occurs when converting from Unicode to code page or from code page to Unicode or from one code page to another code page results in a Natural error or not. This parameter is not regarded for the conversion of Natural sources when loading them into the source area or when cataloging them. It is not regarded whether a Unicode field is converted
into the code page before an I/O on a terminal emulation. In this case, the
substitution character defined by ICU is replaced by the place holder character
which is defined in |
CPOBJIN |
Specifies the code page in which the
batch input file for data is encoded. This file is defined in the data set
CMOBJIN .
|
CPPRINT |
Specifies the code page in which the
batch output file shall be encoded. This file is defined in the data set
CMPRINT .
|
CPSYNIN |
Specifies the code page in which the
batch input file for commands is encoded. This file is defined in the data set
CMSYNIN .
|
NTCPAGE
macro
|
In the
NATCONFG
module, this macro defines a code page and all related information, such as
place holder character, locale ID and collation tables.
See also
|
OPRB or
NTOPRB
macro
|
Sets the ACODE and/or
WCODE option to define the user encoding if the used Adabas
database is enabled for UES (universal encoding support).
|
PRINT or
CP keyword subparameter of
NTPRINT
macro
|
Defines the code page for a report. |
SRETAIN |
Specifies that all existing sources have to be saved in their original encoding format. See also Customizing Your Environment. |
See also:
Natural in Batch Mode in the Operations documentation.
For valid code pages, see http://www.iana.org/assignments/character-sets.
This section covers the following topics:
The parameter CFICU
and its subparameters are
explained in detail in the Parameter Reference. Some of
the subparameters have an impact on the performance.
If collation services are used to compare Unicode strings, both
strings are checked whether they are normalized or not. The check itself
consumes a lot of CPU time. If you are sure that the strings are already
normalized, you can switch off the check (COLNORM=OFF
).
In Unicode, it is possible to represent the same character as one
code point or as a combination of two or more code points. For example, the
German character "ä" can be represented by
"U+00E4" or by the combination of the code points
"U+0061" and "U+0308".
The conversion from Unicode to, for example, IBM01140 treats combined
characters as single code points and produces an "a"
followed by a substitution character since code point
"U+0308" is not represented in the target code page.
With CNVNORM=ON
,
a normalization is performed right before the actual conversion. The
normalization consumes additional CPU time and temporary storage. If you are
sure that no combining characters are involved in
MOVE
statements (except
MOVE
NORMALIZED
), you should set CNVNORM
to
OFF
to increase performance. Note that all possible combinations
are represented by a single coded Unicode code point.
Conversion from Unicode to code page and vice versa is not
high-performance. The reason is that the ICU implementation is written in C++
and that it covers nearly all Unicode, code page and language aspects in the
world. However, some code pages can be mapped to Unicode (and vice versa) via
translation tables to accelerate conversion. Accelerator tables are activated
with the CPOPT
subparameter. If it is set to ON
, Natural automatically creates
two accelerator tables during session initialization by using ICU conversion
functions. The first table (with a size of 512 bytes) is used for conversion
from code page to Unicode and the other table (with a size of 65535 bytes) is
used for conversion from Unicode to code page. During a Natural session, all
conversions are then executed via the accelerator tables instead of ICU calls.
Accelerator tables are only provided for the default code page (*CODEPAGE
).
Temporary code pages (for example, in
MOVE
ENCODED
statements) do not use accelerator tables if the
module NATCPTAB
is not linked. If it linked, up to 30 accelerator
tables based on the ICU database are used to speed up performance.
The parameters CFICU
and
CP
can be
used to adjust Natural to specific purposes:
Settings | Description |
---|---|
CFICU=OFF,
CP=OFF |
Compatibility mode. For running
existing applications without Unicode and without code page support. Legacy
translation tables are used for I/O translation. Compared with former versions,
there is no significant increase in resource consumption (CPU time and buffer
usage). This mode does not need the ICS module SAGICU (or an
alternative ICS module)
to be linked to the Natural nucleus.
|
CFICU=ON,
CP=OFF |
For new applications that are using
Unicode and code page conversion (MOVE ENCODED ) but not
default code page support. Therefore, the system variable
*CODEPAGE
is empty. It is possible to use U format variables, but it is not possible to
use, for example, MOVE A TO U , since this requires the default
code page information. The error NAT3411 will be issued indicating that no
default code page is available.
|
CFICU=ON,
CP=value * |
For new applications that are using full Unicode as well as code page support. |
CFICU=OFF,
CP=value * |
This combination does not make sense,
because code page support needs ICU services for conversion. Therefore,
CFICU=ON is enforced in this case and a session initialization
message is issued.
|
* where value is any value
other than OFF
.
The compiler option CPAGE
creates objects
that can be executed with a code page which is different from the code page
used at creation time. This means that all alphanumeric constants of the object
which are coded with the code page at creation time have to be converted to the
code page which is active at execution time. To make it possible for the
Natural object loader to find and convert alphanumeric constants, an additional
table is created by the compiler. This increases the size of the generated
object, depending on the number of used alphanumeric constants. The conversion
at runtime consumes additional CPU time. If the default code page (value of the
system variable *CODEPAGE
)
is the same as the code page at creation time or if the session has no default
code page (CP=OFF
), no conversion
is done. Conversion errors are ignored, independent from the setting of the
parameter CPCVERR
. If the
compiler option CPAGE
is set to OFF
, no conversion is
performed at runtime and the alphanumeric constants are treated as they are.
The following sample program is cataloged with code page IBM01141 (German) and is executed with default code page IBM01140 (us). The characters "Ä", "Ö" and "Ü" are defined in both code pages, but at different code points.
Example 1 - CPAGE=OFF
:
OPTIONS CPAGE=OFF WRITE *CODEPAGE 'ÄÖÜ' END
Output with code page IBM01140 (us):
Page 1 IBM01140 ¢\!
Example 2 - CPAGE=ON
:
OPTIONS CPAGE=ON WRITE *CODEPAGE 'ÄÖÜ' END
Output with code page IBM01140 (us):
Page 1 IBM01140 ÄÖÜ
The most common standard for code page names is the IANA name.
Therefore, the system variable *CODEPAGE
contains the IANA name of the default code page. A code page is qualified by
its Coded Character Set ID (CCSID). Currently, Adabas uses the Entire
Conversion Service definition (ADAECS). The macro
NTCPAGE
can be used to assign these different
names to the unambiguous IANA name. NTCPAGE
is part of the Natural
configuration module (NATCONFG
).
It does not matter whether the IANA name, the CCSID/CCSN or the
alias name is entered with the CP
parameter. The alias
name can be a user-defined name which is used to assign a more significant name
to the code page. In any case, *CODEPAGE
contains
the IANA name of the selected code page.
In addition, a place holder character can be defined for a code
page. It overwrites the default substitution character of that code page, which
is normally a non-displayable character (for example, H’3F’
in an
EBCDIC code page). The place holder character can be used to avoid that
non-displayable characters are sent to terminals.
Example:
NTCPAGE IANA=IBM01140,CCSID=1140,ECS=1140,ALIAS=’US’,PHC=003F
The values IBM01140
, 1140
or
US
can be entered with the CP
parameter to
activate the code page. *CODEPAGE
contains the
name IBM01140. The substitution character of the code page will be replaced by
"U+003F", which is a quotation mark (?).
The number of available code pages depends on the used ICU data library.
All code pages defined in the currently used data package can be
used by Natural. An NTCPAGE
entry is only necessary if an
alternative alias name or place holder character is desired.
The following configuration parameter is available with Natural Development Server (NDV):
Settings | Description |
---|---|
TERMINAL_EMULATION=WEBIO
|
Specifies that the Natural Web I/O Interface client (which supports Unicode) is used for input and output. |
The code page information of the object is part of the
object directory displayed with the LIST
system
command. For details, see Displaying Directory
Information in the System Commands
documentation.
The encoding of code page data can be specified on different levels.
The default code page can be defined with the
CP
parameter.
A code page can be defined for Natural sources, batch input
(CPOBJIN
,
CPSYNIN
)
and output files (CPPRINT
).
If a code page is defined at object level, this overwrites the default code page.