This document covers the following topics:
Substitution Characters Used When a Character Cannot Be Converted
Receiving a Conversion Error When Cataloging a Source Which Has UTF-8 Format
Receiving Junk on Linux When Displaying U Format by a Terminal Emulation
The code page you have defined with the profile parameter CP
does either not exist (see http://demo.icu-project.org/icu-bin/convexp for valid ICU code
pages and http://www.iana.org/assignments/character-sets for the
appropriate IANA names) or is an invalid default code page for the platform (for example,
an EBCDIC code page cannot be
used on a Windows or Linux
platform).
The default code page is the code page which is the result of the evaluation of the
profile parameter CP
.
If CP
is not filled, it is the current
operating system code page.
On the platforms supported by Natural for Linux, you should always define the
CP
parameter, because the ICU default could be defined
differently for different Linux platforms and this definition can as well change for a
specific platform with newer ICU versions.
The default code page which is used by Natural for conversions between code page and
Unicode and vice versa can be detected by displaying the content of the system variable
*CODEPAGE
.
Should you save all Natural sources in UTF-8 format depends on the characters you want to use and on the platforms on which your sources are located. If you want to use Unicode constants, UTF-8 is the only possibility to store all combinations of characters. However, you can define hexadecimal UH constants which can also be stored in code page sources. The disadvantage of hexadecimal constants is that you have to know the UTF-16 encoding for every character of the constant. On mainframes, UTF-8 format for sources is not possible at all. On Linux, UTF-8 sources can only be handled via SPoD; they cannot be handled locally on Linux.
Use the MOVE ENCODED
statement for conversion from UTF-8 to UTF-16: the code page
"UTF-8" has to be used for the A format variable.
Check if you are using the correct code page. If the code page is correct, check if the selected font supports the characters you want to display.
The code page which is defined for the source is not correct. When converting the contents of the source to Unicode, a conversion error occurs. Change the encoding of the source so that the conversion to Unicode is successful.
You have entered characters in the source which cannot be converted to the code page which was used to read the source. Check if you have entered these characters by mistake or if you really want to save the characters in the source. In the first case, remove the faulty characters and save the source. In the second case, save the source in UTF-8 format or, if the characters are contained in U constants, use UH constants instead.
If you have not entered any characters which are not contained in the code page of the
source, check whether the profile parameter SRETAIN
has been set to OFF
. In this
case, the source will be saved with the default code page. If the concerned source was
previously saved with a different code page, a conversion error may occur.
To find out the encoding of a Natural source, in Natural Studio, invoke the Properties dialog box for the source node. The General page shows the encoding of the source. If the Encoding text box is empty, no specific encoding is stored for the source. This means that the default encoding is used when reading the source.
The list view windows of Natural Studio also show the encodings of all listed objects.
In Natural Studio, invoke the Properties dialog box for the source node. The General page shows the encoding of the source. If this is not the correct encoding, you can change it by choosing the button: a list of available code pages is shown and you can select the correct encoding for the source.
Open the source in the Natural editor with the correct code page. Save the source with Save As dialog box, select UTF-8 as the encoding.
and in theWhich substitution character is used if a character cannot be converted depends on the direction of the conversion: if a code page character cannot be converted to Unicode, the Unicode substitution character "U+FFFD" is used. If a Unicode character cannot be converted to a code page, the substitution character which is defined by ICU for this code page is used.
For the conversion from Unicode to the default code page, the substitution character
can be changed by setting the profile parameter SUBCHAR
.
You cannot use UTF-8 sources with previous Natural versions. Previous Natural versions do not know any code page information; a UTF-8 source will be interpreted as the current system code page.
A Natural source with UTF-8 format cannot be cataloged because a code point cannot be converted.
All A constants in a source with UTF-8 format are converted to the default code page when storing them in the generated program. Either remove the characters which are not contained in the default code page from the A constants or use U constants instead of A constants.
All characters which are not contained in the default code page will be replaced with the
substitution character of the code page before displaying the output on a terminal
emulation. For an ASCII code page, the substitution character defined by the ICU
conversion table is often "0x1A", which could be a control
character on Linux terminals. It is strongly recommended to use the Natural Web I/O
Interface when using U format in I/O statements. If using a terminal emulation is
essential, the substitution character (SUBCHAR
) can be changed to a printable character
(for example, "?").
You can work with a current SPoD client and an older SPoD server, but you should set the code page of the SPoD client to the code page of the server sources.
See also Prerequisites for Natural Single Point of Development at http://documentation.softwareag.com/natural/spod_prereq/prereq.htm.
You can work with a current SPoD server and an older SPoD client, but this is not recommended if you have defined encodings for sources.
See also Prerequisites for Natural Single Point of Development at http://documentation.softwareag.com/natural/spod_prereq/prereq.htm.