It is assumed that you have read the document Introduction to Internationalization and are familiar with the various internationalization approaches described there.
This document provides information to help you decide which internationalization approach is the most appropriate. It covers the following topics:
See also Configuring Broker for Internationalization under z/OS | UNIX | Windows | BS2000/OSD | z/VSE.
This table gives an overview of the internationalization approaches that can be used. The approach you choose depends on
ACI or RPC payload
the type of codepage used by participants (client and server): single-byte or complex codepage configuration (1), for example multibyte, double-byte, EBCDIC stateful codepages, Arabic shaping etc.
Internationalization Approach | Using Locale Strings | All components use single-byte codepages | One component uses a complex codepage configuration (1) | Usage Hint | ||
---|---|---|---|---|---|---|
ACI | RPC (2) | ACI | RPC (2) | |||
ICU Conversion | yes 3,5 | yes | yes | yes | yes | ICU conversion is recommended. In the Broker attribute file, set the service-specific or topic-specific broker attribute CONVERSION :
We recommend always using SAGTRPC for RPC data streams. Conversion with Multibyte, Double-byte and other Complex Codepages will always be correct, and Conversion with Single-byte Codepages is also efficient because SAGTRPC detects single-byte codepages automatically. See Conversion Details. See also Configuring ICU Conversion under z/OS | UNIX | Windows | BS2000/OSD | z/VSE. |
Translation | no | yes | yes | no | no | Translation is not recommended for the following reasons:
Consider instead using ICU conversion, see first row in this table. |
Translation User Exit | no | yes | yes | yes | no | Translation User Exit is not recommended. If you only wish to adapt code points, it is too much effort. We recommend you use ICU conversion instead. See Translation User Exit Replacement with ICU Conversion. |
SAGTRPC User Exit | optional 4,5 | no | yes | no | yes | Requires considerable effort for implementation. See Conversion Details. Consider instead using ICU conversion. See first row in this table. Not available under z/VSE. |
Notes:
must follow the rules described under Locale String Mapping
must be a codepage supported by the broker
must be the codepage used in your environment, otherwise unpredictable results may occur.
set the Codepage-specific Attributes under Broker Attributes to meet your requirements, or
configure the participant (client or server). See Preparing EntireX Components for Internationalization.
This table gives an overview of the conversion effort if two participants (client and server) of a communication use single-byte codepages only. It is valid for ICU conversion. For RPC, SAGTRPC detects single-byte codepages automatically and converts them efficiently in one step (a single ICU call) from source to target encoding. This is the same as SAGTCHA for ACI. The same applies if you have invented your own internationalization approach with Translation User Exit.
The effort does not depend on ACI or RPC payload - there is no difference. If one participant (client or server) uses a complex codepage configuration, the information given here does not apply; see Conversion with Multibyte, Double-byte and other Complex Codepages instead.
To find out if a codepage is single-byte, see ICU Resources.
Codepage Configuration | ACI (1) | RPC (2)(3) |
---|---|---|
Single-byte codepages | Conversion is fast and efficient in one step. | Conversion is fast and efficient in one step. |
Notes:
CONVERSION
is set to CONVERSION=SAGTCHA
.
CONVERSION
is set to CONVERSION=SAGTRPC
.
This table gives an overview on the conversion effort if one participant (client or server) of a communication use a multi-byte, doublebyte or other complex codepage configuration (see the table), including Arabic shaping. It applies to ICU conversion. For RPC, SAGTRPC detects complex codepage configurations automatically and converts them as described (see column RPC) from source to target encoding. If you have invented your own internationalization approach with
Translation User Exit for ACI, consider the rules in column ACI
SAGTRPC User Exit for RPC, consider the rules in column RPC
depending on codepage type.
If two participants (client and server) of a communication use single-byte codepages only, see Conversion with Single-byte Codepages. With a complex codepage configuration, the effort depends on:
ACI or RPC payload
the type of codepage used: multi-byte, doublebyte or EBCDIC stateful, etc.
whether Arabic shaping is required
To find out if a codepage is multibyte, double-byte or EBCDIC stateful, see ICU Resources.
Codepage Configuration | ACI (1) | RPC (2)(3) |
---|---|---|
Multibyte or double-byte codepages | There is no additional effort compared to Conversion with Single-byte Codepages. Conversion is performed in one step, the same as with single-byte codepages. Please note the payload may change its length in bytes during conversion. | If at least one participant (client or server) uses a multibyte or double-byte codepage with RPC,
each IDL parameter (see simple-parameter-definition ) must be converted separately.
The data in IDL type A, AV, K and KV and RPC metadata may increase or decrease after conversion from the sender's source codepage
to the receiver's target codepage. The following must be honored:
All other IDL data types are converted as with single-byte code pages. |
EBCDIC stateful codepages, encoded with escape technique (SI/SO bytes) | There is no additional effort compared to Conversion with Single-byte Codepages. Conversion is performed in one step, the same as with single-byte codepages. Please note the payload may change its length in bytes during conversion. There is no special handling for SI/SO bytes as with RPC. | If at least one participant (client or server) uses an EBCDIC stateful codepage with RPC,
each IDL parameter (see simple-parameter-definition ) must be converted separately.
Also, the IDL types K and KV allow you to transfer double-byte data without SO and SI escape characters.
This feature is designed for use in Asian countries. The disadvantage is that IDL fields must be converted field-by-field.
To convert the fields correctly, RPC programmers have to consider the following rules, otherwise unpredictable results may
occur:
All other IDL data types are converted as with single-byte code pages. |
Hebrew CP803 (4) | There is no additional effort compared to Conversion with Single-byte Codepages.
Conversion is performed in one step, the same as with single-byte codepages. Latin lowercase characters cannot be used and
lead to conversion errors. See OPTION Values for Conversion to tune error behavior to meet your requirements.
|
If at least one participant (client or server) uses the Hebrew codepage CP803, each IDL parameter
(see simple-parameter-definition ) must be converted separately,
because CP803 does not include Latin lowercase characters (3).
Please note the following:
|
Arabic shaping (5) | The additional effort compared to Conversion with Single-byte Codepages. The conversion itself is performed in one step, the same as with single-byte codepages. Shaping is performed on the complete ACI payload. | If Arabic shaping is required, each IDL parameter (see simple-parameter-definition ) must be converted separately. Shaping is performed on IDL data types A, AV, K and KV. All other IDL data types are converted
as with single-byte code pages.
|
Notes:
CONVERSION
is set to CONVERSION=SAGTCHA
.
CONVERSION
is set to CONVERSION=SAGTRPC
.
Codepages used to convert RPC data streams must meet several requirements:
Codepages used to convert RPC data streams must have the following code points (characters) defined:
Character | also known as | Rendered | Unicode Code Point |
---|---|---|---|
uppercase letters A-Z without special characters | A - Z | 0x0041 to 0x005A | |
lowercase letters a-z without special characters | a - z | 0x0061 to 0x007A | |
digits | 0-9 | 0x0030 to 0x0039 | |
SPACE | " " | 0x0020 | |
LEFT PARENTHESIS | OPENING PARENTHESIS | "(" | 0x0028 |
RIGHT PARENTHESIS | CLOSING PARENTHESIS | ")" | 0x0029 |
PLUS SIGN | "+" | 0x002B | |
HYPHEN | MINUS | "-" | 0x002D |
SOLIDUS | SLASH | "/" | 0x002F |
COLON | ":" | 0x003A | |
COMMA | "," | 0x002C | |
FULL STOP | PERIOD | "." | 0x002E |
EQUALS SIGN | "=" | 0x003D |
All code points (characters) listed in the table above must have a unique mapping (without any fallbacks and reverse fallbacks) to/from Unicode, that is, they must be roundtrip-compatible.
If the codepage used is a multibyte or double-byte codepage, the code points (characters) listed in the table above must have a length of 1 byte within the codepage. Therefore UTF-16 encoding cannot be used, but UTF-8 encoding is possible.
Codepages that do not obey the rules above cannot be used for RPC-based components, because those code points (characters) are used to code for example the IDL library and IDL program, descriptive metadata and IDL type fields in numeric, integer and binary form.