Software internationalization is the process of designing products and services so that they can be adapted easily to a variety of different local languages and cultures. Internationalization within EntireX means internationalization of messages: the incoming and outgoing messages are converted to the desired codepage of the platform in use. This document explains in detail how to configure the broker for character conversion. It covers the following topics:
See also Internationalization with EntireX.
To configure ICU conversion
In the Broker attribute file, set the service-specific attribute CONVERSION
.
Examples:
ICU Conversion with SAGTCHA for ACI-based Programming:
CONVERSION=(SAGTCHA,OPTION=SUBSTITUTE)
ICU Conversion with SAGTRPC for RPC-based Components and Reliable RPC:
CONVERSION=(SAGTRPC,OPTION=STOP)
Optionally configure a CONVERSION OPTION
to tune
error behavior to meet your requirements; see OPTION
Values for Conversion.
For the Broker attribute, check if ICU conversion is possible, that is, the attribute
ICU-CONVERSION
is either
not defined, its default is YES
set to YES
To configure locale string defaults (optional)
If the broker's locale string defaults do not match your requirements (see Broker's Locale String Defaults), we recommend you assign suitable locale string defaults for your country and region, see the respective attribute in Codepage-specific Attributes for how to customize the broker's locale string defaults.
To customize mapping of locale strings (optional)
If the built-in locale string mapping mechanism does not match your requirements,
you can assign specific codepages to locale strings.
See Broker's Built-in Locale String Mapping and locale-string
for information on customizing the
mapping of locale strings to codepages.
User-written ICU custom-converters can be used for ACI-based Programming, RPC-based Components, and Reliable RPC. This section covers the following topics:
ICU uses algorithmic conversion, non-algorithmic conversion and combinations of both. See ICU Conversion. Non-algorithmic converters defined by the UCM format are the easiest way to define user-written ICU converters. See UCM Format.
To write a (non-algorithmic) user-written ICU converter
Define the ICU converter file in UCM format using a text editor to meet your requirements.
Note:
For further explanation of the UCM file format, see ICU Resources.
Writing algorithmic and partially algorithmic converters can be complex. However, they can be installed into EntireX in the same way as the table-driven, non-algorithmic ones. A description of how to write algorithmic and partially algorithmic converters is beyond the scope of this documentation; please see the ICU documentation and other sources specified under ICU Resources.
To compile the user-written ICU converter
Compile the converter source files (extension .ucm)
into binary converter files (extension ".cnv") using
the ICU tool makeconv
. Example:
makeconv -v myebcdic.ucm
Note:
EntireX delivers the ICU tool makeconv
in the EntireX bin directory.
This produces a binary converter file named myebcdic.cnv.
Caution:
The binary format "cnv"
depends on the endianness (big/little-endian) and character set family
(ASCII/EBCDIC) of the computer where it is produced. For example, a binary
converter file produced on a machine with big endianes cannot be executed on a
machine with little-endian (and vice versa) or character set family
EBCDIC cannot be executed on a machine with character set family
ASCII (and vice versa). It is highly recommended to compile the
converter source file(s) on the same target platform where the broker runs -
otherwise unpredictable result may occur.
To install the user-written ICU converter
Define the broker attribute ICU-DATA-DIRECTORY
. See Broker-specific Attributes.
Example:
ICU DATA DIRECTORY=".../EntireX/config/etb"
Define the subdirectory icudt<icu-version><endianness> within the ICU-DATA-DIRECTORY
where | <icu-version> | is the ICU version used, for example 54, and |
<endianness> | is either "b" (big-endian) or "l" (little-endian) |
Examples:
.../EntireX/config/etb/icudt54l .../EntireX/config/etb/icudt54b
Notes:
Copy the user-written ICU converter binary file (extension "cnv") to the directory referenced by
ICU-DATA-DIRECTORY
and its subdirectory defined under steps 1 and 2 above. Examples:
.../EntireX/config/etb/icudt54l/myebcdic.cnv .../EntireX/config/etb/icudt54l/myascii.cnv
If the converter name is not sent as the locale string by your
application, customize the mapping of locale strings by assigning the
user-written ICU converter (codepage) to locale strings in the Broker attribute
file, see locale-string
for how to customize the mapping of
locale strings to codepages. Example:
DEFAULTS=CODEPAGE /* Customer-written ICU converter */ CP1140=myebcdic CP0819=myascii
For the Broker attribute, check whether ICU conversion is possible, that
is, the attribute ICU-CONVERSION
is not defined (default=YES
) or set to YES
.
For the Broker attribute, check whether use of ICU custom converters is
possible, that is, the attribute ICU-SET-DATA-DIRECTORY
is not defined (default=YES
) or set to YES
.
This section covers the following topics:
EntireX Broker provides an interface to enable user-written translation routines in the programming language C. It contains three parameters:
The address of the TRAP control block (TRAP = Translation Routine / Area for Parameters).
The address of a temporary work area. It is aligned to fullword / long integer boundary (divisible by 4). The work area can only be used for temporary needs and is cleared after return.
A fullword (long integer) that contains the length of the work area.
Note:
Names for user-written translation routines starting with "SAG" are
reserved for Software AG usage and must not be used, e.g. "SAGTCHA" and
"SAGTRPC".
The C structure TR_TRAP covers the layout of the control block.
typedef struct _TR_TRAP /* I / O */ { unsigned long tr_type; /* TRAP type: TRAP_TYPE inp */ #define TR_TYPE 2 /* TRAP type ETB 121 */ long tr_ilen; /* Input buffer length inp */ unsigned char *tr_ibuf; /* Ptr to input buffer inp */ long tr_olen; /* Output buffer length inp */ unsigned char *tr_obuf; /* Ptr to output buffer inp */ long tr_dlen; /* Len of data returned: out */ /* Minimum of tr_ilen */ /* and tr_olen */ unsigned long tr_shost; /* Senders host inp */ #define TR_LITTLE_ENDIAN 0 /* little endian */ #define TR_BIG_ENDIAN 1 /* big endian */ unsigned long tr_scode; /* Senders character set inp */ #define SEBCIBM ((1L << 5)|(1L << 1)) /* 0x22 EBCDIC (IBM) */ #define SEBCSNI ((1L << 6)|(1L << 1)) /* 0x42 EBCDIC (SNI) */ #define SA88591 (1L << 7) /* 0x80 ASCII */ unsigned long tr_rhost; /* Receivers host (see tr_shost) inp */ unsigned long tr_rcode; /* Receivers char set (see tr_scode) inp */ unsigned long tr_bhost; /* BROKER host (see tr_shost) inp */ unsigned long tr_bcode; /* BROKER char set (see tr_scode) inp */ unsigned long tr_senva; /* Senders ENVIRONMENT field set: inp */ #define OFF 0 /* ENVIRONMENT field not set */ #define ON 1 /* ENVIRONMENT field set */ unsigned long tr_renva; /* Receivers ENVIRONMENT field set: inp */ /* see tr_senva */ #define S_ENV 32 /* size of ENVIRONMENT field */ char tr_senv[S_ENV];/* Senders ENVIRONMENT field inp */ char tr_renv[S_ENV];/* Receivers ENVIRONMENT field inp */ } TR_TRAP;
The tr_dlen must be supplied by the user-written translation routine. It tells the Broker the length of the message of the translation. In our example its value is set to the minimum length of the input and output buffer.
All other TRAP fields are supplied by the Broker and must not be modified by the user-written translation routine.
The incoming message is located in a buffer pointed to by tr_ibuf. The length (not to be exceeded) is supplied in tr_ilen. The character set information from the send buffer can be taken from tr_scode.
The outgoing message must be written to the buffer pointed to by tr_obuf. The length of the output buffer is given in the field tr_olen. The character set is specified in tr_rcode. If the addresses given in tr_ibuf and tr_obuf point to the same location, it is not necessary to copy the data from the input buffer to the output buffer.
The environment fields tr_senva and tr_renva are provided to handle
site-dependent character set information. For the SEND
and/or RECEIVE
functions, you can specify data in the
ENVIRONMENT
field of the Broker ACI control block. This data is translated into
the codepage of the platform where EntireX Broker is running (see field
tr_bcode) and is available to the tr_senv or tr_renv field in the TRAP
control block. tr_senva or tr_renva are set to ON
if environmental data is
available. Any values given in the API field ENVIRONMENT
must correspond to the values handled in the translation routine.
To configure translation user exits
As a prerequisite, the user-written translation routine shared library/object must be accessible to the Broker worker threads.
Copy the user-written translation routine shared library/object into the EntireX lib directory.
In the Broker attribute file, set the
service-specific attribute TRANSLATION
to the name of the user-written translation routine.
Example:
TRANSLATION=libmytrans.s[o|l]
or
Place the user-written translation routine shared library/object in a directory of your choice. Spaces in the path name are not allowed.
In the Broker attribute file, set the service-specific attribute TRANSLATION
to the full path name of the directory of the user-written translation routine. Example:
TRANSLATION=../mydir/mytrans/libmytrans.s[o|l]
This section covers the following topics:
EntireX Broker provides an interface to SAGTRPC user exit routines written in the programming language C. The interface contains three parameters:
The address of the UE (user exit) control block.
The address of a temporary work area. It is aligned to a fullword / long-integer boundary (divisible by 4). The work area can only be used temporarily and is cleared after return.
A fullword (long integer) that contains the length of the work area.
Note:
Names for conversion routines starting with "SAG" are reserved for
Software AG usage and must not be used, e.g. "SAGTCHA" and "SAGTRPC".
The C structure UECB shows the layout of the user exit control block.
typedef struct _UECB { unsigned long eVersion; #define USRTRPC_VERSION_1 1 char * pInputBuffer; unsigned long uInputLen; char * pOutputBuffer; unsigned long uOutputLen; unsigned long uReturnedLen; unsigned long shost; #define USRTRPC_LITTLE_ENDIAN 0 /* little endian */ #define USRTRPC_BIG_ENDIAN 1 /* big endian */ unsigned long scode; #define USRTRPC_SEBCIBM ((1L << 5)|(1L << 1)) /* 0x22 EBCDIC (IBM) */ #define USRTRPC_SEBCSNI ((1L << 6)|(1L << 1)) /* 0x42 EBCDIC (SNI) */ #define USRTRPC_SA88591 (1L << 7) /* 0x80 ASCII */ unsigned long rhost; /* see shost */ unsigned long rcode; /* see scode */ unsigned long bhost; /* see shost */ unsigned long bcode; /* see scode */ unsigned long uCpSender; unsigned long uCpReceiver; unsigned long uCpBroker; char eFunction; #define USRTRPC_FCT_CONVERT 'C' #define USRTRPC_FCT_GETLENGTH 'L' char eDirection; #define USRTRPC_DIR_SENDER_TO_BROKER '1' #define USRTRPC_DIR_SENDER_TO_RECEIVER '2' #define USRTRPC_DIR_BROKER_TO_RECEIVER '3' char sFormat[2]; #define ERX_USERDATA "01" /* UserId, Lib, Pgm, etc. from Header (truncatable) */ #define ERX_METADATA "02" /* Header Data (non-truncatable) */ #define ERX_FRMTDATA "03" /* Format Buffer (non-truncatable) */ #define ERX_SB_ELEMENT "04" /* String Buffer */ #define ERX_VB_METADATA "05" /* Value Buffer Array Occurrences, String Length */ #define ERX_PREVIEW "99" /* Previewing FB and VB, etc... */ /* Convert data lazy. Do not care on */ /* length changes and truncation. */ #define ERX_FRMT_A "A " /* Data Type A */ #define ERX_FRMT_AV "AV" /* Data Type AV */ #define ERX_FRMT_B "B " /* Data Type B */ #define ERX_FRMT_BV "BV" /* Data Type BV */ #define ERX_FRMT_D "D " /* Data Type D */ #define ERX_FRMT_F4 "F4" /* Data Type F4 */ #define ERX_FRMT_F8 "F8" /* Data Type F8 */ #define ERX_FRMT_I1 "I1" /* Data Type I1 */ #define ERX_FRMT_I2 "I2" /* Data Type I2 */ #define ERX_FRMT_I4 "I4" /* Data Type I4 */ #define ERX_FRMT_K "K " /* Data Type K */ #define ERX_FRMT_KV "KV" /* Data Type KV */ #define ERX_FRMT_L "L " /* Data Type L */ #define ERX_FRMT_N "N " /* Data Type N */ #define ERX_FRMT_P "P " /* Data Type P */ #define ERX_FRMT_T "T " /* Data Type T */ #define ERX_FRMT_U "U " /* Data Type U */ #define ERX_FRMT_UV "UV" /* Data Type UV */ char szErrorText[40]; } UECB;
The file usrtrpc.c is an example of the SAGTRPC user exit. It is delivered in the Broker user exit directory. See Directories as Used in EntireX.
The user exit provides two separate functions, Convert
and
GetLength
. The field eFunction
indicates the function to
execute.
Both functions can send an error, using register 15 in the range 1 to
9999 to SAGTRPC together with an error text in the field
szErrorText
.
A value of 0 returned in register 15 means successful response.
Error 9999 is reserved for output buffer overflow.
See Convert
Function.
When an error occurs, the conversion of the message will be aborted and the error text will be sent to the receiver (client or server). The error is prefixed with the error class 1011. See Message Class 1011 - User-definable SAGTRPC Conversion Exit.
Example:
The user exit returns 1 in register 15 and the message "Invalid
Function" in szErrorText
. The receiver gets the error message
10110001 Invalid Function
.
This function has to be executed when the contents of
eFunction
match the definition USRTRPC_FCT_CONVERT
.
uReturnedLen
must be supplied by SAGTRPC's user-written
conversion exit. Its value must be set to the length of the output buffer.
All other interface fields are supplied by the Broker and must not be modified by SAGTRPC's user-written conversion exit.
The incoming data is located in a buffer pointed to by
pInputBuffer
. uInputLen
defines the length.
The outgoing converted message must be written to the buffer pointed to
by pOutputBuffer
. The field tr_olen defines the maximum length
available.
For variable length data such as AV and KV, an output buffer overflow
can occur if the message size increases after conversion or the receiver's
receive buffer is too small. In this case error 9999 "output buffer
overflow" must be returned,
which calls the GetLength
Function for the remaining fields.
The GetLength
function evaluates the needed length of the
output buffer after conversion. An actual conversion must not be performed. The
length needed must be returned in the field uOutputLen
.
The GetLength
function is called for remaining fields
after the Convert
function returned the error 9999 "output
buffer overflow".
The purpose of this function is to evaluate the length needed by the
receiver's receive buffer. This length is returned to the receiver in the ACI
field
RETURN-LENGTH
.
The receiver can then use the Broker ACI function
RECEIVE
with the option LAST
together with a receive buffer large enough to reread the
message.
The character-set information used is the same as in the user-written
translation routine and is taken from scode
(for the sender),
rcode
(for the receiver) and bcode
(for the
Broker). The character-set information depends on the direction information
given in the field eDirection
. See the following table:
eDirection |
From Character Set | To Character Set |
---|---|---|
USRTRPC_DIR_SENDER_TO_BROKER |
scode |
bcode |
USRTRPC_DIR_SENDER_TO_RECEIVER |
scode |
rcode |
USRTRPC_DIR_BROKER_TO_RECEIVER |
bcode |
rcode |
Alternatively, the codepage as derived from the locale string mapping process
is provided in uCpSender
(sender codepage), uCpReceiver
(receiver codepage) and uCpBroker
(Broker codepage), and can be used
to find the correct conversion table. See the following table and also Locale String Mapping.
eDirection |
From Codepage | To Codepage |
---|---|---|
USRTRPC_DIR_SENDER_TO_BROKER |
uCpSender |
uCpBroker |
USRTRPC_DIR_SENDER_TO_RECEIVER |
uCpSender |
uCpReceiver |
USRTRPC_DIR_BROKER_TO_RECEIVER |
uCpBroker |
uCpReceiver |
The field sFormat
provides the SAGTRPC user-written
conversion exit with the information on the IDL data types to convert. Each
data type can be handled independently.
sFormat |
Data to be converted | Notes |
---|---|---|
FMTA |
IDL data type A | 1, 3, 4 |
FMTAV |
IDL data type AV | 4, 5 |
FMTB |
IDL data type B | 1, 2, 7 |
FMTBV |
IDL data type BV | 1, 2, 7 |
FMTD |
IDL data type D | 1, 2, 7 |
FMTF4 |
IDL data type F4 | 1, 2, 7 |
FMTF8 |
IDL data type F8 | 1, 2, 7 |
FMTI1 |
IDL data type I1 | 1, 2, 7 |
FMTI2 |
IDL data type I2 | 1, 2, 7 |
FMTI4 |
IDL data type I4 | 1, 2, 7 |
FMTK |
IDL data type K | 1, 3, 4 |
FMTKV |
IDL data type KV | 4, 5 |
FMTL |
IDL data type L | 1, 2, 7 |
FMTN |
IDL data type N | 1, 2, 7 |
FMTP |
IDL data type P | 1, 2, 7 |
FMTT |
IDL data type T | 1, 2, 8 |
FMTU |
IDL data type U | 1, 2, 7 |
FMTUV |
IDL data type UV | 1, 2, 7 |
FMTUSER |
RPC user data such as user ID, library, program... | 1, 3, 4 |
FMTMETA |
RPC metadata | 1, 2, 7 |
FMTFB |
RPC format buffer | 1, 2, 7 |
FMTSB |
RPC metadata variable length | 4, 5, 7 |
FMTPRE |
Preview data | 4, 6, 7 |
Notes:
uReturnedLen
. If the output buffer in the Convert
function
is too small, error 9999 must be returned to the caller.
To compile and link the SAGTRPC user exit
See the README.TXT in the Broker User Exit Directory.
The user-written SAGTRPC user exit shared library/object must be accessible to the Broker worker threads.
To configure SAGTRPC user exits
Copy the user-written SAGTRPC user exit shared library/object into the EntireX lib directory.
In the Broker attribute file, set the service-specific attribute CONVERSION
to the name of your SAGTRPC user exit. Example:
CONVERSION=(libmytrans.s[o|l])
or
Place the user-written translation routine shared library/object in a directory of your choice.
In the Broker attribute file, set the service-specific attribute CONVERSION
to the full path name of the directory of the SAGTRPC
user exit. Example:
CONVERSION=../mydir/mytrans/libmytrans.s[o|l]
To configure locale string defaults
If the broker's locale string defaults do not match your requirements, we recommend you assign suitable locale string defaults for your country and region. See the appropriate attribute under Codepage-specific Attributes for information on customizing broker's locale string defaults, and also Locale String Mapping.
To customize mapping of locale strings
If the broker's built-in locale string mechanism does not match your requirements, you can assign specific codepages to locale strings. See Broker's Built-in Locale String Mapping and the appropriate attribute under Codepage-specific Attributes for information on customizing broker's locale string defaults.