Configuring Broker for Internationalization

Software internationalization is the process of designing products and services so that they can be adapted easily to a variety of different local languages and cultures. Internationalization within EntireX means internationalization of messages: the incoming and outgoing messages are converted to the desired codepage of the platform in use. This document explains in detail how to configure the broker for character conversion. It covers the following topics:

See also Internationalization with EntireX


Configuring ICU Conversion

Start of instruction setTo configure ICU conversion

  1. In the Broker attribute file, set the service-specific attribute CONVERSION. Examples:

  2. Optionally configure a CONVERSION OPTION to tune error behavior to meet your requirements; see OPTION Values for Conversion.

  3. For the Broker attribute, check if ICU conversion is possible, that is, the attribute ICU-CONVERSION is either

    • not defined, its default is YES

    • set to YES

Start of instruction setTo configure locale string defaults (optional)

  • If the broker's locale string defaults do not match your requirements (see Broker's Locale String Defaults), we recommend you assign suitable locale string defaults for your country and region, see the respective attribute in Codepage-specific Attributes for how to customize the broker's locale string defaults.

Start of instruction setTo customize mapping of locale strings (optional)

  • If the built-in locale string mapping mechanism does not match your requirements, you can assign specific codepages to locale strings. See Broker's Built-in Locale String Mapping and locale-string for information on customizing the mapping of locale strings to codepages.

Building and Installing ICU Custom Converters

User-written ICU custom-converters can be used for ACI-based Programming, RPC-based Components, and Reliable RPC. This section covers the following topics:

Writing a User-written ICU Converter

ICU uses algorithmic conversion, non-algorithmic conversion and combinations of both. See ICU Conversion. Non-algorithmic converters defined by the UCM format are the easiest way to define user-written ICU converters. See UCM Format.

Start of instruction setTo write a (non-algorithmic) user-written ICU converter

  • Define the ICU converter file in UCM format using a text editor to meet your requirements.

    Note:
    For further explanation of the UCM file format, see ICU Resources.

Writing algorithmic and partially algorithmic converters can be complex. However, they can be installed into EntireX in the same way as the table-driven, non-algorithmic ones. A description of how to write algorithmic and partially algorithmic converters is beyond the scope of this documentation; please see the ICU documentation and other sources specified under ICU Resources.

Compiling a User-written ICU Converter

Start of instruction setTo compile the user-written ICU converter

  1. Extract the ICU tool makeconv and ICU shared libraries as described under Installing the EntireX ICU Custom Converter Build Environment under z/OS UNIX.

  2. Compile the converter source files (extension .ucm) into binary converter files (extension ".cnv") using the ICU tool makeconv. Example:

    makeconv -v myebcdic.ucm

    This produces a binary converter file named myebcdic.cnv.

    Caution:
    The binary format "cnv" depends on the endianness (big/little-endian) and character set family (ASCII/EBCDIC) of the computer where it is produced. For example, a binary converter file produced on a machine with big endianes cannot be executed on a machine with little-endian (and vice versa) or character set family EBCDIC cannot be executed on a machine with character set family ASCII (and vice versa). It is highly recommended to compile the converter source file(s) on the same target platform where the broker runs - otherwise unpredictable result may occur.

Installing a User-written ICU Converter

Start of instruction setTo install the user-written ICU converter

  1. Define the broker attribute ICU-DATA-DIRECTORY. See Broker-specific Attributes.

    Example:

    ICU DATA DIRECTORY="/home/sag/EntireX/config/etb"
  2. Define the subdirectory icudt<icu-version><endianness> within the ICU-DATA-DIRECTORY

    where <icu-version> is the ICU version used, for example 54, and
      <endianness> is "e" EBCDIC (big-endian)

    Example:

    /home/sag/EntireX/config/etb/icudt54e

    Notes:

    1. The subdirectory and its naming are given by ICU standard. It is not invented by Software AG.
    2. See the Release Notes to determine the ICU version used by the broker you are running and form the correct name - otherwise the user-written ICU converter will not be located.
    3. There are also other approaches supported by ICU to locate converters. These approaches are (also) ICU version dependent. However, Software AG recommends the mechanism described above. See the ICU website for more information under ICU Resources.
  3. Copy the user-written ICU converter binary file (extension "cnv") to the referenced by ICU-DATA-DIRECTORY and its subdirectory defined under steps 1 and 2 above. Examples:

    /home/sag/EntireX/config/etb/icudt54e/myebcdic.cnv
    /home/sag/EntireX/config/etb/icudt54e/myascii.cnv
  4. If the converter name is not sent as the locale string by your application, customize the mapping of locale strings by assigning the user-written ICU converter (codepage) to locale strings in the Broker attribute file, see locale-string for how to customize the mapping of locale strings to codepages. Example:

    DEFAULTS=CODEPAGE 
    /* Customer-written ICU converter */
    CP1140=myebcdic
    CP0819=myascii
  5. For the Broker attribute, check whether ICU conversion is possible, that is, the attribute ICU-CONVERSION is not defined (default=YES) or set to YES.

  6. For the Broker attribute, check whether use of ICU custom converters is possible, that is, the attribute ICU-SET-DATA-DIRECTORY is not defined (default=YES) or set to YES.

Writing Translation User Exits

This section covers the following topics:

Introduction

EntireX Broker provides an interface to enable user-written translation routines in the programming language Assembler. It contains three parameters:

  • The address of the TRAP control block (TRAP = Translation Routine / Area for Parameters).

  • The address of a temporary work area. It is aligned to fullword / long integer boundary (divisible by 4). The work area can only be used for temporary needs and is cleared after return.

  • A fullword (long integer) that contains the length of the work area.

Note:
Names for user-written translation routines starting with "SAG" are reserved for Software AG usage and must not be used, e.g. "SAGTCHA" and "SAGTRPC".

Structure of the TRAP Control Block

The Assembler dummy section TR$TRAP covers the layout of the TRAP control block:

TR$TRAP  DSECT ,
TR$TYPE  DS       F            TRAP type
TR$TYP2  EQU      2            TRAP type ETB 121
TR$ILEN  DS       F            Input buffer length
TR$IBUF  DS       A            Address of input buffer
TR$OLEN  DS       F            Output buffer length
TR$OBUF  DS       A            Address of output buffer
TR$DLEN  DS       F            Length of data returned:
*                              Should be set to the minimum value of TR$ILEN 
*                              and TR$OLEN.
TR$SHOST DS       F            Sender's host:
*                              x'00000000' = little endian
*                              x'00000001' = big endian
TR$SCODE DS       F            Sender's character set:
SEBCIBM  EQU      X'00000022'  EBCDIC (IBM)
SEBCSNI  EQU      X'00000042'  EBCDIC (SNI)
SA88591  EQU      X'00000080'  ASCII
TR$RHOST DS       F            Receiver's host     --> see TR$SHOST
TR$RCODE DS       F            Receiver's char set --> see TR$SCODE
TR$BHOST DS       F            BROKER host         --> see TR$SHOST
TR$BCODE DS       F            BROKER char set     --> see TR$SCODE
TR$SENVA DS       F            Sender's ENVIRONMENT field supplied:
OFF      EQU      X'00000000'  ENVIRONMENT field not set
ON       EQU      X'00000001'  ENVIRONMENT field set
*
TR$RENVA DS       F            Receiver's ENVIRONMENT field supplied:
*                              --> see TR$SENVA
TR$SENV  DS       CL32         Sender's ENVIRONMENT field
TR$RENV  DS       CL32         Receiver's ENVIRONMENT field
TR$LEN   EQU      *-TR$TRAP    Length of TRAP

Using the TRAP Fields

The TR$DLEN must be supplied by the user-written translation routine. It tells the Broker the length of the message of the translation. In our example its value is set to the minimum length of the input and output buffer.

All other TRAP fields are supplied by the Broker and must not be modified by the user-written translation routine.

The incoming message is located in a buffer pointed to by TR$IBUF. The length (not to be exceeded) is supplied in TR$ILEN. The character set information from the send buffer can be taken from TR$SCODE.

The outgoing message must be written to the buffer pointed to by TR$OBUF. The length of the output buffer is given in the field TR$OLEN. The character set is specified in TR$RCODE. If the addresses given in TR$IBUF and TR$OBUF point to the same location, it is not necessary to copy the data from the input buffer to the output buffer.

The environment fields TR$SENVA and TR$RENVA are provided to handle site-dependent character set information. For the SEND and/or RECEIVE functions, you can specify data in the ENVIRONMENT field of the Broker ACI control block. This data is translated into the codepage of the platform where EntireX Broker is running (see field TR$BCODE) and is available to the TR$SENV or TR$RENV field in the TRAP control block. TR$SENVA or TR$RENVA are set to ON if environmental data is available. Any values given in the API field ENVIRONMENT must correspond to the values handled in the translation routine.

Start of instruction setTo assemble and link the SAGTCHA user-written translation routine

  • Assemble and link your translation routine. You can give the resulting load module any name that does not begin with "SAG". Names starting with "SAG", such as "SAGTCHA", are reserved for Software AG.

Configuring Translation User Exits

Start of instruction setTo configure translation user exits

As a prerequisite, the user-written translation module must be accessible to the Broker worker threads.

  1. Copy the user-written translation module into any library of the Broker's steplib concatenation.

  2. In the Broker attribute file, set the service-specific attribute TRANSLATION to the name of the user-written translation routine. Example:

    TRANSLATION=MYTRANS

Writing SAGTRPC User Exits

This section covers the following topics:

Introduction

EntireX Broker provides an interface to SAGTRPC user exit routines written in the programming language Assembler. The interface contains three parameters:

  • The address of the UE (user exit) control block.

  • The address of a temporary work area. It is aligned to a fullword / long-integer boundary (divisible by 4). The work area can only be used temporarily and is cleared after return.

  • A fullword (long integer) that contains the length of the work area.

Note:
Names for conversion routines starting with "SAG" are reserved for Software AG usage and must not be used, e.g. "SAGTCHA" and "SAGTRPC".

Structure of the User Exit Control Block

The Assembler dummy section UE$CB shows the layout of the user exit control block.

UE$CB    DSECT , ...... User Exit Control Block                          
*                       ***********************                          
*                                                            Direction   
*                                                            ---------   
UE$VERS  DS    F        UECB version                          input      
UE$VER1  EQU   1        UECB version 1                                   
UE$IBUF  DS    A        Address of input buffer               input      
UE$ILEN  DS    F        Input buffer length                   input      
UE$OBUF  DS    A        Address of output buffer              input      
UE$OLEN  DS    F        Output buffer length                  input      
UE$DLEN  DS    F        Length of data returned               output     
*                                                                        
UE$SHOST DS    F        Senders host:                         input      
*                       x'00000000' = little endian                      
*                       x'00000001' = big endian                         
*                                                                        
UE$SCODE DS    F        Senders character set:                input      
SEBCIBM EQU X'00000022' EBCDIC (IBM)                                     
SEBCSNI EQU X'00000042' EBCDIC (SNI)                                     
SA88591 EQU X'00000080' ASCII
*                                                                    
UE$RHOST DS    F        Receivers host      --> see UE$SHOST  input  
UE$RCODE DS    F        Receivers char set  --> see UE$SCODE  input  
UE$BHOST DS    F        BROKER host         --> see UE$SHOST  input  
UE$BCODE DS    F        BROKER char set     --> see UE$SCODE  input  
*                                                                    
UE$SCP   DS    F        Sender   Codepage number                 
UE$RCP   DS    F        Receiver Codepage number                 
UE$BCP   DS    F        Broker   Codepage number                 
*                                                                    
UE$FCT   DS    CL1      Function                              input  
FCTCONV  EQU   C'C'     Function CONVERT                             
FCTGLEN  EQU   C'L'     Function GETLENGTH                           
UE$DIR   DS    CL1      Direction                             input  
DIRS2B   EQU   C'1'     Direction Sender to Broker                   
DIRS2R   EQU   C'2'     Direction Sender to Receiver                 
DIRB2R   EQU   C'3'     Direction Broker to Receiver                 
UE$FMT   DS    CL2      Format                                input  
FMTUSER  EQU   C'01'    User Data like User ID, Program, Library      
FMTMETA  EQU   C'02'    Meta Data Header                             
FMTFB    EQU   C'03'    Format Buffer  
FMTSB    EQU   C'04'    String Buffer 
FMTVBN   EQU   C'05'    Meta data value buffer 
FMTPRE   EQU   C'99'    Preview format buffer                        
FMTA     EQU   C'A '    Data Type A                                  
FMTAV    EQU   C'AV'    Data Type AV                                 
FMTB     EQU   C'B '    Data Type B                                  
FMTBV    EQU   C'BV'    Data Type BV                                 
FMTD     EQU   C'D '    Data Type D                                  
FMTF4    EQU   C'F4'    Data Type F4                                 
FMTF8    EQU   C'F8'    Data Type F8                                 
FMTI1    EQU   C'I1'    Data Type I1                                 
FMTI2    EQU   C'I2'    Data Type I2                                 
FMTI4    EQU   C'I4'    Data Type I4                                 
FMTK     EQU   C'K '    Data Type K                                  
FMTKV    EQU   C'KV'    Data Type KV                                 
FMTL     EQU   C'L '    Data Type L                                  
FMTN     EQU   C'N '    Data Type N                                  
FMTP     EQU   C'P '    Data Type P                                  
FMTT     EQU   C'T '    Data Type T                                  
FMTU     EQU   C'U '    Data Type U                                  
FMTUV    EQU   C'UV'    Data Type UV                                 
UE$ETXT  DS    CL40     Error Text output                            
UE$LEN   EQU   *-UE$CB  Length of UECB                               
         SPACE ,

The user-written conversion exit example USRTRPC is delivered in the EntireX common source library EXX101.SRCE. The related JCL to build USRTCHA is in member EXBUSRXT in the EntireX common jobs library. See Contents of Mainframe Installation Medium.

Using the User Exit Interface Fields

The user exit provides two separate functions, CONVERT and GETLENGTH. The field UE$FCT indicates the function to execute.

Errors

Both functions can send an error, using register 15 in the range 1 to 9999 to SAGTRPC together with an error text in the field UE$ETXT.

  • A value of 0 returned in register 15 means successful response.

  • Error 9999 is reserved for output buffer overflow. See CONVERT Function.

  • When an error occurs, the conversion of the message will be aborted and the error text will be sent to the receiver (client or server). The error is prefixed with the error class 1011. See Message Class 1011 - User-definable SAGTRPC Conversion Exit.

Example:

The user exit returns 1 in register 15 and the message "Invalid Function" in UE$ETXT. The receiver gets the error message 10110001 Invalid Function.

CONVERT Function

This function has to be executed when the contents of UE$FCT match the definition FCTCONV.

UE$DLEN must be supplied by SAGTRPC's user-written conversion exit. Its value must be set to the length of the output buffer.

All other interface fields are supplied by the Broker and must not be modified by SAGTRPC's user-written conversion exit.

The incoming data is located in a buffer pointed to by UE$IBUF. UE$ILEN defines the length.

The outgoing converted message must be written to the buffer pointed to by UE$OBUF. The field TR$OLEN defines the maximum length available.

For variable length data such as AV and KV, an output buffer overflow can occur if the message size increases after conversion or the receiver's receive buffer is too small. In this case error 9999 "output buffer overflow" must be returned, which calls the GETLENGTH Function for the remaining fields.

GETLENGTH Function

The GETLENGTH function evaluates the needed length of the output buffer after conversion. An actual conversion must not be performed. The length needed must be returned in the field UE$OLEN.

The GETLENGTH function is called for remaining fields after the CONVERT function returned the error 9999 "output buffer overflow".

The purpose of this function is to evaluate the length needed by the receiver's receive buffer. This length is returned to the receiver in the ACI field RETURN-LENGTH. The receiver can then use the Broker ACI function RECEIVE with the option LAST together with a receive buffer large enough to reread the message.

Character Set and Codepage

The character-set information used is the same as in the user-written translation routine and is taken from UE$SCODE (for the sender), UE$RCODE (for the receiver) and UE$BCODE (for the Broker). The character-set information depends on the direction information given in the field UE$DIR. See the following table:

UE$DIR From Character Set To Character Set
DIRS2B (Sender to Broker) UE$SCODE UE$BCODE
DIRS2R (Sender to Receiver) UE$SCODE UE$RCODE
DIRB2R (Broker to Receiver) UE$BCODE UE$RCODE

Alternatively, the codepage as derived from the locale string mapping process is provided in UE$SCP (sender codepage), UE$RCP (receiver codepage) and UE$BCP (Broker codepage), and can be used to find the correct conversion table. See the following table and also Locale String Mapping.

UE$DIR From Codepage To Codepage
DIRS2B (Sender to Broker) UE$SCP UE$BCP
DIRS2R (Sender to Receiver) UE$SCP UE$RCP
DIRB2R (Broker to Receiver) UE$BCP UE$RCP

Software AG IDL Data Types to Convert

The field UE$FMT provides the SAGTRPC user-written conversion exit with the information on the IDL data types to convert. Each data type can be handled independently.

UE$FMT Data to be converted Notes
FMTA IDL data type A 1, 3, 4
FMTAV IDL data type AV 4, 5
FMTB IDL data type B 1, 2, 7
FMTBV IDL data type BV 1, 2, 7
FMTD IDL data type D 1, 2, 7
FMTF4 IDL data type F4 1, 2, 7
FMTF8 IDL data type F8 1, 2, 7
FMTI1 IDL data type I1 1, 2, 7
FMTI2 IDL data type I2 1, 2, 7
FMTI4 IDL data type I4 1, 2, 7
FMTK IDL data type K 1, 3, 4
FMTKV IDL data type KV 4, 5
FMTL IDL data type L 1, 2, 7
FMTN IDL data type N 1, 2, 7
FMTP IDL data type P 1, 2, 7
FMTT IDL data type T 1, 2, 8
FMTU IDL data type U 1, 2, 7
FMTUV IDL data type UV 1, 2, 7
FMTUSER RPC user data such as user ID, library, program... 1, 3, 4
FMTMETA RPC metadata 1, 2, 7
FMTFB RPC format buffer 1, 2, 7
FMTSB RPC metadata variable length 4, 5, 7
FMTPRE Preview data 4, 6, 7

Notes:

  1. Field length is constant.
  2. The field content length must not increase or decrease during conversion. If this happens, the user exit should produce an error.
  3. If the field content length decreases during the conversion, suitable padding characters (normally blanks) have to be used.
    If the field content length increases during conversion and exceeds the field length, the contents must be truncated or, alternatively, the conversion can be aborted and an error produced.
  4. If the contents are truncated, character boundaries are the responsibility of the user exit. Complete valid characters after conversion have to be guaranteed. This may be a complex task for codepages described under Arabic Shaping, EBCDIC Stateful Codepages or Multibyte or Double-byte Codepages. For single-byte codepages it is simple because the character boundaries are the same as the byte boundaries.
  5. The field length can decrease or increase during the conversion up to the output buffer length. The new field length must be returned in UE$DLEN. If the output buffer in the CONVERT function is too small, error 9999 must be returned to the caller.
  6. The field buffer should continue to be converted until the output buffer is full or the input buffer has been processed. If the field content length increases or truncations occur, no error should be produced. If the field content length decreases, there should be no padding. The new field length should simply be returned to the caller.
  7. Codepages used for RPC data streams must meet several requirements. See Codepage Requirements for RPC Data Stream Conversions. If these are not met, the codepage cannot be used to convert RPC data streams.

Start of instruction setTo assemble and link the SAGTRPC user-written conversion exit

  • Assemble and link your conversion exit. You can give the resulting load module any name that does not begin with "SAG". Names starting with "SAG", such as "SAGTRPC", are reserved for Software AG. Refer to the JCL provided in member EXBUSRXT in data set EXX101.JOBS.

Start of instruction setTo install and configure the SAGTRPC user-written conversion exit

Configuring SAGTRPC User Exits

Start of instruction setTo configure SAGTRPC user exits

  1. The user-written conversion module must be accessible to the Broker worker threads. Therefore, copy the user-written conversion module into any library of the Broker's steplib concatenation.

  2. In the Broker attribute file, set the service-specific attribute CONVERSION to the name of the user-written SAGTRPC user exit routine. Example:

    CONVERSION=(MYTRANS)

Start of instruction setTo configure locale string defaults

  • If the broker's locale string defaults do not match your requirements, we recommend you assign suitable locale string defaults for your country and region. See the appropriate attribute under Codepage-specific Attributes for information on customizing broker's locale string defaults, and also Locale String Mapping.

Start of instruction setTo customize mapping of locale strings