This document describes the syntax and use of the data definitions to define the layout of files in the database. This input has to be contained in the sequential file FDUFDT that is input to the utility ADAFDU.
The data definitions are used to create the field definition table (FDT) for the file. This table is used by Adabas while executing Adabas commands to determine the logical structure and characteristics of any given field (or group) in the file.
This document covers the following topics:
A separate data-definition statement is required for each field or group to be defined.
The syntax used in constructing data definition entries is:
level-number, name [,standard_length, standard_format] [(,definition_option)...]
`level number' and `name' are required. Any number of spaces may be inserted between definition entries in a line. All text behind a semicolon is treated as comment, and a line that starts with a semicolon is treated as a comment line. Any number of empty lines is allowed.
The level number is a one- or two-digit number in the range 01...07 used in conjunction with field grouping. Leading zeros are optional. Fields may be defined at levels in the range 01...07, where any field with a level number of 02 or greater is considered to be a member of the group on the next lowest level.
Groups may be defined on levels in the range 01...06 and may contain other groups. Level numbers may not be skipped when assigning the level numbers for a group.
The definition of a group enables the user to reference a series of fields (may also be only one field) by using the group name. This is a convenient and efficient method of referencing a series of consecutive fields.
01,GA ; Group 02,A1,... ; Elementary or Multiple Value field 02,A2,... ; Elementary or Multiple Value field 01,GB ; Group 02,B1,... ; Elementary or Multiple Value field 02,GC ; Group 03,C1,... ; Elementary or Multiple Value field 03,C2,... ; Elementary or Multiple Value field
Fields A1 and A2 are in group GA. Field B1 and group GC (consisting of fields C1 and C2) are in group GB. The periods (...) denote further specifications.
The name to be assigned to the field or group.
The name must be two characters in length. The first character must be alphabetic and the second character alphabetic or numeric; upper case and lower case characters are allowed. No special characters are permitted. A maximum of 3214 fields can be defined in a single Adabas record.
The values E0 through E9 are reserved as edit masks and may not be used (see Calling Adabas in the Command Reference Manual for further information about edit masks).
Names must be unique within a file. Names which are English prepositions or articles such as AN, AT, BY, IF, IN, OF, ON etc. should not be used because of possible conflict with syntactical terms used by NATURAL.
Valid Names Invalid Names ----------- ------------- AA A (not 2 characters) e3 E3 (edit mask) S3 F* (special character) wm 3M (first character not alphabetic)
The standard length of the field (expressed in bytes). Standard length is used to define the standard (default) length to be used by Adabas during command processing. The standard length specified is entered in the field definition table (FDT) and used when the field is read/updated, unless the user specifies a length override.
The maximum field lengths which may be specified are:
Format | Maximum Length |
---|---|
ALPHANUMERIC | LA, L4/LB option: 16381 bytes, if no LOB file is defined or if the field is a descriptor or a parent of a derived descriptor. Otherwise 65533 for an LA field and 2147483543 for an L4/LB option, else: 253 bytes |
BINARY | 126 bytes |
FIXED POINT | 8 bytes (1,2,4,8 bytes only) |
FLOATING | 8 bytes (4, 8 bytes only) |
PACKED DECIMAL | 15 bytes |
UNPACKED DECIMAL | 29 bytes |
UNICODE | LA, L4 option: 16381 bytes in UTF-8 encoding else: 253 bytes in UTF-8 encoding (see note below) |
Note:
The length of a Unicode field depends on the encoding used.
Internally, Adabas uses UTF-8 encoding to store Unicode fields, but the Adabas
user can use other encodings to access Unicode fields, and there is no fixed
maximum size for a field in this encoding.
Standard length may not be specified with a group name.
Standard length does not limit the size of any given field value (unless the FI option is used). The user may issue a READ or UPDATE command in which a length greater than the standard length is specified.
If standard length is omitted for a field, the field is assumed to be a variable- length field. Variable-length fields have no standard (default) length. If a variable-length field is referenced without a length override during an Adabas command, the value of the field will be returned preceded by a one-byte field which contains the length of the value (including length byte). The user must give this length byte when the field is updated.
The standard format of the field (expressed as a one-character code):
Code | Format |
---|---|
A | Alphanumeric (left-justified) |
B | Binary (right-justified, unsigned) |
F | Fixed point (right-justified, signed) |
G | Floating (floating, double precision) |
P | Packed decimal (right-justified, signed) |
U | Unpacked decimal (right-justified, signed) |
W | Unicode (see note below) |
Note:
The field is stored internally in UTF-8 encoding, but when you
access the field you can specify a different encoding, to or from which the
value is converted.
The standard format is used to define the standard (default) format to be used by Adabas during command processing. The standard format specified is entered in the field definition table and is used when the field is read/updated, unless the user specifies a format override.
Standard format must be specified for a field. It may not be specified with a group name. A group has no default format. When a group is referenced, the fields within the group are always returned, or must be provided, according to the standard format of each individual field.
Definition options are specified by two-character codes as described below. These codes may be specified in any order, separated by a comma, as the last entries of a data definition statement.
DE indicates that the field is to be a descriptor. Entries will be made in the Associator inverted list for the field, enabling this field to be used in a search expression, as a sort key in a FIND command or to control logical sequential reading.
A maximum of 256 descriptors (including phonetic descriptors, subdescriptors, superdescriptors and hyperdescriptors) may be specified for a file.
The descriptor option should be used judiciously, particularly if the file is large and the field being considered as a descriptor is updated frequently.
There are various ways in which date/time values can be stored in the database, e.g.:
Timestamps in the format YYYYMMDDhhmmss
Natural date/time fields
UNIX time_t
The date/time edit mask specified with the DT option defines which date/time format is used to store the date/time values internally.
The syntax for the field option DT in a field definition is
DT=date_time_edit_mask
where date_time_edit_mask is
E(date_time_edit_mask_name)
The following date / time edit masks are supported:
Date_time_edit_mask | Description | Value 0 is |
---|---|---|
E(DATE) | Date: YYYYMMDD | Invalid date – unknown |
E(TIME) | Time: HHIISS | 00:00:00 |
E(DATETIME) | Date and time: YYYYMMDDHHIISS | Invalid date – unknown |
E(TIMESTAMP) | Numeric timestamp with microsecond precision: YYYYMMDDHHIISS6 | Invalid date – unknown |
E(NATTIME) | Natural T format (tenths of seconds since 0000-01-02) | 0000-01-02:00:00:00.0 before year 1 – unknown |
E(NATDATE) | Natural D format (days since year 0000) | 0000-01-02 before year 1 – unknown |
E(UNIXTIME) | UNIX time_t type (seconds since 1970), always UTC (Coordinated Universal Time) based | 1970-01-01:00:00:00 |
E(XTIMESTAMP) | UNIX timestamp with microseconds since
1970 (UNIXTIME * 1000000) + microseconds, always UTC-based |
1970-01-01:00:00:00.000000 |
The DT option is only allowed with the formats B, F, P, U. The length specified for a field with the DT option must be large enough to store the date/time values. The following table shows the required minimum lengths and if the field option TZ (local time zone) is allowed (for more information see description of TZ):
Date_time_edit_mask | Required minimum field lengths for format | TZ Option Allowed | |||
---|---|---|---|---|---|
B | F | P | U | ||
E(DATE) | 4 | 4 | 5 | 8 | no |
E(TIME) | 3 | 4 | 4 | 6 | no |
E(DATETIME) | 6 | 8 | 8 | 14 | yes |
E(TIMESTAMP) | - | - | 11 | 20 | yes |
E(NATTIME) | 5 | 8 | 7 | 12 | yes |
E(NATDATE) | 3 | 4 | 4 | 6 | no |
E(UNIXTIME) | 4 | 4 | 6 | 10 | yes |
E(XTIMESTAMP) | 7 | 8 | 9 | 16 | yes |
The formats B and F are not allowed with E(TIMESTAMP).
If you use date/time edit masks, date/time values between 0001-01-01:00:00:00.000000 and 9999:12-31:23:59:59.999999 are allowed. The value 0 is allowed also for those date/time edit masks where 0 is not a valid date/time value. In these cases, the meaning of value 0 is “unknown”. If the value is specified for an NC field, the significance indicator is set to -1, independent of a significance indicator provided in the format/record buffer. If you convert a date/time edit mask with value 0, and 0 represents an unknown date to a field with another date/time edit mask, the result is always 0. If the target field is an NC field, the significance indicator is set to -1.
Dates before 1592, when the Gregorian calendar was introduced, are handled as if the Gregorian calendar was also valid before the dates in question:
You can enter dates that did not exist historically;
Dates that existed historically, but which are not defined in the proleptic Gregorian calendar are rejected for NATDATE, NATTIME, UNIXTIME and XTIMESTAMP;
If you compute time intervals, you may get results that are not equal to the historical time intervals.
For DATE, DATETIME and TIMESTAMP, it is possible to specify dates that existed historically according to the Julian calendar, but which do not exist in the proleptic Gregorian calendar; it is the responsibility of user which semantics he assigns to such date/time fields. However, if you try to convert such a date to another date/time edit mask, you get an error (Adabas response code 55).
Depending on the format/length used for UNIXTIME or XTIMESTAMP fields, it is possible that only a subset of the range between years 0 and 9999 is supported:
If you use the format B for UNIXTIME or XTIMESTAMP fields, it is not possible to store date/time values before 1970 - this would require negative values that are not supported with the format B.
If you use the format F with a length of 4 for UNIXTIME, you will not be able to store date/time values after January 19, 2038.
If you use the format B with a length of 4 for UNIXTIME, you can specify date/time values until the year 2106.
If you use the format U with a length of 10, the maximum date for UNIXTIME is in the 23rd century.
Fields with date/time edit masks should not be used to store time intervals.
It is possible to add the DT option to fields in files that already exist. In order to guarantee compatibility with existing applications, fields with defined with the DT option (but without the TZ option) are handled as follows:
Adabas does not check whether if all values stored before adding the DT option are correct date/time values - this is the resposibility of the user. It is up to the user to care for the integrity of the file
If you attempt to read the field with a date/time edit mask and the field value is not a valid date/time value, you will get an Adabas response code 55. If you read the field without a date/time edit mask, Adabas does not check whether the value is a correct date/time value.
If you don’t specify a date/time edit mask for the field in the format buffer for an add/update command, the field is processed as if the field was defined without the DT option - no checks are made for correct date/time values. In order to ensure that the field contains correct date/time values, it is recommended to use date/time edit masks in the format buffer for all updates made to date/time fields - in this case, then invalid date/time values are rejected with an Adabas response code 55.
FI indicates that the field is to occupy a fixed amount of storage and is not to be compressed.
In the Data Storage, the field value is stored without an internal length byte.
The FI option is recommended for fields with a length of 1 or 2 bytes which have a low probability of containing a null value, as well as for fields containing non-compressible values.
The FI option is not recommended for fields defined as multiple-value fields or for fields in a periodic group at the end of a record. Any null values for such fields will not be suppressed (or compressed), which may result in considerable waste of disk storage and increased processing times.
Without With FI option FI option --------- --------- Definition 01,AA,3,P 01,AA,3,P,FI User Data 33104C 33104C Internal 0433104C 33104C Representation (4 bytes) (3 bytes) User Data 00003C 00003C Internal 023C 00003C Representation (2 bytes) (3 bytes)
Restrictions on FI usage:
The FI, NC and NU options are mutually exclusive;
The FI option must not be specified for variable-length fields (standard length omitted);
A field defined with the FI option cannot be updated with a value which exceeds the standard length of the field.
The Adabas binary field format B is used by applications in two ways, either for unsigned integer values or for bit strings with arbitrary bit combinations and length. While in the first case, the values are expected to be ordered according to the hardware architecture and to be swapped if exchanged between different integer architectures, in the second case, the values are always interpreted as high-order-first (Big Endian) values; this means that the values are not swapped when exchanged between different integer architectures.
In order to enable both kinds of usage, the high-order first option (HF) was introduced for binary fields:
If a binary field is defined without the HF option, the values are interpreted as unsigned integers according to the byte order defined on the current hardware.
If a binary field is defined with the HF option, the values are always interpreted as high-order-first values, also on low-order-first platforms.
Note:
Natural expects fields with the Natural format B to always be
binary fields defined with HF option. If you use binary fields without the
option HF in Natural, you will get processing errors:
- If you
access these fields from a database on a machine with a different integer
architecture, the values will be swapped.
- If the fields are
descriptors and are stored on a machine with low-order-first architecture, the
sort sequence will not be as expected.
If a B field that is defined with the HF option is a part of a superdescriptor and the format of the resulting superdescriptor is not alpha, then the HF option is also applied to the superdescriptor. This means that the superdescriptor values are stored in the high-order first format.
The following fields are defined in the FDT:
1,B1,4,B 1,B2,4,B,HF
The following format buffer is defined:
FB="B1,B2"
and the following array is used as a record buffer:
unsigned char RB[8];
The record buffer on a high-order first machine is now filled with the following commands:
RB[0] = 1; /* B1 = 0x01020304 high-order first*/ RB[1] = 2; RB[2] = 3; RB[3] = 4; RB[4] = 1; /* B2 = { 1, 2, 3, 4 } */ RB[5] = 2; RB[6] = 3; RB[7] = 4;
Reading the values from a low-order first machine returns the following values:
RB[0] = 4; /* B1 = 0x01020304 low-order first*/ RB[1] = 3; RB[2] = 2; RB[3] = 1; RB[4] = 1; /* B2 = { 1, 2, 3, 4 } */ RB[5] = 2; RB[6] = 3; RB[7] = 4;
The LB/L4 (long alphanumeric - 4 bytes length) or LA (long alphanumeric 2 bytes length) option can be specified for alphanumeric and Unicode fields. LB and L4 are synonyms. Only one of the LB/L4 or LA options can be specified for a given field. A field defined with the LB/L4 or LA option can contain a value that is up to 16,381 bytes long
if the field is defined as descriptor;
or if it is a parent field for a derived descriptor;
or if no LOB file is associated to the file
or if the field is a Unicode field.
Note:
In these cases, the field value is always stored in the primary
record. If you define such a field, you should consider that the primary record
must fit into a data block, which can have a size of up to 32 KB. You should
only define such fields if it will not result in a record overflow.
Otherwise the value can be up to 65533 bytes long for LA fields, and up to 2147483543 bytes for LB/L4 fields.
If a LOB file is associated with the file, or if the field is not a descriptor or a parent field of a derived descriptor and the value length is > 253, the field value is stored in the LOB file, and a LOB reference is included in the base record. Otherwise the field is compressed the same way as a field without the LB/L4 or LA option. The maximum length that a field with LA option can actually have is limited by the block size of the block in which the compressed record is stored - the compressed record must fit into one block.
When a field with LA option is updated or read with variable length, its value is either specified or returned in the record buffer, preceded by an inclusive two-byte length value (field length, plus two).
When a field with L4 option is updated or read with variable length, its value is either specified or returned in the record buffer, preceded by an inclusive 4-byte length value (field length, plus 4).
A field with the L4 or LA option
can also have the NU, NC/NN, or MU option;
can be a member of a PE group;
cannot have the FI option;
can be a descriptor field, but in this case only values with a maximum length of 1144 (exclusive field length) can be stored if the field does not have the TR option. If the descriptor field has the TR option, values larger than 1144 bytes are possible, but the descriptor value in the index is truncated to 1144 bytes.
Option | Definition | User data (variable length) (high order first) |
User data (variable length) (low order first) |
---|---|---|---|
Without L4 or LA | 01,BA,0,A | "\x06HELLO" | "\x06HELLO" |
With L4 | 01,BA,0,A,L4 | "\x00\x00\x00\x09HELLO" "\x00\x00\x07\xD4" (2000 data bytes) |
"\x09\x00\x00\x00HELLO" "\xD4\x07\x00\x00" (2000 data bytes) |
With LA | 01,BA,0,A,LA | "\x00\x09HELLO" "\x07\xD2" (2000 data bytes) |
"\x09\x00HELLO" "\xD2\x07" (2000 data bytes) |
MU indicates that the field may consist of 0, 1 or more than one value.
The values are stored internally according to the other options specified for the field. For an NU option field, trailing empty values are suppressed. The MU and NC options are mutually exclusive.
The syntax MU(n), as used in the utility ADACMP, is accepted but the occurrence count is ignored.
Definition: 01,AA,5,A,MU,NU Original content after file loading: 3 L value A L value B L value C count field AA1 field AA2 field AA3
L means length of the following value, including the L byte.
After update of value B to empty value:
2 L value A L value C
count field AA1 field AA2
AA count = 2.
Definition: 01,AA,5,A,MU
Original content after file loading:
3 L value A L value B L value C count field AA1 field AA2 field AA3
After update of value B to null value:
3 L value A L null value L value C
count field AA1 field AA2 field AA3
AA count = 3.
The NB option indicates that trailing blanks are not suppressed when a value is stored; values are always stored in the database with the same length as specified in the record buffer. A string which has a value that corresponds to the beginning of another string will always be considered as having a value less than the other string. This has the following consequences for the order of values:
”xxx\x00\x00” < ”xxx\x00” < “xxx” = “xxx “ = “xxx “ < “xxx0”
”xxx” < “xxx\x00” < ”xxx\x00\x00” < “xxx “ < “xxx “ < “xxx0”
The NB option is not allowed together with the FI option. The NB option is only allowed for fields with the format A or W.
If a value defined with the NB option is read with a fixed length that is larger than the value length, the value is filled with trailing blanks, like the value for a field without the NB option. However, if you perform an update with the same format and record buffer, the value is modified in the database – the trailing blanks are appended to the value.
The NC option indicates that the field can represent NULL values that are used by SQL. If this option is used, the field that contains an empty value can be in one of two states:
not present (NULL)
empty (blank)
A special format-buffer element (the S element) indicates whether the field is empty or not present. Please refer to the section Format Buffer Syntax in the Command Reference Manual for further information.
The FI, NU and NC options are mutually exclusive. The NC option is not permitted with a multiple-value field, and must not be specified for a member of a periodic group.
Definition: 01,AA,2,B,NC
Value Blank NULL User S element 0 0 -1 User data 0005 0000 0000 Internal 0205 01 C1 representation
A field that is defined with the NN option must always be assigned a value during an update or add. A value or blank must be provided in each data record, otherwise Adabas returns a response code. This option may only be specified in conjunction with the NC option.
Definition: 01,AA,2,B,NC 01,AA,2,B,NC,NN
User S element -1 -1 User data 0000 0000 Internal C1 not permitted representation
NU indicates that null values for the field will be suppressed.
Null value suppression results in the internal representation of a null value by a one-byte empty field indicator. The null value is not stored.
A series of consecutive fields, each of which contains a null value and for which the NU option is defined, is represented internally by a one-byte empty field indicator which contains the number of successive fields containing a null value. Hence, fields defined with the NU option should be defined in consecutive order whenever possible.
If the NU option is specified for a descriptor, a null value for the descriptor is not stored in the inverted list. Therefore, a FIND command in which a null value for this descriptor is used will always result in no records found, even though there may be records containing a null value in Data Storage.
If a descriptor defined with the NU option is used to control a logical sequence in a READ LOGICAL SEQUENCE command, those records which contain a null value for the descriptor will not be read. If the descriptor has both the NU and the UQ options, null values could be stored multiple times without there being a uniqueness violation.
The FI, NC and NU options are mutually exclusive.
Normal compression (NU or FI not specified) results in the representation of a null value by 1 byte.
Normal With FI With NU Compression Option Option ----------- ------- ------- Definition 01,AA,2,B 01,AA,2,B,FI 01,AA,2,B,NU User data 0000 0000 0000 Internal 01 0000 C1 Representation (1 byte) (2 bytes) (1 byte) C1 indicates 1 empty field follows
A field that is defined with the NV option will not be converted if an UPDATE or READ command is received from a machine with a different architecture.
The NV option cannot be specified for Unicode fields.
Definition: 01,AA,2,A 01,AA,2,A,NV EBCDIC data has convert value to be stored in of AA from EBCDIC no conversion a database on an to ASCII ASCII machine
PE indicates that a periodic group is to be defined.
A periodic group may consist of one or more fields and may occur zero times, once or more than once within a given record.
The periodic group is defined at the 01 level. All of the fields to be included in the periodic group must follow immediately and must be defined at level 02 or higher (in increments of 1 to a maximum of 7). The next 01 level definition indicates the end of the periodic group.
PE may only be specified with a group name. Length and format parameters may not be specified with the group name. A periodic group may contain descriptors and/or multiple-value fields and other groups but may not contain another periodic group.
01,GA,PE ; PERIODIC GROUP GA 02,A1,6,A,NU 02,A2,2,B,NU 02,A3,4,P,NU 01,GB,PE ; PERIODIC GROUP GB 02,B1,4,A,DE,NU 02,B2,5,A,MU,NU ; MU fields in PE groups ; are permitted. 02,B3 ; Grouping of fields within 03,B4,20,A,NU ; PE groups is permitted. 03,B5,7,U,NU
01,XA,PE 02,X1,3,A,NU 02,X2,4,U,NU 02,YA,PE ; Invalid. Nested periodic group not permitted. ^ %ADAFDU-E-PGL1 periodic group may only be defined at level 1
The NU option is recommended for fields within a periodic group. This permits maximum compression and results in less processing time during read/update of the fields.
The values for system generated fields are automatically generated by Adabas - values specified in the record buffer in an update or store command, are ignored. A system generated field must not be a field in a periodic group. A system generated field with the CR option must not be a multiple-value field.
System generated fields are defined with the following syntax:
SY = keyword [,CR]
where keyword can take the following values:
- TIME
Creation or last update timestamp. The field must be defined with the DT option. The values are stored as UTC values. If you want to access the values as local time values, the field must be defined with the TZ option.
Example:
1,CR,14,U,DE,DT=E(DATETIME),TZ,SY=TIME,CR ; Creation timestamp- SESSIONID
The Adabas session ID of the Adabas user session in which the record was created. The field must be defined with the options A,NV. The recommended field length is 28. If a smaller length is provided in the field definition, the value is truncated. The layout is shown below:
Bytes Meaning unsigned char s_node[8]: Adabas client node name unsigned char s_user[8]: Adabas client user ID unsigned int s_pid[4]: Process identification unsigned char s_timestamp[8]: Session timestamp: microseconds since 1970, as binary value Notes:
- The Adabas session ID is a binary string which identifies an Adabas session; there is no conversion between platforms. At a given time, the Adabas session ID is unique, but later on Adabas session IDs can be reused. On mainframes the layout is different than on open systems.
- You can change the Adabas session ID with the function lnk_set_adabas_id. This means that there is no guarantee that the components of this Adabas session ID really contain information on the user.
- The issues mentioned above have to be taken into account if applications want to access the components of SESSIONID.
- The session timestamp is defined as unsigned char[8] because of alignment reasons, but it contains a binary value.
- The session timestamp on open systems platforms is 0 if an Adabas version < 6.2. SP2 or a Net-Work version 7.3 is used, this is because earlier versions still used a 20-byte session ID without a timestamp.
- These rules for the layout of the Adabas session ID only apply to open systems platforms; on mainframes there is also a 28-byte Adabas session ID, but the components are different. Please refer to the mainframe documentation for details.
Example (open systems):
1,CA,16,A,NV,SY=SESSIONID ; Node ID and client user ID of last update UN=CA(9,16) ; Subdescriptor for user ID of last update- SESSIONUSER
The login ID of the Adabas user session in which the record was created or updated. The field must be defined with format A. The recommended field length is 8. The value of a SESSIONUSER field is bytes 9 - 16 of a SESSIONID field.
Note:
If you have also defined a SESSIONID field, you can define a subdescriptor of this field instead of the SESSIONUSER field - see the example for SESSIONID.Example:
1,CU,8,A,DE,SY=SESSIONUSER,CR ; Login ID of creator- OPUSER
The user ID specified in Additions 1 of the OP command for the Adabas session in which the record was created. The field must have format A and length 8.
Example:
1,CO,8,A,DE,SY=OPUSER ; User ID specified in OP command for last update
The values for system generated fields with the option CR are automatically generated by Adabas when a record is created. They are not changed by further update operations.
The values for system generated fields without the option CR and without the option MU are automatically generated by Adabas when a record is created. The values are updated during each following update operation.
System generated fields with the option MU can have up to SYFMAX (file parameter, for further information refer to the documentation of the utility ADAFDU) values.
When a record is created, the first value of the MU field is generated.
When a record is updated, a new value is generated and added before the first existing value to the MU field.
Afterwards, if values with an MU index > SYFMAX exist, these values are removed, e.g. assume the field name for the SY fields is SY, then for indices 1 < i <= SYFMAX, the new value of SY(i) is the old value of SY(i-1).
If you use ADACMP in order to perform a bulk load of external data with ADAMUP afterwards, these external data may either already contain the values for the system generated fields or not. Therefore, you can specify via the ADACMP parameter SYFINPUT how to handle system generated fields in ADACMP:
If you specify SYFINPUT=SYSTEM, ADACMP will create the values for the system generated fields as if inserted by the ADACMP process in the database;
If you specify SYFINPUT=USER, the system generated fields are handled by ADACMP as fields without the SY option.
For further information refer to the documentation of the utility ADACMP.
The TR option must be specified with the L4/LA option and the DE option.
The maximum length of a descriptor value is 1144 bytes. If the descriptor is not defined with the TR option, all update operations that insert a descriptor value larger than 1144 bytes are rejected. If the TR option is specified, these values can be inserted in the database, but the descriptor value will be truncated in the index. The consequence of this is that search operations no longer return the exact result if there is more than 1 record with the same descriptor value truncated to 1144 bytes in the index. If this happens, a warning will be issued. The detailed behaviour of descriptors defined with the TR option is as follows:
If a descriptor value which is larger than 1144 bytes is inserted in the database, the value is truncated in the index, and you receive a response code 2.
If you perform a search operation for which the result may be not exact as a consequence of truncation, you receive a response code 2.
If you sort by a descriptor that is defined with the TR option, and there is more than one record with the same, possibly truncated descriptor value in the index, you receive a response code 2.
A read logical operation (L3/6/9) receives a response code 2 if there is more than one record with the same, possibly truncated descriptor value in the index.
A check truncation option is available for the S1 command: if you specify this option, the search buffer should contain the name of a descriptor defined with the TR option, and the value buffer should contain the value to be checked. You receive a response code 0 if the value is not truncated, and a response code 2 if the value is truncated. A search operation in the database is not performed.
If you specify the TR option together with the UQ option, a uniqueness error will occur if you store two different descriptor values which are identical following truncation to 1144 bytes.
If this option is specified, the output field values are expected to be used in local time (or according to a user-defined time zone), and internally the values are stored in UTC. This option is only allowed together with option DT and the date/time edit mask names DATETIME, TIMESTAMP and NATTIME, UNIXTIME and XTIMESTAMP.
TZ is not allowed with DATE, TIME and NATDATE, since time zones are only relevant if both date and time information is available.
By definition, UNIXTIME and XTIMESTAMP are based on UTC; the standard conversion routines available for these values include the time zone handling. However, you must define UNIXTIME and XTIMESTAMP fields with the option TZ if you want to convert them to or from local time with one of the other date/time edit masks. If the field is defined without the option TZ, it is assumed that the time zone of the external value is UTC.
It is up to the user if he wants to use a field defined with the date/time edit masks DATETIME, TIMESTAMP or NATTIME and without the TZ option to store UTC time values or local time values. However, the following must be taken into consideration:
If you convert such a field to or from the date/time edit mask UNIXTIME or XTIMESTAMP, Adabas assumes that the internal values contain UTC time values.
If you use such a field to store local time values, it is not possible to uniquely specify the hour that occurs twice, when the daylight saving time is switched back to standard time.
If either of these points would be a problem, you should define the field with the option TZ.
If a field in an existing file contains UTC values, and you want to add the DT and TZ option with one of the date/time edit masks DATETIME, TIMESTAMP or NATTIME, you can do so by adding the new options with ADADBM. If the file contains local time values, you must unload and decompress the original file. Then you can compress and reload the file with the new options.
If you access fields with the TZ option, and don’t specify a date/time edit mask in the format buffer, the fields are processed in the same way as if the date/time edit masks in their field definitions were specified in the format buffer.
UQ indicates that the field is to be a unique descriptor. A unique descriptor must contain a different value for each record in the file. However, a multiple-value field may contain the same value several times in one record.
The UQ option must be specified together with the DE option. It is possible to specify the UQ option for more than one field in a file.
A subdescriptor is a descriptor derived from a portion of an elementary field. The elementary field may or may not be a descriptor itself. A subdescriptor may also be defined for a multiple-value field or a field in a periodic group, but may not be defined for a particular value of a multiple-value field or for a particular occurrence of a periodic group.
Subdescriptors must be defined after the last field definition.
A subdescriptor has the same format as the field from which it is derived, except fixed point and floating point, which become binary, and Unicode, which becomes alphanumeric.
A subdescriptor which is derived from a packed value has the sign of the source value appended.
name [,UQ]= field-name (from, to)
- name
The name of the subdescriptor. The naming conventions for a subdescriptor are identical to those defined for Adabas names.
- field-name
The name of the source field from which the subdescriptor element is to be derived.
The source field may be:
an elementary field;
a multiple-value field;
in a periodic group;
a descriptor or non-descriptor.
The source field must NOT be:
a particular multiple-value field value;
a particular periodic group occurrence;
another superdescriptor, subdescriptor, or phonetic descriptor;
A subdescriptor has the NU/NC option when the source field is defined with the NU/NC option. Therefore, when the source field is empty, the subdescriptor is empty and is not entered in the inverted list.
- from
Indicates the relative byte position within the source field where the subdescriptor definition is to begin.
- to
Indicates the relative byte position within the source field where the subdescriptor definition is to end.
`from' and `to' are counted from left to right, beginning with 1, for alphanumeric fields and Unicode fields.
`from' and `to' are counted from right to left, beginning with 1, for unpacked and packed fields. If the source field is defined with P format, the sign of the resulting subdescriptor value is taken from the four low-order bits of the low-order byte (byte 1).
`from' and `to' are counted from low order to high order, beginning with 1, for binary, fixed point and floating point fields.
`to' must be less than or equal to 253.
- UQ
A subdescriptor can be defined as a unique descriptor.
A subdescriptor's standard length is defined by the length of the sub-elements and is used by Adabas while processing search commands. For example, a search buffer containing only a subdescriptor name, without length override, will use this standard length.
Subdescriptor components that are derived from W fields are created from the internal encoding of the W field (UTF-8). A conversion to or from the user encoding defined for the user session is not performed.
Source Field Definition: 01,AR,10,A,NU Subdescriptor Definition: SB = AR(1,5)
The values for subdescriptor SB are derived from the first 5 bytes (counting from left to right) of all the values for the source field AR.
AR values SB values --------- --------- DAVENPORT DAVEN FORD FORD WILSON WILSO
Source Field Definition: 02,PF,6,P Subdescriptor Definition: PS = PF(4,6)
The values for subdescriptor PS are derived from bytes 4 to 6 (counting from right to left) of all the values for the source field PF.
PF values PS values --------- --------- (shown in hex) (shown in hex) 00243182655C 02431C 00000000186C 0C* 78426281448D 0784262D
* If the NU option had been specified for PF, no value would have been created for PS for this value.
Source Field Definition: 02,PF,6,P
Subdescriptor Definition: PT = PF(1,3)
The values for PT are derived from bytes 1 to 3 (counting from right to left) of all the values for PF.
PF values PT values --------- --------- (shown in hex) (shown in hex) 00243182655C 82655C 00000000186C 186C 78426281448D 81448D
A superdescriptor is a descriptor derived from several fields, portions of fields, or a combination thereof. Each source field (or portion of a field) used to define a superdescriptor is termed an element. A superdescriptor may be defined using from 2 to 20 elements.
Superdescriptors must be defined after the last field definition (before and/or after subdescriptor definitions).
All field formats are accepted as part of a superdescriptor.
Notes:
name [,format] [,PF] [,UQ] = field-name (from, to[, encoding]), field-name (from, to[, encoding]) [[,field-name (from, to[, encoding])]...]
- name
The name of the superdescriptor. The naming conventions for superdescriptors are identical to those for Adabas names.
- format
The format may only be specified if
all parent fields have unpacked format. Then A (alphanumeric), B (binary) or U (unpacked) can be specified. For reasons of compatibility with earlier versions of Adabas, the default is B, but it is strongly recommended to always specify either A or U, as the superdescriptor behaviour on low-order-first platforms may lead to strange results.
• At least one parent field has W format. Then A or W may be specified. The default is A.
- PF
Specifying this option ensures compatibility with Adabas databases on mainframe systems if the superdescriptor includes a packed field. On a mainframe database, the sign half-byte of a packed value is 0x0F, whereas under UNIX/Windows it is 0x0C. Using the PF option means that packed positive signs are stored as 0x0F within the superdescriptor.
Note:
Although in the index the sign half-byte is 0x0F, you don't get the 0x0F if you specify the superdescriptor in the format buffer for a read command - the sign half-byte is converted to 0x0C. This also means that the sort sequence of the values in the index may be different from the sort sequence that you get if you perform an alphanumeric comparison of the superdescriptor values you have read. If you want to read a range of descriptor values, it is recommended that you specify the end criterion in the search buffer for the L3 or L9 command, and not to check the read descriptor values in order to find out if you have met the end criterion.- UQ
A superdescriptor can be defined as a unique descriptor.
- field-name
The name of the source field from which a superdescriptor element is to be derived.
The source field may be:
an elementary field;
a multiple-value field but only one per superdescriptor;
any elementary field in a periodic group;
a descriptor or non-descriptor.
The source field must NOT be:
a particular multiple-value field value;
a particular periodic group occurrence;
another superdescriptor, subdescriptor, hyperdescriptor or phonetic descriptor.
A superdescriptor has the NU/NC option when one or more source field is defined with the NU/NC option. Therefore, when one or more of the elements is empty, the superdescriptor is empty and is not entered in the inverted list.
- from
Indicates the relative byte position within the source field where the superdescriptor element is to begin.
- to
Indicates the relative byte position within the source field where the superdescriptor element is to end.
`from' and `to' are counted from left to right, beginning with 1, for alphanumeric fields and Unicode fields.
`from' and `to' are counted from right to left, beginning with 1, for unpacked and packed fields.
`from' and `to' are counted from low order to high order, beginning with 1, for binary, fixed point and floating point fields.
`to' must be less than or equal to 253.
`from' must be less than or equal to `to'. The total length of any superdescriptor value may not exceed 1144 bytes in the case of alphanumeric, 126 bytes in the case of binary and 29 bytes in the case of unpacked.
- encoding
encoding is only allowed for W fields. encoding must be a Unicode encoding. If encoding is specified, the field value is converted to the specified encoding before selecting the specified bytes from the field value.
The description of format provides information concerning the standard format of a superdescriptor where all components are unpacked or at least one component is Unicode.
The format is alphanumeric if at least one parent field is alphanumeric, otherwise it is binary.
The format of a superdescriptor can only be specified if all of the parent fields are unpacked, in which case only unpacked and binary can be specified: the default is binary. If not all parent fields are unpacked, the format is alphanumeric if at least one parent field is alphanumeric or Unicode, otherwise it is binary. If not all parent fields are unpacked, the format is alphanumeric if at least one parent field is alphanumeric or Unicode, otherwise it is binary.
The superdescriptor's standard length is defined by the sum of its elements and is used by Adabas while processing search commands. For example, a search buffer containing only a superdescriptor name, without length override, will use this standard length.
If encoding has been specified for a superdescriptor parent field, the superdescriptor element is derived from the W field value converted to the specified encoding. If encoding has not been specified for a superdescriptor element derived from a W field, the value of the superdescriptor element is created from the internal encoding of the W field (UTF-8). A conversion to and from the user encoding defined for the user session is done superdescriptor element by superdescriptor element – the conversion is only done for superdescriptor elements for which encoding has been specified. It is not permitted to specify different encodings for the same superdescriptor.
If you use a superdescriptor with format W, the superdescriptor value is generally not a valid Unicode field, because the superdescriptor can contain elements that are not Unicode fields, and it may happen that a superdescriptor element derived from a Unicode field may begin or end in the middle of a character.
If you want to use a UTF-16 or UTF-32 encoding, it is strongly recommended to always specify UTF-16BE or UTF-32BE, but not UTF-16LE or UTF-32LE. The expected search order is only achieved with the big endian encodings, because the sort order for Unicode elements is alphanumeric.
If you use superdescriptors with a W field parent that has a user encoding different from the encoding specified for the Unicode superdescriptor elements, you may get incorrect or undefined results. For example, assume you have defined a superdescriptor element FN(1,2,UTF-16BE) to include the first character of the field in the superdescriptor, and the user encoding is UTF-8. If you try to search for a value where the first character of FN is a 3-byte UTF-8 character, the value in the search buffer contains only a part of the character. =>It is not possible to convert the superdescriptor element from UTF-8 to UTF-16.
If you read a superdescriptor with a W field parent that has a length > the superdescriptor length, the following rules are used for padding:
If the last parent field is a W field, the W field is extended until the end byte according to the specified length.
If the last parent field is not a W field, the superdescriptor is padded with A field blanks.
It makes a difference whether you specify a superdescriptor parent with encoding UTF-8 or without encoding: only if you explicitly specify encoding UTF-8, will a conversion to or from the user encoding be performed when you use the superdescriptor in an Adabas call.
If you explicitly specify the format when you access a superdescriptor, it must be the format of the superdescriptor. However, the processing of the superdescriptor is the same, independent of the format used for the superdescriptor.
If a superdescriptor contains binary parent fields (without the HF option), the value of the superdescriptor depends on the platform on which it is used:
on a high-order first platform, the binary components are defined high-order first.
on a low-order first platform, the binary components are defined low-order first.
However, the collation of the superdescriptor on low-order first platforms is the same as on high-order first platforms. Although in this respect there is a difference to normal values, the values are handled like other values of the same format. If, for example, you specify a superdescriptor with the A format in the search buffer with a length less than the superdescriptor length, the value is padded with blanks in order to get the complete superdescriptor value.
The following definitions are used in the next two examples:
01,LN,40,W,DE,NU ;Last-Name 01,FN,40,W,MU,NU ;First-Name 01,ID,4,B,NU ;Identification 01,AG,3,U ;Age 01,AD,PE ;Address 02,CI,20,A,NU ;City 02,ST,20,A,NU ;Street 01,FA,PE ;Relatives 02,NR,20,A,NU ;R-Last-Name 02,FR,20,A,MU,NU ;R-First-Name
Superdescriptor definition: SD = LN(1,4),ID(1,2),AG(2,3)
Superdescriptor SD is to be created. The values for the superdescriptor are to be derived from bytes 1 to 4 of field LN (counting from left to right), bytes 1 to 2 of field ID (counting from the low-order byte to the high-order byte), and bytes 2 to 3 of field AG (counting from right to left). Because no encoding has been specified for field LN, the internally-used encoding UTF-8 is kept. All values are shown in hexadecimal. In the following, the internal value shows how the value is represented internally to control the collating sequence of the values, the high-order (h-o) first value shows the representation of the value in the record buffer or value buffer on a high-order first platform, and the low-order (l-o) first value shows the representation of the value in the record buffer or value buffer on a low-order first platform.
LN ID AG SD 464C454D494E47 0x862143 (logical value) 303433 464C454D21433034 (internal) 00862143 (h-o first) 464C454D21433034 (h-o first) 43218600 (l-o first) 464C454D43213034 (l-o first) 4D4F52524953 0x2461866 (logical value) 303338 4D4F525218663033 (internal) 02461866 (h-o first) 4D4F525218663033 (h-o first) 66184602 (l-o first) 4D4F525266183033 (l-o first) 5041524B4552 00000000 303336 No value is stored with index 202020202020 0x432144 (logical value) 303030 No value is stored with index 00432144 (h-o first) 44214300 (l-o first) 414141414141 0x144 (logical value) 313131 4141414101443131 (internal) 00000144 (h-o first) 4141414101443131 (h-o first) 44010000 (l-o first) 4141414144013131 (l-o first)
The format for SD is alphanumeric since at least one element (LN) is defined with W format, and no explicit format has been specified.
If you specify a truncated superdescriptor value by specifying the following in the search buffer:
SD,5
then a value in the search buffer
464C45D21
is padded with blanks to get the complete superdescriptor value:
464C45D21202020
If this value has been specified on a high-order first platform, it is also the internal value that is used to resolve the query. If the value has been specified on a low-order first platform, the corresponding internal value is:
464C454D20212020
Superdescriptor definition: SY,W = LN(1,8,UTF-16BE),FN(1,2,UTF-16BE)
Superdescriptor SY is to be created from fields LN and FN (which is a multiple-value field). All values are shown in character format. The format is W.
LN FN SY FLEMING DAVID FLEMD UTF-16BE: 0046 004C 0045 004D 0049 004E 0047 0044 0041 0056 0049 0044 0046 004C 0045 004D 0044 WILSON JOHN WILSJ SONNY WILSS UTF-16BE: 0057 0049 004C 0053 004F 004E004A 004F 0048 004E 0057 0049 004C 0053 004A 0053 004F 004E 004E 0059 0057 0049 004C 0053 004E
As long as all values consist only of 1- or 2-byte UTF-8 characters, you can also work with the user encoding UTF-8. Then the superdescriptor value created for FLEMING, DAVID is converted to 46 4C 45 4D 20 20 20 20 44. Also, if you create a superdescriptor value in an application program from the field values, it works as expected: The value created is 46 4C 45 4D 49 4E 47 20 44 41. The first element of the superdescriptor value is converted to 0046 004C 0045 004D 0049 004E 0047 0020 and then truncated to 0046 004C 0045 004D. The second element of the superdescriptor is converted to 0044 0041 and then truncated to 0044. This means that the converted superdescriptor value is to 0046 004C 0045 004D 0044 - as expected.
However, if there are values containing 3-byte UTF-8 characters, working with user encoding UTF-8 will cause problems!
Field Definitions: 01,PN,6,U,NU 01,NA,20,A,DE,NU 01,DP,1,B,FI Superdescriptor Definition: SZ = PN(3,6),DP(1,1) Source Field Values SZ Values ------------------- --------- (shown in hex) (shown in hex) PN DP SZ 303234363732 04 3032343604 383430333938 00 3834303300 303030303131 06 3030303006 303030303031 00 3030303000
The format of SZ is binary because no element is derived from a source field defined with A format. A null value is stored for the last value shown because the superdescriptor format is binary and the first value contains unpacked zeros (hexadecimal value '30') and not binary zeros (hexadecimal value '00').
Field Definitions: 01,PF,4,P,NU 01,PN,2,P,NU Superdescriptor Definition: SP = PF(3,4),PN(1,2) Source Field Values SP values ------------------- --------- (shown in hex) (shown in hex) PF PN SP 0002463C 003C 0002003C 0000045C 043C 0000043C 0032464C 000C No value is stored with index 0038000C 044C 0038044C
The format of SP is binary since no element is derived from a source field defined with A format.
Field Definitions: 01,AD,PE 02,CI,4,A,NU 02,ST,5,A,NU Superdescriptor Definition: XY = CI(1,4),ST(1,5) Source Field Values XY values ------------------- --------- CI ST XY (1st occ.) (1st occ.) BALT MAIN BALTMAIN (2nd occ.) (2nd occ.) CHI SPRUCE CHI SPRUCE (3rd occ.) (3rd occ.) WASH 11TH WASH11TH (4th occ.) (4th occ.) DENV <null value> No value stored with index
The format of XY is alphanumeric since at least one element is derived from a source field which is defined with A format.
A phonetic descriptor can be defined in order to perform phonetic searches. The use of a phonetic descriptor in a FIND command returns all of the records with similar phonetic values. The phonetic value for a phonetic descriptor is based on the first 20 bytes of the source field value. Only upper/lower case alphabetic values are allowed; numeric values, special characters and blanks are ignored.
Phonetic descriptors may be defined after the last field definition. Phonetic descriptors may appear before and/or after any subdescriptor or superdescriptor definitions.
pn = PHON(fn)
- pn
The name of the phonetic descriptor. The naming conventions as described previously for Adabas names must be observed.
- PHON(fn)
The literal PHON followed by the name of the source field to be phoneticized.
The source field may be an elementary or a multiple-value field and must be defined with alphanumeric format. The source field may or may not be a descriptor. A subdescriptor or superdescriptor may not be specified.
The source field may be contained within a periodic group.
Source Field Definition: 01,AA,20,A,DE,NU Phonetic Definition: PA = PHON(AA)
A hyperdescriptor is a descriptor whose value is based on a user-supplied algorithm.
The values are based on algorithms coded in special user exits (hyperexit 1 to 255). Each exit may handle multiple hyperdescriptors. Each hyperdescriptor must be assigned to a hyperexit.
The hyperexit is called whenever a hyperdescriptor value is to be generated by the Adabas nucleus, or by the ADAINV, ADACMP or ADAULD utility.
One or more values may be returned depending on the options (PE, MU) assigned to the hyperdescriptor. The original ISN assigned to the input value(s) may be changed.
The format, the length, and the options of a hyperdescriptor are user-defined. They are not taken from the parent fields defined by the hyperdescriptor specification.
A search using a hyperdescriptor value is performed in the same manner as that for standard descriptors.
The user is responsible for creating correct hyperdescriptor values. There is no standard way to check the values of a hyperdescriptor for completeness against the Data Storage. The user must set the rules for each value definition, and check the value for correctness.
If a hyperdescriptor is defined as packed or unpacked format, Adabas will check the returned values for validity.
Please refer to the chapter User Exits and Hyperexits for more information about hyperdescriptors.
hy-name,length,format[,option... = HYPER(exit_number,parent_field [,parent field...])
- hy-name
The name to be used for the hyperdescriptor. The naming conventions as described previously for Adabas names must be observed.
- length
The default length of the hyperdescriptor.
- format
The format of the hyperdescriptor. The following formats are supported:
Format Maximum Length Alphanumeric (A) 253 bytes Binary (B) 126 bytes Fixed Point (F) 4 bytes (always 4 bytes) Floating Point (G) 8 bytes (always 4 or 8 bytes) Packed Decimal (P) 15 bytes Unpacked Decimal (U) 29 bytes - option
The options to be assigned to the hyperdescriptor. The following options may be used together with a hyperdescriptor:
Option Meaning HE Search value generation: allowed only if the number of parent fields = 1. You must specify not the internal search value, but rather the corresponding parent field value. Adabas then calls the hyperexit to convert the value to the internal search value. MU Multiple-value descriptor NU Null value suppression PE Periodic group index usage UQ Unique descriptor - exit_number
The hyperexit number to be assigned to the hyperdescriptor. This number will be used by the Adabas nucleus and utilities to determine the hyperdescriptor user exit to be called.
- parent field
The names of between one and 20 elementary fields. The field names and addresses are passed to the user exit.
The following definitions are used for this example:
01,LN,20,A,DE,NU ;Last-Name 01,FN,20,A,MU,NU ;First-Name 01,ID,4,B,NU ;Identification 01,AG,3,U ;Age 01,AD,PE ;Address 02,CI,20,A,NU ;City 02,ST,20,A,NU ;Street 01,FA,PE ;Relatives 02,NR,20,A,NU ;R-Last-Name 02,FR,20,A,MU,NU ;R-First-Name
Hyperdescriptor definition: HN,60,A,MU,NU=HYPER(2,LN,FN,FR)
Hyperexit 2 is assigned to this hyperdescriptor, and the name is HN.
The hyperdescriptor length is 60, the format is alphanumeric. The hyperdescriptor is a multiple-value (MU) descriptor with null suppression (NU).
The values for the hyperdescriptor are to be derived from the fields LN, FN and FR.
Hyperdescriptor definition: SN,20,A,HE,NU=HYPER(3,LN)
Hyperexit 3 is assigned to this hyperdescriptor, and the name is SN.
The hyperdescriptor length is 20, the format is alphanumeric with null suppression (NU). The hyperexit is called to perform a search value generation (HE) for search and read commands that use a search and value buffer.
The value for the hyperdescriptor is to be derived from the field LN.
A collation descriptor is a descriptor that is based on an ICU collating key for a Unicode field, where the ICU collating key is a binary string produced from the original character string by applying a Unicode Collation Algorithm and language-specific rules. When you perform a binary comparison between the collating keys produced this way for character strings, you perform a comparison between the strings that is appropriate to your locale.
Notes:
col-name [,max_length] [,LA|L4] [,HE] [,UQ] = COLLATING(parent_field[,collation_attribute]...)
- col-name
The name to be used for the collation descriptor. The naming conventions as described previously for Adabas names must be observed.
- max_length
The maximum number of bytes that are stored as a descriptor value. If the collation key derived from the parent field is larger, the collation key is truncated. The default and maximum value is the maximum descriptor value length (1144).
- LA, L4
If you specify one of these options, the length indicator is 2 bytes (LA option) or 4 bytes (L4 option) long if you access the descriptor with variable length.
- HE
If you specify this option, you must specify the corresponding parent field value in the value buffer for search operations, rather than the internal collation key. It is not possible to read the descriptor values (L9 command).
If this option is not specified, you can specify either the internal collation key or the corresponding parent field value in the value buffer, depending on the search buffer. In this case, it is possible to read the descriptor values (L9 command).
Notes:
- In most cases, you won't want to handle internal collation keys, an exception being if you also use ICU in your application programs. Therefore you should usually specify the HE option.
- If you don't use the HE option, you should remember that collation keys are much larger than their parent fields (4 times the length of the parent value is a typical length for a collation key). This means that one byte is often not sufficient for the length of the collation key, although the parent value is defined without the LA/L4 option, and therefore it is recommended to specify either the LA or the L4 option for the collation descriptor. However, larger collation keys are also stored in the index without the LA or L4 option, but they cannot be read with variable length in an L9 command - trying to do so will result in an Adabas response code 55.
- UQ
A collation descriptor can be defined as a unique descriptor.
- parent_field
The name of the source field from which the collation descriptor is to be derived. It must have the format W.
- collation_attribute
All collation attributes are optional, and they can be specified in any order. The following collation attributes can be specified:
- Locale string
One of the locales supported by ICU. This usually is a 2 character ISO-3166 language code. It can be followed by "@" and "collation=" <collation specifier>. This string must be enclosed in single quotes. Example: 'de@collation=phonebook'
The default is '' (empty string). Collating keys are then compatible with the Unicode Default Collation Table (this is language-independent, but provides good results for many languages).
- Collation strength
You can specify one of the following keywords: PRIMARY, SECONDARY, TERTIARY, QUARTERNARY, IDENTICAL. The value specified represents the comparison levels. See references 1 and 2 below for further information.
If you specify PRIMARY, case and diacritic differences are ignored. SECONDARY means that case differences are ignored, and punctuation is ignored if you specify TERTIARY. QUARTERNARY allows you to distinguish between words with and without punctuation, e.g. with TERTIARY "ab" = "a-b" and with QUARTERNARY "ab" < "a-b". If you specify IDENTICAL, only words with the same canonical decomposition are considered as equal.
The default is TERTIARY.
- case-first option
You can specify one of the following keywords: UPPERFIRST or LOWERFIRST.
If you specify UPPERFIRST, uppercase letters will be sorted before lowercase letters, e.g. 'AB' > 'ab'.
If you specify LOWERFIRST, lowercase letters will be sorted before uppercase letters, e.g. 'ab' > 'AB'.
If not specified, the case-first processing is undefined.
- alternate_option
You can specify one of the following keywords: SHIFTED or NON_IGNORABLE.
These keywords affect the sorting sequence for punctuation characters such as space or hyphen: for example, the words "bi-weekly" and biweekly" will be sorted close together if you specify SHIFTED, and they will not be sorted close together if you specify NON_IGNORABLE. See references 1 and 2 below for further information.
The default is NON_IGNORABLE.
- case_level_option
You can specify one of the following keywords: CASELEVEL or NO_CASELEVEL.
If you specify CASELEVEL, an additional case level is formed between secondary and tertiary. Currently, the case level is used for Japanese, but it could also be used in other situations, such as Pinyin. See reference 2 below for further information.
The default is NO_CASELEVEL.
- french_option
You can specify one of the following keywords: FRENCH or NO_FRENCH.
The setting of this option determines whether or not diacritics will be sorted as in French.
The default is NO_FRENCH.
- normalization_option
You can specify one of the following keywords: NORMALIZATION or NO_NORMALIZATION.
The setting of this option determines whether or not Unicode canonical equivalence is to be taken into account. Even if NO_NORMALIZATION is set, ICU will still produce correct results for non-normalized text for most world languages. However, languages that can use two or more diacritic marks in one character (e.g. Hebrew, Thai or Vietnamese) require this option to be set if the input is not normalized according to Unicode normalization form D. See reference 2 below for further information.
The default is NO_NORMALIZATION.
Collation descriptor definition: C1,HE,UQ=COLLATING(W1,'en',PRIMARY)
A unique collation descriptor is defined with HE option, language is English, and the collation strength is PRIMARY.
Collation descriptor definition: C2,HE=COLLATING(W2,'de@collation=phonebook')
A collation descriptor is defined with HE option, language is German, and the phonebook order is to be used. The collation strength is default (TERTIARY).
Mark Davis, Ken Whistler: "Unicode Technical Standard #10, Unicode Collation Algorithm" (http://www.unicode.org/reports/tr10/)
International Components for Unicode homepage (https://www-01.ibm.com/software/globalization/icu/)
A referential constraint ensures referential integrity between two keys. Keys can be descriptors, superdescriptors or ISNs. Referential integrity means that for every value in a descriptor called “foreign key”, there must be a value in a descriptor called “primary key" in a primary file. The primary key must be defined as unique and the options NC and NN must be set. For the foreign key, the option NC must be set. A pair of primary and foreign key must have the same format. ISNs can be used as primary keys and the corresponding foreign key must be binary. If primary and foreign keys are superdescriptors, then
The corresponding key must be a superdescriptor;
The superdescriptors must have the same number of parent fields;
The corresponding parent fields must have the same format;
No parent field may occur twice in a superdescriptor;
No parent field may occur in more the one foreign key;
All parent fields must have the option NC;
All parent fields of the primary key must have the option NN;
The primary key superdescriptor must be unique.
You can specify a referential action that is executed on the foreign key record if a primary key value is modified.
The referential constraint is added to the file to which the foreign key belongs.
Constraint-name = REFINT(foreign-key, primary-file, primary-key[/referential-action[,referential-action]])
- foreign-key
The name of the foreign key field
- primary-file
The number of the file to which the primary key belongs.
- primary-key
The name of the primary key.
- referential-action
One of the following keywords:
Keyword Description DC On delete cascade: If a record in the primary file is deleted, the records containing the primary key as a foreign key are also deleted. If these records also contain a primary key of a referential constraint, then the corresponding referential action is also performed for these keys. DX On delete no action (default): If a record in the primary file is deleted, no further records that contain the primary key as a foreign key may still exist. Otherwise the delete operation fails with an Adabas response code 196. DN On delete set NULL: If a record in the primary file is deleted, the foreign key field is set to NULL in all records that contain the primary key as a foreign key. This option is not allowed if the foreign key field is defined with the option NN. UC On update cascade: If a primary key is updated in a record in the primary file, the foreign key is also set to the new value of the primary key in the records that contain the primary key as a foreign key. If the foreign key is also the primary key of a referential constraint, then the corresponding referential action is also performed for these keys. UX On update no action (default): If the primary key in a record in the primary file is updated, no further records that contain the old primary key value as a foreign key may still exist. Otherwise the update operation fails with an Adabas response code 196. UN On update set NULL: If the primary key in a record in the primary file is updated, the foreign key field is set to NULL in all records that contain the old primary key value as a foreign key. This option is not allowed if the foreign key field is defined with the option NN. You can specify up to one delete action and up to one update option for a referential constraint. For constraints that refer to ISNs as primary key, only the actions delete cascade and update no action (DC,UX) are possible.
Primary key definition in file 9: 1, AA,8,A,DE,UQ,NC,NN Foreign key definition in file 12: 1, AC,8,A,DE,NC HT=REFINT(AC,9,AA) HT=REFINT(AC,9,AA/UC) HT=REFINT(AC,9,AA/UC,DN)