Version 6.3.1
 —  Adabas Administration  —

Database Design

This document covers the following:


Technical Introduction to Adabas

If a program is to access data in an Adabas database, it must issue Adabas commands (for further information, refer to the Command Reference section). Because the Adabas direct-call command interface is a low-level interface, Software AG also offers several higher-level interfaces to Adabas:

A program that issues Adabas commands is called an Adabas client. The Adabas commands are executed by a database server called the Adabas nucleus.

In order to access the database, the Adabas client must be linked with an Adabas interface, for example ADALNKX, which is part of the Adabas client package. For further information, refer to Command Reference, Linking Application Programs.

The Adabas client may either run locally on the same (physical or virtual) machine as the Adabas nucleus, or remotely on another machine. The additional product Entire Net-Work is required for remote access.

In addition to the Adabas nucleus, there are also a number of Adabas utilities for database administration purposes, which access the Adabas database.

The following figure shows how programs access Adabas:

graphics/access.png

Top of page

Performance Control During System Design

The performance of a system is measured by the time and computer resources required to run it. These may be important for the following reasons:

Performance may not, however, be the most important objective. Trade-offs will often have to be made between performance and the following:

In some cases, performance may be a constraint to be met rather than an objective to be optimized. If the system meets its time and volume requirements, attention may be turned from performance to other areas.

Methodology for Performance Control in System Design

The need to achieve satisfactory performance may affect one or more of the following:

Performance requirements must be considered early in the system design process. The following procedure may be used as a basis for controlling performance:

  1. Obtain from the users the time constraints for each major system function. These requirements are likely to be absolute, i.e., the system is probably useless if it cannot meet them;

  2. Evaluate if the available hardware resources are sufficient, either by experience with other databases, or by simulating the expected load. If you need new hardware consider the following:

  3. Describe each function in terms of the logical design model specifying:

  4. Decide which programs are most performance-critical. The choice may involve volumes, frequency, deadlines and the effect on the performance or scheduling of other systems. Other programs may also have minimum performance requirements which may constrain the extent to which critical functions can be optimized;

  5. Optimize the performance of each critical function by shortening its access paths, optimizing its logic, eliminating database features which increase overheads, etc. In the first pass, an attempt should be made to optimize performance without sacrificing flexibility, accessibility of information, or other functional requirements of the system;

  6. Estimate the performance of each critical function. If this does not yield a satisfactory solution, a relaxation of the time constraints or the functional requirements will have to be negotiated or a hardware upgrade may be required;

  7. Estimate the performance of other system functions. Calculate the total cost and compare the cost and peak period resource requirements with the economic constraints. If the estimates do not meet the constraints, then a solution must be negotiated with the user, operations or senior management;

  8. If possible, validate the estimates by loading a test database in order to time various functions. The test database should be similar to the planned one in terms of the number of records contained in each file and the number of values for descriptors. The size of each record is less important except for tests of sequential processing and then only if records are to be processed in something close to their physical sequence.

Top of page

Unicode Support

Adabas supports Unicode on the basis of International Components for Unicode (ICU) libraries (V3.2). Please refer to the ICU homepage at http://www.ibm.com/software/globalization/icu for further information about ICU. The wide character field format (W) has been introduced for Unicode fields. The Adabas user can specify the external encoding used in Adabas calls or for the compression and decompression utilities ADACMP and ADADCU, but internally all data is stored in UTF-8.

The external encoding can be specified in:

Mainframe Compatibility Considerations

The following points should be taken into consideration if you intend to write applications using Unicode character sets and when you intend to run the applications on both mainframes and UNIX/Windows/OpenVMS platforms:

Top of page

File and Record Design

It is possible to design an Adabas database with one file for each record type as identified during the conceptual design stage. Although such a structure would support any application functions required of it and is the easiest to manipulate in ad hoc queries, it may not be the best from the performance point of view:

It is, therefore, often advisable to reduce the number of Adabas files used by critical programs. The following techniques may be used for this procedure:

Each of these techniques is described in the following sections.

Multiple-Value Fields and Periodic Groups

In the example shown below, ORDER-ITEM is defined as a periodic group in the ORDER file. Each order record contains a variable number of order items.

Order   Order   Date     Customer Item
Number  Date    Required Number   Code     Quantity
    
A1234E  29MAR   10JUN    UK432M   24801K   200     1ST OCCUR.
                                  30419T   100     2ND OCCUR.
                                  273952   300     3RD OCCUR.

A multiple-value field or a periodic group may be retrieved/updated in the same call and with the same I/Os as the main record. This can result in a saving in both CPU time and I/O requirements.

There are certain constraints concerning the usage of multiple-value fields and periodic groups that the user should be aware of:

Descriptors contained within a periodic group or derived from fields within a periodic group cannot be used as a sort key in FIND and SORT commands. In addition, specific rules apply to the methods in which search requests that involve one or more descriptors defined as multiple-value fields and/or contained within a periodic group may be used. These rules are described in the Command Reference Manual.

Multiple Record Types in a Single Adabas File

Another method of reducing the number of files is to store data belonging to two logical record types in the same Adabas file. For example, the figure Multiple Record Types (i) below shows how a customer file and an order file might be combined. This technique takes advantage of the Adabas null-value suppression facility.

Fields in the Field Definition Table for the combined file:

Key, Record Type, Order Data, Order Item Data

Stored records:

Key  Type  Order Data   *
Key  Type  * Order Item Data

* indicates suppressed null values.

The key of an order item record could be order number plus line sequence number within this order.

This technique reduces I/Os by allowing the customer and order record types to share various control blocks and higher-level index blocks. Thus fewer blocks have to be read before processing of the file can start, and more space is left free in the buffer pool for other types of blocks.

The customer and order records can be grouped together in Data Storage reducing the number of blocks which have to be read to retrieve all the orders for a given customer. If all the orders are added at the same time as the customer, the total I/Os required will also be reduced.

The key must be designed carefully to ensure that both customer and order data can be accessed efficiently. The key for a customer record will usually have the null value of the suffix which distinguishes different orders belonging to the same customer appended to it as shown in the figure Multiple Record Types (ii) below.

A00231    000                Order header for order A00231
A00231    001                Order item 1
A00231    002                Order item 2
A00231    003                Order item 3
A00232    000                Order header for order A00232
A00232    001                Order item 1

A record type field is not necessary if the program can tell whether it is dealing with a customer or order record by the contents of the key suffix. It may be necessary for a program to reread a record in order to read additional fields or else have Adabas return all the fields relevant in any of the record types.

Data Duplication

Physical Duplication

In some cases, a few fields from a header record are required almost every time a detail record is accessed. For example, the production of an invoice may require both the order item data and the product description which is part of the PRODUCT record. The simplest way to make this information quickly available to the invoicing program is to hold a copy of the product description in the order item data. This is termed physical duplication because it involves holding a duplicate copy of data which is already physically represented elsewhere.

Physical duplication can also be in effect if some fields from each detail record are stored as a periodic group in a header record.

Logical Duplication

Assume that a credit control routine needs the sum of all invoices sent to the customer. This information can be derived by reading and totalling the relevant invoices, but this might involve accessing randomly quite a large number of records. It can be obtained much more quickly if it is stored permanently in the customer record, provided it is correctly maintained. This is termed logical duplication because the duplicate information is not already stored elsewhere but is implied by the contents of other records.

Programs which update information that is physically or logically duplicated are likely to run more slowly because they must also update the duplicate copies. Physical duplication seldom causes much of a problem because it usually involves fields which are infrequently updated. Logical duplication almost always requires duplicate updating because the change of any one record can affect data in other records.

Adabas Record Design

Once an Adabas file structure has been determined, the next step is usually to define the field definition table for the file. The field definitions are entered as described in the chapter FDT Record Structure. This section describes the performance implications of some of the options which may be used for fields.

The fields should be arranged such that those which are read or updated most often are nearest to the start of the record. This will reduce the CPU time required to locate the fields within the record. Fields which are seldom read but are mainly used as search criteria should be placed last.

The use of groups results in more efficient internal processing of read and update commands. This is the result of shorter format buffers which take less time to process and require less space in the internal format-buffer pool maintained by Adabas.

Numeric fields should be loaded in the format in which they will most frequently be used. This will minimize the amount of format conversion required. However, the relation between CPU time saved and extra disk space required has to be considered.

The use of the fixed storage (FI) option normally reduces the processing time of the field, but may result in a larger disk storage requirement, particularly if the field is contained in a periodic group. NU fields should be grouped together wherever possible. FI fields should be grouped together wherever possible.

It is important that each Adabas base record must fit into one data block, which can be up to 32KB in size (see the section Container Files for further information about block sizes). For further information about the space required for the Adabas fields, see the section Disk Space Usage. In addition to the space required for the Adabas fields, 4 bytes are needed for each data block and 6 bytes for each Adabas record.

Large Object Values

Normally, each Adabas record must fit into one data block, but there is the exception that values larger than 253 bytes (so-called large object values - LOB values) can be stored by Adabas in a separate internal Adabas file, the associated LOB file. The original Adabas file is called the base file, the records in the base file are called base records, and the records in the LOB file are called LOB records. The following rules apply to LOB values:

The following table provides an overview of the terms used in conjunction with LOB values in the Adabas documentation:

Term Definition
Base file An Adabas file with a user-defined FDT that contains one or more LOB fields
Base record A record in a base file. The LOB values in the record are represented by LOB value references pointing to LOB segment records in the LOB file. The actual LOB values are contained in these segment records
LOB Large Object. Initially, a LOB value in Adabas can have a size of up to ca. 2GB (theoretically)
LOB field A new type of field in an Adabas file that stores LOB values
LOB file An Adabas file with a predefined FDT containing LOB values that are spread over one or more LOB segment records
LOB file group The pair consisting of base file and LOB file, viewed as a single unit
LOB segment One portion of a LOB value that was partitioned. A LOB value consists of one or more LOB segments
LOB segment record A record in a LOB file that contains a LOB segment as payload data and other information as control data
LOB value An instance of a LOB
LOB value reference A reference or pointer from a base record to the LOB segment records that contain the partitioned LOB value

It is possible to define the following types of LOB fields:

Notes:

  1. Unicode LOB fields (fields with format W (UNICODE)) are currently not supported. You can store large Unicode values in BLOB fields, but then a conversion to other encodings as provided by format W is not supported.
  2. For detailed information on the Adabas formats and field options, please refer to the section FDT Record Structure.

Top of page

Adabas System Files

The Adabas nucleus uses some internal Adabas files, the so-called Adabas system files, to store internal data. While the checkpoint file, the security file and the user data file are always required for an Adabas database and are defined when the database is created, the replication system files are only created when you initialize the Adabas-to-Adabas replication - please refer to the section Adabas-to-Adabas (A2A) Replication for further details.

The following Adabas system files exist:

Checkpoint File

The Adabas checkpoint file is used to log some important events, the Adabas checkpoints; these checkpoints are written for:

The checkpoints can be displayed with the utility ADAREP parameter CHECKPOINTS. The documentation of this parameter contains a description of the different checkpoint types.

The checkpoints are especially important for the utility ADAREC (database recovery), which re-applies all database updates performed after a database backup: because ADAREC cannot recover some utility operations, it stops when it detects a SYNP checkpoint that indicates the execution of a utility which cannot be recovered by ADAREC, and which must be re-executed manually.

Note:
The information stored in the checkpoint file does not contain all of the information required to re-execute the utilities. Software AG therefore strongly recommends that you document all utility executions in order to be able to recover the database if necessary.

Security File

The Adabas security file contains the Adabas security definitions. For more information on Adabas security, For further information, please refer to the documentation of the ADASCR utility, which is used to maintain the Adabas security definitions.

User Data File

The user data file is used to store information about the last transaction for all User IDs (ETIDs) specified in the Additions1 field for an OP command. The idea of specifying ETIDs is to enable the implementation of restart processing for programs using Adabas, following a crash. If you don't do this, it doesn't make sense to specify ETIDs.

Notes:

  1. Software AG recommends that you only use ETIDs if you really intend to store data in the user data file.
  2. If you use Natural to access your Adabas database, please refer to the Natural documentation for more information how to use ETIDs with Natural.
  3. Software AG recommends that you do not use the process ID as ETID (for example by specifying “$$” as ETID in Natural); you can only access the user data if you know the process ID of the process which generated the user data, and on some operating systems the process ID can become very long, and as a consequence you can get a very large number records in the user data file.

You can delete the data in the user data file by using the ADADBM REFRESH function; of course then all user data stored in the file are lost.

Replication System Files

Please refer to the section Adabas-to-Adabas (A2A) Replication for detailed information about the replication system files.

Top of page

Data Access Strategies

Efficient Use of Descriptors

Descriptors are fields for which Adabas has created an index for efficient search operations, to control a logical sequential read and as a sort key in certain Adabas commands such as FIND and SORT ISN LIST. The use of descriptors is, therefore, closely related to the access strategy to be used for a file. Additional disk space and processing overhead is required for each descriptor, particularly those which are updated frequently. The following guidelines may be used in determining the number and type of descriptors to be defined for a file:

Superdescriptor

A superdescriptor is a descriptor which is created from a combination of up to 20 fields (or portions thereof). The fields from which a superdescriptor is derived may or may not be descriptors. Superdescriptors are more efficient than combinations of ordinary descriptors when the search criteria involves a combination of values. This is because Adabas has only to access one inverted list instead of several and does not have to AND several ISN lists to produce the final list of qualifying records. Superdescriptors may also be used in the same manner as ordinary descriptors to control the logical sequence in which a file is read and to sort ISN lists.

The values for search criteria which use superdescriptors must be provided in the format of the superdescriptor (binary for superdescriptors derived from all numeric fields, otherwise alphanumeric). If the superdescriptor format is binary, the input of the search value using an ad hoc query or report facility may be difficult.

Subdescriptor

A subdescriptor is a descriptor which is derived from a portion of a field. The field used to derive the subdescriptor may or may not be a descriptor. If a search criteria involves a range of values which is contained in the first n bytes of an alphanumeric field or the last n bytes of a numeric field, a subdescriptor derived from the relevant bytes of the field may be defined. This will enable the search criterion to be represented as a single value rather than a range which will, in turn, result in more efficient searching since Adabas will not need to merge intermediate ISN lists. For example, assume an alphanumeric field AREA of 8 bytes, the first 3 of which represent the REGION and the last 5 the DEPARTMENT. If only records for REGION 111 are desired, a search criteria of AREA = 11100000 through 11199999 would be required. If the first three bytes of AREA were defined as a subdescriptor, a search criteria equal to REGION = 111 could be specified.

Phonetic Descriptor

A phonetic descriptor may be defined to perform phonetic searches. The use of a phonetic descriptor in a FIND command results in the return of all the records which contain similar phonetic values. The phonetic value of a descriptor is based on the first 20 bytes of the field value with only alphabetic values being considered (numeric values, special characters and blanks are ignored).

Hyperdescriptor

The hyperdescriptor enables descriptor values to be generated based on a user-supplied algorithm coded in a hyperexit. Up to 255 different user-written hyperexits can be defined for a single Adabas database, and each hyperexit can handle multiple hyperdescriptors.

Hyperdescriptors can be used to implement n-component superdescriptors, derived keys, or other key constructs. See FDT Record Structure and User Exits and Hyperexits in this manual for more information about hyperdescriptors.

Soft Coupling

A multi-file query may be performed by specifying a field to be used for inter-file linkage in the search criteria. This feature is called soft coupling and does not require the files to be physically coupled.

ISNs

Each record in an Adabas file has an internal sequence number (ISN), which is a 4 byte unsigned integer >0. ISNs are used by Adabas internally to perfrom queries efficiently, and the result of Adabas FIND commands represents the result as an ISN list.

If the ISN of a record is known, it is very efficient to access the record via its ISN (Adabas L1 command).

User-Assigned ISNs

The user has the option of assigning the ISN of each record in a file rather than having this done by Adabas. This technique permits subsequent retrieval using the ISN directly rather than using the inverted lists. This requires that the user develop his own algorithm for the assignment of ISNs. The resulting range of ISNs must be within the space allocated for the Address Converter for the file (please refer to the description of the MAXISN parameter in the chapter ADAFDU in the Utilities Manual for more information), and each application which adds records to the file must contain the user's ISN assignment algorithm.

Using ISNs as a Descriptor

The user may store the ISNs of related records in another record in order to be able to read the related records directly without using the inverted lists.

For example, assume an application needs to read an order record and then find and read all customer records for the order. If the ISN of the customer record (if there is more than one customer per order, a multiple-value field could be used) were stored in the order record, the customer record could be read directly since the ISN is available in the order record. This would avoid the necessity of issuing a FIND command to the customer file to determine the customer records for the order. This technique requires that the field containing the ISNs of the customer records be established and maintained in the order record, and assumes that the ISN assignment in the customer file will not be changed as a result of a file unload and reload in which the same ISN assignment is not retained.

ADAM Usage

The Adabas Direct Access Method (ADAM) facility permits the retrieval of records directly from Data Storage without access to the inverted lists. The Data Storage block number in which a record is located is calculated using a randomizing algorithm based on the ADAM key of the record. The use of ADAM is completely transparent to application programs.

The main performance advantage of using ADAM descriptors is the reduction in the number of accesses made to inverted lists. The main advantage of using an ISN as an ADAM key is the reduction in the number of accesses to the Address Converter.

ADAM will generally use an average of 1.2 to 1.5 (logical) I/O operations (including an average of overflow records stored under Associator control in other blocks of the file) to search for a record via the Adam key, as opposed to the three to four I/O operations required to search fora record using the inverted lists. Overflow records are also retrieved using normal Associator inverted-list references.

The ADAM key of each record must be a unique value. The ISN of a record may also be used as the ADAM key.

Notes:

  1. Access to ADAM files via the ADAM key is only efficient if the file was defined with sufficent size to avoid having too many records being stored in the overflow area. If a record did not fit into the block determined by the hash algorithm and searching the record in this block was not successful, the normal search algorithm is performed afterwards. In this case, access to the record requires more time than is required for non-ADAM files.
  2. In order to avoid block overflows and the resulting records in the overflow area, you should normally define ADAM files with enough size such that the average fill factor of a block is relativly small. This means that utilities will generally require more time than for non-ADAM files with the same records, because they must perform more I/Os. This applies in particular to ADAMUP if the sort sequence of the records to be loaded is not the sequence of the hash keys, because then the next record to be loaded, in most cases, belongs to a different block.

The file definition utility ADAFDU is used to define a descriptor or ISN as an ADAM key. There are 3 parameters:

Parameter Meaning
ADAM_KEY define ADAM key
ADAM_PARAMETER parameter to influence data record distribution algorithm
ADAM_OVERFLOW number of overflow blocks

The data space for ADAM is calculated as (DSSIZE (in blocks) - ADAM_OVERFLOW):

graphics/adam004.png

The data space for ADAM cannot be subsequently extended, only the ADAM overflow area can grow. However, the ADAM area can cover multiple DS extents within the initialization. The ADAM area is formatted and marked as in use during the execution of ADAFDU.

The number of blocks to be used for the overflow area is defined with the ADAM_OVERFLOW parameter. A minimum of one block is required, and more blocks can be added later. The overflow blocks are used if there is no space for the ADAM-calculated block for a new record. The gain in performance obtained by using ADAM is decreased if a large number of records is stored in the overflow area. The distribution of records in the ADAM file can be checked using the file information utility ADAFIN.

If the space reusage option has been set for the file, it only applies to the overflow area. The DATA padding factor applies to both areas (DATA and overflow).

The ADAM_PARAMETER parameter is used to influence the distribution of the data records.

If the ADAM key is ISN or a fixed-point descriptor, it determines the number of consecutive values that are to be stored in one block. The basic algorithm is

DS number = (actual value/ADAM_PARAMETER) modulo number_adam_blocks

If the format of the ADAM key is alphanumeric, binary or floating point, then the ADAM parameter defines the offset from the end of the value for an 8-byte extraction.

Example: ADAM key with format A:

ADAM_PARAMETER = 3
Value's lengths = 5, 10, 15

graphics/adam001.png

If the value is less than or equal to 8 bytes long, the complete value is taken as the extraction value.

graphics/adam002.png

Otherwise, if (value length - ADAM parameter) is less than or equal to 8 bytes long, the first 8 bytes are taken as the extraction value.

graphics/adam003.png

Otherwise the last 8 bytes after removing (ADAM parameter) bytes are taken as the extraction value.

The algorithm for calculating a relative DS number is:

DS number = (extraction value) modulo (number of ADAM blocks)

If the format of the ADAM key is packed or unpacked, then the ADAM parameter defines the offset from the end of the value to the position of the value to be considered for the ADAM value. This ADAM value from position 1 to the position (value length - ADAM_parameter) will be converted to a 8-byte integer value.

Example: ADAM key with format P or U

ADAM_PARAMETER = 2
Value's length = 8

graphics/adam02.png

The algorithm for calculating a relative DS number is:

DS number = (integer value) modulo (number of ADAM blocks)

If the format of the ADAM key is fixed point or if the ADAM key is the ISN, the extraction value is (ADAM key value) / (ADAM parameter).

Example: ADAM key with format F

The following values are entered in ADAFDU:

DSSIZE = 40 B
ADAM_KEY = FF
ADAM_OVERFLOW = 10
ADAM_PARAMETER = 12

The file is an ADAM file, and the unique descriptor FF is used as an ADAM key. 30 blocks will be used for the ADAM DS area, with 10 blocks reserved for the overflow area. 12 consecutive values will be stored in each block.

The values will be stored in the DS blocks as follows:

FF Value DS Block
0 - 11 1
12 - 23 2
24 - 35 3
... ...
348 - 359 30
360 - 371 1
372 - 383 2
... ...

Top of page

Disk Space Usage

The efficient use of disk space is especially important in a database environment because:

Decisions concerning the efficient usage of disk space must be made while considering other objectives of the system (performance, flexibility, ease of use). This section discusses the techniques and considerations involved in making trade-offs between these objectives and the efficient usage of disk space.

Compression

Each field may be defined to Adabas with one of three compression options:

  1. Ordinary compression (the default) which causes Adabas to remove trailing blanks from alphanumeric fields and leading zeros from numeric fields, but requires one additional length byte if the compressed value length is <= 126, or two if the compressed value length is larger. The null value is compressed to a length byte = 1.

  2. Null value suppression which results in ordinary compression and, in addition, suppresses the null value for the field.

  3. Fixed storage (FI), in which the field is not compressed at all, but the additional length byte in Data Storage is omitted.

The figure Adabas Compression below illustrates how various values of a five-byte alphanumeric field are stored using each compression option. The number preceding each stored value is an inclusive length byte (not used for FI fields). The asterisk shown under null value suppression indicates a suppressed field count. This is a one-byte field which indicates the number of empty (suppressed) fields present at this point in the record. A `b' means a blank.

Field     Ordinary      Fixed      Null Value
Value     Compression   Storage    Suppression

ABCbb     04414243      4142432020 04414243
          (4 bytes)     (5 bytes)  (4 bytes)

ABCDb     0541424344    4142434420 0541424344
          (5 bytes)     (5 bytes)  (5 bytes)

ABCDE     064142434445  4142434445 064142434445
          (6 bytes)     (5 bytes)  (6 bytes)
bbbbb     01            2020202020  *
          (2 bytes)     (5 bytes)  (1 byte)

The compression options chosen also affect the creation of the inverted list for the field (if it is a descriptor) and the processing time needed for compression and decompression of the field.

Fixed Storage

Fixed storage indicates that no compression is to be performed on the field. The field is stored according to its standard length with no length byte. Fixed storage is useful for small fields and for fields for which little or no compression is possible. See FDT Record Structure for information about the various restrictions related to the use of FI fields.

Ordinary Compression

Ordinary compression results in the removal of trailing blanks from alphanumeric fields and leading zeros from numeric fields. As can be seen in the figure Adabas Compression above, ordinary compression will result in a saving in disk space if at least 2 bytes of trailing blanks or leading zeros are removed.

Null Value Suppression

If null value suppression is specified for a field, and the field value is null, a one-byte empty field indicator will be stored instead of a length byte and the compressed null value (see figure above). This empty field indicator specifies the number of consecutive null-value suppressed fields which contain null values at this point in the record. Up to 63 empty fields can be represented by one byte. It is, therefore, advantageous to physically position fields which are frequently empty next to one another in the record and to define each with the null-value suppression option.

If the field is a descriptor, the use of null value suppression will result in the omission of the null value from the inverted lists. This means that a FIND command, in which the null value of the descriptor is used will always result in no qualifying records even if there are records in Data Storage which contain a null value for the descriptor. This applies also to subdescriptors and superdescriptors derived from a field defined with null value suppression. No entry will be made for a subdescriptor if the bytes of the field from which it is derived contain a null value and the field is defined with the null-value suppression option. No entry will be made for a superdescriptor if the bytes of any of the fields from which it is derived contain a null value and the field is defined with the null-value suppression option.

The use of null value suppression with descriptor fields, therefore, depends on the need to search for null values, and, if the descriptor is used to control logical sequential reading or sorting, the need to read records containing a null value. If such a need does not exist, null value suppression is normally used (unless the FI option is used).

Null value suppression is normally recommended for multiple-value fields and fields within periodic groups in order to reduce both the amount of disk space required and the internal processing requirements of these types of fields. The updating of such fields varies according to the compression option used. If a multiple-value field defined with null value suppression is updated with a null value, all values to the right are shifted left, and the value count is reduced accordingly. If all the fields of a periodic group are defined with null value suppression, and the entire group is updated to a null value, the occurrence count will be reduced only if the occurrence updated is the highest (last) occurrence. For detailed information about the updating of multiple-value fields and periodic groups, see FDT Record Structure and the Command Reference, A1 command and Command Reference, N1/N2 command.

Multiple-Value Fields and Periodic Groups

The values for multiple-value fields and periodic groups are normally preceded by an 8-byte header (or sometimes by a one byte MU or PE count). Each occurrence of a periodic group is preceded by a two-byte length indicator. If a periodic group contains empty occurrences, up to 32767 empty occurrences are compressed to a 2-byte empty periodic group occurrence counter.

Padding Factors

A large amount of record update activity (A1 command) may result in a considerable amount of record migration, i.e. moving the record from its current block to another block in which there is sufficient space for the expanded record. Record migration may be considerably reduced by defining a larger padding factor for Data Storage for the file when it is loaded. The padding factor represents the percentage of each physical block that is to be reserved for record expansion. The padding area is not used during file loading or when adding new records to a file. A large padding factor should not be used if only a small percentage of the records are likely to expand, since the padding area of all the blocks in which non-expanding records are located would be wasted.

If a large amount of record update/addition is to be performed, in which a large number of new values must be inserted into the current value range of one or more descriptors, a considerable amount of migration may also occur within the Associator. This may be reduced by assigning a larger padding factor for the Associator.

The disadvantages of a large padding factor are a larger disk-space requirement (less records or entries per block) and possible degradation of sequential processing, since more physical blocks will have to be read.

Padding factors are specified when a file is defined (using utility ADAFDU) and can be changed (using utility ADAORD).

Top of page

Security Planning

This section describes the general considerations which should be made concerning database security and explains the Adabas facilities which may be used to secure data contained within the database.

Effective security measures must take account of the following points:

The cost of security measures is usually much easier to quantify than the risk or cost of a security violation. The calculation may, however, be complicated by the fact that some security measures may offer benefits outside the area of security. The cost of a security violation depends on the nature of the violation. Possible types of cost include:

Top of page

Adabas Security Facilities

This section contains an overview of the security facilities provided by Adabas and its subsystems. For more detailed information about the facilities discussed in this section, please refer to the chapter Database Security Overview.

Password Security

Password security allows the DBA to control a user's use of the database in the following ways:

Ciphering

Adabas can cipher the data that it stores in container files. This, however, only applies to the data records that are stored in the Data storage, but not to the inverted lists on the Associator.

Ciphering prevents the unauthorized analysis of Adabas container files. If ciphering is enabled (see below), data records are ciphered when they are stored in a database by either the Adabas nucleus or by the mass update utility ADAMUP. The data records are then deciphered when they are requested by a user or application: this means that the ciphering is completely transparent to the user or application.

Ciphering can be enabled for individual Adabas files. This is done when defining the file with ADAFDU by setting the CIPHER/NOCIPHER option. The ciphering process uses internal parameters in order to achieve a maximum level of security. In some systems, identical fields and records present a possible security risk: if an unauthorized user can decipher one, the other can also be deciphered. The Adabas ciphering process, however, treats identical fields and records as follows:

The following example demonstrates this on the basis of two fields in a record which both contain the value `TEST' (representations are hexadecimal):

Record 1  Unciphered=0x54455354  Ciphered=0xDD022537
Record 2  Unciphered=0x54455354  Ciphered=0x55EF0A51

Note:
The ciphered values shown above are just examples, and do not represent the actual ciphering mechanisms used.

The Adabas ciphering mechanism is characterized by the following features and restrictions:

Top of page

Transaction Concept

An important concept for all databases is the availability of a transaction concept in order to guarantee database integrity. A transaction guarantees that a set of database update operations will either be committed, i.e. they all the updates become persistent in the database, or in case of an error, the update operations already performed will be completely be rolled back.

This section is just a short overview on the transaction concept; please refer to the Command Reference section for further information.

Adabas has the following database commands to support the transaction concept:

For some files, it can be desirable that they do not take part in normal transaction logic, and that all database modifications for the file are kept in the database even if a transaction is rolled back. An example for such a file is a log file, in which all activities of a user are to be logged including activities within a transaction that is later backed out.

Lock Concept

In order to guarantee database integrity, it must not be possible for another user to update records that required for a transaction. To this end, Adabas lets you lock records for the duration of a transaction.

Adabas supports the following types of lock:

Lock Type Usage
Share or read lock (S) You can acquire a shared or read lock if no other user has already acquired an exclusive lock for the record. S locks allow you to guarantee that nobody else can update one or more of the records as long as you have these records locked, while other users can still also get a shared lock.
Exclusive or write lock (X) You can only acquire an exclusive or write lock for an Adabas record if no other user has already acquired an S or X lock for this record. Modification or deletion of a record is only possible with an X lock of the record. If you create a new record, this record is automatically locked exclusively.

Subtransactions

Sometimes it is can be necessary to roll back not the complete transaction, but only a subset of the transaction. To this end, Adabas has a subtransaction concept, which is implemented via special options of the ET and BT commands.

ET Synchronization

Sometimes it is necessary for the database to be in a consistent state:

Note:
Switching to a new PLOG extent does not require ET synchronization, because all extents of a PLOG are considered as one PLOG.

In all of these cases, an ET synchronization must be performed for the database - this means:

If you create a backup on the file level using ADABCK DUMP without the option NEW_PLOG, an ET synchronization is only performed on the file level:

Top of page

Recovery/Restart Design

This section discusses the design aspects of database recovery/restart.

Correct recovery/restart planning is an important part of the design of the system, particularly one in which a database is used. Most of the causes of failure can be anticipated, evaluated and resolved as part of the basic system design process.

Recoverability is often an implied objective. Everyone assumes that, regardless of what happens, the system can be recovered and restarted. There are, however, specific facts to be determined about the level of recovery needed by the various users of the system. Recoverability is an area where the DBA has to take the initiative and establish necessary facts. Initially, each potential user of the system should be asked about their recovery/restart requirements. The most important considerations are:

Planning and Incorporating Recoverability

Once the recovery/restart requirements have been established, the DBA can proceed to plan the measures necessary to meet these requirements. The methodology provided in this section may be used as a basic guideline.

  1. A determination should be made as to the level and degree to which data is shared by the various users of the system.

  2. The recovery parameters for the system should be established. This includes a predicted/actual breakdown rate, an average delay and items affected, and items subject to security.

  3. An outline containing recovery design points should be prepared. Information in this outline should include:

    The recovery strategy should be subjected to all possible breakdown situations to determine the suitability, effectiveness, and cost of the strategy.

  4. Operations personnel should be consulted to determine whether all resources required for recovery/restart can be made available if and when they are needed.

  5. The final recovery design should be documented and reviewed with users, operations personnel and any others involved with the system.

Locations of Database Containers, Backup Files, and Protection Logs

When you restart the database after a database crash, an autorestart is performed: All transactions that were active when the nucleus crashed are rolled back, and all missing database updates are written to the ASSO and DATA containers. For this purpose, the update operations have been logged on the WORK container. Nevertheless, in case of a disk corruption, it may be that the autorestart fails. In this case, it is important that you can recover the state of your database from a backup and the protection logs. This can be guaranteed only if your backup files and protection logs (PLOGs) are stored on separate, independent disks. Note that Adabas logs the update operations twice: once on WORK for the autorestart, and once on PLOGs for restore/recover to enable the recovery of the current database state in case a disk where a log is stored becomes corrupted.

Notes:

  1. In order to avoid disk problems, you can also consider hardware-based solutions such as RAID systems. Nevertheless, it is recommended that you create backups and PLOGs, because a RAID system doesn't protect you from software or handling errors.
  2. You should also consider disaster recovery: what do you do if your complete computer system should be destroyed?

Matching Requirements and Facilities

Once the general recovery requirements have been designed, the next step is to select the relevant Adabas and non-Adabas facilities to be used to implement recovery/restart. The following sections describe the Adabas facilities related to recovery/restart.

Transaction Recovery

Many batch-update programs process streams of input transactions that have the following characteristics:

The Adabas End Transaction (ET) Command

The use of the Adabas ET command will:

Adabas Close (CL) Command

The Adabas CL command can be used to update the user's current ET data (for example, to set a job completed flag).

Reading ET Data

A user may retrieve his ET data after a user restart or at the start of a new user or Adabas session with the Adabas OP command. The user is required to provide a user identification (USERID) with the OP command. This USERID is used by Adabas to locate the user's ET data.

Another user's ET data may be read by using the RE command, provided that the USERID of the other user is known. This may be useful, for example, for staff supervision of an online update operation.

System or Transaction Failure

In the event of an abnormal termination of an Adabas session, the Adabas AUTOBACKOUT routine, which is automatically invoked at the beginning of every Adabas session, will remove the effects of all interrupted transactions from the database.

If an individual transaction is interrupted, Adabas will automatically remove all the changes the transaction has made to the database. An application program can explicitly cause its current transaction to be backed out by issuing the Adabas BT command.

No-BT Files

In the case of a nucleus crashing, the following points should be taken into consideration:

Limitations of Adabas Transaction Recovery

The following limitations of Adabas transaction recovery should be considered:

Adabas Checkpoint Command

Some programs cannot conveniently use ET commands because:

Such programs can use the Adabas checkpoint command (C1) to establish a point at which the file or files the program is updating can be restored if necessary. The specific command used depends on the type of updating (exclusive control) being performed.

Exclusive File Control

A user can request exclusive update control of one or more Adabas files. Exclusive control is requested with the OP command and will be given only if the file is not currently being updated by another user. Once exclusive control is assigned to a user, other users may read but not update the file.

Programs which read and/or update long sequences of records, either in logical sequence or as a result of searches, may use exclusive control to prevent other users from updating the records used. This avoids the necessity of issuing a record hold command for each record.

Checkpointing Exclusive Control Files

Exclusive control users may or may not use ET commands. If ET commands are not used, checkpoints can be taken by issuing a C1 command (if user data is to be stored).

System or Program Failure

In the event of a system or program failure, the file or files being updated under exclusive control may have to be restored (using ADABCK or ADAMUP) to the state before the start of the execution of the program which failed.

Limitations of Exclusive File Control

The following limitations apply to exclusive file control:

User Restart Data

The Adabas ET and CL commands provide an option of storing up to 2000 bytes of user data in an Adabas system file.

One record of user data is maintained for each user. This record is overwritten each time new user data is provided by the user. The data is maintained across Adabas sessions only if the user provides a user identification (USERID) with the OP command. User data may be read with an OP or RE command. A user may read another user's data with the RE command, provided that the USERID of the other user is known.

The primary purpose of user data is to enable programs to be self-restarting and to check that recovery procedures have been properly carried out. The type of information which may be useful as user data includes:

Top of page