This document covers the following topics:
There are several ways of loading data into a database:
Using the utility ADAMUP: you can load compressed data that was generated by ADACMP, ADAMUP or ADAULD.
Using the utility ADABCK: you can restore a file or database from a backup that was created by ADABCK.
Using the IMPORT function of the utility ADAORD: you can import a file that was created by the EXPORT function of ADAORD.
Data that is loaded into the database by the utility ADAMUP must be input to ADAMUP in a special, compressed data format. Compressed data can be generated by the utility ADACMP, which converts uncompressed data, as described in the section Uncompressed Data Format, to the necessary format. Compressed data can also be generated by the utilities ADAMUP and ADAULD, when existing data is unloaded from the database.
This is a very flexible data format. You can load copies of a compressed data file into several different databases if required, or as several copies of the same file into a single database. You can also load just a subset of the records into a file.
A disadvantage of this file format is that when the data is loaded into the database, ADABAS has to build a sort index for each file. For large files, this can require large amounts of CPU time, and SORT and TEMP container files are required.
Backup data is generated by the utility ADABCK. Such data can be used to build a long-term data archive, and can also be used for restoring files to databases, or to restore whole databases.
The backup and restore operations are faster than the other methods of saving and restoring data. However, you must always copy backup files back to the database from which they originated. Also, you cannot copy the files back with different file numbers.
Data that was exported from a database using the ADAORD EXPORT function can be imported to a database using the ADAORD IMPORT function. The exported data is very similar to the compressed data format described above, but the main difference is that the index information of the exported files is also exported. This means that when data is subsequently imported, the index does not have to be rebuilt, so the load procedure is much faster than the corresponding operation for ADAMUP. Also, SORT and TEMP container files are not required.
Like compressed data, this is a flexible data format. You can load copies of a compressed data file into several different databases if required, or as several copies of the same file into a single database. However, because the index information is stored in the export file, you cannot import just a subset of the records into a file in the database.
The situation may occur in which you want to copy data from one Adabas database to another database on a computer with a different hardware architecture, for example from a Linux platform to a Windows platform.
You can use the utility ADABCK (Version 6.4 and higher) for this purpose - you can restore a backup created on one hardware architecture into a database on a computer with another architecture.
Notes:
This section describes the format of data records that are input to the utility ADACMP and output from the utility ADADCU. This format is called uncompressed data format (also called raw data format). The utility ADACMP reads in data in this format and compresses it for subsequent input to the mass update utility ADAMUP. The utility ADADCU performs the opposite operation: it takes compressed data that was generated by either ADACMP and decompresses it. Note that compressed data can also be generated by the utilities ADAMUP and ADAULD when data is deleted or unloaded from a database.
Unless otherwise indicated, the data formats described apply to both the input data for ADACMP and the output data for ADADCU.
Uncompressed data records are a sequence of the following syntax elements:
format_buffer_element field_references
The syntax elements, except field_references, are the same as the format buffer elements described in Command Reference, Calling Adabas, Format and Record Buffers. Note that here a format buffer element is nX, a literal, or a field definition including the length and format specifications, if they exist. The difference is how the syntax elements are separated:
The complete syntax element must be entered in one line.
The syntax elements are separated by a comma or a newline.
You can insert comments between the syntax elements: a semicolon indicates that the following characters, until end of line, are comments.
You can insert FDT between the syntax elements; FDT must be entered in a new line. This indicates to the utility where the uncompressed data record syntax is specified that the FDT is to be displayed.
The following special considerations apply for format buffer elements specified in a decompressed record-structure specification:
Edit masks are not allowed.
N elements are not allowed.
1-N elements must be preceded directly in the same line by the corresponding C element. Unlike the format buffer for an update or store command, they are also allowed in format buffers for compression.
Assume GB is a periodic group with the fields BA, BB and BC.
GBC,GB1-N | The number of occurrences of the periodic group and all occurrences of the periodic group are processed (GBC, BA1, BB1, BC1, BA2, BB2, BC2, ...). |
1-N elements are not allowed for fields within a PE.
GB2-GB4 | Incorrect syntax. |
GB4-2 | Descending range. |
GBC GB1-N |
GBC and GB1-N must be specified in the same line. |
GBC,BA1-N | name 1-N must not be specified for fields within a periodic group. |
name R [mu_pe_index] [, length ]
name must be the name of a LOB field. If the decompressed record specification contains field references, the decompressed record doesn’t contain the LOB values themselves, but file names, where the LOB values are contained in the files.
length may be a number >= 0 or ‘*’. length=* is allowed only at the end of the record for a single LOB value. If length=0 is specified, a 2-byte inclusive length is put in front of the file name (analogous to LA fields). The default is 0.
{ i [-j] | N} [ (m [-n] | N | 1-N) ]| 1 – N
If MU fields or fields in periodic groups are LOB fields, you can specify the MU or PE indices for field references the same way as you do for field values.
Notes:
AA,AB ; 2 fields specified in the same line 9X,’LITERAL’ ; Compress: 16 bytes are ignored ; Decompress: " LITERAL" in decompressed record fdt ; Display FDT before next field is specified AT1-12C ; Number of values of MU field in the first 12 elements of PE ; Only allowed for decompress AT1-12(1-2),8,U ; Values of MU field in a PE ; Length is 8 bytes, Format is U P1C,P11-N ; Periodic group count and all groups ; Allowed for compress, too. LMR1-4,20 ; File names of files containing values
This section provides record definition examples. All the examples in this section refer to the sample ADABAS files in Appendix A of the Command Reference Manual.
Syntax : AA,5X,AB. Record : AA value(8 bytes alphanumeric) 5 spaces AB value(2 bytes packed)
Syntax : AA,5X,AB,3,U. Record : AA value (8 bytes alphanumeric) 5 spaces AB value (3 bytes unpacked)
Syntax : GB1. Record : BA1 value (1 byte binary) BB1 value (5 bytes packed) BC1 value (10 bytes alphanumeric)
Syntax : GB1-2.
Record : BA1 value (1 byte binary) BB1 value (5 bytes packed) GB1 BC1 value (10 bytes alphanumeric)
BA2 value (1 byte binary) BB2 value (5 bytes packed) GB2 BC2 value (10 bytes alphanumeric)
Syntax : MF6. Record : MF value 6 (3 bytes alphanumeric)
Syntax : MF01-02. Record : MF value 1 (3 bytes alphanumeric) MF value 2 (3 bytes alphanumeric)
Syntax : GCC,MFC. Record : Highest occurrence count for GC (1 byte binary) Value count for MF (1 byte binary)
The utility ADADCU returns the requested field values in the order specified by the record definition syntax. A value is returned in the standard length and format defined for the field, unless a length and/or format override was specified. If the value is a null value, it is returned in the format in effect for the field:
Format | Null Value |
---|---|
ALPHANUMERIC (A) | Blanks (ASCII: hex `20' or EBCDIC: hex `40') |
BINARY (B) | Binary zeros |
FIXED-POINT (F) | Binary zeros |
FLOATING POINT (G) | Binary zeros |
PACKED DECIMAL (P) | Packed decimal 0 |
UNPACKED DECIMAL (U) | Unpacked decimal 0, depending on the target architecture |
UNICODE (W) | Blanks depending on WCHARSET specified |
Note:
For packed decimals, C is used as sign. For unpacked decimals,
3 is used as sign for target architecture ASCII, F for target architecture
EBCDIC.
Adabas returns the number of bytes equal to the combined lengths (standard or overridden) of all requested fields.
User data which is input to ADACMP must be contained in a sequential file. There are four ways in which the records in the input file can be separated; please refer to the parameter RECORD_STRUCTURE in the chapter ADACMP in the Utilities Manual for more detailed information. The fields in each record must be structured according to the data definition statements provided.
If a user exit routine is used, the structure must agree with the data definitions after user exit processing. Any trailing information in an input record for which there is no corresponding data-definition statement will not be processed and will not be contained in the output produced by ADACMP.
Fields defined as UNPACKED must contain a valid sign value in the four high-order bits of the low-order byte. The sign must be in zoned-numeric format. ADABAS represents the signs in zoned format.
Fields defined as PACKED must contain a valid sign value in the four low-order bits of the low-order byte. Valid positive signs are A, C, E and F. Valid negative signs are B and D. ADABAS represents a positive value with a C and a negative value with a D.
If the input file does not contain any records, a warning message is displayed and the utility aborts. However, a CMPDTA output file that contains the FDT information is created.
If the structure of the decompressed record is not described via the FIELDS parameter, please consider the following:
The values for a multiple value field must be preceded by a 1, 2 or 4 byte binary count, depending on the setting of the ADACMP parameter MUPE_C_L, to indicate the number of values of the multiple-value field in the record. The minimum number of values which may be specified is 1.
If the number of values is constant for each record, this number may be specified in the field definition table used to define the multiple-value field. In this case, the count byte in the input record must be omitted. This option is only enabled if the FDT keyword is used. FDTs that are read from the database always default to variable occurrence counts. These variable occurrence counts can be overwritten by using the FIELDS keyword.
Multiple fields within periodic groups must not be specified with an occurrence count when the periodic group has been specified with a variable occurrence count.
01,PG,PE 02,P1,4,A,NU 02,PM,4,A,NU,MU(4) ^ %ADACMP-E-FIXOCC, specification of occurrences not allowed at this position
The count provided by the user may be modified by ADACMP if the NU option is defined for the field. Null values are suppressed and the count field is modified accordingly.
Field Definition: 01,MF,4,A,MU,NU
Each record contains a variable number of values for MF.
Input Records Before ADACMP After ADACMP Input Record 1 (3 values) MF count = 3 MF count = 3 AAAA AAAA BBBB BBBB CCCC CCCC Input Record 3 (3 values) MF count = 3 MF count = 2 AAAA AAAA <null value> CCCC CCCC Input Record 4 (1 value) MF count = 1 MF count = 0 <null value>
Field Definition: 01,MF,4,A,MU(3),NU
Each record contains 3 values for MF.
Input Records Before ADACMP After ADACMP Input Record 1 MF count = 3 AAAA AAAA BBBB BBBB CCCC CCCC Input Record 2 MF count = 2 AAAA AAAA BBBB BBBB <null value> Input Record 3 MF count = 2 AAAA AAAA <null value> CCCC CCCC Input Record 4 MF count = 0 <null value> <null value> <null value>
If the structure of the decompressed record is not described via the FIELDS parameter, please consider the following:
The first occurrence of a periodic group must be preceded by a 1, 2 or 4 byte binary count, depending of the ADACMP parameter MUPE_C_L, which indicates the number of occurrences of the periodic group in the record. The minimum number of occurrences which may be specified is 1.
If the number of occurrences is constant for each record, this number may be specified in the field definition table used to define the periodic group. In this case, the count byte in the input record must be omitted.
This option is only enabled when the FDT keyword is used. FDTs that are read from the database always default to variable occurrence counts. These variable occurrence counts can be overwritten by using the FIELD keyword.
The occurrence count provided may be modified by ADACMP only if all the fields in the periodic group are defined with the NU option. If all the fields in a given occurrence contain null values and there are no following occurrences which contain non-null values, the occurrence will be suppressed and the periodic group occurrence count will be modified accordingly.
Field Definitions: 01,GA,PE 02,A1,4,A,NU 02,A2,4,A,NU
The input records contain a variable number of occurrences for GA.
Input Records Before ADACMP After ADACMP Input Record 1 GA count = 2 GA count = 2 GA (1st occ.) A1 = AAAA A1 = AAAA A2 = BBBB A2 = BBBB GA (2nd occ.) A1 = CCCC A1 = CCCC A2 = DDDD A2 = DDDD Input Record 2 GA count = 1 GA count = 0 GA (1st occ.) A1 = <null value> suppressed * A2 = <null value> suppressed * Input Record 3 GA count = 3 GA count = 3 GA (1st occ.) A1 = AAAA A1 = AAAA A2 = <null value> A2 = suppressed GA (2nd occ.) A1 = BBBB A1 = BBBB A2 = <null value> A2 = suppressed GA (3rd occ.) A1 = CCCC A1 = CCCC A2 = <null value> A2 = suppressed
* but this is indicated by an empty field count of 2. Up to 63 consecutive empty fields are indicated by one appropriate empty field count.
Field Definitions: 01,GA,PE(3) 02,A1,4,A,NU 02,A2,4,A,NU
All input records contain 3 occurrences for GA.
Input Records Before ADACMP After ADACMP Input Record 1 GA (1st occ.) GA count = 3 A1 = AAAA A1 = AAAA A2 = <null value> A2 suppressed GA (2nd occ.) A1 = BBBB A1 = BBBB A2 = <null value> A2 suppressed GA (3rd occ.) A1 = CCCC A1 = CCCC A2 = <null value> A2 suppressed Input Record 2 GA count = 2* GA (1st occ.) A1 = <null value> A1 = suppressed A2 = <null value> A2 = suppressed GA (2nd occ.) A1 = BBBB A1 = BBBB A2 = <null value> A2 = suppressed GA (3rd occ.) A1 = <null value> A1 = suppressed A2 = <null value> A2 = suppressed Input Record 3 All occ. GA count = 0 contain All occurrences null value are suppressed **
* The first occurrence is included in the count since occurrences follow which contain non-null values. The third occurrence is not included in the count since no occurrences follow which contain non-null values.
** but this is indicated by an empty field count of 2.
Field Definitions: 01,GA,PE(3) 02,A1,4,A 02,A2,4,A
All input records contain 3 occurrences for GA.
Input Records Before ADACMP After ADACMP Input Record 1 GA (1st occ.) GA count = 3 A1 = <null value> A1 = <null value> A2 = <null value> A2 = <null value> GA (2nd occ.) A1 = <null value> A1 = <null value> A2 = <null value> A2 = <null value> GA (3rd occ.) A1 = CCCC A1 = CCCC A2 = <null value> A2 = <null value> Input Record 2 GA count = 3 GA (1st occ.) A1 = <null value> A1 = <null value> A2 = AAAA A2 = AAAA GA (2nd occ.) A1 = <null value> A1 = <null value> A2 = <null value> A2 = <null value> GA (3rd occ.) A1 = <null value> A1 = <null value> A2 = <null value> A2 = <null value>
Each value of a variable-length field (length set to zero in the field definition) must be preceded by a length indicator (in binary format) which indicates the value length (including the length indicator).
The length of the length indicator is:
4 bytes, if the field has the L4 option
2 bytes, if the field has the LA option
1 byte, if the field has neither of these options
Field Definitions: 01,AA,8,A,DE 01,V1,0,A 01,V2,0,A,LA 01,V4,0,A,L4
Input records (high-order first)
"FIELD AA\x09FIELD V1\x00\x0aFIELD V2\x00\x00\x00\x0cFIELD V4" "FIELD AA\x09FIELD V1\x07\xD2 (2000 data bytes)\x00\x00\x07\xD2 (2000 data bytes)"
Input records (low-order first)
"FIELD AA\x09FIELD V1\x0a\x00FIELD V2\x0c\x00\x00\x00FIELD V4" "FIELD AA\x09FIELD V1\xD2\x07 (2000 data bytes)\xD2\x07\x00\x00 (2000 data bytes)"
The values for fields with the NC option are defined without the indicator when the FDT is used to describe the input record
Field Definitions: 01,AA,5,A,NC 01,AB,5,A,NC Input Record Field AA Field AB (5 bytes) (5 bytes)
If the input record contains values for the NC option, then either the NULL_VALUE parameter must be set, or the structure of the records must be described using the FIELDS parameter.
ADACMP modifies all input records as follows:
Fields defined with format U or P are checked to ensure that the field value is numeric and in the correct format.
If a value is null, it must contain characters which correspond to the format specified for the field:
Format | Null Value |
---|---|
ALPHANUMERIC (A) | Blanks (ASCII: hex `20' or EBCDIC: hex `40') |
BINARY (B) | Binary zeros |
FIXED-POINT (F) | Binary zeros |
FLOATING POINT (G) | Binary zeros |
PACKED DECIMAL (P) | Packed decimal 0 |
UNPACKED DECIMAL (U) | Unpacked decimal 0, depending on the source architecture |
UNICODE (W) | Blanks depending on WCHARSET specified |
For a packed or unpacked alphanumeric field, -0 is converted to +0
Any record which contains invalid data is written to the ADACMP error file and will not be written to the compressed file.
The value for each field is compressed (unless the FI option is specified) as follows:
Trailing blanks are removed for fields defined with A format;
Leading zeros are removed for fields defined with B, P or U format;
If the field is defined with the NU option and the value is a null value, a one-byte indicator is stored. Hexadecimal `C1' indicates that one empty field follows, `C2' two, etc.;
Empty fields located at the end of the record are not stored.
The following data definitions and corresponding values would be processed by ADACMP as shown in the following figure:
01,ID,4,B,DE ; ID 01,BD,6,U,DE,NU ; BIRTHDATE 01,SA,5,P ; SALARY 01,DI,2,P,NU ; DAYS ILL 01,FN,8,A,NU ; FIRST_NAME 01,LN,9,A,NU ; LAST_NAME 01,SE,1,A,FI ; SEX 01,HO,7,A,NU ; HOBBY
Field Format Before compression After compression ID B 67 12 00 00 03 67 12 BD U 31 36 30 35 35 39 07 31 36 30 35 35 39 SA P 00 00 05 00 0C 04 05 00 0C DI P 00 0C ) )C2 (two empty fields) FN A 20 20 20 20 20 20 20 20 ) LN A 4E 41 4D 45 20 20 20 20 20 05 4E 41 4D 45 SE A 4D 4D HO A 20 20 20 20 20 20 20 C1 (one empty field)
When adding records to or deleting records from an ADABAS database file, entries have to be inserted/removed in the Address Converter (AC), Data Storage (DS) and in the index. The data storage space table (DSST) has to be modified accordingly.
If the USERISN option is set, the ISN given with the input data is used. If this ISN exceeds the current limit (MAXISN) for the file or has already been assigned to another record, ADAMUP terminates execution and returns an error message. As with an ADABAS N2 command, there is no automatic extension of the file's Address Converter. The file's first free ISN is set to a value that is one greater than the highest USERISN provided if there is a USERISN which is greater than or equal to the file's current first free ISN.
If the USERISN option is omitted or NOUSERISN is specified, the ISN of each record is assigned by ADAMUP. ISNs are assigned in ascending sequence. If ISN-reusage is enabled, ADAMUP first scans the file's Address Converter for unused ISNs. Once all ISNs have been reused or if ISN-reusage is disabled, ADAMUP assigns new ISNs starting at the file's first free ISN. Whenever a new Address Converter block is required, it is taken from the extents that are currently available. When these blocks are exhausted, an automatic extension is carried out according to the rules described in this chapter. Processing continues if the extension is successful, otherwise ADAMUP terminates with an error message.
ISNs deleted by a mass delete that is running in parallel can be reused immediately for the records being added.
If DS-reusage has been enabled, ADAMUP scans the DSST for a DS RABN with sufficient space to store the current data record. One DSST RABN is scanned at a time, just as the ADABAS nucleus does, and the first free DS RABN is used if no space is found via the DSST. When a mass delete is run in parallel, the DS RABNs from which records are deleted are reused first. This is different to the procedure used by the ADABAS nucleus, but saves scanning the DSST and minimizes the number of I/Os to the Data Storage. This is because those RABNs have to be read and written by the delete routines in any case.
If DS-reusage is disabled, or if no space is found via the DSST, ADAMUP assigns a new DS block starting at the first free DS RABN.
Whenever new records are added to a Data Storage block, the padding factor specified for the file is considered. If a new Data Storage block is required, it is taken from the extents that are currently available. When these blocks are exhausted, an automatic extension is carried out. Processing continues if the extension is successful, otherwise ADAMUP terminates with an error message.
In the first step, all input records on the file that contains the ISNs to be deleted are read and validated. If any invalid records are found, the line number and offset are reported, and ADAMUP terminates execution and returns an error status once the input file has been parsed completely.
At the end of this step, ADAMUP builds a table of the ISNs to be deleted in virtual memory. This table is used in the next steps when performing the updates required on the file's Address Converter, Data Storage and index. The space required for this table (one bit per entry) depends on the lowest and highest ISN specified on the input file. ADAMUP terminates execution and returns an error message if there is not sufficient space.
In the second step, the file's Address Converter is processed. Because the ISNs to be deleted are pre-sorted, the number of Address Converter IOs can be reduced to a minimum in this step.
The corresponding Address Converter entry of each ISN specified is retrieved. For unused ISNs, an entry is written to the error log and processing continues if NOT_PRESENT=IGNORE is specified (default), otherwise ADAMUP terminates and an error message is returned. For ISNs that are used, the corresponding Data Storage RABN is put into the SORT and the Address Converter entry is deleted. Consecutive references to the same Data Storage RABN are skipped. Each Data Storage RABN put into the SORT is prefixed with the extent number to indicate its location in the File Control Block (FCB). This allows the next step to process the file's Data Storage according to the sequence in which the Data Storage extents were allocated.
At the end of this step, the first free ISN on the file is reset to the first ISN of the highest range of ISNs to be deleted, if ISN-reusage is enabled, and the highest ISN of the range of records to be deleted is identical to the last used ISN on the file.
In the third step, the file's Data Storage and Data Storage Space Table are processed. Because the Data Storage RABNs to be modified are now pre-sorted, the number of Data Storage and Data Storage Space Table IOs can be reduced to a minimum in this step.
The relevant Data Storage blocks are read using the values returned by the SORT. Within each block, the records identified by an ISN in the table of ISNs to be deleted are removed, the block is refilled with records to be added (when a mass add is run in parallel and DS reusage is enabled) and the Data Storage Space Table is modified accordingly. At the end of this step, the first free Data Storage RABN is reset to the start RABN of the last range of Data Storage RABNs from which all data were deleted, if DS reusage is enabled, and the end RABN is identical to the last used Data Storage RABN on the file.
Once the Address Converter, Data Storage and Data Storage Space Table have been modified, ADAMUP copies the file's Normal Index (NI) to an intermediate file and resets the file's index extents. Index entries that correspond to deleted records are omitted in this step.
In order to build the Normal Index and Main Index, the Descriptor Value Table (DVT) entries contained on the input file have to be read and sorted according to ascending descriptor values and ISNs. The output of this sort is merged with the Normal Index entries saved on the intermediate file, and is then used to build the new Normal Index and Main Index.
Descriptors defined with the unique option are checked to ensure that the new Normal Index contains only one ISN per descriptor value. If there is more than one ISN, the conflicting ISNs are written to the error log, the unique flag is reset in the FDT and processing continues if UQ_CONFLICT=RESET is specified. Otherwise ADAMUP terminates with an error message.
Besides sorting the descriptor values, reading the Descriptor Value Tables is very time-consuming as a result of the large number of I/Os to the sequential input file. Therefore, if there are many descriptors, ADAMUP attempts to minimize the number of passes required to read through the Descriptor Value Tables by using the information contained in the Descriptor Space Summary (DSS). During each pass through the Descriptor Value Tables, the values for one descriptor are directly given to the SORT. The values of additional descriptors, if they exist, are written to the TEMP data set. The greater the number of descriptors using the TEMP in parallel during each pass, the faster this step will be. ADAMUP displays the total number of passes required at the end of the run.
All index blocks are filled in accordance with the padding factor specified when the file was loaded. Whenever a new index block is required, it is taken from the existing extents (which have been reset at the start of this step). If these blocks are exhausted, an automatic extension is carried out. Processing continues if the extension is successful, otherwise ADAMUP terminates with an error message.
Whereas the Normal Index and Main Index are organized on a descriptor-by-descriptor basis, the Upper Index, index level 3 and higher, contains all descriptors. In order to link in the new Main Index, an entry is made in the Upper Index for each new Main Index block. The whole Upper Index is rebuilt. The padding factor specified when loading the file is re-established. All pre-allocated blocks are used before additional blocks are allocated. If additional blocks are required, the procedure as described for Normal Index and Main Index loading is used.
Any rejected data is written to the ADAMUP error file. The contents of this error file should be displayed using the ADAERR utility. Do not print the error file using the standard operating system print utilities, since the records contain unprintable characters.
Please refer to the ADAERR utility in the Utilities Manual for further information.
When dumping a complete database (DUMP=*), the database's global information and all loaded files are dumped to an ADABAS backup copy. Therefore, a database can be restored from a database backup copy. Single files contained in such ADABAS backup copies can also be restored.
Dumping only selected files allows a controlled backup of certain parts of a database in cases where backing up the complete database is unnecessary.
The DUMP/EXU_DUMP function may be used when the nucleus is active or inactive. If the nucleus is active during a DUMP, all updates are dumped to the backup copy.
The DUMP/EXU_DUMP function cannot be used when AUTORESTART is pending. Then first the nucleus has to be started to resolve the AUTORESTART pending situation.
When the DUMP is about to terminate, all transactions have to be synchronized on ET status. An active nucleus does this automatically on request of ADABCK. During synchronization, the nucleus will only schedule commands which
enable ET users to attain ET status;
complete any active update commands;
are read/search commands.
The nucleus may come up while the DUMP function is running. In this case, the nucleus and the DUMP function will synchronize with each other. The nucleus can be shut down with ADAOPR CANCEL while the DUMP function is active. If the nucleus terminates abnormally, ADABCK displays a message requesting the nucleus to be started. Then it waits until the nucleus performs its autorestart, after which it terminates normally.
Sometimes it can be useful to dump single files in parallel using multiple ADABCK jobs. This is generally possible with EXU_DUMP, but if the nucleus is active, only one DUMP function is permitted.
Note:
Parallel backups are not supported on Windows
platforms.
A backup copy can be used to restore/overlay either selected files or a database if single files or the database's global information is corrupted.
When restoring/overlaying files, the nucleus may be either active or inactive. A check is made that all of the RABNs required by the files to be restored/overlaid are available. If all RABNs are available, the file is restored to the same position as before. If one or more of the required RABNs are not available in the database, a completely new set of RABNs will be allocated.
The nucleus may not be active when restoring/overlaying a database, since exclusive control over the database container files is required.
When restoring/overlaying a complete database, the underlying database may be larger, containing more blocks or more containers than the backup save set. However, the block sizes covered by the save set must be identical. The unused blocks from the underlying database will be kept and their space will be returned to the free space table.
When restoring/overlaying files, the underlying database can be smaller or larger than the backup copy.
When restoring/overlaying files, ADABCK tries by default to restore the blocks to the original block numbers. If this space is not available because it is occupied by another file, the file will be completely restored to other block numbers, and an attempt is made to combine several file extents into one.
Sometimes it can be useful to restore single files in parallel using multiple ADABCK jobs. This is possible with both the RESTORE and the OVERLAY function, regardless of whether the nucleus is active or inactive.
When restoring/overlaying the security file, only the passwords and the associated permission levels are re-established; the protection levels of the files loaded are not re-established. Therefore, if the file is restored to a newly-formatted database, the protection levels have to be reenabled using the ADASCR security utility.
The protection levels of all files are only re-established if a database is restored/overlaid.
ADABCK has no restart capability. An abnormally-terminated ADABCK execution must be rerun from the beginning.
An interrupted RESTORE/OVERLAY of one or more files will result in lost RABNs which can be recovered by executing the RECOVER function of the utility ADADBM. An interrupted RESTORE/OVERLAY of a database results in a database that cannot be accessed.
When exporting one or more files, ADAORD copies the content of each file's Data Storage together with the information required to re-establish its index to a sequential output file (ORDEXP). Exporting a file's data records is identical to unloading them, and ADAORD supports the same processing sequences as the ADAULD utility. There are, however, differences in the way in which the information required to re-establish the file's index is provided. ADAORD does not generate descriptor value table (DVT) entries based on the data records (like ADAULD), but rather retrieves and exports the file's inverted lists. This requires access to a valid index and results in additional I/Os on the one hand, while saving CPU time on the other.
All files to be processed are written to a single sequential output file (ORDEXP) in ascending file number sequence. Splitting the export into separate runs and thus creating several versions of the sequential output file should be considered if non-default allocation quantities or placements are to be used when subsequently re-importing a file. If non-default values and placements are used, each file requires a separate run, and splitting the export procedure helps prevent lengthy and time-consuming positioning during the re-import process.
When importing one or more files, ADAORD retrieves the information contained on the sequential input file (ORDEXP) to re-establish each file's Data Storage, Address Converter and index. Importing a file's data records and building the Address Converter is identical to loading them using the utility ADAMUP (with the USERISN option). However, the process of building the file's index is faster in ADAORD because the descriptor values and ISNs are provided in their correct sequence. This eliminates the necessity of sorting (and of using the SORT and TEMP files) and more than compensates for the additional expenditure that results from reading through the index during the EXPORT phase.
The format of the sequential input file (ORDEXP) is independent of any database device types. Therefore, the process of exporting and then re-importing can be used to migrate files between databases that reside on different device types.
When importing the security file, only the passwords and the associated permission levels are reestablished; the protection levels of the files imported are not reestablished. Therefore, if a file is imported to a newly-formatted database, the protection levels have to be re-enabled using the utility ADASCR (refer to the Utilities Manual for further information).
When importing a file, both the placement and initial allocation quantities can be controlled by the user or left to ADAORD.
Unless positioning is forced by the specification of a start RABN, ADAORD will use the following sequence for the initial allocation of a file's extents: Address Converter (AC), Upper Index (UI), Normal Index (NI) and Data Storage (DS).
This allows the two extent types with the highest probability of being exhausted (NI and DS) to be extended without breaking into another extent.
If the number of blocks or megabytes to be allocated is omitted, ADAORD calculates the allocation quantity as follows:
ALQN = ALQO * (100 - PFACO) / (100 - PFACN)
where:
ALQN | New allocation quantity in blocks or megabytes |
ALQO | Old allocation quantity in blocks or megabytes |
PFACN | New padding factor |
PFACO | Old padding factor |
By default, the initial and all subsequent allocations will be made using a contiguous-best-try method.
The ISN provided with each data record (and also contained in the inverted lists) is used. ADAORD will terminate execution and return an error message if the limit (MAXISN) for a file has been decreased to a value less than the file's first free ISN and an ISN that exceeds the new limit is found. The file's new first free ISN is set to a value one greater than the highest ISN found in the data records.
In order to change the ISN assignment, the file has to be unloaded using ADAULD and then reloaded using ADAMUP.
This function consists of implicit EXPORT and IMPORT functions.
When reordering at the database level, all of the files in the database have to be exported in the first step. A single version of ORDEXP will be created, independently of where it physically resides.
The second step consists of rearranging the database's FCB and FDT area and reallocating the DSST behind it.
The final step is to re-import the files. Each file is relocated, multiple logical extents are condensed into a single logical extent and the padding factors are reestablished.
The created sequential file (ORDEXP) will not be deleted at the end of this function.
Because the new index is based on the content of the old index (and not on the file's data records), an index which is logically inconsistent cannot be repaired by exporting and re-importing the file. Furthermore, an index which is physically corrupted may cause ADAORD's EXPORT function to loop or terminate abnormally.
The index can only be repaired by either reinverting (using ADAINV) or unloading and reloading the file (using ADAULD and ADAMUP).
This section contains formulas for calculating the Associator and Data Storage space requirements for a file.
The following pages of this chapter describe how to get a reasonably accurate estimate of the disk space requirements for your file or database before you load the data. A simple way of getting a first approximation is to load a small amount of your data, for example 1%-2%, into the database, then run the ADAREP utility and check the figures output for "allocated" and "unused" blocks. Then extrapolate these figures to calculate how much space would be required for the full 100% of the data. This is the approach often used by experienced database administrators at customer sites to calculate space requirements.
The Associator space required for a file is the sum of the space requirements for the following Associator elements:
Normal Index
The Normal Index is the lowest level of the index structure. It contains the inverted lists. Each inverted list is composed of a descriptor value and the list of ISNs of all the records that refer to this descriptor value.
Upper Index
The Upper Index consists of the Main Index and the other upper index levels. The Main Index is the next-highest level of the index structure after the Normal Index. It is used to manage the Normal Index. Up to this level, each index block may contain entries for only one descriptor.
The Upper Index (index levels 3 and higher) contains entries for all descriptors that are present. Level 3 is used to manage the Main Index. As long as there is more than one Upper Index block at the current level, more levels will be added, each level managing the level below.
Address Converter
The Address Converter consists of a table of RABNs, each of which indicates the Data Storage location of the record identified by a given ISN.
The space required for the Normal Index depends on the number and the characteristics of the descriptors contained in the file.
An estimate of the Normal Index space required for each descriptor can be made using the formula:
NIBY = (IL * UV * MAXISN) + DV * (L + 2)
where
- NIBY
Normal Index space requirement (in bytes).
- UV
The average number of unique values in each record for the descriptor.
If the descriptor is not defined with the MU option, UV is equal to or less than 1.
If the descriptor is defined with the NU option, UV is equal to the average number of values per record minus the percentage of records containing a null value. For example, if the average number of values per record is 1 and 20 percent of the values are null, UV is equal to 1 - 0.2 = 0.8.- MAXISN
The number of records permitted for the file (see MAXISN parameter of the utility ADAFDU).
- DV
The number of different values of the descriptor in the file.
- L
The average length of each different value of the descriptor. If the descriptor is not defined with the FI option, L is equal to the average length. If the descriptor is defined with the FI option, L is equal to the standard length of the descriptor.
- IL
IL ISN size of 2 or 4 bytes.
The factor IL*UV*MAXISN represents the space required to store the ISNs, and the DV*L factor represents the space required to store the descriptor values.
For descriptors with numerous duplicate values, the factor IL*UV*MAXISN is the important factor. For descriptors with a large proportion of unique values, DV*L is the important factor.
This is only valid if the data is loaded using the mass update utility ADAMUP or if the index is created with the inverted list utility ADAINV. If the data is loaded using S1 calls, twice as much space may be required (in the worst case), and the blocks are not filled completely. New values must be added to a block in sort sequence. If there is not enough space available in a block, in index block is split.
Descriptor AA has an average of 1 value in each record. There are 50 different values for AA in the file. There are no null values for AA. The average value length is 3 bytes. The MAXISN setting for the file is 20000, the ISN size is 2 bytes.
Field Definition: 01,AA,5,U,DE NI = (2 * 1 * 20,000) + 50*(3 + 2) NI = 40,000 + 250 NI = 40,250 bytes
Descriptor BB has an average of 1 value in each record. There are 20000 different values for BB in the file. There are no null values for BB. The average value length is 10 bytes. The MAXISN setting for the file is 20000, the ISN size is 4 bytes.
Field Definition: 01,BB,15,A,DE NI = (4 * 1 * 20,000) + 20,000*(10 + 2) NI = 80,000 + 240,000 NI = 320,000 bytes
Descriptor CC is a multiple-value field with an average of 10 values in each record. There are approximately 300 different values for CC in the file. The average value length is 4 bytes. There is an average of 3 null values in each record. The MAXISN setting for the file is 20000, the ISN size is 4 bytes.
Field Definition: 01,CC,12,A,DE,MU,NU NI = (4 * 7 * 20,000) + 300*(4 + 2) NI = 560,000 + 1,800 NI = 561,800 bytes
Descriptor DD is a field within a periodic group. Each record has an average of 5 values for DD. There are 10 different values for DD in the file. Each record has an average of 3 null values. The MAXISN setting for the file is 20000. The average value length is 5 bytes, the ISN size is 2 bytes.
Field Definition: 01,PX 02,DD,8,A,NU NI = (2 * 2 * 20,000) + 10*(5 + 2) NI = 80,000 + 70 NI = 80,070 bytes
Once the number of bytes required for the Normal Index has been determined, an estimate of the number of blocks required can be made using the following formula:
NIBL = NIBY / (BL * (1 - p / 100) - 3)
where
The result of the division should be rounded up to the next integer.
NI requirement in bytes = 60,250 Device type 2 KB Associator block padding factor = 10 percent NIBL = 60,250 / (2048 * (1 - 10 / 100)) NIBL = 32+ = 33 blocks
The Upper Index consists of the Main Index and other upper index levels. Each Normal Index representation in the Main Index consists of a 9 byte fixed part and the descriptor value. The Main Index space requirement for each descriptor may be calculated using the formula:
MIBY = NIBL * (L + 9)
where
- MIBY
Main Index space requirement (in bytes)
- NIBL
Normal Index space requirement (in blocks)
- L
The average length of each different value of the descriptor. If the descriptor is not defined with the FI option, L is equal to the average length. If the descriptor is defined with the FI option, L is equal to the standard length of the descriptor. For fields with format A and W, the length of truncated descriptor values must be considered; the descriptor values are truncated at the first byte where they differ from the previous descriptor value.
NI Block Requirement = 45 blocks MI = 45 * (3 + 9) MI = 540 bytes
The following formula may be used to convert the Main Index byte requirement to blocks:
MIBL = MIBY / (BL * (1 - P/ 100))
where
The result of the division is rounded up to the next integer.
MI byte requirement = 540 bytes Device type 2 KB Associator block padding factor = 5 percent MIBL = 540 / (2048 * (1 - 5 / 100)) MIBL = 0+ = 1 block
The highest upper index levels (level 3 and higher) contain entries for all descriptors of a file. The overall space requirements for the upper index can be obtained using the following formula:
UIBL = M * (1 + C + C**2 + C**3 + ... + C**13)
where
C is given by the following formula:
C = (L + 13) / (BL * (1 - P/100))
where
The Address Converter for a file consists of a list of the relative ADABAS block numbers (RABNs), each of which represents the Data Storage block number in which a given record is stored. The block numbers are stored in ISN sequence, with the nth entry containing the Data Storage RABN for ISN n. Three bytes are required for each entry.
The space requirement for the Address Converter can be calculated using the formula:
AC = MAXISN * 3 / BL
where
The result of the division is rounded up to the next integer.
MAXISN = 2,000,000 Device type 2 KB AC = 2,000,000 * 3 / 2048 AC = 6,000,000 / 2048 AC = 2929+ = 2930 blocks
The Data Storage space requirement can be estimated using the formula:
DS = N/(BW/L) + 1
where
Number of records = 1,000,000 Average compressed record length = 50 Device type = 4 KB Data Storage block padding factor = 5 percent BW = 4096 * (1 - 5/100) = 3891 DS = 1,000,000/(3891/50) + 1 = 12,988 blocks