Maintaining Tamino Indexes

The X-Machine command _admin offers various functions for maintaining Tamino indexes. This section provides general guidelines for maintaining Tamino indexes and describes which functions of _admin to use for the various possibilities.


General

Indexes are defined for a doctype by adding corresponding definitions to Tamino schema documents. In normal operation, these indexes are maintained when adding, removing or modifying documents in the doctype. However, there are a few scenarios where special actions might be required:

  • an index is disabled during an upgrade from a previous version of Tamino

  • the Tamino server was aborted during a schema update which requested creation of new indexes

  • the Tamino server was aborted while executing a function used to recreate or repair an index was interrupted

  • an index has been corrupted

All these scenarios lead to an index which is marked as unusable. An appropriate message will show up in the job log of the Tamino server. Depending on the scenario, an additional message may be returned in a response to the X-Machine request which caused or detected the problem.

The command _admin = ino:DisplayIndex (...) may be used to identify the indexes that are disabled. The respective index will be marked as follows:

<ino:index ino:indexcoll="myCollection"
           ino:indexpath="myDocument/myElement"
           ino:indextype="standard"
           ino:status="not-available">

The following diagram illustrates possible states of an index in a Tamino doctype and the transitions which may occur:

graphics/transit.png

The states are:

State Meaning
OK the index is fully operational: it can be used for query processing
non-existing the index does not exist
(re-)creating the index is being newly created or regenerated
to be repaired the index is not usable and must be repaired

Both the "(re-)creating" and the "to be repaired" state are reflected as "not-available" by the ino:DisplayIndex function.

Here are the possible reasons for the state transitions (the numbers refer to the arrows in the above diagram):

State Transition Meaning
1

Update existing schema and add a new index or a new unique constraint

(a) in session context

(b) without session context

2 An operation which led to "(re-)creating" state has been finished successfully
3 Update an existing schema and remove an existing index or unique constraint
4 Started _admin=ino:RepairIndex(...,"drop"). This affects all indexes of a doctype that are in the "to be repaired" state .
5 Tamino server restarted after being aborted while in the "(re-)creating" state
6 Started _admin=ino:RepairIndex(...,"continue") this affects all indexes of a doctype that are in the "to be repaired" state
7 Different possible reasons, as listed at the beginning of this section.
8

Started one of the following commands:

  • _admin=ino:RecreateIndex(...): this affects all indexes of a doctype

  • _admin=ino:RecreateTextIndex(...): this affects all text indexes of the doctype

(a) in session context

(b) without session context

Most _admin functions mentioned above operate on sets of indexes. If you want to recreate a single index, you can use the Tamino schema editor as follows:

Start of instruction setTo recreate an index using the Tamino Schema Editor

  1. Start the Tamino Schema Editor

  2. Get the schema defining the index(es) to be recreated, remove the index from the schema definition, then use Database > Define Schema to define the schema again. You will be asked whether you want to update the existing schema. Please answer "yes".

  3. Use Edit > Undo Set physical property to reintroduce the index(es), then use Database > Define Schema to define the schema again. You will be asked whether you want to update the existing schema. Please answer "yes". This will cause the index(es) to be recreated.

Note that this operation can run for a considerable length of time.

Special Considerations for Indexes

Special Considerations for Multipath Indexes

Due to the nature of multipath indexes, _admin=ino:RepairIndex(…,"drop") will be rejected if there are multipath indexes defined for a doctype.

Special Considerations for Computed Indexes

If a schema defining a computed index for a doctype is updated, all computed indexes will be recreated.

The construction of a computed index relies on the XQuery module where the referenced XQuery function is defined, and potentially also other modules that are imported directly or indirectly by that primary module. If one of these modules has been modified, the index is in general corrupted if the results returned by the indexing function have changed.

However, Tamino does not automatically recreate all computed indexes for potentially affected doctypes. Instead, it is up to the database administrator to determine the set of potentially affected doctypes and to invoke the _admin=ino:RecreateIndex(…) for all affected doctypes.

The following XQuery can be used to determine the set of potentially affected doctypes if a module with targetNamespace URI has been modified:

import module namespace si="http://namespaces.softwareag.com/tamino/schemaInfo"

for $dt in si:getDoctypesUsingModule("URI") 
return $dt/../*/@name

On the other hand, it may happen, that a computed index cannot be recreated at all due to mis-configuration, if for example

  • the indexing function's signature has changed

  • the indexing function or the enclosing module as a whole have been deleted

In any of these cases it will also no longer be possible to store documents in the affected doctypes.

Note:
Queries using the computed index with a modified or broken indexing function may return invalid results.

After such a broken computed index, one of the following can be done depending on the status of the computed indexes:

  • update or re-create the module

  • update the schema to remove the computed index

  • execute _admin=ino:RepairIndex(…,"drop")

Dependence on Session Context

All index manipulation commands, namely:

may require a long time for execution and perform a lot of changes.

If the command is executed inside a session context, there are potential problems regarding transaction timeouts and journal overflow. In addition, the entire collection is locked exclusively during operation. Hence, it is recommended to use the commands listed above outside a session context (i.e. in autocommit mode).

Performance and Locking Aspects

If an index is being repaired, i.e. it is in the "(re-)creating" or "to be repaired" state, it is disabled. This means that it cannot be used for queries, thus affecting the performance of queries which otherwise could take advantage of that index. When running an index manipulation command, in most cases, except for a short preparation and termination phase of the respective index manipulation command, parallel inserts and updates operating on the respective doctype are possible. If, however, an index used underneath a unique constraint is being disabled, the doctype is locked and no parallel insert or update operations are permitted.

Optimization

In addition, there are scenarios where performance of query execution is also degraded even if the index is not disabled. This may happen if an index is not as selective as it could be, for example:

  • a standard or compound index contained long index values which had been truncated to a length of 1000 bytes

  • a condensed structure index contains paths with no corresponding documents being stored in the doctype any more

The command _admin=ino:Index("optimize", ...) can be used if any of these scenarios might have occurred.