Performance Considerations
Push-down of Query Logic
In order to understand potential performance implications when using TQL API, it is worth understanding the mechanisms running in the background. TQL API incorporates an internal query processing engine. When a TQL query is submitted, an optimization process decomposes the query into fragments. For each fragment it is checked whether it can be expressed in terms of the native stream API. Those fragments are then delegated to the TCStore client, which in turn tries to delegate them to TCStore server. In order to maximize server-side execution, the process of rewriting TQL query fragments in terms of native API tries to leverage portable functions of TCStore DSL whenever possible. The fragments which cannot be expressed in the native API are executed by the internal engine.
As the query processing engine is a logical part of the TCStore client, executing query fragments by the internal engine might require transferring a lot of data from TCStore server to client. Therefore, as a rule of thumb the queries should be written so that as much functionality as possible can be delegated to TCStore client.
The query execution plan can be utilized as tool for that purpose. By investigating the plan, the user can determine:
1. which fragments of the query cannot be expressed in TCStore's query DSL and therefore are executed by the internal engine on the client side;
2. which fragments of the query can be expressed in TCStore's query DSL; the resulting pipeline of stream operations is analyzed and decomposed into a sequence of portable operations and a sequence of non-portable operations;
3. which portable operations are evaluated remotely on the server and how the execution plan of the server is defined.
For details on portable and non-portable operations see also section
Stream Optimizations.
Example
Let us examine a concrete query example and its execution plan.
Result result = env.query("SELECT * FROM Comedians WHERE Age**2 > 4"); // 1
try(ResultStream resultStream = result.stream()) {
resultStream.explainQuery(System.out::println) // 2
.forEach(System.out::println);
}
1 | The query filters all records where the squared age is greater than 4. |
2 | The corresponding query plan is printed. |
Query Plan - No Push-Down
======= Query Plan Start =======
==== SQL Query: // 1
SELECT * FROM Comedians WHERE Age**2 > 4
### WARN: Pushdown failed due to the following reasons:
### WEPRME1953 - The expression of type EXPONENTIATION cannot be pushed down:
### Age^CONSTANT_INTEGER.
==== Query operations evaluated by SQL engine: // 2
SELECTION(filter[Age^CONSTANT_INTEGER>CONSTANT_INTEGER])
==== Query operations evaluated by TCStore Stream Query API: // 3
SOURCE(Comedians)
==== Code generated for TCStore Stream Query API: // 4
IntCellDefinition Age_INT = CellDefinition.defineInt("Age");
StringCellDefinition FirstName_STRING = CellDefinition.defineString("FirstName");
IntCellDefinition Weight_INT = CellDefinition.defineInt("Weight");
StringCellDefinition LastName_STRING = CellDefinition.defineString("LastName");;
reader.records()
==== TCStore query plan: // 5
Stream Plan
Structure:
Portable:
None
Non-Portable:
PipelineOperation{MAP(de.rtm.adapters.tcstore.source.translation.operators.
OperatorTranslationResult$RecordStreamResult$$Lambda$470/28956604@1b5975f)}
PipelineOperation{ITERATOR()}
Server Plan: [stream id: a6e5eb73-47d7-4748-b260-53759d22471b]
Stream Planning Time (Nanoseconds): 4290237
Sorted Index Used In Filter: false
Filter Expression: true
Unknown Filter Count: 0
Unused Filter Count And Filters (If Any): 0
Selected Plan: Full Dataset Scan
======= Query Plan End =======
The query plan provides the following information:
1 | The query is available under section "SQL Query". The warning indicates which fragments cannot be pushed down, in this example the exponentiation operation. |
2 | The query operations being executed by the internal query engine are listed. In this example this list comprises a selection using a filter on column age. |
3 | The query operations being executed by the native API only comprises the source itself, i.e. the access to the Dataset. |
4 | The optimization process rewrites query fragments in the native API. The corresponding code being generated is listed. In this example, only the method 'records' is called, delivering a stream of all instances in the dataset. |
5 | Finally the query plan of TCStore is listed. |
The example query can be rewritten, delivering the same results (under the assumption that age is positive).
result = env.query("SELECT * FROM Comedians WHERE Age > 2"); // 1
try(ResultStream resultStream = result.stream()) {
resultStream.explainQuery(System.out::println)
.forEach(System.out::println);
}
1 | The filter predicate of the query has been rewritten. |
The according query plan now illustrates the effects of rewriting the query on the push-down capabilities.
Query Plan - Push-Down
======= Query Plan Start =======
==== SQL Query: // 1
SELECT * FROM Comedians WHERE Age > 2
==== Query operations evaluated by TCStore Stream Query API: // 2
SELECTION(filter[Age>CONSTANT_INTEGER])
SOURCE(Comedians)
==== Code generated for TCStore Stream Query API: // 3
IntCellDefinition Age_INT = CellDefinition.defineInt("Age");
StringCellDefinition FirstName_STRING = CellDefinition.defineString("FirstName");
IntCellDefinition Weight_INT = CellDefinition.defineInt("Weight");
StringCellDefinition LastName_STRING = CellDefinition.defineString("LastName");;
reader.records()
.filter(Age_INT.exists()
.and(Age_INT.intValueOr(0)
.boxed()
.isGreaterThan(2)))
==== TCStore query plan: // 4
Stream Plan
Structure:
Portable:
PipelineOperation{FILTER((CellDefinition[name='Age' type='Type<Integer>']
.exists()&&(Age.intValueOr(0)>2)))}
Non-Portable:
PipelineOperation{MAP(de.rtm.adapters.tcstore.source.translation.operators
.OperatorTranslationResult$RecordStreamResult$$Lambda$470/28956604@fe32c2)}
PipelineOperation{ITERATOR()}
Server Plan: [stream id: a6e5eb74-47d7-4748-b260-53759d22471b]
Stream Planning Time (Nanoseconds): 10704854
Sorted Index Used In Filter: false
Filter Expression: (CellDefinition[name='Age' type='Type<Integer>']
.exists()&&(Age.valueOr(0)>2))
Unknown Filter Count: 0
Unused Filter Count And Filters (If Any): 0
Selected Plan: Full Dataset Scan
======= Query Plan End =======
1 | The query has been rewritten, no longer using the square operation. |
2 | Now the native API not only includes the source connection, but also the filter operation. |
3 | The generated code shows how the filter predicate is implemented. |
4 | The TCStore query plan also includes now the filter operation. |