BigMemory 4.4.0 | Product Documentation | BigMemory Max Developer Guide | Searching a Cache | Best Practices for Optimizing Searches
 
Best Practices for Optimizing Searches
*Construct searches by including only the data that is actually required.
*Only use includeKeys( ) and/or includeAttribute( ) if those values are required for your application logic.
*If you do not need values or attributes, be careful not to burden your queries with unnecessary work. For example, if result.getValue( ) is not called in the search results, do not use includeValues( ) in the query.
*Consider if it would be sufficient to get attributes or keys on demand. For example, instead of running a search query with includeValues( ) and then result.getValue( ), run the query for keys and include cache.get( ) for each individual key.
Note:
The includeKeys( ) and includeValues( ) methods have lazy deserialization, meaning that keys and values are deserialized only when result.getKey( ) or result.getValue( ) is called. However, calls to includeKeys() and includeValues( ) do take time, so consider carefully when constructing your queries.
*Searchable keys and values are automatically indexed by default. If you are not including them in your query, turn off automatic indexing with the following:
<cache name="cacheName" ...>
<searchable keys="false" values="false"/>
...
</searchable>
</cache>
*Limit the size of the result set. Depending on your use case, you might consider maxResults, an Aggregator, or pagination:
*If getting a subset of all the possible results quickly is more important than receiving all the results, consider using query.maxResults(int number_of_results). Sometimes maxResults is useful where the result set is ordered such that the items you want most are included within the maxResults.
*If all you want is a summary statistic, use a built-in Aggregator function, such as count(). For details, see the net.sf.ehcache.search.aggregator package in the Ehcache Javadoc at http://www.ehcache.org/apidocs/2.10.1/.
*Make your search as specific as possible.
*Queries with iLike criteria and fuzzy (wildcard) searches might take longer than more specific queries.
*If you are using a wildcard, try making it the trailing part of the string instead of the leading part ("321*" instead of "*123").
Tip:
If you want leading wildcard searches, you should create a <searchAttribute> with the string value reversed in it, so that your query can use the trailing wildcard instead.
*When possible, use the query criteria "Between" instead of "LessThan" and "GreaterThan", or "LessThanOrEqual" and "GreaterThanOrEqual". For example, instead of using le(startDate) and ge(endDate), try not(between(startDate,endDate)).
*Index dates as integers. This can save time and can also be faster if you have to do a conversion later on.
*Searches of "eventually consistent" BigMemory Max data sets are fast because queries are executed immediately, without waiting for the commit of pending transactions at the local node.
Note:
If a thread adds an element into an eventually consistent cache and immediately runs a query to fetch the element, it will not be visible in the search results until the update is published to the server.
*If you want to avoid an OutOfMemoryError while allowing your Terracotta client to receive an extremely large result set, consider using the pagination feature. Pagination limits how many of the total results appear on the client at a time, so that you can view the results in page-sized batches. Instead of calling the parameterless version of the execute method query.execute( ), pass in an ExecutionHints object that specifies the page size you want:
query.execute(new ExecutionHints().setResultBatchSize(pageSize))
If you call for results after issuing a query with ExecutionHints, all results are returned (same behavior as a regular query), except that only the number of results specified as the ResultBatchSize will appear on the client. For example, if your query would have 500 results and you use a ResultBatchSize of 100, you will still get all 500 results, but you can scroll through them in pages of 100.
You can enable search result pagination for the execution phase of a query whether the query was constructed using the Search API or BigMemory SQL.
Limitations:
*Results from GroupBy queries (created with the Query.addGroupBy() method) cannot be paginated regardless of server topology.
*In multi-stripe (active/active) Terracotta Server Array topologies, pagination is not supported for the following query types:
*Result-size capped queries with aggregate functions, for example, those constructed with Query.includeAggregator( ).maxResults( ) - with the exception that count( ) is the one aggregator that does work with all topologies.
*Queries that request result ordering, for example, those created with Query.addOrderBy( ).