Search text based on word stems.
The following construct(s) refer to this construct:
tf:stem(string $searchString) => unspecified
The function tf:stem
is specific to Tamino. It takes a
search string as argument and returns all strings that share the same stem as
the search string. It can only be used within the scope of the following
functions:
tf:containsAdjacentText
tf:containsNearText
tf:containsText
tf:createAdjacentTextReference
tf:createNearTextReference
tf:createTextReference
Determining the word tokens that have the same word stem as the search string requires language-specific information. Currently, the pre-defined stemming information is only suitable for German.
Notes:
$searchString |
a string value |
---|
In the paragraphs of some chapter, retrieve all occurrences of the German word "Bank" in the sense of a bank dealing with money:
let $text := <chapter> <para>Die Bank eröffnete drei neue Filialen im Verlauf der letzten fünf Jahre.</para> <para>Ermüdet von dem Spaziergang setzte sich die alte Dame erleichtert auf die gepflegt wirkende Bank mitten im Stadtpark.</para> <para>Die aktuelle Bilanz der Bank zeigt einen Anstieg der liquiden Mittel im Vergleich zum Vorjahresquartal.</para> </chapter> for $a in $text//para let $check := for $value in ("Geld", "Bilanz", "Filiale", "monetär", "Aktie") return tf:containsNearText($a, 10, tf:stem($value), tf:stem("Bank")) where count($check[. eq true()]) > 0 return $a
A sequence creates a word family that is valid for one of two readings
of the German word "bank". For each of these related
words it is checked whether the current paragraph contains an inflected form
that is no longer than ten unmatched word tokens apart from an inflected form
of the word "bank". The second let
clause returns a sequence of five Boolean values. If at least one of them is
true—expressed by the where
clause—the corresponding
para
element is returned as part of the result.