Search for word tokens within some distance.
The following construct(s) refer to this construct:
ft:proximity-contains(node $node, string $searchString, integer $distance, boolean $ordered) => boolean
This text retrieval function searches a node for a sequence of one or
more word tokens (passed as a string) within a specified proximity distance and
in a specified order. If the argument $node
evaluates to
the empty sequence, false
is returned.
The $distance
argument determines how far apart
the matched word tokens in the string value of the node may be. The distance is
evaluated as the maximum number of unmatched tokens between the first matched
word token and the last matched word token in $searchString
. The
function returns true, if $distance
is larger than this computed
distance. For example, a value of "1" means they
must follow immediately after one another, a value of 2 allows a gap of one
word in between etc.
With ft:proximity-contains
you can perform search operations
including the use of a wildcard character. The section
Using Wildcard
Characters in the XQuery 4 User Guide
explains this in detail.
There are no defaults defined, so you need to invoke it with all arguments. This function is bound to the namespace http://www.w3.org/2002/04/xquery-operators-text and you need to declare that namespace in the query prolog.
Note:
This function is deprecated and will be removed in future versions of
Tamino. You should use one of the functions
tf:containsText
,
tf:containsAdjacentText
or tf:containsNearText
instead. See the examples for details.
$node |
node to be searched |
---|---|
$searchString |
string containing a sequence of one or more words to be searched for |
$distance |
integer value denoting proximity distance |
$ordered |
if |
In the patient sample data, there is a remarks
element for the patient Bloggs that reads: "Patient is responding
to treatment. Dr. Shamir.".
Retrieve all patients who are responding to current treatment:
declare namespace ft="http://www.w3.org/2002/04/xquery-operators-text" for $a in input()/patient where ft:proximity-contains($a/remarks, "to treatment", 1, true()) return $a/name
This query returns all the names of all patients for which
ft:proximity-contains
returns true
, since the words
"to" and "treatment"
immediately follow after one another in that order. You can rewrite queries of
this kind with tf:containsAdjacentText
.
Note that you have to specify each of the search words as a separate
argument:
for $a in input()/patient where tf:containsAdjacentText($a/remarks, 1, "to", "treatment") return $a/name
Retrieve all patients who are responding to current treatment:
declare namespace ft="http://www.w3.org/2002/04/xquery-operators-text" for $a in input()/patient where ft:proximity-contains($a/remarks, "treatment responding", 2, false()) return $a/name
This query returns all the names of all patients for which
ft:proximity-contains
returns true
. In the text
contents of the remarks
node, the word tokens
"treatment" and
"responding" may have at most one word token in
between and appear in either order. You can rewrite queries of this kind with
tf:containsNearText
.
Note that you have to specify each of the search words as a separate
argument:
declare namespace ft="http://www.w3.org/2002/04/xquery-operators-text" for $a in input()/patient where tf:containsNearText($a/remarks, 2, "treatment", "to") return $a/name
Check for each patient if there is the word "treatment" in the remarks.
declare namespace ft="http://www.w3.org/2002/04/xquery-operators-text" for $a in input()/patient return ($a/name, ft:proximity-contains($a/remarks, "treatment", 0, true()))
This form of ft:proximity-contains
degenerates to a simple
text search: a distance of 0 allows no delimiters within the
$searchString
restricting it to a single word token. The order has
no effect here. You should use tf:containsText
instead:
for $a in input()/patient return ($a/name, tf:containsText($a/remarks, "treatment"))