This commit is contained in:
Jean-Francois Dockes 2021-07-09 09:25:24 +02:00
parent e6055681d4
commit 03c1e4ce8a
2 changed files with 615 additions and 556 deletions

View File

@ -257,20 +257,25 @@ alink="#0000FF">
<dd> <dd>
<dl> <dl>
<dt><span class="sect2">3.5.1. <a href= <dt><span class="sect2">3.5.1. <a href=
"#RCL.SEARCH.LANG.SYNTAX">General
syntax</a></span></dt>
<dt><span class="sect2">3.5.2. <a href=
"#RCL.SEARCH.LANG.SPECIALFIELDS">Special field-like
specifiers</a></span></dt>
<dt><span class="sect2">3.5.3. <a href=
"#RCL.SEARCH.LANG.RANGES">Range "#RCL.SEARCH.LANG.RANGES">Range
clauses</a></span></dt> clauses</a></span></dt>
<dt><span class="sect2">3.5.2. <a href= <dt><span class="sect2">3.5.4. <a href=
"#RCL.SEARCH.LANG.MODIFIERS">Modifiers</a></span></dt> "#RCL.SEARCH.LANG.MODIFIERS">Modifiers</a></span></dt>
</dl> </dl>
</dd> </dd>
<dt><span class="sect1">3.6. <a href= <dt><span class="sect1">3.6. <a href=
"#RCL.SEARCH.ANCHORWILD">Anchored searches and "#RCL.SEARCH.ANCHORWILD">Wildcards and anchored
wildcards</a></span></dt> searches</a></span></dt>
<dd> <dd>
<dl> <dl>
<dt><span class="sect2">3.6.1. <a href= <dt><span class="sect2">3.6.1. <a href=
"#RCL.SEARCH.WILDCARDS">More about "#RCL.SEARCH.WILDCARDS">Wildcards</a></span></dt>
wildcards</a></span></dt>
<dt><span class="sect2">3.6.2. <a href= <dt><span class="sect2">3.6.2. <a href=
"#RCL.SEARCH.ANCHOR">Anchored "#RCL.SEARCH.ANCHOR">Anchored
searches</a></span></dt> searches</a></span></dt>
@ -423,7 +428,7 @@ alink="#0000FF">
<div class="list-of-tables"> <div class="list-of-tables">
<p><b>List of Tables</b></p> <p><b>List of Tables</b></p>
<dl> <dl>
<dt>3.1. <a href="#idm1471">Keyboard shortcuts</a></dt> <dt>3.1. <a href="#idm1472">Keyboard shortcuts</a></dt>
</dl> </dl>
</div> </div>
<div class="chapter"> <div class="chapter">
@ -2133,22 +2138,23 @@ metadatacmds = ; <em class=
language, through any of its aliases: <em class= language, through any of its aliases: <em class=
"replaceable"><code>tags:some/alternate/values</code></em> "replaceable"><code>tags:some/alternate/values</code></em>
or <em class= or <em class=
"replaceable"><code>tags:all,these,values</code></em> (the "replaceable"><code>tags:all,these,values</code></em>. The
compact field search syntax is supported for recoll 1.20 compact comma- or slash-based field search syntax is
and later. For older versions, you would need to repeat the supported for recoll 1.20 and later. For older versions,
<em class="replaceable"><code>tags:</code></em> specifier you would need to repeat the <em class=
for each term, e.g. <em class= "replaceable"><code>tags:</code></em> specifier for each
term, e.g. <em class=
"replaceable"><code>tags:some</code></em> <code class= "replaceable"><code>tags:some</code></em> <code class=
"literal">OR</code> <em class= "literal">OR</code> <em class=
"replaceable"><code>tags:alternate</code></em>).</p> "replaceable"><code>tags:alternate</code></em>.</p>
<p>Tags changes will not be detected by the indexer if the <p>Tags changes will not be detected by the indexer if the
file itself did not change. One possible workaround would file itself did not change. One possible workaround would
be to update the file <code class="literal">ctime</code> be to update the file <code class="literal">ctime</code>
when you modify the tags, which would be consistent with when you modify the tags, which would be consistent with
how extended attributes function. A pair of <span class= how extended attributes function. A pair of <span class=
"command"><strong>chmod</strong></span> commands could "command"><strong>chmod</strong></span> commands could
accomplish this, or a <code class="literal">touch -a</code> accomplish this, or a <code class="literal">touch
. Alternatively, just couple the tag update with a -a</code>. Alternatively, just couple the tag update with a
<code class="literal">recollindex -e -i</code> <em class= <code class="literal">recollindex -e -i</code> <em class=
"replaceable"><code>/path/to/the/file</code></em>.</p> "replaceable"><code>/path/to/the/file</code></em>.</p>
</div> </div>
@ -2771,11 +2777,16 @@ fs.inotify.max_user_watches=32768
documents containing all your input terms.</p> documents containing all your input terms.</p>
</li> </li>
<li class="listitem"> <li class="listitem">
<p><code class="literal">Query Language</code> mode <p>The <code class="literal">Query Language</code>
behaves like <code class="literal">All Terms</code> mode behaves like <code class="literal">All
in the absence of special input, but it can also do Terms</code> in the absence of special input, but it
much more. This is the best mode for getting the most can also do much more. This is the best mode for
of <span class="application">Recoll</span>.</p> getting the most of <span class=
"application">Recoll</span>. It is usable from all
possible interfaces (GUI, command line, WEB UI, ...),
and is <a class="link" href="#RCL.SEARCH.LANG" title=
"3.5.&nbsp;The query language">described
here</a>.</p>
</li> </li>
<li class="listitem"> <li class="listitem">
<p>In <code class="literal">Any Term</code> mode, <p>In <code class="literal">Any Term</code> mode,
@ -2906,8 +2917,8 @@ fs.inotify.max_user_watches=32768
<code class="literal">?</code>, <code class= <code class="literal">?</code>, <code class=
"literal">[]</code>). See the <a class="link" href= "literal">[]</code>). See the <a class="link" href=
"#RCL.SEARCH.WILDCARDS" title= "#RCL.SEARCH.WILDCARDS" title=
"3.6.1.&nbsp;More about wildcards">section about "3.6.1.&nbsp;Wildcards">section about wildcards</a> for
wildcards</a> for more details.</p> more details.</p>
<p>In all modes except <span class="guilabel">File <p>In all modes except <span class="guilabel">File
name</span>, you can search for exact phrases (adjacent name</span>, you can search for exact phrases (adjacent
words in a given order) by enclosing the input inside words in a given order) by enclosing the input inside
@ -2964,9 +2975,9 @@ fs.inotify.max_user_watches=32768
complex searches.</p> complex searches.</p>
<p>The <span class="guilabel">File name</span> search <p>The <span class="guilabel">File name</span> search
mode will specifically look for file names. The point of mode will specifically look for file names. The point of
having a separate file name search is that wild card having a separate file name search is that wildcard
expansion can be performed more efficiently on a small expansion can be performed more efficiently on a small
subset of the index (allowing wild cards on the left of subset of the index (allowing wildcards on the left of
terms without excessive cost). Things to know:</p> terms without excessive cost). Things to know:</p>
<div class="itemizedlist"> <div class="itemizedlist">
<ul class="itemizedlist" style= <ul class="itemizedlist" style=
@ -2981,7 +2992,7 @@ fs.inotify.max_user_watches=32768
accents, independently of the type of index.</p> accents, independently of the type of index.</p>
</li> </li>
<li class="listitem"> <li class="listitem">
<p>An entry without any wild card character and not <p>An entry without any wildcard character and not
capitalized will be prepended and appended with '*' capitalized will be prepended and appended with '*'
(ie: <em class="replaceable"><code>etc</code></em> (ie: <em class="replaceable"><code>etc</code></em>
-&gt; <em class= -&gt; <em class=
@ -3841,7 +3852,7 @@ fs.inotify.max_user_watches=32768
<em class="replaceable"><code>xapi</code></em>. <em class="replaceable"><code>xapi</code></em>.
(More about wildcards <a class="link" href= (More about wildcards <a class="link" href=
"#RCL.SEARCH.WILDCARDS" title= "#RCL.SEARCH.WILDCARDS" title=
"3.6.1.&nbsp;More about wildcards">here</a> ).</p> "3.6.1.&nbsp;Wildcards">here</a> ).</p>
</dd> </dd>
<dt><span class="term">Regular expression</span></dt> <dt><span class="term">Regular expression</span></dt>
<dd> <dd>
@ -4064,7 +4075,7 @@ fs.inotify.max_user_watches=32768
given context (e.g. within a preview window, within the given context (e.g. within a preview window, within the
result table).</p> result table).</p>
<div class="table"> <div class="table">
<a name="idm1471" id="idm1471"></a> <a name="idm1472" id="idm1472"></a>
<p class="title"><b>Table&nbsp;3.1.&nbsp;Keyboard <p class="title"><b>Table&nbsp;3.1.&nbsp;Keyboard
shortcuts</b></p> shortcuts</b></p>
<div class="table-contents"> <div class="table-contents">
@ -4291,8 +4302,7 @@ fs.inotify.max_user_watches=32768
<p><b>Wildcards.&nbsp;</b>Wildcards can be used inside <p><b>Wildcards.&nbsp;</b>Wildcards can be used inside
search terms in all forms of searches. <a class="link" search terms in all forms of searches. <a class="link"
href="#RCL.SEARCH.WILDCARDS" title= href="#RCL.SEARCH.WILDCARDS" title=
"3.6.1.&nbsp;More about wildcards">More about "3.6.1.&nbsp;Wildcards">More about wildcards</a>.</p>
wildcards</a>.</p>
<p><b>Automatic suffixes.&nbsp;</b>Words like <p><b>Automatic suffixes.&nbsp;</b>Words like
<code class="literal">odt</code> or <code class= <code class="literal">odt</code> or <code class=
"literal">ods</code> can be automatically turned into "literal">ods</code> can be automatically turned into
@ -4361,7 +4371,7 @@ fs.inotify.max_user_watches=32768
Example: "user manual"p would also match "manual user". Example: "user manual"p would also match "manual user".
Also see <a class="link" href= Also see <a class="link" href=
"#RCL.SEARCH.LANG.MODIFIERS" title= "#RCL.SEARCH.LANG.MODIFIERS" title=
"3.5.2.&nbsp;Modifiers">the modifier section</a> from "3.5.4.&nbsp;Modifiers">the modifier section</a> from
the query language documentation.</p> the query language documentation.</p>
<p><b>AutoPhrases.&nbsp;</b>This option can be set in <p><b>AutoPhrases.&nbsp;</b>This option can be set in
the preferences dialog. If it is set, a phrase will be the preferences dialog. If it is set, a phrase will be
@ -5213,389 +5223,447 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
</div> </div>
</div> </div>
</div> </div>
<p>The <span class="application">Recoll</span> query
language was based on the now defunct <a class="ulink"
href="http://www.xesam.org/main/XesamUserSearchLanguage95"
target="_top">Xesam</a> user search language specification.
It allows defining general boolean searches within the main
body text or specific fields, and has many additional
features, broadly equivalent to those provided by
<span class="emphasis"><em>complex search</em></span>
interface in the GUI.</p>
<p>The query language processor is activated in the GUI <p>The query language processor is activated in the GUI
simple search entry when the search mode selector is set to simple search entry when the search mode selector is set to
<span class="guilabel">Query Language</span>. It can also <code class="literal">Query Language</code>. It can also be
be used with the KIO slave or the command line search. It used from the command line search, the KIO slave, or the
broadly has the same capabilities as the complex search WEB UI.</p>
interface in the GUI.</p>
<p>The language was based on the now defunct <a class=
"ulink" href=
"http://www.xesam.org/main/XesamUserSearchLanguage95"
target="_top">Xesam</a> user search language
specification.</p>
<p>If the results of a query language search puzzle you and <p>If the results of a query language search puzzle you and
you doubt what has been actually searched for, you can use you doubt what has been actually searched for, you can use
the GUI <code class="literal">Show Query</code> link at the the GUI <code class="literal">Show Query</code> link at the
top of the result list to check the exact query which was top of the result list to check the exact query which was
finally executed by Xapian.</p> finally executed by Xapian.</p>
<p>Here follows a sample request that we are going to <div class="sect2">
explain:</p> <div class="titlepage">
<pre class="programlisting"> <div>
<div>
<h3 class="title"><a name="RCL.SEARCH.LANG.SYNTAX"
id="RCL.SEARCH.LANG.SYNTAX"></a>3.5.1.&nbsp;General
syntax</h3>
</div>
</div>
</div>
<p>Here follows a sample request that we are going to
explain:</p>
<pre class="programlisting">
author:"john doe" Beatles OR Lennon Live OR Unplugged -potatoes author:"john doe" Beatles OR Lennon Live OR Unplugged -potatoes
</pre> </pre>
<p>This would search for all documents with <em class= <p>This would search for all documents with <em class=
"replaceable"><code>John Doe</code></em> appearing as a "replaceable"><code>John Doe</code></em> appearing as a
phrase in the author field (exactly what this is would phrase in the author field (exactly what this is would
depend on the document type, ie: the <code class= depend on the document type, ie: the <code class=
"literal">From:</code> header, for an email message), and "literal">From:</code> header, for an email message), and
containing either <em class= containing either <em class=
"replaceable"><code>beatles</code></em> or <em class= "replaceable"><code>beatles</code></em> or <em class=
"replaceable"><code>lennon</code></em> and either "replaceable"><code>lennon</code></em> and either
<em class="replaceable"><code>live</code></em> or <em class="replaceable"><code>live</code></em> or
<em class="replaceable"><code>unplugged</code></em> but not <em class="replaceable"><code>unplugged</code></em> but
<em class="replaceable"><code>potatoes</code></em> (in any not <em class="replaceable"><code>potatoes</code></em>
part of the document).</p> (in any part of the document).</p>
<p>An element is composed of an optional field <p>An element is composed of an optional field
specification, and a value, separated by a colon (the field specification, and a value, separated by a colon (the
separator is the last colon in the element). Examples: field separator is the last colon in the element).
<em class="replaceable"><code>Eugenie</code></em>, Examples:</p>
<em class="replaceable"><code>author:balzac</code></em>, <div class="itemizedlist">
<em class="replaceable"><code>dc:title:grandet</code></em> <ul class="itemizedlist" style=
<em class="replaceable"><code>dc:title:"eugenie "list-style-type: disc;">
grandet"</code></em></p> <li class="listitem"><em class=
<p>The colon, if present, means "contains". Xesam defines "replaceable"><code>Eugenie</code></em></li>
other relations, which are mostly unsupported for now <li class="listitem"><em class=
(except in special cases, described further down).</p> "replaceable"><code>author:balzac</code></em></li>
<p>All elements in the search entry are normally combined <li class="listitem"><em class=
with an implicit AND. It is possible to specify that "replaceable"><code>dc:title:grandet</code></em></li>
elements be OR'ed instead, as in <em class= <li class="listitem"><em class=
"replaceable"><code>Beatles</code></em> <code class= "replaceable"><code>dc:title:"eugenie
"literal">OR</code> <em class= grandet"</code></em></li>
"replaceable"><code>Lennon</code></em>. The <code class= </ul>
"literal">OR</code> must be entered literally (capitals), </div>
and it has priority over the AND associations: <em class= <p>The colon, if present, means "contains". Xesam defines
"replaceable"><code>word1</code></em> <em class= other relations, which are mostly unsupported for now
"replaceable"><code>word2</code></em> <code class= (except in special cases, described further down).</p>
"literal">OR</code> <em class= <p>All elements in the search entry are normally combined
"replaceable"><code>word3</code></em> means <em class= with an implicit AND. It is possible to specify that
"replaceable"><code>word1</code></em> AND (<em class= elements be OR'ed instead, as in <em class=
"replaceable"><code>word2</code></em> <code class= "replaceable"><code>Beatles</code></em> <code class=
"literal">OR</code> <em class= "literal">OR</code> <em class=
"replaceable"><code>word3</code></em>) not (<em class= "replaceable"><code>Lennon</code></em>. The <code class=
"replaceable"><code>word1</code></em> AND <em class= "literal">OR</code> must be entered literally (capitals),
"replaceable"><code>word2</code></em>) <code class= and it has priority over the AND associations: <em class=
"literal">OR</code> <em class= "replaceable"><code>word1</code></em> <em class=
"replaceable"><code>word3</code></em>.</p> "replaceable"><code>word2</code></em> <code class=
<p><span class="application">Recoll</span> versions 1.21 "literal">OR</code> <em class=
and later, allow using parentheses to group elements, which "replaceable"><code>word3</code></em> means <em class=
will sometimes make things clearer, and may allow "replaceable"><code>word1</code></em> AND (<em class=
expressing combinations which would have been difficult "replaceable"><code>word2</code></em> <code class=
otherwise.</p> "literal">OR</code> <em class=
<p>An element preceded by a <code class="literal">-</code> "replaceable"><code>word3</code></em>) not (<em class=
specifies a term that should <span class= "replaceable"><code>word1</code></em> AND <em class=
"emphasis"><em>not</em></span> appear.</p> "replaceable"><code>word2</code></em>) <code class=
<p>As usual, words inside quotes define a phrase (the order "literal">OR</code> <em class=
of words is significant), so that <em class= "replaceable"><code>word3</code></em>.</p>
"replaceable"><code>title:"prejudice pride"</code></em> is <p>You can use parentheses to group elements (from
not the same as <em class= version 1.21), which will sometimes make things clearer,
"replaceable"><code>title:prejudice and may allow expressing combinations which would have
title:pride</code></em>, and is unlikely to find a been difficult otherwise.</p>
result.</p> <p>An element preceded by a <code class=
<p>Words inside phrases and capitalized words are not "literal">-</code> specifies a term that should
stem-expanded. Wildcards may be used anywhere inside a <span class="emphasis"><em>not</em></span> appear.</p>
term. Specifying a wild-card on the left of a term can <p>As usual, words inside quotes define a phrase (the
produce a very slow search (or even an incorrect one if the order of words is significant), so that <em class=
expansion is truncated because of excessive size). Also see "replaceable"><code>title:"prejudice pride"</code></em>
<a class="link" href="#RCL.SEARCH.WILDCARDS" title= is not the same as <em class=
"3.6.1.&nbsp;More about wildcards">More about "replaceable"><code>title:prejudice
wildcards</a>.</p> title:pride</code></em>, and is unlikely to find a
<p>To save you some typing, recent <span class= result.</p>
"application">Recoll</span> versions (1.20 and later) <p>Words inside phrases and capitalized words are not
interpret a comma-separated list of terms for a field as an stem-expanded. Wildcards may be used anywhere inside a
AND list inside the field. Use slash characters ('/') for term. Specifying a wildcard on the left of a term can
an OR list. No white space is allowed. So</p> produce a very slow search (or even an incorrect one if
<pre class="programlisting">author:john,lennon</pre> the expansion is truncated because of excessive size).
<p>will search for documents with <code class= Also see <a class="link" href="#RCL.SEARCH.WILDCARDS"
"literal">john</code> and <code class= title="3.6.1.&nbsp;Wildcards">More about
"literal">lennon</code> inside the <code class= wildcards</a>.</p>
"literal">author</code> field (in any order), and</p> <p>To save you some typing, <span class=
<pre class="programlisting">author:john/ringo</pre> "application">Recoll</span> versions 1.20 and later
<p>would search for <code class="literal">john</code> or interpret a field value given as a comma-separated list
<code class="literal">ringo</code>. This behaviour only of terms as an AND list and a slash-separated list as an
happens for field queries (input without a field, comma- or OR list. No white space is allowed. So</p>
slash- separated input will produce a phrase search). You <pre class="programlisting">author:john,lennon</pre>
can use a <code class="literal">text</code> field name to <p>will search for documents with <code class=
search the main text this way.</p> "literal">john</code> and <code class=
<p>Modifiers can be set on a double-quote value, for "literal">lennon</code> inside the <code class=
example to specify a proximity search (unordered). See "literal">author</code> field (in any order), and</p>
<a class="link" href="#RCL.SEARCH.LANG.MODIFIERS" title= <pre class="programlisting">author:john/ringo</pre>
"3.5.2.&nbsp;Modifiers">the modifier section</a>. No space <p>would search for <code class="literal">john</code> or
must separate the final double-quote and the modifiers <code class="literal">ringo</code>. This behaviour is
value, e.g. <em class="replaceable"><code>"two only triggered by a field prefix: without it, comma- or
one"po10</code></em></p> slash- separated input will produce a phrase search.
<p><span class="application">Recoll</span> currently However, you can use a <code class="literal">text</code>
manages the following default fields:</p> field name to search the main text this way, as an
<div class="itemizedlist"> alternate to using an explicit <code class=
<ul class="itemizedlist" style="list-style-type: disc;"> "literal">OR</code>, e.g. <code class=
<li class="listitem"> "literal">text:napoleon/bonaparte</code> would generate a
<p><code class="literal">title</code>, <code class= search for <em class=
"literal">subject</code> or <code class= "replaceable"><code>napoleon</code></em> or <em class=
"literal">caption</code> are synonyms which specify "replaceable"><code>bonaparte</code></em> in the main
data to be searched for in the document title or text body.</p>
subject.</p> <p>Modifiers can be set on a double-quote value, for
</li> example to specify a proximity search (unordered). See
<li class="listitem"> <a class="link" href="#RCL.SEARCH.LANG.MODIFIERS" title=
<p><code class="literal">author</code> or "3.5.4.&nbsp;Modifiers">the modifier section</a>. No
<code class="literal">from</code> for searching the space must separate the final double-quote and the
documents originators.</p> modifiers value, e.g. <em class="replaceable"><code>"two
</li> one"po10</code></em></p>
<li class="listitem"> <p><span class="application">Recoll</span> currently
<p><code class="literal">recipient</code> or manages the following default fields:</p>
<code class="literal">to</code> for searching the <div class="itemizedlist">
documents recipients.</p> <ul class="itemizedlist" style=
</li> "list-style-type: disc;">
<li class="listitem"> <li class="listitem">
<p><code class="literal">keyword</code> for searching <p><code class="literal">title</code>, <code class=
the document-specified keywords (few documents "literal">subject</code> or <code class=
actually have any).</p> "literal">caption</code> are synonyms which specify
</li> data to be searched for in the document title or
<li class="listitem"> subject.</p>
<p><code class="literal">filename</code> for the </li>
document's file name. This is not necessarily set for <li class="listitem">
all documents: internal documents contained inside a <p><code class="literal">author</code> or
compound one (for example an EPUB section) do not <code class="literal">from</code> for searching the
inherit the container file name any more, this was documents originators.</p>
replaced by an explicit field (see next). </li>
Sub-documents can still have a specific <code class= <li class="listitem">
"literal">filename</code>, if it is implied by the <p><code class="literal">recipient</code> or
document format, for example the attachment file name <code class="literal">to</code> for searching the
for an email attachment.</p> documents recipients.</p>
</li> </li>
<li class="listitem"> <li class="listitem">
<p><code class="literal">containerfilename</code>. <p><code class="literal">keyword</code> for
This is set for all documents, both top-level and searching the document-specified keywords (few
contained sub-documents, and is always the name of documents actually have any).</p>
the filesystem directory entry which contains the </li>
data. The terms from this field can only be matched <li class="listitem">
by an explicit field specification (as opposed to <p><code class="literal">filename</code> for the
terms from <code class="literal">filename</code> document's file name. You can use the shorter
which are also indexed as general document content). <code class="literal">fn</code> alias. This value
This avoids getting matches for all the sub-documents is not set for all documents: internal documents
when searching for the container file name.</p> contained inside a compound one (for example an
</li> EPUB section) do not inherit the container file
<li class="listitem"> name any more, this was replaced by an explicit
<p><code class="literal">ext</code> specifies the field (see next). Sub-documents can still have a
file name extension (Ex: <code class= <code class="literal">filename</code>, if it is
"literal">ext:html</code>).</p> implied by the document format, for example the
</li> attachment file name for an email attachment.</p>
<li class="listitem"> </li>
<p><code class="literal">rclmd5</code> the MD5 <li class="listitem">
checksum for the document. This is used for <p><code class="literal">containerfilename</code>,
displaying the duplicates of a search result (when aliased as <code class="literal">cfn</code>. This
querying with the option to collapse duplicate is set for all documents, both top-level and
results). Incidentally, this could be used to find contained sub-documents, and is always the name of
the duplicates of any given file by computing its MD5 the filesystem file which contains the data. The
checksum and executing a query with just the terms from this field can only be matched by an
<code class="literal">rclmd5</code> value.</p> explicit field specification (as opposed to terms
</li> from <code class="literal">filename</code> which
</ul> are also indexed as general document content). This
avoids getting matches for all the sub-documents
when searching for the container file name.</p>
</li>
<li class="listitem">
<p><code class="literal">ext</code> specifies the
file name extension (Ex: <code class=
"literal">ext:html</code>).</p>
</li>
<li class="listitem">
<p><code class="literal">rclmd5</code> the MD5
checksum for the document. This is used for
displaying the duplicates of a search result (when
querying with the option to collapse duplicate
results). Incidentally, this could be used to find
the duplicates of any given file by computing its
MD5 checksum and executing a query with just the
<code class="literal">rclmd5</code> value.</p>
</li>
</ul>
</div>
<p>You can define aliases for field names, in order to
use your preferred denomination or to save typing (e.g.
the predefined <code class="literal">fn</code> and
<code class="literal">cfn</code> aliases defined for
<code class="literal">filename</code> and <code class=
"literal">containerfilename</code>). See the <a class=
"link" href="#RCL.INSTALL.CONFIG.FIELDS" title=
"5.4.3.&nbsp;The fields file">section about the
<code class="filename">fields</code> file</a>.</p>
<p>The document input handlers have the possibility to
create other fields with arbitrary names, and aliases may
be defined in the configuration, so that the exact field
search possibilities may be different for you if someone
took care of the customisation.</p>
</div> </div>
<p><span class="application">Recoll</span> 1.20 and later <div class="sect2">
have a way to specify aliases for the field names, which <div class="titlepage">
will save typing, for example by aliasing <code class= <div>
"literal">filename</code> to <em class= <div>
"replaceable"><code>fn</code></em> or <code class= <h3 class="title"><a name=
"literal">containerfilename</code> to <em class= "RCL.SEARCH.LANG.SPECIALFIELDS" id=
"replaceable"><code>cfn</code></em>. See the <a class= "RCL.SEARCH.LANG.SPECIALFIELDS"></a>3.5.2.&nbsp;Special
"link" href="#RCL.INSTALL.CONFIG.FIELDS" title= field-like specifiers</h3>
"5.4.3.&nbsp;The fields file">section about the
<code class="filename">fields</code> file</a>.</p>
<p>The document input handlers used while indexing have the
possibility to create other fields with arbitrary names,
and aliases may be defined in the configuration, so that
the exact field search possibilities may be different for
you if someone took care of the customisation.</p>
<p>The field syntax also supports a few field-like, but
special, criteria:</p>
<div class="itemizedlist">
<ul class="itemizedlist" style="list-style-type: disc;">
<li class="listitem">
<p><code class="literal">dir</code> for filtering the
results on file location (Ex: <code class=
"literal">dir:/home/me/somedir</code>). <code class=
"literal">-dir</code> also works to find results not
in the specified directory (release &gt;= 1.15.8).
Tilde expansion will be performed as usual (except
for a bug in versions 1.19 to 1.19.11p1). Wildcards
will be expanded, but please <a class="link" href=
"#RCL.SEARCH.WILDCARDS.PATH" title=
"Wildcards and path filtering">have a look</a> at an
important limitation of wildcards in path
filters.</p>
<p>Relative paths also make sense, for example,
<code class="literal">dir:share/doc</code> would
match either <code class=
"filename">/usr/share/doc</code> or <code class=
"filename">/usr/local/share/doc</code></p>
<p>Several <code class="literal">dir</code> clauses
can be specified, both positive and negative. For
example the following makes sense:</p>
<pre class="programlisting">
dir:recoll dir:src -dir:utils -dir:common
</pre>
<p>This would select results which have both
<code class="filename">recoll</code> and <code class=
"filename">src</code> in the path (in any order), and
which have not either <code class=
"filename">utils</code> or <code class=
"filename">common</code>.</p>
<p>You can also use <code class="literal">OR</code>
conjunctions with <code class="literal">dir:</code>
clauses.</p>
<p>A special aspect of <code class=
"literal">dir</code> clauses is that the values in
the index are not transcoded to UTF-8, and never
lower-cased or unaccented, but stored as binary. This
means that you need to enter the values in the exact
lower or upper case, and that searches for names with
diacritics may sometimes be impossible because of
character set conversion issues. Non-ASCII UNIX file
paths are an unending source of trouble and are best
avoided.</p>
<p>You need to use double-quotes around the path
value if it contains space characters.</p>
</li>
<li class="listitem">
<p><code class="literal">size</code> for filtering
the results on file size. Example: <code class=
"literal">size&lt;10000</code>. You can use
<code class="literal">&lt;</code>, <code class=
"literal">&gt;</code> or <code class=
"literal">=</code> as operators. You can specify a
range like the following: <code class=
"literal">size&gt;100 size&lt;1000</code>. The usual
<code class="literal">k/K, m/M, g/G, t/T</code> can
be used as (decimal) multipliers. Ex: <code class=
"literal">size&gt;1k</code> to search for files
bigger than 1000 bytes.</p>
</li>
<li class="listitem">
<p><code class="literal">date</code> for searching or
filtering on dates. The syntax for the argument is
based on the ISO8601 standard for dates and time
intervals. Only dates are supported, no times. The
general syntax is 2 elements separated by a
<code class="literal">/</code> character. Each
element can be a date or a period of time. Periods
are specified as <code class=
"literal">P</code><em class=
"replaceable"><code>n</code></em><code class=
"literal">Y</code><em class=
"replaceable"><code>n</code></em><code class=
"literal">M</code><em class=
"replaceable"><code>n</code></em><code class=
"literal">D</code>. The <em class=
"replaceable"><code>n</code></em> numbers are the
respective numbers of years, months or days, any of
which may be missing. Dates are specified as
<em class=
"replaceable"><code>YYYY</code></em>-<em class=
"replaceable"><code>MM</code></em>-<em class=
"replaceable"><code>DD</code></em>. The days and
months parts may be missing. If the <code class=
"literal">/</code> is present but an element is
missing, the missing element is interpreted as the
lowest or highest date in the index. Examples:</p>
<div class="itemizedlist">
<ul class="itemizedlist" style=
"list-style-type: circle;">
<li class="listitem">
<p><code class=
"literal">2001-03-01/2002-05-01</code> the
basic syntax for an interval of dates.</p>
</li>
<li class="listitem">
<p><code class=
"literal">2001-03-01/P1Y2M</code> the same
specified with a period.</p>
</li>
<li class="listitem">
<p><code class="literal">2001/</code> from the
beginning of 2001 to the latest date in the
index.</p>
</li>
<li class="listitem">
<p><code class="literal">2001</code> the whole
year of 2001</p>
</li>
<li class="listitem">
<p><code class="literal">P2D/</code> means 2
days ago up to now if there are no documents
with dates in the future.</p>
</li>
<li class="listitem">
<p><code class="literal">/2003</code> all
documents from 2003 or older.</p>
</li>
</ul>
</div> </div>
<p>Periods can also be specified with small letters </div>
(ie: p2y).</p> </div>
</li> <p>The field syntax also supports a few field-like, but
<li class="listitem"> special, criteria, for which the values are interpreted
<p><code class="literal">mime</code> or <code class= differently. Regular processing does not apply (for
"literal">format</code> for specifying the MIME type. example the slash- or comma- separated lists don't work).
These clauses are processed besides the normal A list follows.</p>
Boolean logic of the search. Multiple values will be <div class="itemizedlist">
OR'ed (instead of the normal AND). You can specify <ul class="itemizedlist" style=
types to be excluded, with the usual <code class= "list-style-type: disc;">
"literal">-</code>, and use wildcards. Example: <li class="listitem">
<em class="replaceable"><code>mime:text/* <p><a name="RCL.SEARCH.LANG.SPECIALFIELDS.DIR" id=
-mime:text/plain</code></em> Specifying an explicit "RCL.SEARCH.LANG.SPECIALFIELDS.DIR"></a><code class="literal">dir</code>
boolean operator before a <code class= for filtering the results on file location. For
"literal">mime</code> specification is not supported example, <code class=
and will produce strange results.</p> "literal">dir:/home/me/somedir</code> will restrict
</li> the search to results found anywhere under the
<li class="listitem"> <em class=
<p><code class="literal">type</code> or <code class= "replaceable"><code>/home/me/somedir</code></em>
"literal">rclcat</code> for specifying the category directory (including subdirectories).</p>
(as in text/media/presentation/etc.). The <p>Tilde expansion will be performed as usual.
classification of MIME types in categories is defined Wildcards will be expanded, but please <a class=
in the <span class="application">Recoll</span> "link" href="#RCL.SEARCH.WILDCARDS.PATH" title=
configuration (<code class= "Wildcards and path filtering">have a look</a> at
"filename">mimeconf</code>), and can be modified or an important limitation of wildcards in path
extended. The default category names are those which filters.</p>
permit filtering results in the main GUI screen. <p>You can also use relative paths. For example,
Categories are OR'ed like MIME types above, and can <code class="literal">dir:share/doc</code> would
be negated with <code class="literal">-</code>.</p> match either <code class=
</li> "filename">/usr/share/doc</code> or <code class=
<li class="listitem"> "filename">/usr/local/share/doc</code>.</p>
<p><code class="literal">issub</code> for specifying <p><code class="literal">-dir</code> will find
that only standalone (<code class= results <span class="emphasis"><em>not</em></span>
"literal">issub:0</code>) or only embedded in the specified location.</p>
(<code class="literal">issub:1</code>) documents <p>Several <code class="literal">dir</code> clauses
should be returned as results.</p> can be specified, both positive and negative. For
</li> example the following makes sense:</p>
</ul> <pre class=
</div> "programlisting">dir:recoll dir:src -dir:utils -dir:common</pre>
<div class="note" style= <p>This would select results which have both
"margin-left: 0.5in; margin-right: 0.5in;"> <code class="filename">recoll</code> and
<h3 class="title">Note</h3> <code class="filename">src</code> in the path (in
<p><code class="literal">mime</code>, <code class= any order), and which have not either <code class=
"literal">rclcat</code>, <code class= "filename">utils</code> or <code class=
"literal">size</code>, <code class="literal">issub</code> "filename">common</code>.</p>
and <code class="literal">date</code> criteria always <p>You can also use <code class="literal">OR</code>
affect the whole query (they are applied as a final conjunctions with <code class="literal">dir:</code>
filter), even if set with other terms inside a clauses.</p>
parenthese.</p> <p>A special aspect of <code class=
</div> "literal">dir</code> clauses is that the values in
<div class="note" style= the index are not transcoded to UTF-8, and never
"margin-left: 0.5in; margin-right: 0.5in;"> lower-cased or unaccented, but stored as binary.
<h3 class="title">Note</h3> This means that you need to enter the values in the
<p><code class="literal">mime</code> (or the equivalent exact lower or upper case, and that searches for
<code class="literal">rclcat</code>) is the <span class= names with diacritics may sometimes be impossible
"emphasis"><em>only</em></span> field with an because of character set conversion issues.
<code class="literal">OR</code> default. You do need to Non-ASCII UNIX file paths are an unending source of
use <code class="literal">OR</code> with <code class= trouble and are best avoided.</p>
"literal">ext</code> terms for example.</p> <p>You need to use double-quotes around the path
value if it contains space characters.</p>
<p>The shortcut syntax to define OR or AND lists
within fields with commas or slash characters is
not available.</p>
</li>
<li class="listitem">
<p><code class="literal">size</code> for filtering
the results on file size. Example: <code class=
"literal">size&lt;10000</code>. You can use
<code class="literal">&lt;</code>, <code class=
"literal">&gt;</code> or <code class=
"literal">=</code> as operators. You can specify a
range like the following: <code class=
"literal">size&gt;100 size&lt;1000</code>. The
usual <code class="literal">k/K, m/M, g/G,
t/T</code> can be used as (decimal) multipliers.
Ex: <code class="literal">size&gt;1k</code> to
search for files bigger than 1000 bytes.</p>
</li>
<li class="listitem">
<p><code class="literal">date</code> for searching
or filtering on dates. The syntax for the argument
is based on the ISO8601 standard for dates and time
intervals. Only dates are supported, no times. The
general syntax is 2 elements separated by a
<code class="literal">/</code> character. Each
element can be a date or a period of time. Periods
are specified as <code class=
"literal">P</code><em class=
"replaceable"><code>n</code></em><code class=
"literal">Y</code><em class=
"replaceable"><code>n</code></em><code class=
"literal">M</code><em class=
"replaceable"><code>n</code></em><code class=
"literal">D</code>. The <em class=
"replaceable"><code>n</code></em> numbers are the
respective numbers of years, months or days, any of
which may be missing. Dates are specified as
<em class=
"replaceable"><code>YYYY</code></em>-<em class=
"replaceable"><code>MM</code></em>-<em class=
"replaceable"><code>DD</code></em>. The days and
months parts may be missing. If the <code class=
"literal">/</code> is present but an element is
missing, the missing element is interpreted as the
lowest or highest date in the index. Examples:</p>
<div class="itemizedlist">
<ul class="itemizedlist" style=
"list-style-type: circle;">
<li class="listitem">
<p><code class=
"literal">2001-03-01/2002-05-01</code> the
basic syntax for an interval of dates.</p>
</li>
<li class="listitem">
<p><code class=
"literal">2001-03-01/P1Y2M</code> the same
specified with a period.</p>
</li>
<li class="listitem">
<p><code class="literal">2001/</code> from
the beginning of 2001 to the latest date in
the index.</p>
</li>
<li class="listitem">
<p><code class="literal">2001</code> the
whole year of 2001</p>
</li>
<li class="listitem">
<p><code class="literal">P2D/</code> means 2
days ago up to now if there are no documents
with dates in the future.</p>
</li>
<li class="listitem">
<p><code class="literal">/2003</code> all
documents from 2003 or older.</p>
</li>
</ul>
</div>
<p>Periods can also be specified with small letters
(ie: p2y).</p>
</li>
<li class="listitem">
<p><code class="literal">mime</code> or
<code class="literal">format</code> for specifying
the MIME type. These clauses are processed apart
from the normal Boolean logic of the search:
multiple values will be OR'ed (instead of the
normal AND). You can specify types to be excluded,
with the usual <code class="literal">-</code>, and
use wildcards. Example: <em class=
"replaceable"><code>mime:text/*
-mime:text/plain</code></em>. Specifying an
explicit boolean operator before a <code class=
"literal">mime</code> specification is not
supported and will produce strange results.</p>
</li>
<li class="listitem">
<p><code class="literal">type</code> or
<code class="literal">rclcat</code> for specifying
the category (as in text/media/presentation/etc.).
The classification of MIME types in categories is
defined in the <span class=
"application">Recoll</span> configuration
(<code class="filename">mimeconf</code>), and can
be modified or extended. The default category names
are those which permit filtering results in the
main GUI screen. Categories are OR'ed like MIME
types above, and can be negated with <code class=
"literal">-</code>.</p>
</li>
<li class="listitem">
<p><code class="literal">issub</code> for
specifying that only standalone (<code class=
"literal">issub:0</code>) or only embedded
(<code class="literal">issub:1</code>) documents
should be returned as results.</p>
</li>
</ul>
</div>
<div class="note" style=
"margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Note</h3>
<p><code class="literal">mime</code>, <code class=
"literal">rclcat</code>, <code class=
"literal">size</code>, <code class=
"literal">issub</code> and <code class=
"literal">date</code> criteria always affect the whole
query (they are applied as a final filter), even if set
with other terms inside a parenthese.</p>
</div>
<div class="note" style=
"margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Note</h3>
<p><code class="literal">mime</code> (or the equivalent
<code class="literal">rclcat</code>) is the
<span class="emphasis"><em>only</em></span> field with
an <code class="literal">OR</code> default. You do need
to use <code class="literal">OR</code> with
<code class="literal">ext</code> terms for example.</p>
</div>
</div> </div>
<div class="sect2"> <div class="sect2">
<div class="titlepage"> <div class="titlepage">
<div> <div>
<div> <div>
<h3 class="title"><a name="RCL.SEARCH.LANG.RANGES" <h3 class="title"><a name="RCL.SEARCH.LANG.RANGES"
id="RCL.SEARCH.LANG.RANGES"></a>3.5.1.&nbsp;Range id="RCL.SEARCH.LANG.RANGES"></a>3.5.3.&nbsp;Range
clauses</h3> clauses</h3>
</div> </div>
</div> </div>
@ -5634,7 +5702,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
<div> <div>
<h3 class="title"><a name= <h3 class="title"><a name=
"RCL.SEARCH.LANG.MODIFIERS" id= "RCL.SEARCH.LANG.MODIFIERS" id=
"RCL.SEARCH.LANG.MODIFIERS"></a>3.5.2.&nbsp;Modifiers</h3> "RCL.SEARCH.LANG.MODIFIERS"></a>3.5.4.&nbsp;Modifiers</h3>
</div> </div>
</div> </div>
</div> </div>
@ -5698,8 +5766,8 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
<div> <div>
<h2 class="title" style="clear: both"><a name= <h2 class="title" style="clear: both"><a name=
"RCL.SEARCH.ANCHORWILD" id= "RCL.SEARCH.ANCHORWILD" id=
"RCL.SEARCH.ANCHORWILD"></a>3.6.&nbsp;Anchored "RCL.SEARCH.ANCHORWILD"></a>3.6.&nbsp;Wildcards and
searches and wildcards</h2> anchored searches</h2>
</div> </div>
</div> </div>
</div> </div>
@ -5714,8 +5782,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
<div> <div>
<div> <div>
<h3 class="title"><a name="RCL.SEARCH.WILDCARDS" <h3 class="title"><a name="RCL.SEARCH.WILDCARDS"
id="RCL.SEARCH.WILDCARDS"></a>3.6.1.&nbsp;More id="RCL.SEARCH.WILDCARDS"></a>3.6.1.&nbsp;Wildcards</h3>
about wildcards</h3>
</div> </div>
</div> </div>
</div> </div>

View File

@ -1399,26 +1399,22 @@ metadatacmds = ; <replaceable>tags</replaceable> = tmsu tags %f
extend the <link linkend="RCL.PROGRAM.FIELDS">field extend the <link linkend="RCL.PROGRAM.FIELDS">field
configuration</link>.</para> configuration</link>.</para>
<para>Once re-indexing is performed (you will need to force the file <para>Once re-indexing is performed (you will need to force the file reindexing, &RCL; will
reindexing, &RCL; will not detect the need by itself), you will be not detect the need by itself), you will be able to search from the query language, through
able to search from the query language, through any of its aliases: any of its aliases: <replaceable>tags:some/alternate/values</replaceable>
<replaceable>tags:some/alternate/values</replaceable> or or <replaceable>tags:all,these,values</replaceable>. The compact comma- or slash-based field
<replaceable>tags:all,these,values</replaceable> (the compact field search search syntax is supported for recoll 1.20 and later. For older versions, you would need to
syntax is supported for recoll 1.20 and later. For older versions, repeat the <replaceable>tags:</replaceable> specifier for each term,
you would need to repeat the <replaceable>tags:</replaceable> e.g. <replaceable>tags:some</replaceable> <literal>OR</literal>
specifier for each term, e.g. <replaceable>tags:some</replaceable> <replaceable>tags:alternate</replaceable>.</para>
<literal>OR</literal>
<replaceable>tags:alternate</replaceable>).</para>
<para>Tags changes will not be detected by <para>Tags changes will not be detected by the indexer if the file itself did not change. One
the indexer if the file itself did not change. One possible possible workaround would be to update the file <literal>ctime</literal> when you modify the
workaround would be to update the file <literal>ctime</literal> when tags, which would be consistent with how extended attributes function. A pair
you modify the tags, which of <command>chmod</command> commands could accomplish this, or a
would be consistent with how extended attributes function. A pair of <literal>touch -a</literal>.
<command>chmod</command> commands could accomplish this, or a Alternatively, just couple the tag update with a
<literal>touch -a</literal> . Alternatively, just <literal>recollindex -e -i</literal> <replaceable>/path/to/the/file</replaceable>.</para>
couple the tag update with a
<literal>recollindex -e -i</literal> <replaceable>/path/to/the/file</replaceable>.</para>
</sect1> </sect1>
@ -1918,11 +1914,12 @@ fs.inotify.max_user_watches=32768
<itemizedlist> <itemizedlist>
<listitem><para>In <literal>All Terms</literal> mode, &RCL; looks <listitem><para>In <literal>All Terms</literal> mode, &RCL; looks
for documents containing all your input terms.</para></listitem> for documents containing all your input terms.</para></listitem>
<listitem><para><literal>Query Language</literal> mode behaves like
<literal>All Terms</literal> in the absence of special input, but <listitem><para>The <literal>Query Language</literal> mode behaves like <literal>All
it can also do much more. This is the best mode for getting the Terms</literal> in the absence of special input, but it can also do much more. This is the
most of &RCL;.</para></listitem> best mode for getting the most of &RCL;. It is usable from all possible interfaces (GUI,
command line, WEB UI, ...), and is <link linkend="RCL.SEARCH.LANG">described
here</link>.</para></listitem>
<listitem><para>In <literal>Any Term</literal> mode, &RCL; looks <listitem><para>In <literal>Any Term</literal> mode, &RCL; looks
for documents containing any your input terms, preferring those for documents containing any your input terms, preferring those
@ -2067,8 +2064,8 @@ fs.inotify.max_user_watches=32768
<para>The <guilabel>File name</guilabel> search mode will <para>The <guilabel>File name</guilabel> search mode will
specifically look for file names. The point of having a separate specifically look for file names. The point of having a separate
file name search is that wild card expansion can be performed more file name search is that wildcard expansion can be performed more
efficiently on a small subset of the index (allowing wild cards on efficiently on a small subset of the index (allowing wildcards on
the left of terms without excessive cost). Things to know: the left of terms without excessive cost). Things to know:
<itemizedlist> <itemizedlist>
<listitem><para>White space in the entry should match white <listitem><para>White space in the entry should match white
@ -2077,7 +2074,7 @@ fs.inotify.max_user_watches=32768
<listitem><para>The search is insensitive to character case and <listitem><para>The search is insensitive to character case and
accents, independently of the type of index.</para> accents, independently of the type of index.</para>
</listitem> </listitem>
<listitem><para>An entry without any wild card <listitem><para>An entry without any wildcard
character and not capitalized will be prepended and appended character and not capitalized will be prepended and appended
with '*' (ie: <replaceable>etc</replaceable> -> with '*' (ie: <replaceable>etc</replaceable> ->
<replaceable>*etc*</replaceable>, but <replaceable>*etc*</replaceable>, but
@ -3940,24 +3937,26 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
<sect1 id="RCL.SEARCH.LANG"> <sect1 id="RCL.SEARCH.LANG">
<title>The query language</title> <title>The query language</title>
<para>The query language processor is activated in the GUI <para>The &RCL; query language was based on the now defunct
simple search entry when the search mode selector is set to <ulink url="http://www.xesam.org/main/XesamUserSearchLanguage95">
<guilabel>Query Language</guilabel>. It can also be used with the KIO Xesam</ulink> user search language specification. It allows defining general boolean
slave or the command line search. It broadly has the same searches within the main body text or specific fields, and has many additional features,
capabilities as the complex search interface in the broadly equivalent to those provided by <emphasis>complex search</emphasis> interface in the
GUI.</para> GUI.</para>
<para>The language was based on the now defunct <para>The query language processor is activated in the GUI simple search entry when the search
<ulink url="http://www.xesam.org/main/XesamUserSearchLanguage95"> mode selector is set to <literal>Query Language</literal>. It can also be used from the
Xesam</ulink> user search language specification.</para> command line search, the KIO slave, or the WEB UI.</para>
<para>If the results of a query language search puzzle you and you <para>If the results of a query language search puzzle you and you
doubt what has been actually searched for, you can use the GUI doubt what has been actually searched for, you can use the GUI <literal>Show Query</literal>
<literal>Show Query</literal> link at the top of the result list to link at the top of the result list to check the exact query which was finally executed by
check the exact query which was finally executed by Xapian.</para> Xapian.</para>
<para>Here follows a sample request that we are going to <sect2 id="RCL.SEARCH.LANG.SYNTAX">
explain:</para> <title>General syntax</title>
<para>Here follows a sample request that we are going to explain:</para>
<programlisting> <programlisting>
author:"john doe" Beatles OR Lennon Live OR Unplugged -potatoes author:"john doe" Beatles OR Lennon Live OR Unplugged -potatoes
@ -3977,10 +3976,12 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
<para>An element is composed of an optional field specification, <para>An element is composed of an optional field specification,
and a value, separated by a colon (the field separator is the last and a value, separated by a colon (the field separator is the last
colon in the element). Examples: colon in the element). Examples:
<replaceable>Eugenie</replaceable>, <itemizedlist>
<replaceable>author:balzac</replaceable>, <listitem><replaceable>Eugenie</replaceable></listitem>
<replaceable>dc:title:grandet</replaceable> <listitem><replaceable>author:balzac</replaceable></listitem>
<replaceable>dc:title:"eugenie grandet"</replaceable> <listitem><replaceable>dc:title:grandet</replaceable></listitem>
<listitem><replaceable>dc:title:"eugenie grandet"</replaceable></listitem>
</itemizedlist>
</para> </para>
<para>The colon, if present, means "contains". Xesam defines other <para>The colon, if present, means "contains". Xesam defines other
@ -4005,41 +4006,38 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
<replaceable>word2</replaceable>) <literal>OR</literal> <replaceable>word2</replaceable>) <literal>OR</literal>
<replaceable>word3</replaceable>. </para> <replaceable>word3</replaceable>. </para>
<para>&RCL; versions 1.21 and later, allow using parentheses to <para>You can use parentheses to group elements (from version 1.21), which will sometimes make
group elements, which will sometimes make things clearer, and may things clearer, and may allow expressing combinations which would have been difficult
allow expressing combinations which would have been difficult
otherwise.</para> otherwise.</para>
<para>An element preceded by a <literal>-</literal> specifies a <para>An element preceded by a <literal>-</literal> specifies a
term that should <emphasis>not</emphasis> appear.</para> term that should <emphasis>not</emphasis> appear.</para>
<para>As usual, words inside quotes define a phrase <para>As usual, words inside quotes define a phrase (the order of words is significant), so
(the order of words is significant), so that that <replaceable>title:"prejudice pride"</replaceable> is not the same
<replaceable>title:"prejudice pride"</replaceable> is not the same as as <replaceable>title:prejudice title:pride</replaceable>, and is unlikely to find a
<replaceable>title:prejudice title:pride</replaceable>, and is result.</para>
unlikely to find a result.</para>
<para>Words inside phrases and capitalized words are not <para>Words inside phrases and capitalized words are not stem-expanded. Wildcards may be used
stem-expanded. Wildcards may be used anywhere inside a term. anywhere inside a term. Specifying a wildcard on the left of a term can produce a very slow
Specifying a wild-card on the left of a term can produce a very search (or even an incorrect one if the expansion is truncated because of excessive
slow search (or even an incorrect one if the expansion is size). Also see <link linkend="RCL.SEARCH.WILDCARDS">More about wildcards</link>.
truncated because of excessive size). Also see
<link linkend="RCL.SEARCH.WILDCARDS">More about wildcards</link>.
</para> </para>
<para>To save you some typing, recent &RCL; versions (1.20 and later) <para>To save you some typing, &RCL; versions 1.20 and later
interpret a comma-separated list of terms for a field as an AND list interpret a field value given as a comma-separated list of terms as an AND list and a
inside the field. Use slash characters ('/') for an OR list. No white slash-separated list as an OR list. No white space is
space is allowed. So allowed. So <programlisting>author:john,lennon</programlisting> will search for documents
<programlisting>author:john,lennon</programlisting> will search for with <literal>john</literal> and <literal>lennon</literal> inside
documents with <literal>john</literal> and <literal>lennon</literal> the <literal>author</literal> field (in any order),
inside the <literal>author</literal> field (in any order), and and <programlisting>author:john/ringo</programlisting> would search
<programlisting>author:john/ringo</programlisting> would search for for <literal>john</literal> or <literal>ringo</literal>. This behaviour is only triggered by
<literal>john</literal> or <literal>ringo</literal>. This behaviour a field prefix: without it, comma- or slash- separated input will produce a phrase
only happens for field queries (input without a field, comma- or search. However, you can use a <literal>text</literal> field name to search the main text
slash- separated input will produce a phrase search). You can use a this way, as an alternate to using an explicit <literal>OR</literal>,
<literal>text</literal> field name to search the main text this e.g. <literal>text:napoleon/bonaparte</literal> would generate a search
way.</para> for <replaceable>napoleon</replaceable> or <replaceable>bonaparte</replaceable> in the main
text body.</para>
<para>Modifiers can be set on a double-quote value, for example to specify <para>Modifiers can be set on a double-quote value, for example to specify
a proximity search (unordered). See a proximity search (unordered). See
@ -4073,23 +4071,20 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
</listitem> </listitem>
<listitem><para><literal>filename</literal> for the document's <listitem><para><literal>filename</literal> for the document's
file name. This is not necessarily set for all documents: file name. You can use the shorter <literal>fn</literal> alias. This value is not set
internal documents contained inside a compound one (for example for all documents: internal documents contained inside a compound one (for example an
an EPUB section) do not inherit the container file name any more, EPUB section) do not inherit the container file name any more, this was replaced by an
this was replaced by an explicit field (see next). Sub-documents explicit field (see next). Sub-documents can still have a <literal>filename</literal>,
can still have a specific <literal>filename</literal>, if it is if it is implied by the document format, for example the attachment file name for an
implied by the document format, for example the attachment file email attachment.</para></listitem>
name for an email attachment.</para></listitem>
<listitem><para><literal>containerfilename</literal>. This is <listitem><para><literal>containerfilename</literal>, aliased
set for all documents, both top-level and contained as <literal>cfn</literal>. This is set for all documents, both top-level and contained
sub-documents, and is always the name of the filesystem directory sub-documents, and is always the name of the filesystem file which contains the
entry which contains the data. The terms from this field can data. The terms from this field can only be matched by an explicit field specification
only be matched by an explicit field specification (as opposed (as opposed to terms from <literal>filename</literal> which are also indexed as general
to terms from <literal>filename</literal> which are also indexed document content). This avoids getting matches for all the sub-documents when searching
as general document content). This avoids getting matches for for the container file name.</para></listitem>
all the sub-documents when searching for the container file
name.</para></listitem>
<listitem><para><literal>ext</literal> specifies the file <listitem><para><literal>ext</literal> specifies the file
name extension name extension
@ -4106,66 +4101,69 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
</itemizedlist> </itemizedlist>
<para>&RCL; 1.20 and later have a way to specify aliases for the <para>You can define aliases for field names, in order to use your preferred denomination or
field names, which will save typing, for example by aliasing to save typing (e.g. the predefined <literal>fn</literal> and <literal>cfn</literal> aliases
<literal>filename</literal> to <replaceable>fn</replaceable> or defined for <literal>filename</literal> and <literal>containerfilename</literal>). See
<literal>containerfilename</literal> to the <link linkend="RCL.INSTALL.CONFIG.FIELDS">section about the <filename>fields</filename>
<replaceable>cfn</replaceable>. See the file</link>.
<link linkend="RCL.INSTALL.CONFIG.FIELDS">section about the <filename>fields</filename> file</link>.
</para> </para>
<para>The document input handlers used while indexing have the <para>The document input handlers have the possibility to create other fields with arbitrary
possibility to create other fields with arbitrary names, and names, and aliases may be defined in the configuration, so that the exact field search
aliases may be defined in the configuration, so that the exact possibilities may be different for you if someone took care of the customisation.</para>
field search possibilities may be different for you if someone </sect2>
took care of the customisation.</para>
<para>The field syntax also supports a few field-like, but <sect2 id="RCL.SEARCH.LANG.SPECIALFIELDS">
special, criteria:</para> <title>Special field-like specifiers</title>
<para>The field syntax also supports a few field-like, but special, criteria, for which the
values are interpreted differently. Regular processing does not apply (for example the
slash- or comma- separated lists don't work). A list follows.</para>
<itemizedlist> <itemizedlist>
<listitem><para><literal>dir</literal> for filtering the <listitem>
results on file location <para id="RCL.SEARCH.LANG.SPECIALFIELDS.DIR"><literal>dir</literal> for filtering the
(Ex: <literal>dir:/home/me/somedir</literal>). results on file location. For example, <literal>dir:/home/me/somedir</literal> will
<literal>-dir</literal> restrict the search to results found anywhere under
also works to find results not in the specified directory the <replaceable>/home/me/somedir</replaceable> directory (including
(release >= 1.15.8). Tilde expansion will be performed as subdirectories).</para>
usual (except for a bug in versions 1.19 to
1.19.11p1). Wildcards will be expanded, but
please
<link linkend="RCL.SEARCH.WILDCARDS.PATH"> have a look</link>
at an important limitation of wildcards in path filters.</para>
<para>Relative paths also make sense, for example, <para>Tilde expansion will be performed as usual. Wildcards will be expanded, but
<literal>dir:share/doc</literal> would match either please <link linkend="RCL.SEARCH.WILDCARDS.PATH"> have a look</link> at an important
<filename>/usr/share/doc</filename> or limitation of wildcards in path filters.</para>
<filename>/usr/local/share/doc</filename> </para>
<para>Several <literal>dir</literal> clauses can be specified, <para>You can also use relative paths. For example, <literal>dir:share/doc</literal> would
both positive and negative. For example the following makes sense: match either <filename>/usr/share/doc</filename>
<programlisting> or <filename>/usr/local/share/doc</filename>.</para>
dir:recoll dir:src -dir:utils -dir:common
</programlisting> This would select results which have both
<filename>recoll</filename> and <filename>src</filename> in the
path (in any order), and which have not either
<filename>utils</filename> or
<filename>common</filename>.</para>
<para>You can also use <literal>OR</literal> conjunctions <para><literal>-dir</literal> will find
with <literal>dir:</literal> clauses.</para> results <emphasis>not</emphasis> in the specified location.</para>
<para>Several <literal>dir</literal> clauses can be specified,
both positive and negative. For example the following makes sense:
<programlisting>dir:recoll dir:src -dir:utils -dir:common</programlisting>
This would select results which have both
<filename>recoll</filename> and <filename>src</filename> in the
path (in any order), and which have not either
<filename>utils</filename> or
<filename>common</filename>.</para>
<para>You can also use <literal>OR</literal> conjunctions
with <literal>dir:</literal> clauses.</para>
<para>A special aspect of <literal>dir</literal> clauses is <para>A special aspect of <literal>dir</literal> clauses is
that the values in the index are not transcoded to UTF-8, and that the values in the index are not transcoded to UTF-8, and never lower-cased or
never lower-cased or unaccented, but stored as binary. This means unaccented, but stored as binary. This means that you need to enter the values in the
that you need to enter the values in the exact lower or upper exact lower or upper case, and that searches for names with diacritics may sometimes be
case, and that searches for names with diacritics may sometimes impossible because of character set conversion issues. Non-ASCII UNIX file paths are an
be impossible because of character set conversion unending source of trouble and are best avoided.</para>
issues. Non-ASCII UNIX file paths are an unending source of
trouble and are best avoided.</para>
<para>You need to use double-quotes around the path value if it <para>You need to use double-quotes around the path value if it contains space
contains space characters.</para> characters.</para>
<para>The shortcut syntax to define OR or AND lists within fields with commas or slash
characters is not available.</para>
</listitem> </listitem>
@ -4219,17 +4217,13 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
p2y).</para> p2y).</para>
</listitem> </listitem>
<listitem><para><literal>mime</literal> or <listitem><para><literal>mime</literal> or <literal>format</literal> for specifying the MIME
<literal>format</literal> for specifying the type. These clauses are processed apart from the normal Boolean logic of the search:
MIME type. These clauses are processed besides the normal multiple values will be OR'ed (instead of the normal AND). You can specify types to be
Boolean logic of the search. Multiple values will be OR'ed
(instead of the normal AND). You can specify types to be
excluded, with the usual <literal>-</literal>, and use excluded, with the usual <literal>-</literal>, and use
wildcards. Example: <replaceable>mime:text/* wildcards. Example: <replaceable>mime:text/* -mime:text/plain</replaceable>. Specifying an
-mime:text/plain</replaceable> explicit boolean operator before a <literal>mime</literal> specification is not supported
Specifying an explicit boolean and will produce strange results. </para>
operator before a <literal>mime</literal> specification is not
supported and will produce strange results. </para>
</listitem> </listitem>
<listitem><para><literal>type</literal> or <listitem><para><literal>type</literal> or
@ -4264,6 +4258,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
field with an <literal>OR</literal> default. You do need to use field with an <literal>OR</literal> default. You do need to use
<literal>OR</literal> with <literal>ext</literal> terms for <literal>OR</literal> with <literal>ext</literal> terms for
example.</para> </note> example.</para> </note>
</sect2>
<sect2 id="RCL.SEARCH.LANG.RANGES"> <sect2 id="RCL.SEARCH.LANG.RANGES">
<title>Range clauses</title> <title>Range clauses</title>
@ -4343,20 +4338,18 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
</sect1> <!-- rcl.search.lang --> </sect1> <!-- rcl.search.lang -->
<sect1 id="RCL.SEARCH.ANCHORWILD"> <sect1 id="RCL.SEARCH.ANCHORWILD">
<title>Anchored searches and wildcards</title> <title>Wildcards and anchored searches</title>
<para>Some special characters are interpreted by &RCL; in search <para>Some special characters are interpreted by &RCL; in search
strings to expand or specialize the search. Wildcards expand a root strings to expand or specialize the search. Wildcards expand a root term in controlled
term in controlled ways. Anchor characters can restrict a search to ways. Anchor characters can restrict a search to succeed only if the match is found at or
succeed only if the match is found at or near the beginning of the near the beginning of the document or one of its fields.</para>
document or one of its fields.</para>
<sect2 id="RCL.SEARCH.WILDCARDS"> <sect2 id="RCL.SEARCH.WILDCARDS">
<title>More about wildcards</title> <title>Wildcards</title>
<para>All words entered in &RCL; search fields will be processed <para>All words entered in &RCL; search fields will be processed
for wildcard expansion before the request is finally for wildcard expansion before the request is finally executed.</para>
executed.</para>
<para>The wildcard characters are:</para> <para>The wildcard characters are:</para>
@ -4376,8 +4369,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>You should be aware of a few things when using <para>You should be aware of a few things when using wildcards.</para>
wildcards.</para>
<itemizedlist> <itemizedlist>
<listitem><para>Using a wildcard character at the beginning of <listitem><para>Using a wildcard character at the beginning of