This commit is contained in:
Jean-Francois Dockes 2019-04-12 12:01:12 +02:00
parent ad89225b24
commit 3ebf1a7db2
3 changed files with 819 additions and 863 deletions

View File

@ -17,8 +17,9 @@ XSLDIR="/usr/share/xml/docbook/stylesheet/docbook-xsl/"
# Options common to the single-file and chunked versions # Options common to the single-file and chunked versions
commonoptions=--stringparam section.autolabel 1 \ commonoptions=--stringparam section.autolabel 1 \
--stringparam section.autolabel.max.depth 3 \ --stringparam section.autolabel.max.depth 2 \
--stringparam section.label.includes.component.label 1 \ --stringparam section.label.includes.component.label 1 \
--stringparam toc.max.depth 3 \
--stringparam autotoc.label.in.hyperlink 0 \ --stringparam autotoc.label.in.hyperlink 0 \
--stringparam abstract.notitle.enabled 1 \ --stringparam abstract.notitle.enabled 1 \
--stringparam html.stylesheet docbook-xsl.css \ --stringparam html.stylesheet docbook-xsl.css \

View File

@ -1429,7 +1429,7 @@ alink="#0000FF">
other constraints. Most of the relevant parameters are other constraints. Most of the relevant parameters are
described in the <a class="link" href= described in the <a class="link" href=
"#RCL.INSTALL.CONFIG.RECOLLCONF.TERMS" title= "#RCL.INSTALL.CONFIG.RECOLLCONF.TERMS" title=
"6.4.2.2.&nbsp;Parameters affecting how we generate terms and organize the index"> "Parameters affecting how we generate terms and organize the index">
linked section</a>.</p> linked section</a>.</p>
<p>The different search interfaces (GUI, command line, <p>The different search interfaces (GUI, command line,
...) have different methods to define the set of indexes ...) have different methods to define the set of indexes
@ -2362,7 +2362,7 @@ recoll -c <em class=
"varname">mondelaypatterns</code> parameter in the "varname">mondelaypatterns</code> parameter in the
<a class="link" href= <a class="link" href=
"#RCL.INSTALL.CONFIG.RECOLLCONF.MISC" title= "#RCL.INSTALL.CONFIG.RECOLLCONF.MISC" title=
"6.4.2.5.&nbsp;Miscellaneous parameters">configuration "Miscellaneous parameters">configuration
section</a>.</p> section</a>.</p>
</div> </div>
</div> </div>
@ -2655,8 +2655,7 @@ recoll -c <em class=
<p>The format of the result list entries is entirely <p>The format of the result list entries is entirely
configurable by using the preference dialog to <a class= configurable by using the preference dialog to <a class=
"link" href="#RCL.SEARCH.GUI.CUSTOM.RESLIST" title= "link" href="#RCL.SEARCH.GUI.CUSTOM.RESLIST" title=
"3.1.15.1.&nbsp;The result list format">edit an HTML "The result list format">edit an HTML fragment</a>.</p>
fragment</a>.</p>
<p>You can click on the <code class="literal">Query <p>You can click on the <code class="literal">Query
details</code> link at the top of the results page to see details</code> link at the top of the results page to see
the query actually performed, after stem expansion and the query actually performed, after stem expansion and
@ -2674,8 +2673,8 @@ recoll -c <em class=
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.SEARCH.GUI.RESLIST.SUGGS" id= "RCL.SEARCH.GUI.RESLIST.SUGGS" id=
"RCL.SEARCH.GUI.RESLIST.SUGGS"></a>3.1.2.1.&nbsp;No "RCL.SEARCH.GUI.RESLIST.SUGGS"></a>No results:
results: the spelling suggestions</h4> the spelling suggestions</h4>
</div> </div>
</div> </div>
</div> </div>
@ -2696,8 +2695,8 @@ recoll -c <em class=
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.SEARCH.GUI.RESULTLIST.MENU" id= "RCL.SEARCH.GUI.RESULTLIST.MENU" id=
"RCL.SEARCH.GUI.RESULTLIST.MENU"></a>3.1.2.2.&nbsp;The "RCL.SEARCH.GUI.RESULTLIST.MENU"></a>The result
result list right-click menu</h4> list right-click menu</h4>
</div> </div>
</div> </div>
</div> </div>
@ -2992,7 +2991,7 @@ recoll -c <em class=
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.SEARCH.GUI.PREVIEW.SEARCH" id= "RCL.SEARCH.GUI.PREVIEW.SEARCH" id=
"RCL.SEARCH.GUI.PREVIEW.SEARCH"></a>3.1.6.1.&nbsp;Searching "RCL.SEARCH.GUI.PREVIEW.SEARCH"></a>Searching
inside the preview</h4> inside the preview</h4>
</div> </div>
</div> </div>
@ -3153,8 +3152,7 @@ recoll -c <em class=
<p><span class="application">Recoll</span> keeps a <p><span class="application">Recoll</span> keeps a
history of searches. See <a class="link" href= history of searches. See <a class="link" href=
"#RCL.SEARCH.GUI.COMPLEX.HISTORY" title= "#RCL.SEARCH.GUI.COMPLEX.HISTORY" title=
"3.1.8.3.&nbsp;Avanced search history">Advanced search "Avanced search history">Advanced search history</a>.</p>
history</a>.</p>
<p>The dialog has two tabs:</p> <p>The dialog has two tabs:</p>
<div class="orderedlist"> <div class="orderedlist">
<ol class="orderedlist" type="1"> <ol class="orderedlist" type="1">
@ -3184,7 +3182,7 @@ recoll -c <em class=
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.SEARCH.GUI.COMPLEX.TERMS" id= "RCL.SEARCH.GUI.COMPLEX.TERMS" id=
"RCL.SEARCH.GUI.COMPLEX.TERMS"></a>3.1.8.1.&nbsp;Avanced "RCL.SEARCH.GUI.COMPLEX.TERMS"></a>Avanced
search: the "find" tab</h4> search: the "find" tab</h4>
</div> </div>
</div> </div>
@ -3256,7 +3254,7 @@ recoll -c <em class=
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.SEARCH.GUI.COMPLEX.FILTER" id= "RCL.SEARCH.GUI.COMPLEX.FILTER" id=
"RCL.SEARCH.GUI.COMPLEX.FILTER"></a>3.1.8.2.&nbsp;Avanced "RCL.SEARCH.GUI.COMPLEX.FILTER"></a>Avanced
search: the "filter" tab</h4> search: the "filter" tab</h4>
</div> </div>
</div> </div>
@ -3324,7 +3322,7 @@ recoll -c <em class=
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.SEARCH.GUI.COMPLEX.HISTORY" id= "RCL.SEARCH.GUI.COMPLEX.HISTORY" id=
"RCL.SEARCH.GUI.COMPLEX.HISTORY"></a>3.1.8.3.&nbsp;Avanced "RCL.SEARCH.GUI.COMPLEX.HISTORY"></a>Avanced
search history</h4> search history</h4>
</div> </div>
</div> </div>
@ -3590,8 +3588,8 @@ recoll -c <em class=
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.SEARCH.GUI.TIPS.TERMS" id= "RCL.SEARCH.GUI.TIPS.TERMS" id=
"RCL.SEARCH.GUI.TIPS.TERMS"></a>3.1.13.1.&nbsp;Terms "RCL.SEARCH.GUI.TIPS.TERMS"></a>Terms and search
and search expansion</h4> expansion</h4>
</div> </div>
</div> </div>
</div> </div>
@ -3654,8 +3652,8 @@ recoll -c <em class=
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.SEARCH.GUI.TIPS.PHRASES" id= "RCL.SEARCH.GUI.TIPS.PHRASES" id=
"RCL.SEARCH.GUI.TIPS.PHRASES"></a>3.1.13.2.&nbsp;Working "RCL.SEARCH.GUI.TIPS.PHRASES"></a>Working with
with phrases and proximity</h4> phrases and proximity</h4>
</div> </div>
</div> </div>
</div> </div>
@ -3711,7 +3709,7 @@ recoll -c <em class=
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.SEARCH.GUI.TIPS.MISC" id= "RCL.SEARCH.GUI.TIPS.MISC" id=
"RCL.SEARCH.GUI.TIPS.MISC"></a>3.1.13.3.&nbsp;Others</h4> "RCL.SEARCH.GUI.TIPS.MISC"></a>Others</h4>
</div> </div>
</div> </div>
</div> </div>
@ -4019,8 +4017,8 @@ recoll -c <em class=
presentation of each result list entry. See the presentation of each result list entry. See the
<a class="link" href= <a class="link" href=
"#RCL.SEARCH.GUI.CUSTOM.RESLIST" title= "#RCL.SEARCH.GUI.CUSTOM.RESLIST" title=
"3.1.15.1.&nbsp;The result list format">result list "The result list format">result list customisation
customisation section</a>.</p> section</a>.</p>
</li> </li>
<li class="listitem"> <li class="listitem">
<p><a name="RCL.SEARCH.GUI.CUSTOM.RESULTHEAD" id= <p><a name="RCL.SEARCH.GUI.CUSTOM.RESULTHEAD" id=
@ -4030,8 +4028,8 @@ recoll -c <em class=
at the end of the result page HTML header. More at the end of the result page HTML header. More
detail in the <a class="link" href= detail in the <a class="link" href=
"#RCL.SEARCH.GUI.CUSTOM.RESLIST" title= "#RCL.SEARCH.GUI.CUSTOM.RESLIST" title=
"3.1.15.1.&nbsp;The result list format">result list "The result list format">result list customisation
customisation section.</a></p> section.</a></p>
</li> </li>
<li class="listitem"> <li class="listitem">
<p><span class="guilabel">Date format</span>: <p><span class="guilabel">Date format</span>:
@ -4158,8 +4156,8 @@ recoll -c <em class=
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.SEARCH.GUI.CUSTOM.RESLIST" id= "RCL.SEARCH.GUI.CUSTOM.RESLIST" id=
"RCL.SEARCH.GUI.CUSTOM.RESLIST"></a>3.1.15.1.&nbsp;The "RCL.SEARCH.GUI.CUSTOM.RESLIST"></a>The result
result list format</h4> list format</h4>
</div> </div>
</div> </div>
</div> </div>
@ -4915,9 +4913,9 @@ recoll -c <em class=
for a bug in versions 1.19 to 1.19.11p1). Wildcards for a bug in versions 1.19 to 1.19.11p1). Wildcards
will be expanded, but please <a class="link" href= will be expanded, but please <a class="link" href=
"#RCL.SEARCH.WILDCARDS.PATH" title= "#RCL.SEARCH.WILDCARDS.PATH" title=
"3.8.1.1.&nbsp;Wildcards and path filtering">have a "Wildcards and path filtering">have a look</a> at an
look</a> at an important limitation of wildcards in important limitation of wildcards in path
path filters.</p> filters.</p>
<p>Relative paths also make sense, for example, <p>Relative paths also make sense, for example,
<code class="literal">dir:share/doc</code> would <code class="literal">dir:share/doc</code> would
match either <code class= match either <code class=
@ -5365,8 +5363,8 @@ recoll -c <em class=
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.SEARCH.WILDCARDS.PATH" id= "RCL.SEARCH.WILDCARDS.PATH" id=
"RCL.SEARCH.WILDCARDS.PATH"></a>3.8.1.1.&nbsp;Wildcards "RCL.SEARCH.WILDCARDS.PATH"></a>Wildcards and
and path filtering</h4> path filtering</h4>
</div> </div>
</div> </div>
</div> </div>
@ -6382,12 +6380,12 @@ recollindex -c "$confdir"
the result list by using the appropriate directive in the result list by using the appropriate directive in
the definition of the <a class="link" href= the definition of the <a class="link" href=
"#RCL.SEARCH.GUI.CUSTOM.RESLIST" title= "#RCL.SEARCH.GUI.CUSTOM.RESLIST" title=
"3.1.15.1.&nbsp;The result list format">result list "The result list format">result list paragraph
paragraph format</a>. All fields are displayed on the format</a>. All fields are displayed on the fields
fields screen of the preview window (which you can screen of the preview window (which you can reach
reach through the right-click menu). This is through the right-click menu). This is independant of
independant of the fact that the search which the fact that the search which produced the results
produced the results used the field or not.</p> used the field or not.</p>
</li> </li>
</ul> </ul>
</div> </div>
@ -6423,14 +6421,16 @@ recollindex -c "$confdir"
</div> </div>
</div> </div>
</div> </div>
<p><span class="application">Recoll</span> versions after <p>The <span class="application">Recoll</span> Python
1.11 define a Python programming interface, both for programming interface can be used both for searching and
searching and creating/updating an index.</p> for creating/updating an index. Bindings exist for
<p>The search interface is used in the <span class= Python2 and Python3.</p>
"application">Recoll</span> Ubuntu Unity Lens and the <p>The search interface is used in a number of active
<span class="application">Recoll</span> Web UI. It can projects: the <span class="application">Recoll</span>
run queries on any <span class= <span class="application">Gnome Shell Search
"application">Recoll</span> configuration.</p> Provider</span>, the <span class=
"application">Recoll</span> Web UI, and the upmpdcli UPnP
Media Server, in addition to many small scripts.</p>
<p>The index update section of the API may be used to <p>The index update section of the API may be used to
create and update <span class="application">Recoll</span> create and update <span class="application">Recoll</span>
indexes on specific configurations (separate from the indexes on specific configurations (separate from the
@ -6467,6 +6467,23 @@ recollindex -c "$confdir"
here. A paragraph at the end of this section will explain here. A paragraph at the end of this section will explain
a few differences and ways to write code compatible with a few differences and ways to write code compatible with
both versions.</p> both versions.</p>
<p>The <code class="literal">recoll</code> package now
contains two modules:</p>
<div class="itemizedlist">
<ul class="itemizedlist" style=
"list-style-type: disc;">
<li class="listitem">
<p>The <code class="literal">recoll</code> module
contains functions and classes used to query (or
update) the index.</p>
</li>
<li class="listitem">
<p>The <code class="literal">rclextract</code>
module contains functions and classes used at query
time to access document data.</p>
</li>
</ul>
</div>
<p>There is a good chance that your system repository has <p>There is a good chance that your system repository has
packages for the Recoll Python API, sometimes in a packages for the Recoll Python API, sometimes in a
package separate from the main one (maybe named something package separate from the main one (maybe named something
@ -6493,15 +6510,17 @@ recollindex -c "$confdir"
nres = query.execute("some query") nres = query.execute("some query")
results = query.fetchmany(20) results = query.fetchmany(20)
for doc in results: for doc in results:
print(doc.url, doc.title) print("%s %s" % (doc.url, doc.title))
</pre> </pre>
<p>You can also take a look at the source for the <p>You can also take a look at the source for the
<a class="ulink" href= <a class="ulink" href=
"https://github.com/koniu/recoll-webui" target= "https://opensourceprojects.eu/p/recollwebui/code/ci/78ddb20787b2a894b5e4661a8d5502c4511cf71e/tree/"
"_top">Recoll WebUI</a>, or the <a class="ulink" href= target="_top">Recoll WebUI</a>, the <a class="ulink"
"https://opensourceprojects.eu/p/upmpdcli/code/ci/c8c8e75bd181ad9db2df14da05934e53ca867a06/tree/src/mediaserver/cdplugins/uprcl/uprclfolders.py" href="https://opensourceprojects.eu/p/upmpdcli/code/ci/c8c8e75bd181ad9db2df14da05934e53ca867a06/tree/src/mediaserver/cdplugins/uprcl/uprclfolders.py"
target="_top">upmpdcli local media server</a>, which are target="_top">upmpdcli local media server</a>, or the
both based on the Python API.</p> <a class="ulink" href=
"https://opensourceprojects.eu/p/recollgssp/code/ci/3f120108e099f9d687306c0be61593994326d52d/tree/gssp-recoll.py"
target="_top">Gnome Shell Search Provider</a>.</p>
</div> </div>
<div class="sect2"> <div class="sect2">
<div class="titlepage"> <div class="titlepage">
@ -6604,11 +6623,19 @@ recollindex -c "$confdir"
<dt><span class="term">Stored and indexed <dt><span class="term">Stored and indexed
fields</span></dt> fields</span></dt>
<dd> <dd>
<p>The <code class="filename">fields</code> file <p>The <a class="link" href=
inside the <span class="application">Recoll</span> "#RCL.INSTALL.CONFIG.FIELDS" title=
"6.4.3.&nbsp;The fields file"><code class=
"filename">fields</code> file</a> inside the
<span class="application">Recoll</span>
configuration defines which document fields are configuration defines which document fields are
either "indexed" (searchable), "stored" either <code class="literal">indexed</code>
(retrievable with search results), or both.</p> (searchable), <code class="literal">stored</code>
(retrievable with search results), or both. Apart
from a few standard/internal fields, only the
<code class="literal">stored</code> fields are
retrievable through the Python search
interface.</p>
</dd> </dd>
</dl> </dl>
</div> </div>
@ -6624,113 +6651,64 @@ recollindex -c "$confdir"
</div> </div>
</div> </div>
</div> </div>
<div class="sect3">
<div class="titlepage">
<div>
<div>
<h4 class="title"><a name=
"RCL.PROGRAM.PYTHONAPI.PACKAGE" id=
"RCL.PROGRAM.PYTHONAPI.PACKAGE"></a>5.3.3.1.&nbsp;Recoll
package</h4>
</div>
</div>
</div>
<p>The <code class="literal">recoll</code> package
contains two modules:</p>
<div class="itemizedlist">
<ul class="itemizedlist" style=
"list-style-type: disc;">
<li class="listitem">
<p>The <code class="literal">recoll</code> module
contains functions and classes used to query (or
update) the index. This section will only
describe the query part, see further for the
update part.</p>
</li>
<li class="listitem">
<p>The <code class="literal">rclextract</code>
module contains functions and classes used to
access document data.</p>
</li>
</ul>
</div>
</div>
<div class="sect3"> <div class="sect3">
<div class="titlepage"> <div class="titlepage">
<div> <div>
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.PROGRAM.PYTHONAPI.RECOLL" id= "RCL.PROGRAM.PYTHONAPI.RECOLL" id=
"RCL.PROGRAM.PYTHONAPI.RECOLL"></a>5.3.3.2.&nbsp;The "RCL.PROGRAM.PYTHONAPI.RECOLL"></a>The recoll
recoll module</h4> module</h4>
</div> </div>
</div> </div>
</div> </div>
<div class="sect4"> <div class="simplesect">
<div class="titlepage"> <div class="titlepage">
<div> <div>
<div> <div>
<h5 class="title"><a name= <h5 class="title"><a name=
"RCL.PROGRAM.PYTHONAPI.RECOLL.FUNCTIONS" id= "RCL.PROGRAM.PYTHONAPI.RECOLL.CONNECT" id=
"RCL.PROGRAM.PYTHONAPI.RECOLL.FUNCTIONS"></a>Functions</h5> "RCL.PROGRAM.PYTHONAPI.RECOLL.CONNECT"></a>connect(confdir=None,
extra_dbs=None, writable = False)</h5>
</div> </div>
</div> </div>
</div> </div>
<div class="variablelist">
<dl class="variablelist">
<dt><span class="term">connect(confdir=None,
extra_dbs=None, writable = False)</span></dt>
<dd>
<p>The <code class="literal">connect()</code> <p>The <code class="literal">connect()</code>
function connects to one or several function connects to one or several <span class=
<span class="application">Recoll</span> "application">Recoll</span> index(es) and returns a
index(es) and returns a <code class= <code class="literal">Db</code> object.</p>
"literal">Db</code> object.</p> <p>This call initializes the recoll module, and it
should always be performed before any other call or
object creation.</p>
<div class="itemizedlist"> <div class="itemizedlist">
<ul class="itemizedlist" style= <ul class="itemizedlist" style=
"list-style-type: disc;"> "list-style-type: disc;">
<li class="listitem"> <li class="listitem">
<p><code class="literal">confdir</code> <p><code class="literal">confdir</code> may
may specify a configuration directory. specify a configuration directory. The usual
The usual defaults apply.</p> defaults apply.</p>
</li> </li>
<li class="listitem"> <li class="listitem">
<p><code class="literal">extra_dbs</code> <p><code class="literal">extra_dbs</code> is a
is a list of additional indexes (Xapian list of additional indexes (Xapian
directories).</p> directories).</p>
</li> </li>
<li class="listitem"> <li class="listitem">
<p><code class="literal">writable</code> <p><code class="literal">writable</code>
decides if we can index new data through decides if we can index new data through this
this connection.</p> connection.</p>
</li> </li>
</ul> </ul>
</div> </div>
<p>This call initializes the recoll module, and
it should always be performed before any other
call or object creation.</p>
</dd>
</dl>
</div> </div>
</div> <div class="simplesect">
<div class="sect4">
<div class="titlepage"> <div class="titlepage">
<div> <div>
<div> <div>
<h5 class="title"><a name= <h5 class="title"><a name=
"RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES" id= "RCL.PROGRAM.PYTHONAPI.RECOLL.DB" id=
"RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES"></a>Classes</h5> "RCL.PROGRAM.PYTHONAPI.RECOLL.DB"></a>The Db
</div> class</h5>
</div>
</div>
<div class="sect5">
<div class="titlepage">
<div>
<div>
<h6 class="title"><a name=
"RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DB" id=
"RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DB"></a>The
Db class</h6>
</div> </div>
</div> </div>
</div> </div>
@ -6741,9 +6719,9 @@ recollindex -c "$confdir"
<dl class="variablelist"> <dl class="variablelist">
<dt><span class="term">Db.close()</span></dt> <dt><span class="term">Db.close()</span></dt>
<dd> <dd>
<p>Closes the connection. You can't do <p>Closes the connection. You can't do anything
anything with the <code class= with the <code class="literal">Db</code> object
"literal">Db</code> object after this.</p> after this.</p>
</dd> </dd>
<dt><span class="term">Db.query(), <dt><span class="term">Db.query(),
Db.cursor()</span></dt> Db.cursor()</span></dt>
@ -6768,9 +6746,9 @@ recollindex -c "$confdir"
expr, field='', maxlen=-1, casesens=False, expr, field='', maxlen=-1, casesens=False,
diacsens=False, lang='english')</span></dt> diacsens=False, lang='english')</span></dt>
<dd> <dd>
<p>Expand an expression against the index <p>Expand an expression against the index term
term list. Performs the basic function from list. Performs the basic function from the GUI
the GUI term explorer tool. <code class= term explorer tool. <code class=
"literal">match_type</code> can be either of "literal">match_type</code> can be either of
<code class="literal">wildcard</code>, <code class="literal">wildcard</code>,
<code class="literal">regexp</code> or <code class="literal">regexp</code> or
@ -6781,23 +6759,21 @@ recollindex -c "$confdir"
</dl> </dl>
</div> </div>
</div> </div>
<div class="sect5"> <div class="simplesect">
<div class="titlepage"> <div class="titlepage">
<div> <div>
<div> <div>
<h6 class="title"><a name= <h5 class="title"><a name=
"RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.QUERY" "RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.QUERY"
id= id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.QUERY">
"RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.QUERY"></a>The </a>The Query class</h5>
Query class</h6>
</div> </div>
</div> </div>
</div> </div>
<p>A <code class="literal">Query</code> object <p>A <code class="literal">Query</code> object
(equivalent to a cursor in the Python DB API) is (equivalent to a cursor in the Python DB API) is
created by a <code class= created by a <code class="literal">Db.query()</code>
"literal">Db.query()</code> call. It is used to call. It is used to execute index searches.</p>
execute index searches.</p>
<div class="variablelist"> <div class="variablelist">
<dl class="variablelist"> <dl class="variablelist">
<dt><span class="term">Query.sortby(fieldname, <dt><span class="term">Query.sortby(fieldname,
@ -6810,14 +6786,13 @@ recollindex -c "$confdir"
</dd> </dd>
<dt><span class= <dt><span class=
"term">Query.execute(query_string, stemming=1, "term">Query.execute(query_string, stemming=1,
stemlang="english", stemlang="english", fetchtext=False)</span></dt>
fetchtext=False)</span></dt>
<dd> <dd>
<p>Starts a search for <em class= <p>Starts a search for <em class=
"replaceable"><code>query_string</code></em>, "replaceable"><code>query_string</code></em>, a
a <span class="application">Recoll</span> <span class="application">Recoll</span> search
search language string. If the index stores language string. If the index stores the
the document texts and <code class= document texts and <code class=
"literal">fetchtext</code> is True, store the "literal">fetchtext</code> is True, store the
document extracted text in <code class= document extracted text in <code class=
"literal">doc.text</code>.</p> "literal">doc.text</code>.</p>
@ -6826,9 +6801,9 @@ recollindex -c "$confdir"
"term">Query.executesd(SearchData, "term">Query.executesd(SearchData,
fetchtext=False)</span></dt> fetchtext=False)</span></dt>
<dd> <dd>
<p>Starts a search for the query defined by <p>Starts a search for the query defined by the
the SearchData object. If the index stores SearchData object. If the index stores the
the document texts and <code class= document texts and <code class=
"literal">fetchtext</code> is True, store the "literal">fetchtext</code> is True, store the
document extracted text in <code class= document extracted text in <code class=
"literal">doc.text</code>.</p> "literal">doc.text</code>.</p>
@ -6838,8 +6813,8 @@ recollindex -c "$confdir"
<dd> <dd>
<p>Fetches the next <code class= <p>Fetches the next <code class=
"literal">Doc</code> objects in the current "literal">Doc</code> objects in the current
search results, and returns them as an array search results, and returns them as an array of
of the required size, which is by default the the required size, which is by default the
value of the <code class= value of the <code class=
"literal">arraysize</code> data member.</p> "literal">arraysize</code> data member.</p>
</dd> </dd>
@ -6851,8 +6826,7 @@ recollindex -c "$confdir"
search results. Generates a StopIteration search results. Generates a StopIteration
exception if there are no results left.</p> exception if there are no results left.</p>
</dd> </dd>
<dt><span class= <dt><span class="term">Query.close()</span></dt>
"term">Query.close()</span></dt>
<dd> <dd>
<p>Closes the query. The object is unusable <p>Closes the query. The object is unusable
after the call.</p> after the call.</p>
@ -6868,14 +6842,13 @@ recollindex -c "$confdir"
<dt><span class= <dt><span class=
"term">Query.getgroups()</span></dt> "term">Query.getgroups()</span></dt>
<dd> <dd>
<p>Retrieves the expanded query terms as a <p>Retrieves the expanded query terms as a list
list of pairs. Meaningful only after of pairs. Meaningful only after executexx In
executexx In each pair, the first entry is a each pair, the first entry is a list of user
list of user terms (of size one for simple terms (of size one for simple terms, or more
terms, or more for group and phrase clauses), for group and phrase clauses), the second a
the second a list of query terms as derived list of query terms as derived from the user
from the user terms and used in the Xapian terms and used in the Xapian Query.</p>
Query.</p>
</dd> </dd>
<dt><span class= <dt><span class=
"term">Query.getxquery()</span></dt> "term">Query.getxquery()</span></dt>
@ -6890,26 +6863,24 @@ recollindex -c "$confdir"
<p>Will insert &lt;span "class=rclmatch"&gt;, <p>Will insert &lt;span "class=rclmatch"&gt;,
&lt;/span&gt; tags around the match areas in &lt;/span&gt; tags around the match areas in
the input text and return the modified text. the input text and return the modified text.
<code class="literal">ishtml</code> can be <code class="literal">ishtml</code> can be set
set to indicate that the input text is HTML to indicate that the input text is HTML and
and that HTML special characters should not that HTML special characters should not be
be escaped. <code class= escaped. <code class="literal">methods</code>
"literal">methods</code> if set should be an if set should be an object with methods
object with methods startMatch(i) and startMatch(i) and endMatch() which will be
endMatch() which will be called for each called for each match and should return a begin
match and should return a begin and end and end tag</p>
tag</p>
</dd> </dd>
<dt><span class= <dt><span class="term">Query.makedocabstract(doc,
"term">Query.makedocabstract(doc, methods = methods = object))</span></dt>
object))</span></dt>
<dd> <dd>
<p>Create a snippets abstract for <p>Create a snippets abstract for <code class=
<code class="literal">doc</code> (a "literal">doc</code> (a <code class=
<code class="literal">Doc</code> object) by "literal">Doc</code> object) by selecting text
selecting text around the match terms. If around the match terms. If methods is set, will
methods is set, will also perform also perform highlighting. See the highlight
highlighting. See the highlight method.</p> method.</p>
</dd> </dd>
<dt><span class="term">Query.__iter__() and <dt><span class="term">Query.__iter__() and
Query.next()</span></dt> Query.next()</span></dt>
@ -6928,8 +6899,7 @@ recollindex -c "$confdir"
<p>Default number of records processed by <p>Default number of records processed by
fetchmany (r/w).</p> fetchmany (r/w).</p>
</dd> </dd>
<dt><span class= <dt><span class="term">Query.rowcount</span></dt>
"term">Query.rowcount</span></dt>
<dd> <dd>
<p>Number of records returned by the last <p>Number of records returned by the last
execute.</p> execute.</p>
@ -6938,39 +6908,38 @@ recollindex -c "$confdir"
"term">Query.rownumber</span></dt> "term">Query.rownumber</span></dt>
<dd> <dd>
<p>Next index to be fetched from results. <p>Next index to be fetched from results.
Normally increments after each fetchone() Normally increments after each fetchone() call,
call, but can be set/reset before the call to but can be set/reset before the call to effect
effect seeking (equivalent to using seeking (equivalent to using <code class=
<code class="literal">scroll()</code>). "literal">scroll()</code>). Starts at 0.</p>
Starts at 0.</p>
</dd> </dd>
</dl> </dl>
</div> </div>
</div> </div>
<div class="sect5"> <div class="simplesect">
<div class="titlepage"> <div class="titlepage">
<div> <div>
<div> <div>
<h6 class="title"><a name= <h5 class="title"><a name=
"RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DOC" "RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DOC" id=
id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DOC"> "RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DOC"></a>The
</a>The Doc class</h6> Doc class</h5>
</div> </div>
</div> </div>
</div> </div>
<p>A <code class="literal">Doc</code> object <p>A <code class="literal">Doc</code> object contains
contains index data for a given document. The data index data for a given document. The data is
is extracted from the index when searching, or set extracted from the index when searching, or set by
by the indexer program when updating. The Doc the indexer program when updating. The Doc object has
object has many attributes to be read or set by its many attributes to be read or set by its user. It
user. It matches exactly the Rcl::Doc C++ object. mostly matches the Rcl::Doc C++ object. Some of the
Some of the attributes are predefined, but, attributes are predefined, but, especially when
especially when indexing, others can be set, the indexing, others can be set, the name of which will
name of which will be processed as field names by be processed as field names by the indexing
the indexing configuration. Inputs can be specified configuration. Inputs can be specified as Unicode or
as Unicode or strings. Outputs are Unicode objects. strings. Outputs are Unicode objects. All dates are
All dates are specified as Unix timestamps, printed specified as Unix timestamps, printed as strings.
as strings. Please refer to the <code class= Please refer to the <code class=
"filename">rcldb/rcldoc.cpp</code> C++ file for a "filename">rcldb/rcldoc.cpp</code> C++ file for a
full description of the predefined attributes. Here full description of the predefined attributes. Here
follows a short list.</p> follows a short list.</p>
@ -6984,23 +6953,21 @@ recollindex -c "$confdir"
</li> </li>
<li class="listitem"> <li class="listitem">
<p><code class="literal">ipath</code> the <p><code class="literal">ipath</code> the
document <code class="literal">ipath</code> document <code class="literal">ipath</code> for
for embedded documents.</p> embedded documents.</p>
</li> </li>
<li class="listitem"> <li class="listitem">
<p><code class="literal">fbytes, <p><code class="literal">fbytes, dbytes</code>
dbytes</code> the document file and text the document file and text sizes.</p>
sizes.</p>
</li> </li>
<li class="listitem"> <li class="listitem">
<p><code class="literal">fmtime, <p><code class="literal">fmtime, dmtime</code>
dmtime</code> the document file and document the document file and document times.</p>
times.</p>
</li> </li>
<li class="listitem"> <li class="listitem">
<p><code class="literal">xdocid</code> the <p><code class="literal">xdocid</code> the
document Xapian document ID. This is useful document Xapian document ID. This is useful if
if you want to access the document through a you want to access the document through a
direct Xapian operation.</p> direct Xapian operation.</p>
</li> </li>
<li class="listitem"> <li class="listitem">
@ -7016,13 +6983,15 @@ recollindex -c "$confdir"
</li> </li>
</ul> </ul>
</div> </div>
<p>At query time, only the fields that are defined <p>At query time, only the fields that are defined as
as <code class="literal">stored</code> either by <code class="literal">stored</code> either by default
default or in the <code class= or in the <code class="filename">fields</code>
"filename">fields</code> configuration file will be configuration file will be meaningful in the
meaningful in the <code class="literal">Doc</code> <code class="literal">Doc</code> object. The document
object. Especially this will not be the case for processed text may be present or not, depending if
the document text. See the <code class= the index stores the text at all, and if it does, on
the <code class="literal">fetchtext</code> query
execute option. See also the <code class=
"literal">rclextract</code> module for accessing "literal">rclextract</code> module for accessing
document contents.</p> document contents.</p>
<div class="variablelist"> <div class="variablelist">
@ -7031,26 +7000,24 @@ recollindex -c "$confdir"
operator</span></dt> operator</span></dt>
<dd> <dd>
<p>Retrieve the named document attribute. You <p>Retrieve the named document attribute. You
can also use <code class= can also use <code class="literal">getattr(doc,
"literal">getattr(doc, key)</code> or key)</code> or <code class=
<code class="literal">doc.key</code>.</p> "literal">doc.key</code>.</p>
</dd> </dd>
<dt><span class="term">doc.key = <dt><span class="term">doc.key =
value</span></dt> value</span></dt>
<dd> <dd>
<p>Set the the named document attribute. You <p>Set the the named document attribute. You
can also use <code class= can also use <code class="literal">setattr(doc,
"literal">setattr(doc, key, key, value)</code>.</p>
value)</code>.</p>
</dd> </dd>
<dt><span class="term">getbinurl()</span></dt> <dt><span class="term">getbinurl()</span></dt>
<dd> <dd>
<p>Retrieve the URL in byte array format (no <p>Retrieve the URL in byte array format (no
transcoding), for use as parameter to a transcoding), for use as parameter to a system
system call.</p> call.</p>
</dd> </dd>
<dt><span class= <dt><span class="term">setbinurl(url)</span></dt>
"term">setbinurl(url)</span></dt>
<dd> <dd>
<p>Set the URL in byte array format (no <p>Set the URL in byte array format (no
transcoding).</p> transcoding).</p>
@ -7068,25 +7035,25 @@ recollindex -c "$confdir"
</dl> </dl>
</div> </div>
</div> </div>
<div class="sect5"> <div class="simplesect">
<div class="titlepage"> <div class="titlepage">
<div> <div>
<div> <div>
<h6 class="title"><a name= <h5 class="title"><a name=
"RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.SEARCHDATA" "RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.SEARCHDATA"
id= id=
"RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.SEARCHDATA"> "RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.SEARCHDATA">
</a>The SearchData class</h6> </a>The SearchData class</h5>
</div> </div>
</div> </div>
</div> </div>
<p>A <code class="literal">SearchData</code> object <p>A <code class="literal">SearchData</code> object
allows building a query by combining clauses, for allows building a query by combining clauses, for
execution by <code class= execution by <code class=
"literal">Query.executesd()</code>. It can be used "literal">Query.executesd()</code>. It can be used in
in replacement of the query language approach. The replacement of the query language approach. The
interface is going to change a little, so no interface is going to change a little, so no detailed
detailed doc for now...</p> doc for now...</p>
<div class="variablelist"> <div class="variablelist">
<dl class="variablelist"> <dl class="variablelist">
<dt><span class= <dt><span class=
@ -7098,21 +7065,21 @@ recollindex -c "$confdir"
</div> </div>
</div> </div>
</div> </div>
</div>
<div class="sect3"> <div class="sect3">
<div class="titlepage"> <div class="titlepage">
<div> <div>
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.PROGRAM.PYTHONAPI.RCLEXTRACT" id= "RCL.PROGRAM.PYTHONAPI.RCLEXTRACT" id=
"RCL.PROGRAM.PYTHONAPI.RCLEXTRACT"></a>5.3.3.3.&nbsp;The "RCL.PROGRAM.PYTHONAPI.RCLEXTRACT"></a>The
rclextract module</h4> rclextract module</h4>
</div> </div>
</div> </div>
</div> </div>
<p>Prior to <span class="application">Recoll</span> <p>Prior to <span class="application">Recoll</span>
1.25, index queries never provide document content 1.25, index queries could not provide document content
because it is not stored. More recent versions usually because it was never stored. <span class=
"application">Recoll</span> 1.25 and later usually
store the document text, which can be optionally store the document text, which can be optionally
retrieved when running a query (see <code class= retrieved when running a query (see <code class=
"literal">query.execute()</code> above - the result is "literal">query.execute()</code> above - the result is
@ -7126,7 +7093,7 @@ recollindex -c "$confdir"
<p>You need to import the <code class= <p>You need to import the <code class=
"literal">recoll</code> module before the <code class= "literal">recoll</code> module before the <code class=
"literal">rclextract</code> module.</p> "literal">rclextract</code> module.</p>
<div class="sect4"> <div class="simplesect">
<div class="titlepage"> <div class="titlepage">
<div> <div>
<div> <div>
@ -7207,7 +7174,7 @@ not doc.ipath and (not "rclbes" in doc.keys() or doc["rclbes"] == "FS")
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.PROGRAM.PYTHONAPI.SEARCH.EXAMPLE" id= "RCL.PROGRAM.PYTHONAPI.SEARCH.EXAMPLE" id=
"RCL.PROGRAM.PYTHONAPI.SEARCH.EXAMPLE"></a>5.3.3.4.&nbsp;Search "RCL.PROGRAM.PYTHONAPI.SEARCH.EXAMPLE"></a>Search
API usage example</h4> API usage example</h4>
</div> </div>
</div> </div>
@ -7305,7 +7272,7 @@ for i in range(nres):
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.PROGRAM.PYTHONAPI.UPDATE.UPDATE" id= "RCL.PROGRAM.PYTHONAPI.UPDATE.UPDATE" id=
"RCL.PROGRAM.PYTHONAPI.UPDATE.UPDATE"></a>5.3.4.1.&nbsp;Python "RCL.PROGRAM.PYTHONAPI.UPDATE.UPDATE"></a>Python
update interface</h4> update interface</h4>
</div> </div>
</div> </div>
@ -7399,7 +7366,7 @@ for i in range(nres):
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.PROGRAM.PYTHONAPI.UPDATE.ACCESS" id= "RCL.PROGRAM.PYTHONAPI.UPDATE.ACCESS" id=
"RCL.PROGRAM.PYTHONAPI.UPDATE.ACCESS"></a>5.3.4.2.&nbsp;Query "RCL.PROGRAM.PYTHONAPI.UPDATE.ACCESS"></a>Query
data access for external indexers (1.23)</h4> data access for external indexers (1.23)</h4>
</div> </div>
</div> </div>
@ -7449,7 +7416,7 @@ for i in range(nres):
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.PROGRAM.PYTHONAPI.UPDATE.SAMPLES" id= "RCL.PROGRAM.PYTHONAPI.UPDATE.SAMPLES" id=
"RCL.PROGRAM.PYTHONAPI.UPDATE.SAMPLES"></a>5.3.4.3.&nbsp;External "RCL.PROGRAM.PYTHONAPI.UPDATE.SAMPLES"></a>External
indexer samples</h4> indexer samples</h4>
</div> </div>
</div> </div>
@ -8404,7 +8371,7 @@ for i in range(nres):
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.WHATDOCS" id= "RCL.INSTALL.CONFIG.RECOLLCONF.WHATDOCS" id=
"RCL.INSTALL.CONFIG.RECOLLCONF.WHATDOCS"></a>6.4.2.1.&nbsp;Parameters "RCL.INSTALL.CONFIG.RECOLLCONF.WHATDOCS"></a>Parameters
affecting what documents we index</h4> affecting what documents we index</h4>
</div> </div>
</div> </div>
@ -8738,7 +8705,7 @@ for i in range(nres):
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.TERMS" id= "RCL.INSTALL.CONFIG.RECOLLCONF.TERMS" id=
"RCL.INSTALL.CONFIG.RECOLLCONF.TERMS"></a>6.4.2.2.&nbsp;Parameters "RCL.INSTALL.CONFIG.RECOLLCONF.TERMS"></a>Parameters
affecting how we generate terms and organize the affecting how we generate terms and organize the
index</h4> index</h4>
</div> </div>
@ -9008,7 +8975,7 @@ for i in range(nres):
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.STORE" id= "RCL.INSTALL.CONFIG.RECOLLCONF.STORE" id=
"RCL.INSTALL.CONFIG.RECOLLCONF.STORE"></a>6.4.2.3.&nbsp;Parameters "RCL.INSTALL.CONFIG.RECOLLCONF.STORE"></a>Parameters
affecting where and how we store things</h4> affecting where and how we store things</h4>
</div> </div>
</div> </div>
@ -9163,7 +9130,7 @@ for i in range(nres):
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.PERFS" id= "RCL.INSTALL.CONFIG.RECOLLCONF.PERFS" id=
"RCL.INSTALL.CONFIG.RECOLLCONF.PERFS"></a>6.4.2.4.&nbsp;Parameters "RCL.INSTALL.CONFIG.RECOLLCONF.PERFS"></a>Parameters
affecting indexing performance and resource affecting indexing performance and resource
usage</h4> usage</h4>
</div> </div>
@ -9264,7 +9231,7 @@ for i in range(nres):
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.MISC" id= "RCL.INSTALL.CONFIG.RECOLLCONF.MISC" id=
"RCL.INSTALL.CONFIG.RECOLLCONF.MISC"></a>6.4.2.5.&nbsp;Miscellaneous "RCL.INSTALL.CONFIG.RECOLLCONF.MISC"></a>Miscellaneous
parameters</h4> parameters</h4>
</div> </div>
</div> </div>
@ -9541,7 +9508,7 @@ for i in range(nres):
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.QUERY" id= "RCL.INSTALL.CONFIG.RECOLLCONF.QUERY" id=
"RCL.INSTALL.CONFIG.RECOLLCONF.QUERY"></a>6.4.2.6.&nbsp;Query-time "RCL.INSTALL.CONFIG.RECOLLCONF.QUERY"></a>Query-time
parameters (no impact on the index)</h4> parameters (no impact on the index)</h4>
</div> </div>
</div> </div>
@ -9616,7 +9583,7 @@ for i in range(nres):
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.PDF" id= "RCL.INSTALL.CONFIG.RECOLLCONF.PDF" id=
"RCL.INSTALL.CONFIG.RECOLLCONF.PDF"></a>6.4.2.7.&nbsp;Parameters "RCL.INSTALL.CONFIG.RECOLLCONF.PDF"></a>Parameters
for the PDF input script</h4> for the PDF input script</h4>
</div> </div>
</div> </div>
@ -9687,7 +9654,7 @@ for i in range(nres):
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.SPECLOCATIONS" id= "RCL.INSTALL.CONFIG.RECOLLCONF.SPECLOCATIONS" id=
"RCL.INSTALL.CONFIG.RECOLLCONF.SPECLOCATIONS"></a>6.4.2.8.&nbsp;Parameters "RCL.INSTALL.CONFIG.RECOLLCONF.SPECLOCATIONS"></a>Parameters
set for specific locations</h4> set for specific locations</h4>
</div> </div>
</div> </div>
@ -9820,7 +9787,7 @@ for i in range(nres):
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.INSTALL.CONFIG.FIELDS.XATTR" id= "RCL.INSTALL.CONFIG.FIELDS.XATTR" id=
"RCL.INSTALL.CONFIG.FIELDS.XATTR"></a>6.4.3.1.&nbsp;Extended "RCL.INSTALL.CONFIG.FIELDS.XATTR"></a>Extended
attributes in the fields file</h4> attributes in the fields file</h4>
</div> </div>
</div> </div>
@ -10150,7 +10117,7 @@ other = rclcat:other
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.INSTALL.CONFIG.EXAMPLES.ADDVIEW" id= "RCL.INSTALL.CONFIG.EXAMPLES.ADDVIEW" id=
"RCL.INSTALL.CONFIG.EXAMPLES.ADDVIEW"></a>6.4.8.1.&nbsp;Adding "RCL.INSTALL.CONFIG.EXAMPLES.ADDVIEW"></a>Adding
an external viewer for an non-indexed type</h4> an external viewer for an non-indexed type</h4>
</div> </div>
</div> </div>
@ -10213,7 +10180,7 @@ other = rclcat:other
<div> <div>
<h4 class="title"><a name= <h4 class="title"><a name=
"RCL.INSTALL.CONFIG.EXAMPLES.ADDINDEX" id= "RCL.INSTALL.CONFIG.EXAMPLES.ADDINDEX" id=
"RCL.INSTALL.CONFIG.EXAMPLES.ADDINDEX"></a>6.4.8.2.&nbsp;Adding "RCL.INSTALL.CONFIG.EXAMPLES.ADDINDEX"></a>Adding
indexing support for a new file type</h4> indexing support for a new file type</h4>
</div> </div>
</div> </div>

View File

@ -4966,13 +4966,14 @@ recollindex -c "$confdir"
<sect2 id="RCL.PROGRAM.PYTHONAPI.INTRO"> <sect2 id="RCL.PROGRAM.PYTHONAPI.INTRO">
<title>Introduction</title> <title>Introduction</title>
<para>&RCL; versions after 1.11 define a Python programming <para>The &RCL; Python programming interface can be used both for
interface, both for searching and creating/updating an searching and for creating/updating an index. Bindings exist for
index.</para> Python2 and Python3.</para>
<para>The search interface is used in the &RCL; Ubuntu Unity Lens <para>The search interface is used in a number of active projects:
and the &RCL; Web UI. It can run queries on any &RCL; the &RCL; <application>Gnome Shell Search Provider</application>,
configuration.</para> the &RCL; Web UI, and the upmpdcli UPnP Media Server, in addition
to many small scripts.</para>
<para>The index update section of the API may be used to create and <para>The index update section of the API may be used to create and
update &RCL; indexes on specific configurations (separate from the update &RCL; indexes on specific configurations (separate from the
@ -4998,6 +4999,19 @@ recollindex -c "$confdir"
paragraph at the end of this section will explain a few differences paragraph at the end of this section will explain a few differences
and ways to write code compatible with both versions.</para> and ways to write code compatible with both versions.</para>
<para>The <literal>recoll</literal> package now contains two
modules:</para>
<itemizedlist>
<listitem><para>The <literal>recoll</literal> module contains
functions and classes used to query (or update) the
index.</para></listitem>
<listitem><para>The <literal>rclextract</literal> module contains
functions and classes used at query time to access document
data.</para>
</listitem>
</itemizedlist>
<para>There is a good chance that your system repository has <para>There is a good chance that your system repository has
packages for the Recoll Python API, sometimes in a package separate packages for the Recoll Python API, sometimes in a package separate
from the main one (maybe named something like python-recoll). Else from the main one (maybe named something like python-recoll). Else
@ -5022,13 +5036,17 @@ recollindex -c "$confdir"
nres = query.execute("some query") nres = query.execute("some query")
results = query.fetchmany(20) results = query.fetchmany(20)
for doc in results: for doc in results:
print(doc.url, doc.title) print("%s %s" % (doc.url, doc.title))
]]></programlisting> ]]></programlisting>
<para>You can also take a look at the source for the <ulink <para>You can also take a look at the source for the
url="https://github.com/koniu/recoll-webui">Recoll <ulink url="https://opensourceprojects.eu/p/recollwebui/code/ci/78ddb20787b2a894b5e4661a8d5502c4511cf71e/tree/">Recoll
WebUI</ulink>, or the <ulink url="https://opensourceprojects.eu/p/upmpdcli/code/ci/c8c8e75bd181ad9db2df14da05934e53ca867a06/tree/src/mediaserver/cdplugins/uprcl/uprclfolders.py">upmpdcli local media server</ulink>, which are both WebUI</ulink>, the
based on the Python API.</para> <ulink url="https://opensourceprojects.eu/p/upmpdcli/code/ci/c8c8e75bd181ad9db2df14da05934e53ca867a06/tree/src/mediaserver/cdplugins/uprcl/uprclfolders.py">upmpdcli
local media server</ulink>, or the
<ulink
url="https://opensourceprojects.eu/p/recollgssp/code/ci/3f120108e099f9d687306c0be61593994326d52d/tree/gssp-recoll.py">Gnome
Shell Search Provider</ulink>.</para>
</sect2> </sect2>
@ -5104,10 +5122,14 @@ recollindex -c "$confdir"
<varlistentry> <varlistentry>
<term>Stored and indexed fields</term> <term>Stored and indexed fields</term>
<listitem><para>The <filename>fields</filename> file inside <listitem><para>The <link
the &RCL; configuration defines which document fields are linkend="RCL.INSTALL.CONFIG.FIELDS"><filename>fields</filename>
either "indexed" (searchable), "stored" (retrievable with file</link> inside the &RCL; configuration defines which
search results), or both.</para> document fields are either <literal>indexed</literal>
(searchable), <literal>stored</literal> (retrievable with
search results), or both. Apart from a few standard/internal
fields, only the <literal>stored</literal> fields are
retrievable through the Python search interface.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
@ -5118,37 +5140,18 @@ recollindex -c "$confdir"
<sect2 id="RCL.PROGRAM.PYTHONAPI.SEARCH"> <sect2 id="RCL.PROGRAM.PYTHONAPI.SEARCH">
<title>Python search interface</title> <title>Python search interface</title>
<sect3 id="RCL.PROGRAM.PYTHONAPI.PACKAGE">
<title>Recoll package</title>
<para>The <literal>recoll</literal> package contains two
modules:
<itemizedlist>
<listitem><para>The <literal>recoll</literal> module contains
functions and classes used to query (or update) the
index. This section will only describe the query part, see
further for the update part.</para></listitem>
<listitem><para>The <literal>rclextract</literal> module contains
functions and classes used to access document
data.</para></listitem>
</itemizedlist>
</para>
</sect3>
<sect3 id="RCL.PROGRAM.PYTHONAPI.RECOLL"> <sect3 id="RCL.PROGRAM.PYTHONAPI.RECOLL">
<title>The recoll module</title> <title>The recoll module</title>
<sect4 id="RCL.PROGRAM.PYTHONAPI.RECOLL.FUNCTIONS"> <simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CONNECT">
<title>Functions</title> <title>connect(confdir=None, extra_dbs=None, writable = False)</title>
<variablelist>
<varlistentry>
<term>connect(confdir=None, extra_dbs=None,
writable = False)</term>
<listitem>
<para>The <literal>connect()</literal> function connects to <para>The <literal>connect()</literal> function connects to
one or several &RCL; index(es) and returns one or several &RCL; index(es) and returns
a <literal>Db</literal> object.</para> a <literal>Db</literal> object.</para>
<para>This call initializes the recoll module, and it should
always be performed before any other call or object
creation.</para>
<itemizedlist> <itemizedlist>
<listitem><para><literal>confdir</literal> may specify <listitem><para><literal>confdir</literal> may specify
a configuration directory. The usual defaults a configuration directory. The usual defaults
@ -5159,24 +5162,13 @@ recollindex -c "$confdir"
we can index new data through this we can index new data through this
connection.</para></listitem> connection.</para></listitem>
</itemizedlist> </itemizedlist>
<para>This call initializes the recoll module, and it should </simplesect>
always be performed before any other call or object
creation.</para>
</listitem>
</varlistentry>
</variablelist>
</sect4>
<simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.DB">
<sect4 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES">
<title>Classes</title>
<sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DB">
<title>The Db class</title> <title>The Db class</title>
<para>A Db object is created by <para>A Db object is created by a <literal>connect()</literal>
a <literal>connect()</literal> call and holds a call and holds a connection to a Recoll index.</para>
connection to a Recoll index.</para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
<term>Db.close()</term> <term>Db.close()</term>
@ -5216,10 +5208,8 @@ recollindex -c "$confdir"
</variablelist> </variablelist>
</sect5> </simplesect>
<simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.QUERY">
<sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.QUERY">
<title>The Query class</title> <title>The Query class</title>
<para>A <literal>Query</literal> object (equivalent to a <para>A <literal>Query</literal> object (equivalent to a
@ -5355,17 +5345,15 @@ recollindex -c "$confdir"
</variablelist> </variablelist>
</sect5> </simplesect>
<simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DOC">
<sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DOC">
<title>The Doc class</title> <title>The Doc class</title>
<para>A <literal>Doc</literal> object contains index data <para>A <literal>Doc</literal> object contains index data
for a given document. The data is extracted from the for a given document. The data is extracted from the
index when searching, or set by the indexer program when index when searching, or set by the indexer program when
updating. The Doc object has many attributes to be read or updating. The Doc object has many attributes to be read or
set by its user. It matches exactly the Rcl::Doc C++ set by its user. It mostly matches the Rcl::Doc C++
object. Some of the attributes are predefined, but, object. Some of the attributes are predefined, but,
especially when indexing, others can be set, the name of especially when indexing, others can be set, the name of
which will be processed as field names by the indexing which will be processed as field names by the indexing
@ -5405,13 +5393,14 @@ recollindex -c "$confdir"
</itemizedlist> </itemizedlist>
</para> </para>
<para>At query time, only the fields that are defined <para>At query time, only the fields that are defined as
as <literal>stored</literal> either by default or in <literal>stored</literal> either by default or in the
the <filename>fields</filename> configuration file will be <filename>fields</filename> configuration file will be meaningful
meaningful in the <literal>Doc</literal> in the <literal>Doc</literal> object. The document processed text
object. Especially this will not be the case for the may be present or not, depending if the index stores the text at
document text. See the <literal>rclextract</literal> all, and if it does, on the <literal>fetchtext</literal> query
module for accessing document contents.</para> execute option. See also the <literal>rclextract</literal> module
for accessing document contents.</para>
<variablelist> <variablelist>
@ -5460,9 +5449,9 @@ recollindex -c "$confdir"
</varlistentry> </varlistentry>
</variablelist> </variablelist>
</sect5> <!-- Doc --> </simplesect> <!-- Doc -->
<sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.SEARCHDATA"> <simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.SEARCHDATA">
<title>The SearchData class</title> <title>The SearchData class</title>
<para>A <literal>SearchData</literal> object allows building <para>A <literal>SearchData</literal> object allows building
@ -5482,17 +5471,16 @@ recollindex -c "$confdir"
</varlistentry> </varlistentry>
</variablelist> </variablelist>
</sect5> <!-- SearchData --> </simplesect> <!-- SearchData -->
</sect4> <!-- recoll.classes -->
</sect3> <!-- Recoll module --> </sect3> <!-- Recoll module -->
<sect3 id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT"> <sect3 id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT">
<title>The rclextract module</title> <title>The rclextract module</title>
<para>Prior to &RCL; 1.25, index queries never provide document <para>Prior to &RCL; 1.25, index queries could not provide document
content because it is not stored. More recent versions usually content because it was never stored. &RCL; 1.25 and later usually
store the document text, which can be optionally retrieved when store the document text, which can be optionally retrieved when
running a query (see <literal>query.execute()</literal> running a query (see <literal>query.execute()</literal>
above - the result is always plain text).</para> above - the result is always plain text).</para>
@ -5506,7 +5494,7 @@ recollindex -c "$confdir"
<para>You need to import the <literal>recoll</literal> module <para>You need to import the <literal>recoll</literal> module
before the <literal>rclextract</literal> module.</para> before the <literal>rclextract</literal> module.</para>
<sect4 id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT.CLASSES.EXTRACTOR"> <simplesect id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT.CLASSES.EXTRACTOR">
<title>The Extractor class</title> <title>The Extractor class</title>
<variablelist> <variablelist>
@ -5565,7 +5553,7 @@ not doc.ipath and (not "rclbes" in doc.keys() or doc["rclbes"] == "FS")
</variablelist> </variablelist>
</sect4> </simplesect>
</sect3> <!-- rclextract module --> </sect3> <!-- rclextract module -->