This commit is contained in:
Jean-Francois Dockes 2019-04-12 12:01:12 +02:00
parent ad89225b24
commit 3ebf1a7db2
3 changed files with 819 additions and 863 deletions

View File

@ -17,8 +17,9 @@ XSLDIR="/usr/share/xml/docbook/stylesheet/docbook-xsl/"
# Options common to the single-file and chunked versions # Options common to the single-file and chunked versions
commonoptions=--stringparam section.autolabel 1 \ commonoptions=--stringparam section.autolabel 1 \
--stringparam section.autolabel.max.depth 3 \ --stringparam section.autolabel.max.depth 2 \
--stringparam section.label.includes.component.label 1 \ --stringparam section.label.includes.component.label 1 \
--stringparam toc.max.depth 3 \
--stringparam autotoc.label.in.hyperlink 0 \ --stringparam autotoc.label.in.hyperlink 0 \
--stringparam abstract.notitle.enabled 1 \ --stringparam abstract.notitle.enabled 1 \
--stringparam html.stylesheet docbook-xsl.css \ --stringparam html.stylesheet docbook-xsl.css \

File diff suppressed because it is too large Load Diff

View File

@ -4966,13 +4966,14 @@ recollindex -c "$confdir"
<sect2 id="RCL.PROGRAM.PYTHONAPI.INTRO"> <sect2 id="RCL.PROGRAM.PYTHONAPI.INTRO">
<title>Introduction</title> <title>Introduction</title>
<para>&RCL; versions after 1.11 define a Python programming <para>The &RCL; Python programming interface can be used both for
interface, both for searching and creating/updating an searching and for creating/updating an index. Bindings exist for
index.</para> Python2 and Python3.</para>
<para>The search interface is used in the &RCL; Ubuntu Unity Lens <para>The search interface is used in a number of active projects:
and the &RCL; Web UI. It can run queries on any &RCL; the &RCL; <application>Gnome Shell Search Provider</application>,
configuration.</para> the &RCL; Web UI, and the upmpdcli UPnP Media Server, in addition
to many small scripts.</para>
<para>The index update section of the API may be used to create and <para>The index update section of the API may be used to create and
update &RCL; indexes on specific configurations (separate from the update &RCL; indexes on specific configurations (separate from the
@ -4998,6 +4999,19 @@ recollindex -c "$confdir"
paragraph at the end of this section will explain a few differences paragraph at the end of this section will explain a few differences
and ways to write code compatible with both versions.</para> and ways to write code compatible with both versions.</para>
<para>The <literal>recoll</literal> package now contains two
modules:</para>
<itemizedlist>
<listitem><para>The <literal>recoll</literal> module contains
functions and classes used to query (or update) the
index.</para></listitem>
<listitem><para>The <literal>rclextract</literal> module contains
functions and classes used at query time to access document
data.</para>
</listitem>
</itemizedlist>
<para>There is a good chance that your system repository has <para>There is a good chance that your system repository has
packages for the Recoll Python API, sometimes in a package separate packages for the Recoll Python API, sometimes in a package separate
from the main one (maybe named something like python-recoll). Else from the main one (maybe named something like python-recoll). Else
@ -5022,13 +5036,17 @@ recollindex -c "$confdir"
nres = query.execute("some query") nres = query.execute("some query")
results = query.fetchmany(20) results = query.fetchmany(20)
for doc in results: for doc in results:
print(doc.url, doc.title) print("%s %s" % (doc.url, doc.title))
]]></programlisting> ]]></programlisting>
<para>You can also take a look at the source for the <ulink <para>You can also take a look at the source for the
url="https://github.com/koniu/recoll-webui">Recoll <ulink url="https://opensourceprojects.eu/p/recollwebui/code/ci/78ddb20787b2a894b5e4661a8d5502c4511cf71e/tree/">Recoll
WebUI</ulink>, or the <ulink url="https://opensourceprojects.eu/p/upmpdcli/code/ci/c8c8e75bd181ad9db2df14da05934e53ca867a06/tree/src/mediaserver/cdplugins/uprcl/uprclfolders.py">upmpdcli local media server</ulink>, which are both WebUI</ulink>, the
based on the Python API.</para> <ulink url="https://opensourceprojects.eu/p/upmpdcli/code/ci/c8c8e75bd181ad9db2df14da05934e53ca867a06/tree/src/mediaserver/cdplugins/uprcl/uprclfolders.py">upmpdcli
local media server</ulink>, or the
<ulink
url="https://opensourceprojects.eu/p/recollgssp/code/ci/3f120108e099f9d687306c0be61593994326d52d/tree/gssp-recoll.py">Gnome
Shell Search Provider</ulink>.</para>
</sect2> </sect2>
@ -5104,10 +5122,14 @@ recollindex -c "$confdir"
<varlistentry> <varlistentry>
<term>Stored and indexed fields</term> <term>Stored and indexed fields</term>
<listitem><para>The <filename>fields</filename> file inside <listitem><para>The <link
the &RCL; configuration defines which document fields are linkend="RCL.INSTALL.CONFIG.FIELDS"><filename>fields</filename>
either "indexed" (searchable), "stored" (retrievable with file</link> inside the &RCL; configuration defines which
search results), or both.</para> document fields are either <literal>indexed</literal>
(searchable), <literal>stored</literal> (retrievable with
search results), or both. Apart from a few standard/internal
fields, only the <literal>stored</literal> fields are
retrievable through the Python search interface.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
@ -5118,381 +5140,347 @@ recollindex -c "$confdir"
<sect2 id="RCL.PROGRAM.PYTHONAPI.SEARCH"> <sect2 id="RCL.PROGRAM.PYTHONAPI.SEARCH">
<title>Python search interface</title> <title>Python search interface</title>
<sect3 id="RCL.PROGRAM.PYTHONAPI.PACKAGE">
<title>Recoll package</title>
<para>The <literal>recoll</literal> package contains two
modules:
<itemizedlist>
<listitem><para>The <literal>recoll</literal> module contains
functions and classes used to query (or update) the
index. This section will only describe the query part, see
further for the update part.</para></listitem>
<listitem><para>The <literal>rclextract</literal> module contains
functions and classes used to access document
data.</para></listitem>
</itemizedlist>
</para>
</sect3>
<sect3 id="RCL.PROGRAM.PYTHONAPI.RECOLL"> <sect3 id="RCL.PROGRAM.PYTHONAPI.RECOLL">
<title>The recoll module</title> <title>The recoll module</title>
<sect4 id="RCL.PROGRAM.PYTHONAPI.RECOLL.FUNCTIONS"> <simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CONNECT">
<title>Functions</title> <title>connect(confdir=None, extra_dbs=None, writable = False)</title>
<variablelist> <para>The <literal>connect()</literal> function connects to
<varlistentry> one or several &RCL; index(es) and returns
<term>connect(confdir=None, extra_dbs=None, a <literal>Db</literal> object.</para>
writable = False)</term> <para>This call initializes the recoll module, and it should
<listitem> always be performed before any other call or object
<para>The <literal>connect()</literal> function connects to creation.</para>
one or several &RCL; index(es) and returns <itemizedlist>
a <literal>Db</literal> object.</para> <listitem><para><literal>confdir</literal> may specify
<itemizedlist> a configuration directory. The usual defaults
<listitem><para><literal>confdir</literal> may specify apply.</para></listitem>
a configuration directory. The usual defaults <listitem><para><literal>extra_dbs</literal> is a list of
apply.</para></listitem> additional indexes (Xapian directories).</para></listitem>
<listitem><para><literal>extra_dbs</literal> is a list of <listitem><para><literal>writable</literal> decides if
additional indexes (Xapian directories).</para></listitem> we can index new data through this
<listitem><para><literal>writable</literal> decides if connection.</para></listitem>
we can index new data through this </itemizedlist>
connection.</para></listitem> </simplesect>
</itemizedlist>
<para>This call initializes the recoll module, and it should
always be performed before any other call or object
creation.</para>
</listitem>
</varlistentry>
</variablelist>
</sect4>
<simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.DB">
<title>The Db class</title>
<sect4 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES"> <para>A Db object is created by a <literal>connect()</literal>
<title>Classes</title> call and holds a connection to a Recoll index.</para>
<variablelist>
<varlistentry>
<term>Db.close()</term>
<listitem><para>Closes the connection. You can't do anything
with the <literal>Db</literal> object after
this.</para></listitem>
</varlistentry>
<varlistentry>
<term>Db.query(), Db.cursor()</term> <listitem><para>These
aliases return a blank <literal>Query</literal> object
for this index.</para></listitem>
</varlistentry>
<sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DB"> <varlistentry>
<title>The Db class</title> <term>Db.setAbstractParams(maxchars,
contextwords)</term> <listitem><para>Set the parameters used
to build snippets (sets of keywords in context text
fragments). <literal>maxchars</literal> defines the
maximum total size of the abstract.
<literal>contextwords</literal> defines how many
terms are shown around the keyword.</para></listitem>
</varlistentry>
<para>A Db object is created by <varlistentry>
a <literal>connect()</literal> call and holds a <term>Db.termMatch(match_type, expr, field='',
connection to a Recoll index.</para> maxlen=-1, casesens=False, diacsens=False, lang='english')
<variablelist> </term>
<varlistentry> <listitem><para>Expand an expression against the
<term>Db.close()</term> index term list. Performs the basic function from the
<listitem><para>Closes the connection. You can't do anything GUI term explorer tool. <literal>match_type</literal>
with the <literal>Db</literal> object after can be either
this.</para></listitem> of <literal>wildcard</literal>, <literal>regexp</literal>
</varlistentry> or <literal>stem</literal>. Returns a list of terms
<varlistentry> expanded from the input expression.
<term>Db.query(), Db.cursor()</term> <listitem><para>These </para></listitem>
aliases return a blank <literal>Query</literal> object </varlistentry>
for this index.</para></listitem>
</varlistentry>
<varlistentry> </variablelist>
<term>Db.setAbstractParams(maxchars,
contextwords)</term> <listitem><para>Set the parameters used
to build snippets (sets of keywords in context text
fragments). <literal>maxchars</literal> defines the
maximum total size of the abstract.
<literal>contextwords</literal> defines how many
terms are shown around the keyword.</para></listitem>
</varlistentry>
<varlistentry> </simplesect>
<term>Db.termMatch(match_type, expr, field='', <simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.QUERY">
maxlen=-1, casesens=False, diacsens=False, lang='english') <title>The Query class</title>
</term>
<listitem><para>Expand an expression against the
index term list. Performs the basic function from the
GUI term explorer tool. <literal>match_type</literal>
can be either
of <literal>wildcard</literal>, <literal>regexp</literal>
or <literal>stem</literal>. Returns a list of terms
expanded from the input expression.
</para></listitem>
</varlistentry>
</variablelist> <para>A <literal>Query</literal> object (equivalent to a
cursor in the Python DB API) is created by
a <literal>Db.query()</literal> call. It is used to
execute index searches.</para>
</sect5> <variablelist>
<varlistentry>
<term>Query.sortby(fieldname, ascending=True)</term>
<listitem><para>Sort results
by <replaceable>fieldname</replaceable>, in ascending
or descending order. Must be called before executing
the search.</para></listitem>
</varlistentry>
<sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.QUERY"> <varlistentry>
<title>The Query class</title> <term>Query.execute(query_string, stemming=1,
stemlang="english", fetchtext=False)</term>
<listitem><para>Starts a search
for <replaceable>query_string</replaceable>, a &RCL;
search language string. If the index stores the document
texts and <literal>fetchtext</literal> is True, store the
document extracted text in
<literal>doc.text</literal>.</para></listitem>
</varlistentry>
<para>A <literal>Query</literal> object (equivalent to a <varlistentry>
cursor in the Python DB API) is created by <term>Query.executesd(SearchData, fetchtext=False)</term>
a <literal>Db.query()</literal> call. It is used to <listitem><para>Starts a search for the query defined by
execute index searches.</para> the SearchData object. If the index stores the document
texts and <literal>fetchtext</literal> is True, store the
document extracted text in
<literal>doc.text</literal>.</para></listitem>
</varlistentry>
<variablelist> <varlistentry>
<term>Query.fetchmany(size=query.arraysize)</term>
<varlistentry> <listitem><para>Fetches
<term>Query.sortby(fieldname, ascending=True)</term> the next <literal>Doc</literal> objects in the current
<listitem><para>Sort results search results, and returns them as an array of the
by <replaceable>fieldname</replaceable>, in ascending required size, which is by default the value of
or descending order. Must be called before executing the <literal>arraysize</literal> data member.</para></listitem>
the search.</para></listitem> </varlistentry>
</varlistentry>
<varlistentry> <varlistentry>
<term>Query.execute(query_string, stemming=1, <term>Query.fetchone()</term> <listitem><para>Fetches the
stemlang="english", fetchtext=False)</term> next <literal>Doc</literal> object from the current
<listitem><para>Starts a search search results. Generates a StopIteration exception if
for <replaceable>query_string</replaceable>, a &RCL; there are no results left.</para></listitem>
search language string. If the index stores the document </varlistentry>
texts and <literal>fetchtext</literal> is True, store the
document extracted text in
<literal>doc.text</literal>.</para></listitem>
</varlistentry>
<varlistentry> <varlistentry>
<term>Query.executesd(SearchData, fetchtext=False)</term> <term>Query.close()</term>
<listitem><para>Starts a search for the query defined by <listitem><para>Closes the query. The object is unusable
the SearchData object. If the index stores the document after the call.</para></listitem>
texts and <literal>fetchtext</literal> is True, store the </varlistentry>
document extracted text in
<literal>doc.text</literal>.</para></listitem>
</varlistentry>
<varlistentry> <varlistentry>
<term>Query.fetchmany(size=query.arraysize)</term> <term>Query.scroll(value, mode='relative')</term>
<listitem><para>Adjusts the position in the current result
set. <literal>mode</literal> can
be <literal>relative</literal>
or <literal>absolute</literal>. </para></listitem>
</varlistentry>
<listitem><para>Fetches <varlistentry>
the next <literal>Doc</literal> objects in the current <term>Query.getgroups()</term>
search results, and returns them as an array of the <listitem><para>Retrieves the expanded query terms as a list
required size, which is by default the value of of pairs. Meaningful only after executexx In each
the <literal>arraysize</literal> data member.</para></listitem> pair, the first entry is a list of user terms (of size
</varlistentry> one for simple terms, or more for group and phrase
clauses), the second a list of query terms as derived
from the user terms and used in the Xapian
Query.</para></listitem>
</varlistentry>
<varlistentry> <varlistentry>
<term>Query.fetchone()</term> <listitem><para>Fetches the <term>Query.getxquery()</term>
next <literal>Doc</literal> object from the current <listitem><para>Return the Xapian query description as a
search results. Generates a StopIteration exception if Unicode string.
there are no results left.</para></listitem> Meaningful only after executexx.</para></listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Query.close()</term> <term>Query.highlight(text, ishtml = 0, methods = object)</term>
<listitem><para>Closes the query. The object is unusable <listitem><para>Will insert &lt;span "class=rclmatch">,
after the call.</para></listitem> &lt;/span> tags around the match areas in the input text
</varlistentry> and return the modified text. <literal>ishtml</literal>
can be set to indicate that the input text is HTML and
that HTML special characters should not be escaped.
<literal>methods</literal> if set should be an object
with methods startMatch(i) and endMatch() which will be
called for each match and should return a begin and end
tag</para></listitem>
</varlistentry>
<varlistentry> <varlistentry>
<term>Query.scroll(value, mode='relative')</term> <term>Query.makedocabstract(doc, methods = object))</term>
<listitem><para>Adjusts the position in the current result <listitem><para>Create a snippets abstract
set. <literal>mode</literal> can for <literal>doc</literal> (a <literal>Doc</literal>
be <literal>relative</literal> object) by selecting text around the match terms.
or <literal>absolute</literal>. </para></listitem> If methods is set, will also perform highlighting. See
</varlistentry> the highlight method.
</para></listitem>
</varlistentry>
<varlistentry> <varlistentry>
<term>Query.getgroups()</term> <term>Query.__iter__() and Query.next()</term>
<listitem><para>Retrieves the expanded query terms as a list <listitem><para>So that things like <literal>for doc in
of pairs. Meaningful only after executexx In each query:</literal> will work.</para></listitem>
pair, the first entry is a list of user terms (of size </varlistentry>
one for simple terms, or more for group and phrase </variablelist>
clauses), the second a list of query terms as derived
from the user terms and used in the Xapian
Query.</para></listitem>
</varlistentry>
<varlistentry> <variablelist>
<term>Query.getxquery()</term>
<listitem><para>Return the Xapian query description as a
Unicode string.
Meaningful only after executexx.</para></listitem>
</varlistentry>
<varlistentry> <varlistentry><term>Query.arraysize</term>
<term>Query.highlight(text, ishtml = 0, methods = object)</term> <listitem><para>Default number of records processed by fetchmany
<listitem><para>Will insert &lt;span "class=rclmatch">, (r/w).</para></listitem>
&lt;/span> tags around the match areas in the input text </varlistentry>
and return the modified text. <literal>ishtml</literal> <varlistentry><term>Query.rowcount</term><listitem><para>Number
can be set to indicate that the input text is HTML and of records returned by the last
that HTML special characters should not be escaped. execute.</para></listitem></varlistentry>
<literal>methods</literal> if set should be an object <varlistentry><term>Query.rownumber</term><listitem><para>Next index
with methods startMatch(i) and endMatch() which will be to be fetched from results. Normally increments after
called for each match and should return a begin and end each fetchone() call, but can be set/reset before the
tag</para></listitem> call to effect seeking (equivalent to
</varlistentry> using <literal>scroll()</literal>). Starts at
0.</para></listitem>
</varlistentry>
<varlistentry> </variablelist>
<term>Query.makedocabstract(doc, methods = object))</term>
<listitem><para>Create a snippets abstract
for <literal>doc</literal> (a <literal>Doc</literal>
object) by selecting text around the match terms.
If methods is set, will also perform highlighting. See
the highlight method.
</para></listitem>
</varlistentry>
<varlistentry> </simplesect>
<term>Query.__iter__() and Query.next()</term> <simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DOC">
<listitem><para>So that things like <literal>for doc in <title>The Doc class</title>
query:</literal> will work.</para></listitem>
</varlistentry>
</variablelist>
<variablelist> <para>A <literal>Doc</literal> object contains index data
for a given document. The data is extracted from the
index when searching, or set by the indexer program when
updating. The Doc object has many attributes to be read or
set by its user. It mostly matches the Rcl::Doc C++
object. Some of the attributes are predefined, but,
especially when indexing, others can be set, the name of
which will be processed as field names by the indexing
configuration. Inputs can be specified as Unicode or
strings. Outputs are Unicode objects. All dates are
specified as Unix timestamps, printed as strings. Please
refer to the <filename>rcldb/rcldoc.cpp</filename> C++ file
for a full description of the predefined attributes. Here
follows a short list.</para>
<varlistentry><term>Query.arraysize</term> <para><itemizedlist>
<listitem><para>Default number of records processed by fetchmany <listitem><para><literal>url</literal> the document URL but
(r/w).</para></listitem> see also <literal>getbinurl()</literal></para></listitem>
</varlistentry>
<varlistentry><term>Query.rowcount</term><listitem><para>Number
of records returned by the last
execute.</para></listitem></varlistentry>
<varlistentry><term>Query.rownumber</term><listitem><para>Next index
to be fetched from results. Normally increments after
each fetchone() call, but can be set/reset before the
call to effect seeking (equivalent to
using <literal>scroll()</literal>). Starts at
0.</para></listitem>
</varlistentry>
</variablelist> <listitem><para><literal>ipath</literal> the document
<literal>ipath</literal> for embedded
documents.</para></listitem>
</sect5> <listitem><para><literal>fbytes, dbytes</literal> the document
file and text sizes.</para></listitem>
<listitem><para><literal>fmtime, dmtime</literal> the document
file and document times.</para></listitem>
<listitem><para><literal>xdocid</literal> the document
Xapian document ID. This is useful if you want to access
the document through a direct Xapian
operation.</para></listitem>
<sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DOC"> <listitem><para><literal>mtype</literal> the document
<title>The Doc class</title> MIME type.</para></listitem>
<para>A <literal>Doc</literal> object contains index data <listitem><para>Fields stored by default:
for a given document. The data is extracted from the <literal>author</literal>, <literal>filename</literal>,
index when searching, or set by the indexer program when <literal>keywords</literal>,
updating. The Doc object has many attributes to be read or <literal>recipient</literal></para></listitem>
set by its user. It matches exactly the Rcl::Doc C++
object. Some of the attributes are predefined, but,
especially when indexing, others can be set, the name of
which will be processed as field names by the indexing
configuration. Inputs can be specified as Unicode or
strings. Outputs are Unicode objects. All dates are
specified as Unix timestamps, printed as strings. Please
refer to the <filename>rcldb/rcldoc.cpp</filename> C++ file
for a full description of the predefined attributes. Here
follows a short list.</para>
<para><itemizedlist> </itemizedlist>
<listitem><para><literal>url</literal> the document URL but </para>
see also <literal>getbinurl()</literal></para></listitem>
<listitem><para><literal>ipath</literal> the document <para>At query time, only the fields that are defined as
<literal>ipath</literal> for embedded <literal>stored</literal> either by default or in the
documents.</para></listitem> <filename>fields</filename> configuration file will be meaningful
in the <literal>Doc</literal> object. The document processed text
may be present or not, depending if the index stores the text at
all, and if it does, on the <literal>fetchtext</literal> query
execute option. See also the <literal>rclextract</literal> module
for accessing document contents.</para>
<listitem><para><literal>fbytes, dbytes</literal> the document <variablelist>
file and text sizes.</para></listitem>
<listitem><para><literal>fmtime, dmtime</literal> the document
file and document times.</para></listitem>
<listitem><para><literal>xdocid</literal> the document <varlistentry>
Xapian document ID. This is useful if you want to access <term>get(key), [] operator</term>
the document through a direct Xapian
operation.</para></listitem>
<listitem><para><literal>mtype</literal> the document <listitem><para>Retrieve the named document
MIME type.</para></listitem> attribute. You can also use <literal>getattr(doc,
key)</literal> or
<literal>doc.key</literal>.</para></listitem>
</varlistentry>
<listitem><para>Fields stored by default: <varlistentry>
<literal>author</literal>, <literal>filename</literal>, <term>doc.key = value</term>
<literal>keywords</literal>,
<literal>recipient</literal></para></listitem>
</itemizedlist> <listitem><para>Set the the named document attribute. You
</para> can also use <literal>setattr(doc, key,
value)</literal>.</para></listitem>
</varlistentry>
<para>At query time, only the fields that are defined <varlistentry>
as <literal>stored</literal> either by default or in <term>getbinurl()</term>
the <filename>fields</filename> configuration file will be
meaningful in the <literal>Doc</literal>
object. Especially this will not be the case for the
document text. See the <literal>rclextract</literal>
module for accessing document contents.</para>
<variablelist> <listitem><para>Retrieve the URL in byte array format (no
transcoding), for use as parameter to a system
call.</para></listitem>
</varlistentry>
<varlistentry> <varlistentry>
<term>get(key), [] operator</term> <term>setbinurl(url)</term>
<listitem><para>Retrieve the named document <listitem><para>Set the URL in byte array format (no
attribute. You can also use <literal>getattr(doc, transcoding).</para></listitem>
key)</literal> or </varlistentry>
<literal>doc.key</literal>.</para></listitem>
</varlistentry>
<varlistentry> <varlistentry>
<term>doc.key = value</term> <term>items()</term>
<listitem><para>Return a dictionary of doc object
keys/values</para></listitem>
</varlistentry>
<listitem><para>Set the the named document attribute. You <varlistentry>
can also use <literal>setattr(doc, key, <term>keys()</term>
value)</literal>.</para></listitem> <listitem><para>list of doc object keys (attribute
</varlistentry> names).</para></listitem>
</varlistentry>
</variablelist>
<varlistentry> </simplesect> <!-- Doc -->
<term>getbinurl()</term>
<listitem><para>Retrieve the URL in byte array format (no <simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.SEARCHDATA">
transcoding), for use as parameter to a system <title>The SearchData class</title>
call.</para></listitem>
</varlistentry>
<varlistentry> <para>A <literal>SearchData</literal> object allows building
<term>setbinurl(url)</term> a query by combining clauses, for execution
by <literal>Query.executesd()</literal>. It can be used
in replacement of the query language approach. The
interface is going to change a little, so no detailed doc
for now...</para>
<listitem><para>Set the URL in byte array format (no <variablelist>
transcoding).</para></listitem>
</varlistentry>
<varlistentry> <varlistentry>
<term>items()</term> <term>addclause(type='and'|'or'|'excl'|'phrase'|'near'|'sub',
<listitem><para>Return a dictionary of doc object qstring=string, slack=0, field='', stemming=1,
keys/values</para></listitem> subSearch=SearchData)</term>
</varlistentry> <listitem><para></para></listitem>
</varlistentry>
</variablelist>
<varlistentry> </simplesect> <!-- SearchData -->
<term>keys()</term>
<listitem><para>list of doc object keys (attribute
names).</para></listitem>
</varlistentry>
</variablelist>
</sect5> <!-- Doc --> </sect3> <!-- Recoll module -->
<sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.SEARCHDATA">
<title>The SearchData class</title>
<para>A <literal>SearchData</literal> object allows building
a query by combining clauses, for execution
by <literal>Query.executesd()</literal>. It can be used
in replacement of the query language approach. The
interface is going to change a little, so no detailed doc
for now...</para>
<variablelist>
<varlistentry>
<term>addclause(type='and'|'or'|'excl'|'phrase'|'near'|'sub',
qstring=string, slack=0, field='', stemming=1,
subSearch=SearchData)</term>
<listitem><para></para></listitem>
</varlistentry>
</variablelist>
</sect5> <!-- SearchData -->
</sect4> <!-- recoll.classes -->
</sect3> <!-- Recoll module -->
<sect3 id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT"> <sect3 id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT">
<title>The rclextract module</title> <title>The rclextract module</title>
<para>Prior to &RCL; 1.25, index queries never provide document <para>Prior to &RCL; 1.25, index queries could not provide document
content because it is not stored. More recent versions usually content because it was never stored. &RCL; 1.25 and later usually
store the document text, which can be optionally retrieved when store the document text, which can be optionally retrieved when
running a query (see <literal>query.execute()</literal> running a query (see <literal>query.execute()</literal>
above - the result is always plain text).</para> above - the result is always plain text).</para>
@ -5506,7 +5494,7 @@ recollindex -c "$confdir"
<para>You need to import the <literal>recoll</literal> module <para>You need to import the <literal>recoll</literal> module
before the <literal>rclextract</literal> module.</para> before the <literal>rclextract</literal> module.</para>
<sect4 id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT.CLASSES.EXTRACTOR"> <simplesect id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT.CLASSES.EXTRACTOR">
<title>The Extractor class</title> <title>The Extractor class</title>
<variablelist> <variablelist>
@ -5565,7 +5553,7 @@ not doc.ipath and (not "rclbes" in doc.keys() or doc["rclbes"] == "FS")
</variablelist> </variablelist>
</sect4> </simplesect>
</sect3> <!-- rclextract module --> </sect3> <!-- rclextract module -->