This commit is contained in:
Jean-Francois Dockes 2019-04-12 12:01:12 +02:00
parent ad89225b24
commit 3ebf1a7db2
3 changed files with 819 additions and 863 deletions

View File

@ -17,8 +17,9 @@ XSLDIR="/usr/share/xml/docbook/stylesheet/docbook-xsl/"
# Options common to the single-file and chunked versions
commonoptions=--stringparam section.autolabel 1 \
--stringparam section.autolabel.max.depth 3 \
--stringparam section.autolabel.max.depth 2 \
--stringparam section.label.includes.component.label 1 \
--stringparam toc.max.depth 3 \
--stringparam autotoc.label.in.hyperlink 0 \
--stringparam abstract.notitle.enabled 1 \
--stringparam html.stylesheet docbook-xsl.css \

File diff suppressed because it is too large Load Diff

View File

@ -4966,13 +4966,14 @@ recollindex -c "$confdir"
<sect2 id="RCL.PROGRAM.PYTHONAPI.INTRO">
<title>Introduction</title>
<para>&RCL; versions after 1.11 define a Python programming
interface, both for searching and creating/updating an
index.</para>
<para>The &RCL; Python programming interface can be used both for
searching and for creating/updating an index. Bindings exist for
Python2 and Python3.</para>
<para>The search interface is used in the &RCL; Ubuntu Unity Lens
and the &RCL; Web UI. It can run queries on any &RCL;
configuration.</para>
<para>The search interface is used in a number of active projects:
the &RCL; <application>Gnome Shell Search Provider</application>,
the &RCL; Web UI, and the upmpdcli UPnP Media Server, in addition
to many small scripts.</para>
<para>The index update section of the API may be used to create and
update &RCL; indexes on specific configurations (separate from the
@ -4998,6 +4999,19 @@ recollindex -c "$confdir"
paragraph at the end of this section will explain a few differences
and ways to write code compatible with both versions.</para>
<para>The <literal>recoll</literal> package now contains two
modules:</para>
<itemizedlist>
<listitem><para>The <literal>recoll</literal> module contains
functions and classes used to query (or update) the
index.</para></listitem>
<listitem><para>The <literal>rclextract</literal> module contains
functions and classes used at query time to access document
data.</para>
</listitem>
</itemizedlist>
<para>There is a good chance that your system repository has
packages for the Recoll Python API, sometimes in a package separate
from the main one (maybe named something like python-recoll). Else
@ -5022,13 +5036,17 @@ recollindex -c "$confdir"
nres = query.execute("some query")
results = query.fetchmany(20)
for doc in results:
print(doc.url, doc.title)
print("%s %s" % (doc.url, doc.title))
]]></programlisting>
<para>You can also take a look at the source for the <ulink
url="https://github.com/koniu/recoll-webui">Recoll
WebUI</ulink>, or the <ulink url="https://opensourceprojects.eu/p/upmpdcli/code/ci/c8c8e75bd181ad9db2df14da05934e53ca867a06/tree/src/mediaserver/cdplugins/uprcl/uprclfolders.py">upmpdcli local media server</ulink>, which are both
based on the Python API.</para>
<para>You can also take a look at the source for the
<ulink url="https://opensourceprojects.eu/p/recollwebui/code/ci/78ddb20787b2a894b5e4661a8d5502c4511cf71e/tree/">Recoll
WebUI</ulink>, the
<ulink url="https://opensourceprojects.eu/p/upmpdcli/code/ci/c8c8e75bd181ad9db2df14da05934e53ca867a06/tree/src/mediaserver/cdplugins/uprcl/uprclfolders.py">upmpdcli
local media server</ulink>, or the
<ulink
url="https://opensourceprojects.eu/p/recollgssp/code/ci/3f120108e099f9d687306c0be61593994326d52d/tree/gssp-recoll.py">Gnome
Shell Search Provider</ulink>.</para>
</sect2>
@ -5104,10 +5122,14 @@ recollindex -c "$confdir"
<varlistentry>
<term>Stored and indexed fields</term>
<listitem><para>The <filename>fields</filename> file inside
the &RCL; configuration defines which document fields are
either "indexed" (searchable), "stored" (retrievable with
search results), or both.</para>
<listitem><para>The <link
linkend="RCL.INSTALL.CONFIG.FIELDS"><filename>fields</filename>
file</link> inside the &RCL; configuration defines which
document fields are either <literal>indexed</literal>
(searchable), <literal>stored</literal> (retrievable with
search results), or both. Apart from a few standard/internal
fields, only the <literal>stored</literal> fields are
retrievable through the Python search interface.</para>
</listitem>
</varlistentry>
@ -5118,381 +5140,347 @@ recollindex -c "$confdir"
<sect2 id="RCL.PROGRAM.PYTHONAPI.SEARCH">
<title>Python search interface</title>
<sect3 id="RCL.PROGRAM.PYTHONAPI.PACKAGE">
<title>Recoll package</title>
<para>The <literal>recoll</literal> package contains two
modules:
<itemizedlist>
<listitem><para>The <literal>recoll</literal> module contains
functions and classes used to query (or update) the
index. This section will only describe the query part, see
further for the update part.</para></listitem>
<listitem><para>The <literal>rclextract</literal> module contains
functions and classes used to access document
data.</para></listitem>
</itemizedlist>
</para>
</sect3>
<sect3 id="RCL.PROGRAM.PYTHONAPI.RECOLL">
<title>The recoll module</title>
<sect4 id="RCL.PROGRAM.PYTHONAPI.RECOLL.FUNCTIONS">
<title>Functions</title>
<simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CONNECT">
<title>connect(confdir=None, extra_dbs=None, writable = False)</title>
<variablelist>
<varlistentry>
<term>connect(confdir=None, extra_dbs=None,
writable = False)</term>
<listitem>
<para>The <literal>connect()</literal> function connects to
one or several &RCL; index(es) and returns
a <literal>Db</literal> object.</para>
<itemizedlist>
<listitem><para><literal>confdir</literal> may specify
a configuration directory. The usual defaults
apply.</para></listitem>
<listitem><para><literal>extra_dbs</literal> is a list of
additional indexes (Xapian directories).</para></listitem>
<listitem><para><literal>writable</literal> decides if
we can index new data through this
connection.</para></listitem>
</itemizedlist>
<para>This call initializes the recoll module, and it should
always be performed before any other call or object
creation.</para>
</listitem>
</varlistentry>
</variablelist>
</sect4>
<para>The <literal>connect()</literal> function connects to
one or several &RCL; index(es) and returns
a <literal>Db</literal> object.</para>
<para>This call initializes the recoll module, and it should
always be performed before any other call or object
creation.</para>
<itemizedlist>
<listitem><para><literal>confdir</literal> may specify
a configuration directory. The usual defaults
apply.</para></listitem>
<listitem><para><literal>extra_dbs</literal> is a list of
additional indexes (Xapian directories).</para></listitem>
<listitem><para><literal>writable</literal> decides if
we can index new data through this
connection.</para></listitem>
</itemizedlist>
</simplesect>
<simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.DB">
<title>The Db class</title>
<sect4 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES">
<title>Classes</title>
<para>A Db object is created by a <literal>connect()</literal>
call and holds a connection to a Recoll index.</para>
<variablelist>
<varlistentry>
<term>Db.close()</term>
<listitem><para>Closes the connection. You can't do anything
with the <literal>Db</literal> object after
this.</para></listitem>
</varlistentry>
<varlistentry>
<term>Db.query(), Db.cursor()</term> <listitem><para>These
aliases return a blank <literal>Query</literal> object
for this index.</para></listitem>
</varlistentry>
<varlistentry>
<term>Db.setAbstractParams(maxchars,
contextwords)</term> <listitem><para>Set the parameters used
to build snippets (sets of keywords in context text
fragments). <literal>maxchars</literal> defines the
maximum total size of the abstract.
<literal>contextwords</literal> defines how many
terms are shown around the keyword.</para></listitem>
</varlistentry>
<varlistentry>
<term>Db.termMatch(match_type, expr, field='',
maxlen=-1, casesens=False, diacsens=False, lang='english')
</term>
<listitem><para>Expand an expression against the
index term list. Performs the basic function from the
GUI term explorer tool. <literal>match_type</literal>
can be either
of <literal>wildcard</literal>, <literal>regexp</literal>
or <literal>stem</literal>. Returns a list of terms
expanded from the input expression.
</para></listitem>
</varlistentry>
</variablelist>
</simplesect>
<simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.QUERY">
<title>The Query class</title>
<para>A <literal>Query</literal> object (equivalent to a
cursor in the Python DB API) is created by
a <literal>Db.query()</literal> call. It is used to
execute index searches.</para>
<variablelist>
<varlistentry>
<term>Query.sortby(fieldname, ascending=True)</term>
<listitem><para>Sort results
by <replaceable>fieldname</replaceable>, in ascending
or descending order. Must be called before executing
the search.</para></listitem>
</varlistentry>
<sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DB">
<title>The Db class</title>
<varlistentry>
<term>Query.execute(query_string, stemming=1,
stemlang="english", fetchtext=False)</term>
<listitem><para>Starts a search
for <replaceable>query_string</replaceable>, a &RCL;
search language string. If the index stores the document
texts and <literal>fetchtext</literal> is True, store the
document extracted text in
<literal>doc.text</literal>.</para></listitem>
</varlistentry>
<para>A Db object is created by
a <literal>connect()</literal> call and holds a
connection to a Recoll index.</para>
<variablelist>
<varlistentry>
<term>Db.close()</term>
<listitem><para>Closes the connection. You can't do anything
with the <literal>Db</literal> object after
this.</para></listitem>
</varlistentry>
<varlistentry>
<term>Db.query(), Db.cursor()</term> <listitem><para>These
aliases return a blank <literal>Query</literal> object
for this index.</para></listitem>
</varlistentry>
<varlistentry>
<term>Query.executesd(SearchData, fetchtext=False)</term>
<listitem><para>Starts a search for the query defined by
the SearchData object. If the index stores the document
texts and <literal>fetchtext</literal> is True, store the
document extracted text in
<literal>doc.text</literal>.</para></listitem>
</varlistentry>
<varlistentry>
<term>Db.setAbstractParams(maxchars,
contextwords)</term> <listitem><para>Set the parameters used
to build snippets (sets of keywords in context text
fragments). <literal>maxchars</literal> defines the
maximum total size of the abstract.
<literal>contextwords</literal> defines how many
terms are shown around the keyword.</para></listitem>
</varlistentry>
<varlistentry>
<term>Db.termMatch(match_type, expr, field='',
maxlen=-1, casesens=False, diacsens=False, lang='english')
</term>
<listitem><para>Expand an expression against the
index term list. Performs the basic function from the
GUI term explorer tool. <literal>match_type</literal>
can be either
of <literal>wildcard</literal>, <literal>regexp</literal>
or <literal>stem</literal>. Returns a list of terms
expanded from the input expression.
</para></listitem>
</varlistentry>
</variablelist>
</sect5>
<sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.QUERY">
<title>The Query class</title>
<para>A <literal>Query</literal> object (equivalent to a
cursor in the Python DB API) is created by
a <literal>Db.query()</literal> call. It is used to
execute index searches.</para>
<variablelist>
<varlistentry>
<term>Query.sortby(fieldname, ascending=True)</term>
<listitem><para>Sort results
by <replaceable>fieldname</replaceable>, in ascending
or descending order. Must be called before executing
the search.</para></listitem>
</varlistentry>
<varlistentry>
<term>Query.execute(query_string, stemming=1,
stemlang="english", fetchtext=False)</term>
<listitem><para>Starts a search
for <replaceable>query_string</replaceable>, a &RCL;
search language string. If the index stores the document
texts and <literal>fetchtext</literal> is True, store the
document extracted text in
<literal>doc.text</literal>.</para></listitem>
</varlistentry>
<varlistentry>
<term>Query.executesd(SearchData, fetchtext=False)</term>
<listitem><para>Starts a search for the query defined by
the SearchData object. If the index stores the document
texts and <literal>fetchtext</literal> is True, store the
document extracted text in
<literal>doc.text</literal>.</para></listitem>
</varlistentry>
<varlistentry>
<term>Query.fetchmany(size=query.arraysize)</term>
<listitem><para>Fetches
the next <literal>Doc</literal> objects in the current
search results, and returns them as an array of the
required size, which is by default the value of
the <literal>arraysize</literal> data member.</para></listitem>
</varlistentry>
<varlistentry>
<term>Query.fetchone()</term> <listitem><para>Fetches the
next <literal>Doc</literal> object from the current
search results. Generates a StopIteration exception if
there are no results left.</para></listitem>
</varlistentry>
<varlistentry>
<term>Query.close()</term>
<listitem><para>Closes the query. The object is unusable
after the call.</para></listitem>
</varlistentry>
<varlistentry>
<term>Query.scroll(value, mode='relative')</term>
<listitem><para>Adjusts the position in the current result
set. <literal>mode</literal> can
be <literal>relative</literal>
or <literal>absolute</literal>. </para></listitem>
</varlistentry>
<varlistentry>
<term>Query.getgroups()</term>
<listitem><para>Retrieves the expanded query terms as a list
of pairs. Meaningful only after executexx In each
pair, the first entry is a list of user terms (of size
one for simple terms, or more for group and phrase
clauses), the second a list of query terms as derived
from the user terms and used in the Xapian
Query.</para></listitem>
</varlistentry>
<varlistentry>
<term>Query.getxquery()</term>
<listitem><para>Return the Xapian query description as a
Unicode string.
Meaningful only after executexx.</para></listitem>
</varlistentry>
<varlistentry>
<term>Query.highlight(text, ishtml = 0, methods = object)</term>
<listitem><para>Will insert &lt;span "class=rclmatch">,
&lt;/span> tags around the match areas in the input text
and return the modified text. <literal>ishtml</literal>
can be set to indicate that the input text is HTML and
that HTML special characters should not be escaped.
<literal>methods</literal> if set should be an object
with methods startMatch(i) and endMatch() which will be
called for each match and should return a begin and end
tag</para></listitem>
</varlistentry>
<varlistentry>
<term>Query.makedocabstract(doc, methods = object))</term>
<listitem><para>Create a snippets abstract
for <literal>doc</literal> (a <literal>Doc</literal>
object) by selecting text around the match terms.
If methods is set, will also perform highlighting. See
the highlight method.
</para></listitem>
</varlistentry>
<varlistentry>
<term>Query.__iter__() and Query.next()</term>
<listitem><para>So that things like <literal>for doc in
query:</literal> will work.</para></listitem>
</varlistentry>
</variablelist>
<variablelist>
<varlistentry><term>Query.arraysize</term>
<listitem><para>Default number of records processed by fetchmany
(r/w).</para></listitem>
</varlistentry>
<varlistentry><term>Query.rowcount</term><listitem><para>Number
of records returned by the last
execute.</para></listitem></varlistentry>
<varlistentry><term>Query.rownumber</term><listitem><para>Next index
to be fetched from results. Normally increments after
each fetchone() call, but can be set/reset before the
call to effect seeking (equivalent to
using <literal>scroll()</literal>). Starts at
0.</para></listitem>
</varlistentry>
</variablelist>
</sect5>
<sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DOC">
<title>The Doc class</title>
<para>A <literal>Doc</literal> object contains index data
for a given document. The data is extracted from the
index when searching, or set by the indexer program when
updating. The Doc object has many attributes to be read or
set by its user. It matches exactly the Rcl::Doc C++
object. Some of the attributes are predefined, but,
especially when indexing, others can be set, the name of
which will be processed as field names by the indexing
configuration. Inputs can be specified as Unicode or
strings. Outputs are Unicode objects. All dates are
specified as Unix timestamps, printed as strings. Please
refer to the <filename>rcldb/rcldoc.cpp</filename> C++ file
for a full description of the predefined attributes. Here
follows a short list.</para>
<para><itemizedlist>
<listitem><para><literal>url</literal> the document URL but
see also <literal>getbinurl()</literal></para></listitem>
<varlistentry>
<term>Query.fetchmany(size=query.arraysize)</term>
<listitem><para><literal>ipath</literal> the document
<literal>ipath</literal> for embedded
documents.</para></listitem>
<listitem><para>Fetches
the next <literal>Doc</literal> objects in the current
search results, and returns them as an array of the
required size, which is by default the value of
the <literal>arraysize</literal> data member.</para></listitem>
</varlistentry>
<listitem><para><literal>fbytes, dbytes</literal> the document
file and text sizes.</para></listitem>
<listitem><para><literal>fmtime, dmtime</literal> the document
file and document times.</para></listitem>
<listitem><para><literal>xdocid</literal> the document
Xapian document ID. This is useful if you want to access
the document through a direct Xapian
operation.</para></listitem>
<varlistentry>
<term>Query.fetchone()</term> <listitem><para>Fetches the
next <literal>Doc</literal> object from the current
search results. Generates a StopIteration exception if
there are no results left.</para></listitem>
</varlistentry>
<listitem><para><literal>mtype</literal> the document
MIME type.</para></listitem>
<varlistentry>
<term>Query.close()</term>
<listitem><para>Closes the query. The object is unusable
after the call.</para></listitem>
</varlistentry>
<listitem><para>Fields stored by default:
<literal>author</literal>, <literal>filename</literal>,
<literal>keywords</literal>,
<literal>recipient</literal></para></listitem>
<varlistentry>
<term>Query.scroll(value, mode='relative')</term>
<listitem><para>Adjusts the position in the current result
set. <literal>mode</literal> can
be <literal>relative</literal>
or <literal>absolute</literal>. </para></listitem>
</varlistentry>
</itemizedlist>
</para>
<para>At query time, only the fields that are defined
as <literal>stored</literal> either by default or in
the <filename>fields</filename> configuration file will be
meaningful in the <literal>Doc</literal>
object. Especially this will not be the case for the
document text. See the <literal>rclextract</literal>
module for accessing document contents.</para>
<varlistentry>
<term>Query.getgroups()</term>
<listitem><para>Retrieves the expanded query terms as a list
of pairs. Meaningful only after executexx In each
pair, the first entry is a list of user terms (of size
one for simple terms, or more for group and phrase
clauses), the second a list of query terms as derived
from the user terms and used in the Xapian
Query.</para></listitem>
</varlistentry>
<varlistentry>
<term>Query.getxquery()</term>
<listitem><para>Return the Xapian query description as a
Unicode string.
Meaningful only after executexx.</para></listitem>
</varlistentry>
<variablelist>
<varlistentry>
<term>Query.highlight(text, ishtml = 0, methods = object)</term>
<listitem><para>Will insert &lt;span "class=rclmatch">,
&lt;/span> tags around the match areas in the input text
and return the modified text. <literal>ishtml</literal>
can be set to indicate that the input text is HTML and
that HTML special characters should not be escaped.
<literal>methods</literal> if set should be an object
with methods startMatch(i) and endMatch() which will be
called for each match and should return a begin and end
tag</para></listitem>
</varlistentry>
<varlistentry>
<term>get(key), [] operator</term>
<varlistentry>
<term>Query.makedocabstract(doc, methods = object))</term>
<listitem><para>Create a snippets abstract
for <literal>doc</literal> (a <literal>Doc</literal>
object) by selecting text around the match terms.
If methods is set, will also perform highlighting. See
the highlight method.
</para></listitem>
</varlistentry>
<varlistentry>
<term>Query.__iter__() and Query.next()</term>
<listitem><para>So that things like <literal>for doc in
query:</literal> will work.</para></listitem>
</varlistentry>
</variablelist>
<listitem><para>Retrieve the named document
attribute. You can also use <literal>getattr(doc,
key)</literal> or
<literal>doc.key</literal>.</para></listitem>
</varlistentry>
<variablelist>
<varlistentry>
<term>doc.key = value</term>
<varlistentry><term>Query.arraysize</term>
<listitem><para>Default number of records processed by fetchmany
(r/w).</para></listitem>
</varlistentry>
<varlistentry><term>Query.rowcount</term><listitem><para>Number
of records returned by the last
execute.</para></listitem></varlistentry>
<varlistentry><term>Query.rownumber</term><listitem><para>Next index
to be fetched from results. Normally increments after
each fetchone() call, but can be set/reset before the
call to effect seeking (equivalent to
using <literal>scroll()</literal>). Starts at
0.</para></listitem>
</varlistentry>
<listitem><para>Set the the named document attribute. You
can also use <literal>setattr(doc, key,
value)</literal>.</para></listitem>
</varlistentry>
</variablelist>
<varlistentry>
<term>getbinurl()</term>
</simplesect>
<simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DOC">
<title>The Doc class</title>
<listitem><para>Retrieve the URL in byte array format (no
transcoding), for use as parameter to a system
call.</para></listitem>
</varlistentry>
<para>A <literal>Doc</literal> object contains index data
for a given document. The data is extracted from the
index when searching, or set by the indexer program when
updating. The Doc object has many attributes to be read or
set by its user. It mostly matches the Rcl::Doc C++
object. Some of the attributes are predefined, but,
especially when indexing, others can be set, the name of
which will be processed as field names by the indexing
configuration. Inputs can be specified as Unicode or
strings. Outputs are Unicode objects. All dates are
specified as Unix timestamps, printed as strings. Please
refer to the <filename>rcldb/rcldoc.cpp</filename> C++ file
for a full description of the predefined attributes. Here
follows a short list.</para>
<varlistentry>
<term>setbinurl(url)</term>
<para><itemizedlist>
<listitem><para><literal>url</literal> the document URL but
see also <literal>getbinurl()</literal></para></listitem>
<listitem><para><literal>ipath</literal> the document
<literal>ipath</literal> for embedded
documents.</para></listitem>
<listitem><para>Set the URL in byte array format (no
transcoding).</para></listitem>
</varlistentry>
<listitem><para><literal>fbytes, dbytes</literal> the document
file and text sizes.</para></listitem>
<listitem><para><literal>fmtime, dmtime</literal> the document
file and document times.</para></listitem>
<listitem><para><literal>xdocid</literal> the document
Xapian document ID. This is useful if you want to access
the document through a direct Xapian
operation.</para></listitem>
<varlistentry>
<term>items()</term>
<listitem><para>Return a dictionary of doc object
keys/values</para></listitem>
</varlistentry>
<listitem><para><literal>mtype</literal> the document
MIME type.</para></listitem>
<varlistentry>
<term>keys()</term>
<listitem><para>list of doc object keys (attribute
names).</para></listitem>
</varlistentry>
</variablelist>
<listitem><para>Fields stored by default:
<literal>author</literal>, <literal>filename</literal>,
<literal>keywords</literal>,
<literal>recipient</literal></para></listitem>
</sect5> <!-- Doc -->
</itemizedlist>
</para>
<para>At query time, only the fields that are defined as
<literal>stored</literal> either by default or in the
<filename>fields</filename> configuration file will be meaningful
in the <literal>Doc</literal> object. The document processed text
may be present or not, depending if the index stores the text at
all, and if it does, on the <literal>fetchtext</literal> query
execute option. See also the <literal>rclextract</literal> module
for accessing document contents.</para>
<sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.SEARCHDATA">
<title>The SearchData class</title>
<variablelist>
<para>A <literal>SearchData</literal> object allows building
a query by combining clauses, for execution
by <literal>Query.executesd()</literal>. It can be used
in replacement of the query language approach. The
interface is going to change a little, so no detailed doc
for now...</para>
<varlistentry>
<term>get(key), [] operator</term>
<variablelist>
<listitem><para>Retrieve the named document
attribute. You can also use <literal>getattr(doc,
key)</literal> or
<literal>doc.key</literal>.</para></listitem>
</varlistentry>
<varlistentry>
<term>addclause(type='and'|'or'|'excl'|'phrase'|'near'|'sub',
qstring=string, slack=0, field='', stemming=1,
subSearch=SearchData)</term>
<listitem><para></para></listitem>
</varlistentry>
</variablelist>
<varlistentry>
<term>doc.key = value</term>
</sect5> <!-- SearchData -->
<listitem><para>Set the the named document attribute. You
can also use <literal>setattr(doc, key,
value)</literal>.</para></listitem>
</varlistentry>
</sect4> <!-- recoll.classes -->
</sect3> <!-- Recoll module -->
<varlistentry>
<term>getbinurl()</term>
<listitem><para>Retrieve the URL in byte array format (no
transcoding), for use as parameter to a system
call.</para></listitem>
</varlistentry>
<varlistentry>
<term>setbinurl(url)</term>
<listitem><para>Set the URL in byte array format (no
transcoding).</para></listitem>
</varlistentry>
<varlistentry>
<term>items()</term>
<listitem><para>Return a dictionary of doc object
keys/values</para></listitem>
</varlistentry>
<varlistentry>
<term>keys()</term>
<listitem><para>list of doc object keys (attribute
names).</para></listitem>
</varlistentry>
</variablelist>
</simplesect> <!-- Doc -->
<simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.SEARCHDATA">
<title>The SearchData class</title>
<para>A <literal>SearchData</literal> object allows building
a query by combining clauses, for execution
by <literal>Query.executesd()</literal>. It can be used
in replacement of the query language approach. The
interface is going to change a little, so no detailed doc
for now...</para>
<variablelist>
<varlistentry>
<term>addclause(type='and'|'or'|'excl'|'phrase'|'near'|'sub',
qstring=string, slack=0, field='', stemming=1,
subSearch=SearchData)</term>
<listitem><para></para></listitem>
</varlistentry>
</variablelist>
</simplesect> <!-- SearchData -->
</sect3> <!-- Recoll module -->
<sect3 id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT">
<title>The rclextract module</title>
<para>Prior to &RCL; 1.25, index queries never provide document
content because it is not stored. More recent versions usually
<para>Prior to &RCL; 1.25, index queries could not provide document
content because it was never stored. &RCL; 1.25 and later usually
store the document text, which can be optionally retrieved when
running a query (see <literal>query.execute()</literal>
above - the result is always plain text).</para>
@ -5506,7 +5494,7 @@ recollindex -c "$confdir"
<para>You need to import the <literal>recoll</literal> module
before the <literal>rclextract</literal> module.</para>
<sect4 id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT.CLASSES.EXTRACTOR">
<simplesect id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT.CLASSES.EXTRACTOR">
<title>The Extractor class</title>
<variablelist>
@ -5565,7 +5553,7 @@ not doc.ipath and (not "rclbes" in doc.keys() or doc["rclbes"] == "FS")
</variablelist>
</sect4>
</simplesect>
</sect3> <!-- rclextract module -->