doc

2019-04-12 12:01:12 +02:00 · 2019-04-12 12:01:12 +02:00 · 3ebf1a7db2
commit 3ebf1a7db2
parent ad89225b24
3 changed files with 819 additions and 863 deletions
--- a/src/doc/user/Makefile
+++ b/src/doc/user/Makefile
@ -17,8 +17,9 @@ XSLDIR="/usr/share/xml/docbook/stylesheet/docbook-xsl/"
 # Options common to the single-file and chunked versions
 commonoptions=--stringparam section.autolabel 1 \
-  --stringparam section.autolabel.max.depth 3 \
+  --stringparam section.autolabel.max.depth 2 \
  --stringparam section.label.includes.component.label 1 \
  --stringparam toc.max.depth 3 \
  --stringparam autotoc.label.in.hyperlink 0 \
  --stringparam abstract.notitle.enabled 1 \
  --stringparam html.stylesheet docbook-xsl.css \
--- a/src/doc/user/usermanual.html
+++ b/src/doc/user/usermanual.html
--- a/src/doc/user/usermanual.xml
+++ b/src/doc/user/usermanual.xml
@ -4966,13 +4966,14 @@ recollindex -c "$confdir"
      <sect2 id="RCL.PROGRAM.PYTHONAPI.INTRO">
        <title>Introduction</title>
-        <para>&RCL; versions after 1.11 define a Python programming
+        <para>The &RCL; Python programming interface can be used both for
-        interface, both for searching and creating/updating an
+        searching and for creating/updating an index. Bindings exist for
-        index.</para>
+        Python2 and Python3.</para>
-        <para>The search interface is used in the &RCL; Ubuntu Unity Lens
+        <para>The search interface is used in a number of active projects:
-        and the &RCL; Web UI. It can run queries on any &RCL;
+        the &RCL; <application>Gnome Shell Search Provider</application>,
-        configuration.</para>
+        the &RCL; Web UI, and the upmpdcli UPnP Media Server, in addition
        to many small scripts.</para>
        <para>The index update section of the API may be used to create and
        update &RCL; indexes on specific configurations (separate from the
@ -4998,6 +4999,19 @@ recollindex -c "$confdir"
        paragraph at the end of this section will explain a few differences
        and ways to write code compatible with both versions.</para>
        <para>The <literal>recoll</literal> package now contains two
        modules:</para>
        <itemizedlist>
          <listitem><para>The <literal>recoll</literal> module contains
          functions and classes used to query (or update) the
          index.</para></listitem>
          <listitem><para>The <literal>rclextract</literal> module contains
          functions and classes used at query time to access document
          data.</para>
          </listitem>
        </itemizedlist>
        <para>There is a good chance that your system repository has
        packages for the Recoll Python API, sometimes in a package separate
        from the main one (maybe named something like python-recoll).  Else
@ -5022,13 +5036,17 @@ recollindex -c "$confdir"
        nres = query.execute("some query")
        results = query.fetchmany(20)
        for doc in results:
-        print(doc.url, doc.title)
+            print("%s %s" % (doc.url, doc.title))
        ]]></programlisting>
-        <para>You can also take a look at the source for the <ulink
+        <para>You can also take a look at the source for the
-        url="https://github.com/koniu/recoll-webui">Recoll
+        <ulink  url="https://opensourceprojects.eu/p/recollwebui/code/ci/78ddb20787b2a894b5e4661a8d5502c4511cf71e/tree/">Recoll
-        WebUI</ulink>, or the <ulink url="https://opensourceprojects.eu/p/upmpdcli/code/ci/c8c8e75bd181ad9db2df14da05934e53ca867a06/tree/src/mediaserver/cdplugins/uprcl/uprclfolders.py">upmpdcli local media server</ulink>, which are both
+        WebUI</ulink>, the
-        based on the Python API.</para>
+        <ulink url="https://opensourceprojects.eu/p/upmpdcli/code/ci/c8c8e75bd181ad9db2df14da05934e53ca867a06/tree/src/mediaserver/cdplugins/uprcl/uprclfolders.py">upmpdcli 
        local media server</ulink>, or the
        <ulink
            url="https://opensourceprojects.eu/p/recollgssp/code/ci/3f120108e099f9d687306c0be61593994326d52d/tree/gssp-recoll.py">Gnome
        Shell Search Provider</ulink>.</para>
      </sect2>
@ -5104,10 +5122,14 @@ recollindex -c "$confdir"
          <varlistentry> 
            <term>Stored and indexed fields</term> 
-            <listitem><para>The <filename>fields</filename> file inside
+            <listitem><para>The <link
-            the &RCL; configuration defines which document fields are
+            linkend="RCL.INSTALL.CONFIG.FIELDS"><filename>fields</filename>
-            either "indexed" (searchable), "stored" (retrievable with
+            file</link> inside the &RCL; configuration defines which
-            search results), or both.</para>
+            document fields are either <literal>indexed</literal>
            (searchable), <literal>stored</literal> (retrievable with
            search results), or both. Apart from a few standard/internal
            fields, only the <literal>stored</literal> fields are
            retrievable through the Python search interface.</para>
            </listitem>
          </varlistentry>
@ -5118,381 +5140,347 @@ recollindex -c "$confdir"
      <sect2 id="RCL.PROGRAM.PYTHONAPI.SEARCH">
        <title>Python search interface</title>
        <sect3 id="RCL.PROGRAM.PYTHONAPI.PACKAGE">
          <title>Recoll package</title>
          <para>The <literal>recoll</literal> package contains two
          modules:
          <itemizedlist>
            <listitem><para>The <literal>recoll</literal> module contains
            functions and classes used to query (or update) the
            index. This section will only describe the query part, see
            further for the update part.</para></listitem> 
            <listitem><para>The <literal>rclextract</literal> module contains
            functions and classes used to access document
            data.</para></listitem> 
          </itemizedlist>
          </para>            
        </sect3>
        <sect3 id="RCL.PROGRAM.PYTHONAPI.RECOLL">
          <title>The recoll module</title>
-          <sect4 id="RCL.PROGRAM.PYTHONAPI.RECOLL.FUNCTIONS">
+        <simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CONNECT">
-            <title>Functions</title>
+          <title>connect(confdir=None, extra_dbs=None, writable = False)</title>
-            <variablelist>
+          <para>The <literal>connect()</literal> function connects to
-              <varlistentry>
+          one or several &RCL; index(es) and returns
-                <term>connect(confdir=None, extra_dbs=None,
+          a <literal>Db</literal> object.</para>
-                writable = False)</term>
+          <para>This call initializes the recoll module, and it should
-                <listitem>
+          always be performed before any other call or object
-                  <para>The <literal>connect()</literal> function connects to
+          creation.</para> 
-                  one or several &RCL; index(es) and returns
+          <itemizedlist>
-                  a <literal>Db</literal> object.</para>
+            <listitem><para><literal>confdir</literal> may specify
-                  <itemizedlist>
+            a configuration directory. The usual defaults
-                    <listitem><para><literal>confdir</literal> may specify
+            apply.</para></listitem> 
-                    a configuration directory. The usual defaults
+            <listitem><para><literal>extra_dbs</literal> is a list of
-                    apply.</para></listitem> 
+            additional indexes (Xapian directories).</para></listitem>
-                    <listitem><para><literal>extra_dbs</literal> is a list of
+            <listitem><para><literal>writable</literal> decides if
-                    additional indexes (Xapian directories).</para></listitem>
+            we can index new data through this
-                    <listitem><para><literal>writable</literal> decides if
+            connection.</para></listitem>
-                    we can index new data through this
+          </itemizedlist> 
-                    connection.</para></listitem>
+        </simplesect>
                  </itemizedlist> 
                  <para>This call initializes the recoll module, and it should
                  always be performed before any other call or object
                  creation.</para> 
                </listitem>
              </varlistentry>
            </variablelist>
          </sect4>
        <simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.DB">
          <title>The Db class</title>
-          <sect4 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES">
+          <para>A Db object is created by a <literal>connect()</literal>
-            <title>Classes</title>
+          call and holds a  connection to a Recoll index.</para>
          <variablelist>
            <varlistentry>
              <term>Db.close()</term>
              <listitem><para>Closes the connection. You can't do anything
              with the <literal>Db</literal> object after
              this.</para></listitem>
            </varlistentry>
            <varlistentry>
              <term>Db.query(), Db.cursor()</term> <listitem><para>These
              aliases return a blank <literal>Query</literal> object
              for this index.</para></listitem>
            </varlistentry>
-            <sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DB">
+            <varlistentry>
-              <title>The Db class</title>
+              <term>Db.setAbstractParams(maxchars,
              contextwords)</term> <listitem><para>Set the parameters used
              to build snippets (sets of keywords in context text
              fragments). <literal>maxchars</literal> defines the
              maximum total size of the abstract. 
              <literal>contextwords</literal> defines how many
              terms are shown around the keyword.</para></listitem>
            </varlistentry>
-              <para>A Db object is created by
+            <varlistentry>
-              a <literal>connect()</literal> call and holds a 
+              <term>Db.termMatch(match_type, expr, field='',
-              connection to a Recoll index.</para>
+              maxlen=-1, casesens=False, diacsens=False, lang='english')
-              <variablelist>
+              </term> 
-                <varlistentry>
+              <listitem><para>Expand an expression against the
-                  <term>Db.close()</term>
+              index term list. Performs the basic function from the
-                  <listitem><para>Closes the connection. You can't do anything
+              GUI term explorer tool. <literal>match_type</literal>
-                  with the <literal>Db</literal> object after
+              can be either
-                  this.</para></listitem>
+              of <literal>wildcard</literal>, <literal>regexp</literal>
-                </varlistentry>
+              or <literal>stem</literal>. Returns a list of terms
-                <varlistentry>
+              expanded from the input expression.
-                  <term>Db.query(), Db.cursor()</term> <listitem><para>These
+              </para></listitem>
-                  aliases return a blank <literal>Query</literal> object
+            </varlistentry>
                  for this index.</para></listitem>
                </varlistentry>
-                <varlistentry>
+          </variablelist>
                  <term>Db.setAbstractParams(maxchars,
                  contextwords)</term> <listitem><para>Set the parameters used
                  to build snippets (sets of keywords in context text
                  fragments). <literal>maxchars</literal> defines the
                  maximum total size of the abstract. 
                  <literal>contextwords</literal> defines how many
                  terms are shown around the keyword.</para></listitem>
                </varlistentry>
-                <varlistentry>
+        </simplesect>
-                  <term>Db.termMatch(match_type, expr, field='',
+        <simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.QUERY">
-                  maxlen=-1, casesens=False, diacsens=False, lang='english')
+          <title>The Query class</title>
                  </term> 
                  <listitem><para>Expand an expression against the
                  index term list. Performs the basic function from the
                  GUI term explorer tool. <literal>match_type</literal>
                  can be either
                  of <literal>wildcard</literal>, <literal>regexp</literal>
                  or <literal>stem</literal>. Returns a list of terms
                  expanded from the input expression.
                  </para></listitem>
                </varlistentry>
-              </variablelist>
+          <para>A <literal>Query</literal> object (equivalent to a
          cursor in the Python DB API) is created by
          a <literal>Db.query()</literal> call. It is used to
          execute index searches.</para>
-            </sect5>
+          <variablelist>
            <varlistentry>
              <term>Query.sortby(fieldname, ascending=True)</term>
              <listitem><para>Sort results
              by <replaceable>fieldname</replaceable>, in ascending
              or descending order. Must be called before executing
              the search.</para></listitem>
            </varlistentry>
-            <sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.QUERY">
+            <varlistentry>
-              <title>The Query class</title>
+              <term>Query.execute(query_string, stemming=1, 
              stemlang="english", fetchtext=False)</term>
              <listitem><para>Starts a search
              for <replaceable>query_string</replaceable>, a &RCL;
              search language string. If the index stores the document
              texts and <literal>fetchtext</literal> is True, store the
              document extracted text in
              <literal>doc.text</literal>.</para></listitem> 
            </varlistentry>
-              <para>A <literal>Query</literal> object (equivalent to a
+            <varlistentry>
-              cursor in the Python DB API) is created by
+              <term>Query.executesd(SearchData, fetchtext=False)</term>
-              a <literal>Db.query()</literal> call. It is used to
+              <listitem><para>Starts a search for the query defined by
-              execute index searches.</para>
+              the SearchData object. If the index stores the document
              texts and <literal>fetchtext</literal> is True, store the
              document extracted text in
              <literal>doc.text</literal>.</para></listitem>
            </varlistentry>
-              <variablelist>
+            <varlistentry>
              <term>Query.fetchmany(size=query.arraysize)</term> 
-                <varlistentry>
+              <listitem><para>Fetches
-                  <term>Query.sortby(fieldname, ascending=True)</term>
+              the next <literal>Doc</literal> objects in the current
-                  <listitem><para>Sort results
+              search results, and returns them as an array of the
-                  by <replaceable>fieldname</replaceable>, in ascending
+              required size, which is by default the value of
-                  or descending order. Must be called before executing
+              the <literal>arraysize</literal> data member.</para></listitem>
-                  the search.</para></listitem>
+            </varlistentry>
                </varlistentry>
-                <varlistentry>
+            <varlistentry>
-                  <term>Query.execute(query_string, stemming=1, 
+              <term>Query.fetchone()</term> <listitem><para>Fetches the
-                  stemlang="english", fetchtext=False)</term>
+              next <literal>Doc</literal> object from the current
-                  <listitem><para>Starts a search
+              search results. Generates a StopIteration exception if
-                  for <replaceable>query_string</replaceable>, a &RCL;
+              there are no results left.</para></listitem>
-                  search language string. If the index stores the document
+            </varlistentry>
                  texts and <literal>fetchtext</literal> is True, store the
                  document extracted text in
                  <literal>doc.text</literal>.</para></listitem> 
                </varlistentry>
-                <varlistentry>
+            <varlistentry>
-                  <term>Query.executesd(SearchData, fetchtext=False)</term>
+              <term>Query.close()</term>
-                  <listitem><para>Starts a search for the query defined by
+              <listitem><para>Closes the query. The object is unusable
-                  the SearchData object. If the index stores the document
+              after the call.</para></listitem>
-                  texts and <literal>fetchtext</literal> is True, store the
+            </varlistentry>
                  document extracted text in
                  <literal>doc.text</literal>.</para></listitem>
                </varlistentry>
-                <varlistentry>
+            <varlistentry>
-                  <term>Query.fetchmany(size=query.arraysize)</term> 
+              <term>Query.scroll(value, mode='relative')</term>
              <listitem><para>Adjusts the position in the current result
              set. <literal>mode</literal> can
              be <literal>relative</literal>
              or <literal>absolute</literal>. </para></listitem>
            </varlistentry>
-                  <listitem><para>Fetches
+            <varlistentry>
-                  the next <literal>Doc</literal> objects in the current
+              <term>Query.getgroups()</term>
-                  search results, and returns them as an array of the
+              <listitem><para>Retrieves the expanded query terms as a list
-                  required size, which is by default the value of
+              of pairs. Meaningful only after executexx In each
-                  the <literal>arraysize</literal> data member.</para></listitem>
+              pair, the first entry is a list of user terms (of size
-                </varlistentry>
+              one for simple terms, or more for group and phrase
              clauses), the second a list of query terms as derived
              from the user terms and used in the Xapian
              Query.</para></listitem>
            </varlistentry>
-                <varlistentry>
+            <varlistentry>
-                  <term>Query.fetchone()</term> <listitem><para>Fetches the
+              <term>Query.getxquery()</term>
-                  next <literal>Doc</literal> object from the current
+              <listitem><para>Return the Xapian query description as a
-                  search results. Generates a StopIteration exception if
+              Unicode string. 
-                  there are no results left.</para></listitem>
+              Meaningful only after executexx.</para></listitem>
-                </varlistentry>
+            </varlistentry>
-                <varlistentry>
+            <varlistentry>
-                  <term>Query.close()</term>
+              <term>Query.highlight(text, ishtml = 0, methods = object)</term>
-                  <listitem><para>Closes the query. The object is unusable
+              <listitem><para>Will insert &lt;span "class=rclmatch">,
-                  after the call.</para></listitem>
+              &lt;/span> tags around the match areas in the input text
-                </varlistentry>
+              and return the modified text.  <literal>ishtml</literal>
              can be set to indicate that the input text is HTML and
              that HTML special characters should not be escaped.
              <literal>methods</literal> if set should be an object
              with methods startMatch(i) and endMatch() which will be
              called for each match and should return a begin and end
              tag</para></listitem>
            </varlistentry>
-                <varlistentry>
+            <varlistentry>
-                  <term>Query.scroll(value, mode='relative')</term>
+              <term>Query.makedocabstract(doc, methods = object))</term>
-                  <listitem><para>Adjusts the position in the current result
+              <listitem><para>Create a snippets abstract
-                  set. <literal>mode</literal> can
+              for <literal>doc</literal> (a <literal>Doc</literal>
-                  be <literal>relative</literal>
+              object) by selecting text around the match terms.
-                  or <literal>absolute</literal>. </para></listitem>
+              If methods is set, will also perform highlighting. See
-                </varlistentry>
+              the highlight method.
              </para></listitem>
            </varlistentry>
-                <varlistentry>
+            <varlistentry>
-                  <term>Query.getgroups()</term>
+              <term>Query.__iter__() and Query.next()</term>
-                  <listitem><para>Retrieves the expanded query terms as a list
+              <listitem><para>So that things like <literal>for doc in
-                  of pairs. Meaningful only after executexx In each
+              query:</literal> will work.</para></listitem>
-                  pair, the first entry is a list of user terms (of size
+            </varlistentry>
-                  one for simple terms, or more for group and phrase
+          </variablelist>
                  clauses), the second a list of query terms as derived
                  from the user terms and used in the Xapian
                  Query.</para></listitem>
                </varlistentry>
-                <varlistentry>
+          <variablelist>
                  <term>Query.getxquery()</term>
                  <listitem><para>Return the Xapian query description as a
                  Unicode string. 
                  Meaningful only after executexx.</para></listitem>
                </varlistentry>
-                <varlistentry>
+            <varlistentry><term>Query.arraysize</term>
-                  <term>Query.highlight(text, ishtml = 0, methods = object)</term>
+            <listitem><para>Default number of records processed by fetchmany
-                  <listitem><para>Will insert &lt;span "class=rclmatch">,
+            (r/w).</para></listitem>  
-                  &lt;/span> tags around the match areas in the input text
+            </varlistentry>
-                  and return the modified text.  <literal>ishtml</literal>
+            <varlistentry><term>Query.rowcount</term><listitem><para>Number
-                  can be set to indicate that the input text is HTML and
+            of records returned by the last
-                  that HTML special characters should not be escaped.
+            execute.</para></listitem></varlistentry>
-                  <literal>methods</literal> if set should be an object
+            <varlistentry><term>Query.rownumber</term><listitem><para>Next index
-                  with methods startMatch(i) and endMatch() which will be
+            to be fetched from results. Normally increments after
-                  called for each match and should return a begin and end
+            each fetchone() call, but can be set/reset before the
-                  tag</para></listitem>
+            call to effect seeking (equivalent to
-                </varlistentry>
+            using <literal>scroll()</literal>). Starts at
            0.</para></listitem> 
            </varlistentry>
-                <varlistentry>
+          </variablelist>
                  <term>Query.makedocabstract(doc, methods = object))</term>
                  <listitem><para>Create a snippets abstract
                  for <literal>doc</literal> (a <literal>Doc</literal>
                  object) by selecting text around the match terms.
                  If methods is set, will also perform highlighting. See
                  the highlight method.
                  </para></listitem>
                </varlistentry>
-                <varlistentry>
+        </simplesect>
-                  <term>Query.__iter__() and Query.next()</term>
+        <simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DOC">
-                  <listitem><para>So that things like <literal>for doc in
+          <title>The Doc class</title>
                  query:</literal> will work.</para></listitem>
                </varlistentry>
              </variablelist>
-              <variablelist>
+          <para>A <literal>Doc</literal> object contains index data
          for a given document. The data is extracted from the
          index when searching, or set by the indexer program when
          updating. The Doc object has many attributes to be read or
          set by its user. It mostly matches the Rcl::Doc C++
          object. Some of the attributes are predefined, but,
          especially when indexing, others can be set, the name of
          which will be processed as field names by the indexing
          configuration.  Inputs can be specified as Unicode or
          strings. Outputs are Unicode objects. All dates are
          specified as Unix timestamps, printed as strings. Please
          refer to the <filename>rcldb/rcldoc.cpp</filename> C++ file
          for a full description of the predefined attributes. Here
          follows a short list.</para>
-                <varlistentry><term>Query.arraysize</term>
+          <para><itemizedlist>
-                <listitem><para>Default number of records processed by fetchmany
+            <listitem><para><literal>url</literal> the document URL but
-                (r/w).</para></listitem>  
+            see also <literal>getbinurl()</literal></para></listitem>
                </varlistentry>
                <varlistentry><term>Query.rowcount</term><listitem><para>Number
                of records returned by the last
                execute.</para></listitem></varlistentry>
                <varlistentry><term>Query.rownumber</term><listitem><para>Next index
                to be fetched from results. Normally increments after
                each fetchone() call, but can be set/reset before the
                call to effect seeking (equivalent to
                using <literal>scroll()</literal>). Starts at
                0.</para></listitem> 
                </varlistentry>
-              </variablelist>
+            <listitem><para><literal>ipath</literal> the document
            <literal>ipath</literal> for embedded
            documents.</para></listitem> 
-            </sect5>
+            <listitem><para><literal>fbytes, dbytes</literal> the document
            file and text sizes.</para></listitem> 
            <listitem><para><literal>fmtime, dmtime</literal> the document
            file and document times.</para></listitem> 
            <listitem><para><literal>xdocid</literal> the document
            Xapian document ID. This is useful if you want to access
            the document through a direct Xapian
            operation.</para></listitem>
-            <sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.DOC">
+            <listitem><para><literal>mtype</literal> the document
-              <title>The Doc class</title>
+            MIME type.</para></listitem>
-              <para>A <literal>Doc</literal> object contains index data
+            <listitem><para>Fields stored by default:
-              for a given document. The data is extracted from the
+            <literal>author</literal>, <literal>filename</literal>,
-              index when searching, or set by the indexer program when
+            <literal>keywords</literal>,
-              updating. The Doc object has many attributes to be read or
+            <literal>recipient</literal></para></listitem>
              set by its user. It matches exactly the Rcl::Doc C++
              object. Some of the attributes are predefined, but,
              especially when indexing, others can be set, the name of
              which will be processed as field names by the indexing
              configuration.  Inputs can be specified as Unicode or
              strings. Outputs are Unicode objects. All dates are
              specified as Unix timestamps, printed as strings. Please
              refer to the <filename>rcldb/rcldoc.cpp</filename> C++ file
              for a full description of the predefined attributes. Here
              follows a short list.</para>
-              <para><itemizedlist>
+          </itemizedlist>                
-                <listitem><para><literal>url</literal> the document URL but
+          </para>
                see also <literal>getbinurl()</literal></para></listitem>
-                <listitem><para><literal>ipath</literal> the document
+          <para>At query time, only the fields that are defined as
-                <literal>ipath</literal> for embedded
+          <literal>stored</literal> either by default or in the
-                documents.</para></listitem> 
+          <filename>fields</filename> configuration file will be meaningful
          in the <literal>Doc</literal> object. The document processed text
          may be present or not, depending if the index stores the text at
          all, and if it does, on the <literal>fetchtext</literal> query
          execute option. See also the <literal>rclextract</literal> module
          for accessing document contents.</para>
-                <listitem><para><literal>fbytes, dbytes</literal> the document
+          <variablelist>
                file and text sizes.</para></listitem> 
                <listitem><para><literal>fmtime, dmtime</literal> the document
                file and document times.</para></listitem> 
-                <listitem><para><literal>xdocid</literal> the document
+            <varlistentry>
-                Xapian document ID. This is useful if you want to access
+              <term>get(key), [] operator</term>
                the document through a direct Xapian
                operation.</para></listitem>
-                <listitem><para><literal>mtype</literal> the document
+              <listitem><para>Retrieve the named document
-                MIME type.</para></listitem>
+              attribute. You can also use <literal>getattr(doc,
              key)</literal> or
              <literal>doc.key</literal>.</para></listitem>
            </varlistentry>
-                <listitem><para>Fields stored by default:
+            <varlistentry>
-                <literal>author</literal>, <literal>filename</literal>,
+              <term>doc.key = value</term>
                <literal>keywords</literal>,
                <literal>recipient</literal></para></listitem>
-              </itemizedlist>                
+              <listitem><para>Set the the named document attribute. You
-              </para>
+              can also use <literal>setattr(doc, key,
              value)</literal>.</para></listitem>
            </varlistentry>
-              <para>At query time, only the fields that are defined
+            <varlistentry>
-              as <literal>stored</literal> either by default or in
+              <term>getbinurl()</term>
              the <filename>fields</filename> configuration file will be
              meaningful in the <literal>Doc</literal>
              object. Especially this will not be the case for the
              document text. See the <literal>rclextract</literal>
              module for accessing document contents.</para> 
-              <variablelist>
+              <listitem><para>Retrieve the URL in byte array format (no
              transcoding), for use as parameter to a system
              call.</para></listitem>
            </varlistentry>
-                <varlistentry>
+            <varlistentry>
-                  <term>get(key), [] operator</term>
+              <term>setbinurl(url)</term>
-                  <listitem><para>Retrieve the named document
+              <listitem><para>Set the URL in byte array format (no
-                  attribute. You can also use <literal>getattr(doc,
+              transcoding).</para></listitem>
-                  key)</literal> or
+            </varlistentry>
                  <literal>doc.key</literal>.</para></listitem>
                </varlistentry>
-                <varlistentry>
+            <varlistentry>
-                  <term>doc.key = value</term>
+              <term>items()</term>
              <listitem><para>Return a dictionary of doc object
              keys/values</para></listitem> 
            </varlistentry>
-                  <listitem><para>Set the the named document attribute. You
+            <varlistentry>
-                  can also use <literal>setattr(doc, key,
+              <term>keys()</term>
-                  value)</literal>.</para></listitem>
+              <listitem><para>list of doc object keys (attribute
-                </varlistentry>
+              names).</para></listitem>
            </varlistentry>
          </variablelist>
-                <varlistentry>
+        </simplesect> <!-- Doc -->
                  <term>getbinurl()</term>
-                  <listitem><para>Retrieve the URL in byte array format (no
+        <simplesect id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.SEARCHDATA">
-                  transcoding), for use as parameter to a system
+          <title>The SearchData class</title>
                  call.</para></listitem>
                </varlistentry>
-                <varlistentry>
+          <para>A <literal>SearchData</literal> object allows building
-                  <term>setbinurl(url)</term>
+          a query by combining clauses, for execution
          by <literal>Query.executesd()</literal>. It can be used
          in replacement of the query language approach. The
          interface is going to change a little, so no detailed doc
          for now...</para>
-                  <listitem><para>Set the URL in byte array format (no
+          <variablelist>
                  transcoding).</para></listitem>
                </varlistentry>
-                <varlistentry>
+            <varlistentry>
-                  <term>items()</term>
+              <term>addclause(type='and'|'or'|'excl'|'phrase'|'near'|'sub',
-                  <listitem><para>Return a dictionary of doc object
+              qstring=string, slack=0, field='', stemming=1,
-                  keys/values</para></listitem> 
+              subSearch=SearchData)</term>
-                </varlistentry>
+              <listitem><para></para></listitem>
            </varlistentry>
          </variablelist>
-                <varlistentry>
+        </simplesect> <!-- SearchData -->
                  <term>keys()</term>
                  <listitem><para>list of doc object keys (attribute
                  names).</para></listitem>
                </varlistentry>
              </variablelist>
-            </sect5> <!-- Doc -->
+      </sect3> <!-- Recoll module -->
            <sect5 id="RCL.PROGRAM.PYTHONAPI.RECOLL.CLASSES.SEARCHDATA">
              <title>The SearchData class</title>
              <para>A <literal>SearchData</literal> object allows building
              a query by combining clauses, for execution
              by <literal>Query.executesd()</literal>. It can be used
              in replacement of the query language approach. The
              interface is going to change a little, so no detailed doc
              for now...</para>
              <variablelist>
                <varlistentry>
                  <term>addclause(type='and'|'or'|'excl'|'phrase'|'near'|'sub',
                  qstring=string, slack=0, field='', stemming=1,
                  subSearch=SearchData)</term>
                  <listitem><para></para></listitem>
                </varlistentry>
              </variablelist>
            </sect5> <!-- SearchData -->
          </sect4> <!-- recoll.classes -->
        </sect3> <!-- Recoll module -->
        <sect3 id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT">
          <title>The rclextract module</title>
-          <para>Prior to &RCL; 1.25, index queries never provide document
+          <para>Prior to &RCL; 1.25, index queries could not provide document
-          content because it is not stored. More recent versions usually
+          content because it was never stored. &RCL; 1.25 and later usually
          store the document text, which can be optionally retrieved when
          running a query (see <literal>query.execute()</literal>
          above - the result is always plain text).</para>
@ -5506,7 +5494,7 @@ recollindex -c "$confdir"
          <para>You need to import the <literal>recoll</literal> module
          before the <literal>rclextract</literal> module.</para>
-          <sect4 id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT.CLASSES.EXTRACTOR">
+          <simplesect id="RCL.PROGRAM.PYTHONAPI.RCLEXTRACT.CLASSES.EXTRACTOR">
            <title>The Extractor class</title>
            <variablelist>
@ -5565,7 +5553,7 @@ not doc.ipath and (not "rclbes" in doc.keys() or doc["rclbes"] == "FS")
              </variablelist>
-          </sect4>
+          </simplesect>
        </sect3> <!-- rclextract module -->