python doc update

This commit is contained in:
Jean-Francois Dockes 2021-01-08 14:34:32 +01:00
parent 727d619d8b
commit 51761b7aa6
2 changed files with 73 additions and 86 deletions

View File

@ -6681,7 +6681,8 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
<p>The <span class="application">Recoll</span> Python <p>The <span class="application">Recoll</span> Python
programming interface can be used both for searching and programming interface can be used both for searching and
for creating/updating an index. Bindings exist for for creating/updating an index. Bindings exist for
Python2 and Python3.</p> Python2 and Python3 (Jan 2021: python2 support will be
dropped soon).</p>
<p>The search interface is used in a number of active <p>The search interface is used in a number of active
projects: the <a class="ulink" href= projects: the <a class="ulink" href=
"https://www.lesbonscomptes.com/recoll/pages/download.html#gssp" "https://www.lesbonscomptes.com/recoll/pages/download.html#gssp"
@ -6739,17 +6740,17 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
exercising the extension more completely, and especially exercising the extension more completely, and especially
its data extraction features.</p> its data extraction features.</p>
<pre class="programlisting"> <pre class="programlisting">
#!/usr/bin/env python #!/usr/bin/python3
from recoll import recoll from recoll import recoll
db = recoll.connect() db = recoll.connect()
query = db.query() query = db.query()
nres = query.execute("some query") nres = query.execute("some query")
results = query.fetchmany(20) results = query.fetchmany(20)
for doc in results: for doc in results:
print("%s %s" % (doc.url, doc.title)) print("%s %s" % (doc.url, doc.title))
</pre> </pre>
<p>You can also take a look at the source for the <p>You can also take a look at the source for the
<a class="ulink" href= <a class="ulink" href=
"https://framagit.org/medoc92/recollwebui/-/blob/master/webui.py" "https://framagit.org/medoc92/recollwebui/-/blob/master/webui.py"
@ -7345,7 +7346,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
<p>The <code class="literal">rclextract</code> module <p>The <code class="literal">rclextract</code> module
can give access to the original document and to the can give access to the original document and to the
document text content (if not stored by the index, or document text content (if not stored by the index, or
to access an HTML version of the text). Acessing the to access an HTML version of the text). Accessing the
original document is particularly useful if it is original document is particularly useful if it is
embedded (e.g. an email attachment).</p> embedded (e.g. an email attachment).</p>
<p>You need to import the <code class= <p>You need to import the <code class=
@ -7446,7 +7447,7 @@ not doc.ipath and (not "rclbes" in doc.keys() or doc["rclbes"] == "FS")
embryonic GUI which demonstrates the highlighting and embryonic GUI which demonstrates the highlighting and
data extraction functions.</p> data extraction functions.</p>
<pre class="programlisting"> <pre class="programlisting">
#!/usr/bin/env python #!/usr/bin/python3
from recoll import recoll from recoll import recoll
@ -7455,17 +7456,15 @@ db.setAbstractParams(maxchars=80, contextwords=4)
query = db.query() query = db.query()
nres = query.execute("some user question") nres = query.execute("some user question")
print "Result count: ", nres print("Result count: %d" % nres)
if nres &gt; 5: if nres &gt; 5:
nres = 5 nres = 5
for i in range(nres): for i in range(nres):
doc = query.fetchone() doc = query.fetchone()
print "Result #%d" % (query.rownumber,) print("Result #%d" % (query.rownumber))
for k in ("title", "size"): for k in ("title", "size"):
print k, ":", getattr(doc, k).encode('utf-8') print("%s : %s" % (k, getattr(doc, k)))
abs = db.makeDocAbstract(doc, query).encode('utf-8') print("%s\n" % db.makeDocAbstract(doc, query))
print abs
print
</pre> </pre>
</div> </div>
</div> </div>
@ -7651,9 +7650,9 @@ for i in range(nres):
Recoll source (which sets <code class= Recoll source (which sets <code class=
"literal">rclbes="MBOX"</code>):</p> "literal">rclbes="MBOX"</code>):</p>
<pre class="programlisting">[MBOX] <pre class="programlisting">[MBOX]
fetch = /path/to/recoll/src/python/samples/rclmbox.py fetch fetch = /path/to/recoll/src/python/samples/rclmbox.py fetch
makesig = path/to/recoll/src/python/samples/rclmbox.py makesig makesig = path/to/recoll/src/python/samples/rclmbox.py makesig
</pre> </pre>
<p><code class="literal">fetch</code> and <code class= <p><code class="literal">fetch</code> and <code class=
"literal">makesig</code> define two commands to execute "literal">makesig</code> define two commands to execute
to respectively retrieve the document text and compute to respectively retrieve the document text and compute
@ -7708,27 +7707,21 @@ for i in range(nres):
of course).</p> of course).</p>
<p>Adapting to the new package structure:</p> <p>Adapting to the new package structure:</p>
<pre class="programlisting"> <pre class="programlisting">
try:
try: from recoll import recoll
from recoll import recoll from recoll import rclextract
from recoll import rclextract hasextract = True
hasextract = True except:
except: import recoll
import recoll hasextract = False
hasextract = False </pre>
</pre>
<p>Adapting to the change of nature of the <code class= <p>Adapting to the change of nature of the <code class=
"literal">next</code> <code class="literal">Query</code> "literal">next</code> <code class="literal">Query</code>
member. The same test can be used to choose to use the member. The same test can be used to choose to use the
<code class="literal">scroll()</code> method (new) or set <code class="literal">scroll()</code> method (new) or set
the <code class="literal">next</code> value (old).</p> the <code class="literal">next</code> value (old).</p>
<pre class="programlisting"> <pre class=
"programlisting">rownum = query.next if type(query.next) == int else query.rownumber</pre>
rownum = query.next if type(query.next) == int else \
query.rownumber
</pre>
</div> </div>
</div> </div>
</div> </div>

View File

@ -5144,7 +5144,8 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
<para>The &RCL; Python programming interface can be used both for <para>The &RCL; Python programming interface can be used both for
searching and for creating/updating an index. Bindings exist for searching and for creating/updating an index. Bindings exist for
Python2 and Python3.</para> Python2 and Python3 (Jan 2021: python2 support will be dropped
soon).</para>
<para>The search interface is used in a number of active projects: <para>The search interface is used in a number of active projects:
the <ulink the <ulink
@ -5192,18 +5193,18 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
extension more completely, and especially its data extraction extension more completely, and especially its data extraction
features.</para> features.</para>
<programlisting><![CDATA[ <programlisting><![CDATA[
#!/usr/bin/env python #!/usr/bin/python3
from recoll import recoll from recoll import recoll
db = recoll.connect() db = recoll.connect()
query = db.query() query = db.query()
nres = query.execute("some query") nres = query.execute("some query")
results = query.fetchmany(20) results = query.fetchmany(20)
for doc in results: for doc in results:
print("%s %s" % (doc.url, doc.title)) print("%s %s" % (doc.url, doc.title))
]]></programlisting> ]]></programlisting>
<para>You can also take a look at the source for the <para>You can also take a look at the source for the
<ulink <ulink
@ -5670,7 +5671,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
<para>The <literal>rclextract</literal> module can give access to <para>The <literal>rclextract</literal> module can give access to
the original document and to the document text content (if not the original document and to the document text content (if not
stored by the index, or to access an HTML version of the text). stored by the index, or to access an HTML version of the text).
Acessing the original document is particularly useful if it is Accessing the original document is particularly useful if it is
embedded (e.g. an email attachment).</para> embedded (e.g. an email attachment).</para>
<para>You need to import the <literal>recoll</literal> module <para>You need to import the <literal>recoll</literal> module
@ -5703,19 +5704,20 @@ qdoc = query.fetchone()
extractor = recoll.Extractor(qdoc) extractor = recoll.Extractor(qdoc)
doc = extractor.textextract(qdoc.ipath) doc = extractor.textextract(qdoc.ipath)
# use doc.text, e.g. for previewing</programlisting> # use doc.text, e.g. for previewing</programlisting>
<para>Passing <literal>qdoc.ipath</literal> to
<para>Passing <literal>qdoc.ipath</literal> to
<literal>textextract()</literal> is redundant, but <literal>textextract()</literal> is redundant, but
reflects the fact that the <literal>Extractor</literal> reflects the fact that the <literal>Extractor</literal>
object actually has the capability to access the other object actually has the capability to access the other
entries in a compound document.</para> entries in a compound document.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Extractor.idoctofile(ipath, targetmtype, outfile='')</term> <term>Extractor.idoctofile(ipath, targetmtype, outfile='')</term>
<listitem><para>Extracts document into an output file, <listitem><para>Extracts document into an output file,
which can be given explicitly or will be created as a which can be given explicitly or will be created as a
temporary file to be deleted by the caller. Typical temporary file to be deleted by the caller. Typical
use:</para> use:</para>
<programlisting> <programlisting>
from recoll import recoll, rclextract from recoll import recoll, rclextract
@ -5750,7 +5752,7 @@ not doc.ipath and (not "rclbes" in doc.keys() or doc["rclbes"] == "FS")
highlighting and data extraction functions.</para> highlighting and data extraction functions.</para>
<programlisting><![CDATA[ <programlisting><![CDATA[
#!/usr/bin/env python #!/usr/bin/python3
from recoll import recoll from recoll import recoll
@ -5759,17 +5761,15 @@ db.setAbstractParams(maxchars=80, contextwords=4)
query = db.query() query = db.query()
nres = query.execute("some user question") nres = query.execute("some user question")
print "Result count: ", nres print("Result count: %d" % nres)
if nres > 5: if nres > 5:
nres = 5 nres = 5
for i in range(nres): for i in range(nres):
doc = query.fetchone() doc = query.fetchone()
print "Result #%d" % (query.rownumber,) print("Result #%d" % (query.rownumber))
for k in ("title", "size"): for k in ("title", "size"):
print k, ":", getattr(doc, k).encode('utf-8') print("%s : %s" % (k, getattr(doc, k)))
abs = db.makeDocAbstract(doc, query).encode('utf-8') print("%s\n" % db.makeDocAbstract(doc, query))
print abs
print
]]></programlisting> ]]></programlisting>
</sect3> </sect3>
@ -5911,10 +5911,11 @@ for i in range(nres):
access data from the specified indexer. Example, for the mbox access data from the specified indexer. Example, for the mbox
indexing sample found in the Recoll source (which sets indexing sample found in the Recoll source (which sets
<literal>rclbes="MBOX"</literal>):</para> <literal>rclbes="MBOX"</literal>):</para>
<programlisting>[MBOX] <programlisting>[MBOX]
fetch = /path/to/recoll/src/python/samples/rclmbox.py fetch fetch = /path/to/recoll/src/python/samples/rclmbox.py fetch
makesig = path/to/recoll/src/python/samples/rclmbox.py makesig makesig = path/to/recoll/src/python/samples/rclmbox.py makesig
</programlisting> </programlisting>
<para><literal>fetch</literal> and <literal>makesig</literal> <para><literal>fetch</literal> and <literal>makesig</literal>
define two commands to execute to respectively retrieve the define two commands to execute to respectively retrieve the
document text and compute the document signature (the example document text and compute the document signature (the example
@ -5953,17 +5954,15 @@ for i in range(nres):
course).</para> course).</para>
<para>Adapting to the new package structure:</para> <para>Adapting to the new package structure:</para>
<programlisting> <programlisting><![CDATA[
<![CDATA[ try:
try: from recoll import recoll
from recoll import recoll from recoll import rclextract
from recoll import rclextract hasextract = True
hasextract = True except:
except: import recoll
import recoll hasextract = False
hasextract = False ]]></programlisting>
]]>
</programlisting>
<para>Adapting to the change of nature of <para>Adapting to the change of nature of
the <literal>next</literal> <literal>Query</literal> the <literal>next</literal> <literal>Query</literal>
@ -5971,12 +5970,7 @@ for i in range(nres):
the <literal>scroll()</literal> method (new) or set the <literal>scroll()</literal> method (new) or set
the <literal>next</literal> value (old).</para> the <literal>next</literal> value (old).</para>
<programlisting> <programlisting><![CDATA[rownum = query.next if type(query.next) == int else query.rownumber]]></programlisting>
<![CDATA[
rownum = query.next if type(query.next) == int else \
query.rownumber
]]>
</programlisting>
</sect2> <!-- compat with previous version --> </sect2> <!-- compat with previous version -->