From fe2eb103ecbe98ec6f25da4694e1ca31ec239e63 Mon Sep 17 00:00:00 2001
From: Jean-Francois Dockes rclextract module
- currently provides a single class which can be used to
- access the data content for result documents.
rclextract module provides a single
+ class which can be used to access the data content for
+ result documents.
Extract document defined by ipath and
return a Doc
- object. The doc.text field has the document
- text converted to either text/plain or
- text/html according to doc.mimetype. The
- typical use would be as follows:
doc.text field has the
+ document text converted to either text/plain
+ or text/html according to doc.mimetype. The typical
+ use would be as follows:
- qdoc = query.fetchone() - extractor = recoll.Extractor(qdoc) - doc = extractor.textextract(qdoc.ipath) - # use doc.text, e.g. for previewing -+qdoc = query.fetchone() +extractor = recoll.Extractor(qdoc) +doc = extractor.textextract(qdoc.ipath) +# use doc.text, e.g. for previewing +
Passing qdoc.ipath to textextract() is redundant,
+ but reflects the fact that the Extractor object actually
+ has the capability to access the other
+ entries in a compound document.
- qdoc = query.fetchone() - extractor = recoll.Extractor(qdoc) - filename = extractor.idoctofile(qdoc.ipath, qdoc.mimetype)+qdoc = query.fetchone() +extractor = recoll.Extractor(qdoc) +filename = extractor.idoctofile(qdoc.ipath, qdoc.mimetype) +
In all cases the output is a copy, even if + the requested document is a regular system + file, which may be wasteful in some cases. If + you want to avoid this, you can test for a + simple file document as follows:
++not doc.ipath and (not "rclbes" in doc.keys() or doc["rclbes"] == "FS") +
- #!/usr/bin/env python
-
- from recoll import recoll
+#!/usr/bin/env python
+
+from recoll import recoll
db = recoll.connect()
db.setAbstractParams(maxchars=80, contextwords=4)
@@ -6769,18 +6786,16 @@ query = db.query()
nres = query.execute("some user question")
print "Result count: ", nres
if nres > 5:
-nres = 5
+ nres = 5
for i in range(nres):
-doc = query.fetchone()
-print "Result #%d" % (query.rownumber,)
-for k in ("title", "size"):
-print k, ":", getattr(doc, k).encode('utf-8')
-abs = db.makeDocAbstract(doc, query).encode('utf-8')
-print abs
-print
-
-
-
+ doc = query.fetchone()
+ print "Result #%d" % (query.rownumber,)
+ for k in ("title", "size"):
+ print k, ":", getattr(doc, k).encode('utf-8')
+ abs = db.makeDocAbstract(doc, query).encode('utf-8')
+ print abs
+ print
+