- connect(confdir=None, extra_dbs=None, writable = False)
-
The
connect()function connects to one or several Recoll index(es) and returns aDbobject.This call initializes the recoll module, and it should always be performed before any other call or object creation.confdirmay specify a configuration directory. The usual defaults apply.extra_dbsis a list of additional indexes (Xapian directories).writabledecides if we can index new data through this connection.
A Db object is created by
a connect() call and holds a
connection to a Recoll index.
Methods
- Db.close()
- Closes the connection. You can't do anything
with the
Dbobject after this. - Db.query(), Db.cursor()
- These
aliases return a blank
Queryobject for this index. - Db.setAbstractParams(maxchars, contextwords)
- Set the parameters used
to build snippets (sets of keywords in context text
fragments).
maxcharsdefines the maximum total size of the abstract.contextwordsdefines how many terms are shown around the keyword. - Db.termMatch(match_type, expr, field='', maxlen=-1, casesens=False, diacsens=False, lang='english')
- Expand an expression against the
index term list. Performs the basic function from the
GUI term explorer tool.
match_typecan be either ofwildcard,regexporstem. Returns a list of terms expanded from the input expression.
A Query object (equivalent to a
cursor in the Python DB API) is created by
a Db.query() call. It is used to
execute index searches.
Methods
- Query.sortby(fieldname, ascending=True)
- Sort results
by
fieldname, in ascending or descending order. Must be called before executing the search. - Query.execute(query_string, stemming=1, stemlang="english")
- Starts a search
for
query_string, a Recoll search language string. - Query.executesd(SearchData)
- Starts a search for the query defined by the SearchData object.
- Query.fetchmany(size=query.arraysize)
- Fetches
the next
Docobjects in the current search results, and returns them as an array of the required size, which is by default the value of thearraysizedata member. - Query.fetchone()
- Fetches the next
Docobject from the current search results. - Query.close()
- Closes the query. The object is unusable after the call.
- Query.scroll(value, mode='relative')
- Adjusts the position in the current result
set.
modecan berelativeorabsolute. - Query.getgroups()
- Retrieves the expanded query terms as a list of pairs. Meaningful only after executexx In each pair, the first entry is a list of user terms (of size one for simple terms, or more for group and phrase clauses), the second a list of query terms as derived from the user terms and used in the Xapian Query.
- Query.getxquery()
- Return the Xapian query description as a Unicode string. Meaningful only after executexx.
- Query.highlight(text, ishtml = 0, methods = object)
- Will insert <span "class=rclmatch">,
</span> tags around the match areas in the input text
and return the modified text.
ishtmlcan be set to indicate that the input text is HTML and that HTML special characters should not be escaped.methodsif set should be an object with methods startMatch(i) and endMatch() which will be called for each match and should return a begin and end tag - Query.makedocabstract(doc, methods = object))
- Create a snippets abstract
for
doc(aDocobject) by selecting text around the match terms. If methods is set, will also perform highlighting. See the highlight method. - Query.__iter__() and Query.next()
- So that things like
for doc in query:will work.
Data descriptors
- Query.arraysize
- Default number of records processed by fetchmany (r/w).
- Query.rowcount
- Number of records returned by the last execute.
- Query.rownumber
- Next index
to be fetched from results. Normally increments after
each fetchone() call, but can be set/reset before the
call to effect seeking (equivalent to
using
scroll()). Starts at 0.
A Doc object contains index data
for a given document. The data is extracted from the
index when searching, or set by the indexer program when
updating. The Doc object has many attributes to be read or
set by its user. It matches exactly the Rcl::Doc C++
object. Some of the attributes are predefined, but,
especially when indexing, others can be set, the name of
which will be processed as field names by the indexing
configuration. Inputs can be specified as Unicode or
strings. Outputs are Unicode objects. All dates are
specified as Unix timestamps, printed as strings. Please
refer to the rcldb/rcldoc.h C++ file
for a description of the predefined attributes.
At query time, only the fields that are defined
as stored either by default or in
the fields configuration file will be
meaningful in the Doc
object. Especially this will not be the case for the
document text. See the rclextract
module for accessing document contents.
Methods
- get(key), [] operator
- Retrieve the named doc attribute
- getbinurl()
- Retrieve the URL in byte array format (no transcoding), for use as parameter to a system call.
- items()
- Return a dictionary of doc object keys/values
- keys()
- list of doc object keys (attribute names).
A SearchData object allows building
a query by combining clauses, for execution
by Query.executesd(). It can be used
in replacement of the query language approach. The
interface is going to change a little, so no detailed doc
for now...
Methods
- addclause(type='and'|'or'|'excl'|'phrase'|'near'|'sub', qstring=string, slack=0, field='', stemming=1, subSearch=SearchData)

