release 2896
This commit is contained in:
parent
e2185379b5
commit
c1ce9caa36
85
src/INSTALL
85
src/INSTALL
@ -333,7 +333,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
a configuration directory. There can be several such directories, each of
|
||||
which define the parameters for one index.
|
||||
|
||||
The configuration files can be edited by hand or through the Indexing
|
||||
The configuration files can be edited by hand or through the Index
|
||||
configuration dialog (Preferences menu). The GUI tool will try to respect
|
||||
your formatting and comments as much as possible, so it is quite possible
|
||||
to use both ways.
|
||||
@ -526,6 +526,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
window. A size of a few megabytes would seem reasonable (default:
|
||||
1MB).
|
||||
|
||||
membermaxkbs
|
||||
|
||||
This defines the maximum size in kilobytes for an archive member
|
||||
(zip, tar or rar at the moment). Bigger entries will be skipped.
|
||||
|
||||
indexallfilenames
|
||||
|
||||
Recoll indexes file names in a special section of the database to
|
||||
@ -562,6 +567,32 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
share the values for these parameters, because they usually affect both
|
||||
search and index operations.
|
||||
|
||||
indexStripChars
|
||||
|
||||
Decide if we strip characters of diacritics and convert them to
|
||||
lower-case before terms are indexed. If we don't, searches
|
||||
sensitive to case and diacritics can be performed, but the index
|
||||
will be bigger, and some marginal weirdness may sometimes occur.
|
||||
The default is a stripped index (indexStripChars = 1) for now.
|
||||
When using multiple indexes for a search, this parameter must be
|
||||
defined identically for all. Changing the value implies an index
|
||||
reset.
|
||||
|
||||
maxTermExpand
|
||||
|
||||
Maximum expansion count for a single term (e.g.: when using
|
||||
wildcards). The default of 10000 is reasonable and will avoid
|
||||
queries that appear frozen while the engine is walking the term
|
||||
list.
|
||||
|
||||
maxXapianClauses
|
||||
|
||||
Maximum number of elementary clauses we can add to a single Xapian
|
||||
query. In some cases, the result of term expansion can be
|
||||
multiplicative, and we want to avoid using excessive memory. The
|
||||
default of 100 000 should be both high enough in most cases and
|
||||
compatible with current typical hardware configurations.
|
||||
|
||||
nonumbers
|
||||
|
||||
If this set to true, no terms will be generated for numbers. For
|
||||
@ -699,6 +730,22 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
5.4.1.4. Miscellaneous parameters:
|
||||
|
||||
autodiacsens
|
||||
|
||||
IF the index is not stripped, decide if we automatically trigger
|
||||
diacritics sensitivity if the search term has accented characters
|
||||
(not in unac_except_trans). Else you need to use the query
|
||||
language and the D modifier to specify diacritics sensitivity.
|
||||
Default is no.
|
||||
|
||||
autocasesens
|
||||
|
||||
IF the index is not stripped, decide if we automatically trigger
|
||||
character case sensitivity if the search term has upper-case
|
||||
characters in any but the first position. Else you need to use the
|
||||
query language and the C modifier to specify character-case
|
||||
sensitivity. Default is yes.
|
||||
|
||||
loglevel,daemloglevel
|
||||
|
||||
Verbosity level for recoll and recollindex. A value of 4 lists
|
||||
@ -737,6 +784,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
the auxiliary databases (spelling, stemming) if needed. The
|
||||
default is one hour.
|
||||
|
||||
monioniceclass, monioniceclassdata
|
||||
|
||||
These allow defining the ionice class and data used by the indexer
|
||||
(default class 3, no data).
|
||||
|
||||
filtermaxseconds
|
||||
|
||||
Maximum filter execution time, after which it is aborted. Some
|
||||
@ -781,6 +833,13 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
Useful for cases where you don't need the functionality or when it
|
||||
is unusable because aspell crashes during dictionary generation.
|
||||
|
||||
mhmboxquirks
|
||||
|
||||
This allows definining location-related quirks for the mailbox
|
||||
handler. Currently only the tbird flag is defined, and it should
|
||||
be set for directories which hold Thunderbird data, as their
|
||||
folder format is weird.
|
||||
|
||||
5.4.2. The fields file
|
||||
|
||||
This file contains information about dynamic fields handling in Recoll.
|
||||
@ -885,19 +944,24 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
oofice instead of openoffice etc.
|
||||
|
||||
Changes to this file can be done by direct editing, or through the recoll
|
||||
user preferences dialog.
|
||||
GUI preferences dialog.
|
||||
|
||||
If Use desktop preferences to choose document editor is checked in the
|
||||
Recoll GUI user preferences, all mimeview entries will be ignored except
|
||||
the one labelled application/x-all (which is set to use xdg-open by
|
||||
default).
|
||||
Recoll GUI preferences, all mimeview entries will be ignored except the
|
||||
one labelled application/x-all (which is set to use xdg-open by default).
|
||||
|
||||
In this case, the xallexcepts top level variable defines a list of mime
|
||||
type exceptions which will be processed according to the local entries
|
||||
instead of being passed to the desktop. This is so that specific Recoll
|
||||
options such as a page number or a search string can be passed to
|
||||
applications that support them, such as the evince viewer.
|
||||
|
||||
As for the other configuration files, the normal usage is to have a
|
||||
mimeview inside your own configuration directory, with just the
|
||||
non-default entries, which will override those from the central
|
||||
configuration file.
|
||||
|
||||
Please note that these entries must be placed under a [view] section.
|
||||
All viewer definition entries must be placed under a [view] section.
|
||||
|
||||
The keys in the file are normally mime types. You can add an application
|
||||
tag to specialize the choice for an area of the filesystem (using a
|
||||
@ -927,6 +991,15 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
* %M. Mime type
|
||||
|
||||
* %p. Page index. Only significant for a subset of document types,
|
||||
currently only PDF, Postscript and DVI files. Can be used to start the
|
||||
editor at the right page for a match or snippet.
|
||||
|
||||
* %s. Search term. The value will only be set for documents with indexed
|
||||
page numbers (ie: PDF). The value will be one of the matched search
|
||||
terms. It would allow pre-setting the value in the "Find" entry inside
|
||||
Evince for example, for easy highlighting of the term.
|
||||
|
||||
* %U, %u. Url.
|
||||
|
||||
In addition to the predefined values above, all strings like %(fieldname)
|
||||
|
||||
346
src/README
346
src/README
@ -46,9 +46,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
2.2.2. Security aspects
|
||||
|
||||
2.3. Indexing configuration
|
||||
2.3. Index configuration
|
||||
|
||||
2.3.1. The indexing configuration GUI
|
||||
2.3.1. Index case and diacritics sensitivity
|
||||
|
||||
2.3.2. The index configuration GUI
|
||||
|
||||
2.4. Using Beagle WEB browser plugins
|
||||
|
||||
@ -102,19 +104,21 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
3.4.1. Modifiers
|
||||
|
||||
3.5. Anchored searches and wildcards
|
||||
3.5. Search case and diacritics sensitivity
|
||||
|
||||
3.5.1. More about wildcards
|
||||
3.6. Anchored searches and wildcards
|
||||
|
||||
3.5.2. Anchored searches
|
||||
3.6.1. More about wildcards
|
||||
|
||||
3.6. Desktop integration
|
||||
3.6.2. Anchored searches
|
||||
|
||||
3.6.1. Hotkeying recoll
|
||||
3.7. Desktop integration
|
||||
|
||||
3.6.2. The KDE Kicker Recoll applet
|
||||
3.7.1. Hotkeying recoll
|
||||
|
||||
3.7. Multiple databases
|
||||
3.7.2. The KDE Kicker Recoll applet
|
||||
|
||||
3.8. Multiple databases
|
||||
|
||||
4. Programming interface
|
||||
|
||||
@ -126,6 +130,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
4.1.3. Filter HTML output
|
||||
|
||||
4.1.4. Page numbers
|
||||
|
||||
4.2. Field data processing
|
||||
|
||||
4.3. API
|
||||
@ -250,20 +256,36 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
plural (floor, floors), or on a verb tense (flooring, floored). Because
|
||||
the mechanisms used for stemming depend on the specific grammatical rules
|
||||
for each language, there is a separate stemmer module for most common
|
||||
languages where stemming makes sense. Storing documents written in
|
||||
different languages in the same index is possible, and commonly done. In
|
||||
this situation, you can specify several stemming languages for the index.
|
||||
languages where stemming makes sense.
|
||||
|
||||
Recoll stores the unstemmed versions of terms in the main index and uses
|
||||
auxiliary databases for term expansion (one for each stemming language),
|
||||
which means that you can switch stemming languages between searches, or
|
||||
add a language without needing a full reindex. Recoll currently makes no
|
||||
attempt at automatic language recognition, which means that the stemmer
|
||||
will sometimes be applied to terms from other languages with potentially
|
||||
strange results. In practise, even if this introduces possibilities of
|
||||
confusion, this approach has been proven quite useful, and, awaiting the
|
||||
addition of an automatic language recognition module to Recoll, it is much
|
||||
less cumbersome than separating your documents according to what language
|
||||
they are written in.
|
||||
add a language without needing a full reindex.
|
||||
|
||||
Storing documents written in different languages in the same index is
|
||||
possible, and commonly done. In this situation, you can specify several
|
||||
stemming languages for the index.
|
||||
|
||||
Recoll currently makes no attempt at automatic language recognition, which
|
||||
means that the stemmer will sometimes be applied to terms from other
|
||||
languages with potentially strange results. In practise, even if this
|
||||
introduces possibilities of confusion, this approach has been proven quite
|
||||
useful, and, awaiting the addition of an automatic language recognition
|
||||
module to Recoll, it is much less cumbersome than separating your
|
||||
documents according to what language they are written in.
|
||||
|
||||
Before version 1.18, Recoll always stripped most accents and diacritics
|
||||
from terms, and converted them to lower case before storing them in the
|
||||
index. As a consequence, it was impossible to search for a particular
|
||||
capitalization of a term (US / us), or to discriminate two terms based on
|
||||
diacritics (sake / sake, mate / mate).
|
||||
|
||||
As of version 1.18, Recoll can optionally store the raw terms, without
|
||||
accent stripping or case conversion. Expansions necessary for searches
|
||||
insensitive to case and/or diacritics are then performed when searching.
|
||||
This is described in more detail in the section about index case and
|
||||
diacritics sensitivity.
|
||||
|
||||
Recoll has many parameters which define exactly what to index, and how to
|
||||
classify and decode the source documents. These are kept in configuration
|
||||
@ -352,8 +374,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
The generated indexes can be queried concurrently in a transparent manner.
|
||||
|
||||
For index generation, multiple configurations are totally independant from
|
||||
each other. When multiple indexes are used for searches, some parameters
|
||||
should be consistent among the configurations.
|
||||
each other. When multiple indexes need to be used for a single search,
|
||||
some parameters should be consistent among the configurations.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
@ -480,7 +502,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
2.3. Indexing configuration
|
||||
2.3. Index configuration
|
||||
|
||||
Variables set inside the Recoll configuration files control which areas of
|
||||
the file system are indexed, and how files are processed. These variables
|
||||
@ -506,25 +528,63 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
2.3.1. The indexing configuration GUI
|
||||
2.3.1. Index case and diacritics sensitivity
|
||||
|
||||
Most parameters for a given indexing configuration can be set from a
|
||||
recoll GUI running on this configuration (either as default, or by setting
|
||||
As of Recoll version 1.18 you have a choice of building an index with
|
||||
terms stripped of character case and diacritics, or one with raw terms.
|
||||
For a source term of Resume, the former will store resume, the latter
|
||||
Resume.
|
||||
|
||||
Each type of index allows performing searches insensitive to case and
|
||||
diacritics: with a raw index, the user entry will be expanded to match all
|
||||
case and diacritics variations present in the index. With a stripped
|
||||
index, the search term will be stripped before searching.
|
||||
|
||||
A raw index allows for another possibility which a stripped index cannot
|
||||
offer: using case and diacritics to discriminate between terms, returning
|
||||
different results when searching for US and us or resume and resume. Read
|
||||
the section about search case and diacritics sensitivity for more details.
|
||||
|
||||
The type of index to be created is controlled by the indexStripChars
|
||||
configuration variable which can only be changed by editing the
|
||||
configuration file. Any change implies an index reset (not automated by
|
||||
Recoll), and all indexes in a search must be set in the same way (again,
|
||||
not checked by Recoll).
|
||||
|
||||
If the indexStripChars is not set, Recoll 1.18 creates a stripped index by
|
||||
default, for compatibility with previous versions.
|
||||
|
||||
As a cost for added capability, a raw index will be slightly bigger than a
|
||||
stripped one (around 10%). Also, searches will be more complex, so
|
||||
probably slightly slower, and the feature is still young, and a certain
|
||||
amount of weirdness cannot be excluded.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
2.3.2. The index configuration GUI
|
||||
|
||||
Most parameters for a given index configuration can be set from a recoll
|
||||
GUI running on this configuration (either as default, or by setting
|
||||
RECOLL_CONFDIR or the -c option.)
|
||||
|
||||
The interface is started from the Preferences->Indexing Configuration menu
|
||||
entry. It is divided in three tabs, Global parameters, Local parameters,
|
||||
and Beagle web history, which is explained in the next section.
|
||||
The interface is started from the Preferences->Index Configuration menu
|
||||
entry. It is divided in four tabs, Global parameters, Local parameters,
|
||||
Beagle web history (which is explained in the next section) and Search
|
||||
parameters.
|
||||
|
||||
The first tab allows setting global variables, like the lists of top
|
||||
directories, skipped paths, or stemming languages.
|
||||
The Global parameters tab allows setting global variables, like the lists
|
||||
of top directories, skipped paths, or stemming languages.
|
||||
|
||||
The second tab allows setting variables that can be redefined for
|
||||
subdirectories. This second tab has an initially empty list of
|
||||
The Local parameters tab allows setting variables that can be redefined
|
||||
for subdirectories. This second tab has an initially empty list of
|
||||
customisation directories, to which you can add. The variables are then
|
||||
set for the currently selected directory (or at the top level if the empty
|
||||
line is selected).
|
||||
|
||||
The Search parameters section defines parameters which are used at query
|
||||
time, but are global to an index and affect all search tools, not only the
|
||||
GUI.
|
||||
|
||||
The meaning for most entries in the interface is self-evident and
|
||||
documented by a ToolTip popup on the text label. For more detail, you will
|
||||
need to refer to the configuration section of this guide.
|
||||
@ -550,7 +610,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
the Beagle queue directory. This supposes that Beagle is not running, else
|
||||
both programs will fight for the same files.
|
||||
|
||||
This feature can be enabled in the GUI indexing configuration panel, or by
|
||||
This feature can be enabled in the GUI Index configuration panel, or by
|
||||
editing the configuration file (set processbeaglequeue to 1).
|
||||
|
||||
There are more recent instructions about how to find and install the
|
||||
@ -855,7 +915,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
Clicking the Open link will attempt to start an external viewer. The
|
||||
viewer for each document type can be configured through the user
|
||||
preferences dialog, or by editing the mimeview configuration file. You can
|
||||
also check the Use desktop preferences option in the user preferences
|
||||
also check the Use desktop preferences option in the GUI preferences
|
||||
dialog to use the desktop defaults for all documents. This is probably the
|
||||
best option if you are using a well configured Gnome or KDE desktop.
|
||||
|
||||
@ -904,6 +964,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
* Open Parent document
|
||||
|
||||
* Open Snippets Window
|
||||
|
||||
The Preview and Open entries do the same thing as the corresponding links.
|
||||
|
||||
The Copy File Name and Copy Url copy the relevant data to the clipboard,
|
||||
@ -930,6 +992,13 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
this case. In other cases, the Open option makes sense, for example to
|
||||
start a chm viewer on the parent document for a help page.
|
||||
|
||||
The Open Snippets Window entry will only appear for documents which
|
||||
support page breaks (typically PDF, Postscript, DVI). The snippets window
|
||||
lists extracts from the document, taken around search terms occurrences,
|
||||
along with the corresponding page number, as links which can be used to
|
||||
start the native viewer on the appropriate page. If the viewer supports
|
||||
it, its search function will also be primed with one of the search terms.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
3.1.3. The result table
|
||||
@ -1428,6 +1497,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
mimeview. xdg-open will in term use your desktop preferences to choose
|
||||
an appropriate application.
|
||||
|
||||
* Exceptions: when using the desktop preferences for opening documents,
|
||||
these are mime types that will still be opened according to Recoll
|
||||
preferences. This is useful for passing parameters like page numbers
|
||||
or search strings to applications that support them (e.g. evince).
|
||||
|
||||
* Choose editor applications this will let you choose the command
|
||||
started by the Open links inside the result list, for specific
|
||||
document types.
|
||||
@ -1569,6 +1643,9 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
* %D. Date
|
||||
|
||||
* %E. Precooked Snippets link (will only appear for documents indexed
|
||||
with page numbers)
|
||||
|
||||
* %I. Icon image name. This is normally determined from the mime type.
|
||||
The associations are defined inside the mimeconf configuration file.
|
||||
If a thumbnail for the file is found at the standard Freedesktop
|
||||
@ -1826,13 +1903,34 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
The field syntax also supports a few field-like, but special, criteria:
|
||||
|
||||
* dir for filtering the results on file location (Ex:
|
||||
dir:/home/me/somedir). -dir also works to find results out of the
|
||||
specified directory, only after release 1.15.8. A tilde inside the
|
||||
value will be expanded to the home directory. dir is not a regular
|
||||
field and only one value makes sense in a query (you can't use
|
||||
dir:dir1 OR dir:dir2). Relative paths make sense, for example,
|
||||
dir:share/doc would match either /usr/share/doc or
|
||||
/usr/local/share/doc
|
||||
dir:/home/me/somedir). -dir also works to find results not in the
|
||||
specified directory (release >= 1.15.8). A tilde inside the value will
|
||||
be expanded to the home directory. Wildcards will not be expanded. You
|
||||
cannot use OR with dir clauses (this restriction may go away in the
|
||||
future).
|
||||
|
||||
Relative paths also make sense, for example, dir:share/doc would match
|
||||
either /usr/share/doc or /usr/local/share/doc
|
||||
|
||||
Several dir clauses can be specified, both positive and negative. For
|
||||
example the following makes sense:
|
||||
|
||||
dir:recoll dir:src -dir:utils -dir:common
|
||||
|
||||
|
||||
This would select results which have both recoll and src in the path
|
||||
(in any order), and which have not either utils or common.
|
||||
|
||||
Another special aspect of dir clauses is that the values in the index
|
||||
are not transcoded to UTF-8, and never lower-cased or unaccented, but
|
||||
stored as binary. This means that you need to enter the values in the
|
||||
exact lower or upper case, and that searches for names with diacritics
|
||||
may sometimes be impossible because of character set conversion
|
||||
issues. Non-ASCII UNIX file paths are an unending source of trouble
|
||||
and are best avoided.
|
||||
|
||||
You need to use double-quotes around the path value if it contains
|
||||
space characters.
|
||||
|
||||
* size for filtering the results on file size. Example: size<10000. You
|
||||
can use <, > or = as operators. You can specify a range like the
|
||||
@ -1913,12 +2011,68 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
* p can be used to turn the default phrase search into a proximity one
|
||||
(unordered). Example:"order any in"p
|
||||
|
||||
* C will turn on case sensitivity (if the index supports it).
|
||||
|
||||
* D will turn on diacritics sensitivity (if the index supports it).
|
||||
|
||||
* A weight can be specified for a query element by specifying a decimal
|
||||
value at the start of the modifiers. Example: "Important"2.5.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
3.5. Anchored searches and wildcards
|
||||
3.5. Search case and diacritics sensitivity
|
||||
|
||||
For Recoll versions 1.18 and later, and when working with a raw index (not
|
||||
the default), searches can be made sensitive to character case and
|
||||
diacritics. How this happens is controlled by configuration variables and
|
||||
what search data is entered.
|
||||
|
||||
The general default is that searches are insensitive to case and
|
||||
diacritics. An entry of resume will match any of Resume, RESUME, resume,
|
||||
Resume etc.
|
||||
|
||||
Two configuration variables can automate switching on sensitivity:
|
||||
|
||||
autodiacsens
|
||||
|
||||
If this is set, search sensitivity to diacritics will be turned on
|
||||
as soon as an accented character exists in a search term. When the
|
||||
variable is set to true, resume will start a
|
||||
diacritics-unsensitive search, but resume will be matched exactly.
|
||||
The default value is false.
|
||||
|
||||
autocasesens
|
||||
|
||||
If this is set, search sensitivity to character case will be
|
||||
turned on as soon as an upper-case character exists in a search
|
||||
term except for the first one. When the variable is set to true,
|
||||
us or Us will start a diacritics-unsensitive search, but US will
|
||||
be matched exactly. The default value is true (contrary to
|
||||
autodiacsens).
|
||||
|
||||
As in the past, capitalizing the first letter of a word will turn off its
|
||||
stem expansion and have no effect on case-sensitivity.
|
||||
|
||||
You can also explicitely activate case and diacritics sensitivity by using
|
||||
modifiers with the query language. C will make the term case-sensitive,
|
||||
and D will make it diacritics-sensitive. Examples:
|
||||
|
||||
"us"C
|
||||
|
||||
|
||||
will search for the term us exactly (Us will not be a match).
|
||||
|
||||
"resume"D
|
||||
|
||||
|
||||
will search for the term resume exactly (resume will not be a match).
|
||||
|
||||
When either case or diacritics sensitivity is activated, stem expansion is
|
||||
turned off. Having both does not make much sense.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
3.6. Anchored searches and wildcards
|
||||
|
||||
Some special characters are interpreted by Recoll in search strings to
|
||||
expand or specialize the search. Wildcards expand a root term in
|
||||
@ -1928,7 +2082,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
3.5.1. More about wildcards
|
||||
3.6.1. More about wildcards
|
||||
|
||||
All words entered in Recoll search fields will be processed for wildcard
|
||||
expansion before the request is finally executed.
|
||||
@ -1959,7 +2113,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
3.5.2. Anchored searches
|
||||
3.6.2. Anchored searches
|
||||
|
||||
Two characters are used to specify that a search hit should occur at the
|
||||
beginning or at the end of the text. ^ at the beginning of a term or
|
||||
@ -1984,14 +2138,14 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
3.6. Desktop integration
|
||||
3.7. Desktop integration
|
||||
|
||||
Being independant of the desktop type has its drawbacks: Recoll desktop
|
||||
integration is minimal. Here follow a few things that may help.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
3.6.1. Hotkeying recoll
|
||||
3.7.1. Hotkeying recoll
|
||||
|
||||
It is surprisingly convenient to be able to show or hide the Recoll GUI
|
||||
with a single keystroke. Recoll comes with a small Python script, based on
|
||||
@ -2000,7 +2154,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
3.6.2. The KDE Kicker Recoll applet
|
||||
3.7.2. The KDE Kicker Recoll applet
|
||||
|
||||
The Recoll source tree contains the source code to the recoll_applet, a
|
||||
small application derived from the find_applet. This can be used to add a
|
||||
@ -2023,7 +2177,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
3.7. Multiple databases
|
||||
3.8. Multiple databases
|
||||
|
||||
Multiple Recoll databases or indexes can be created by using several
|
||||
configuration directories which are usually set to index different areas
|
||||
@ -2216,6 +2370,15 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.1.4. Page numbers
|
||||
|
||||
The indexer will interpret ^L characters in the filter output as
|
||||
indicating page breaks, and will record them. At query time, this allows
|
||||
starting a viewer on the right page for a hit or a snippet. Currently,
|
||||
only the PDF, Postscript and DVI filters generate page breaks.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.2. Field data processing
|
||||
|
||||
Fields are named pieces of information in or about documents, like title,
|
||||
@ -2824,7 +2987,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
a configuration directory. There can be several such directories, each of
|
||||
which define the parameters for one index.
|
||||
|
||||
The configuration files can be edited by hand or through the Indexing
|
||||
The configuration files can be edited by hand or through the Index
|
||||
configuration dialog (Preferences menu). The GUI tool will try to respect
|
||||
your formatting and comments as much as possible, so it is quite possible
|
||||
to use both ways.
|
||||
@ -3021,6 +3184,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
window. A size of a few megabytes would seem reasonable (default:
|
||||
1MB).
|
||||
|
||||
membermaxkbs
|
||||
|
||||
This defines the maximum size in kilobytes for an archive member
|
||||
(zip, tar or rar at the moment). Bigger entries will be skipped.
|
||||
|
||||
indexallfilenames
|
||||
|
||||
Recoll indexes file names in a special section of the database to
|
||||
@ -3059,6 +3227,32 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
share the values for these parameters, because they usually affect both
|
||||
search and index operations.
|
||||
|
||||
indexStripChars
|
||||
|
||||
Decide if we strip characters of diacritics and convert them to
|
||||
lower-case before terms are indexed. If we don't, searches
|
||||
sensitive to case and diacritics can be performed, but the index
|
||||
will be bigger, and some marginal weirdness may sometimes occur.
|
||||
The default is a stripped index (indexStripChars = 1) for now.
|
||||
When using multiple indexes for a search, this parameter must be
|
||||
defined identically for all. Changing the value implies an index
|
||||
reset.
|
||||
|
||||
maxTermExpand
|
||||
|
||||
Maximum expansion count for a single term (e.g.: when using
|
||||
wildcards). The default of 10000 is reasonable and will avoid
|
||||
queries that appear frozen while the engine is walking the term
|
||||
list.
|
||||
|
||||
maxXapianClauses
|
||||
|
||||
Maximum number of elementary clauses we can add to a single Xapian
|
||||
query. In some cases, the result of term expansion can be
|
||||
multiplicative, and we want to avoid using excessive memory. The
|
||||
default of 100 000 should be both high enough in most cases and
|
||||
compatible with current typical hardware configurations.
|
||||
|
||||
nonumbers
|
||||
|
||||
If this set to true, no terms will be generated for numbers. For
|
||||
@ -3200,6 +3394,22 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
5.4.1.4. Miscellaneous parameters:
|
||||
|
||||
autodiacsens
|
||||
|
||||
IF the index is not stripped, decide if we automatically trigger
|
||||
diacritics sensitivity if the search term has accented characters
|
||||
(not in unac_except_trans). Else you need to use the query
|
||||
language and the D modifier to specify diacritics sensitivity.
|
||||
Default is no.
|
||||
|
||||
autocasesens
|
||||
|
||||
IF the index is not stripped, decide if we automatically trigger
|
||||
character case sensitivity if the search term has upper-case
|
||||
characters in any but the first position. Else you need to use the
|
||||
query language and the C modifier to specify character-case
|
||||
sensitivity. Default is yes.
|
||||
|
||||
loglevel,daemloglevel
|
||||
|
||||
Verbosity level for recoll and recollindex. A value of 4 lists
|
||||
@ -3238,6 +3448,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
the auxiliary databases (spelling, stemming) if needed. The
|
||||
default is one hour.
|
||||
|
||||
monioniceclass, monioniceclassdata
|
||||
|
||||
These allow defining the ionice class and data used by the indexer
|
||||
(default class 3, no data).
|
||||
|
||||
filtermaxseconds
|
||||
|
||||
Maximum filter execution time, after which it is aborted. Some
|
||||
@ -3282,6 +3497,13 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
Useful for cases where you don't need the functionality or when it
|
||||
is unusable because aspell crashes during dictionary generation.
|
||||
|
||||
mhmboxquirks
|
||||
|
||||
This allows definining location-related quirks for the mailbox
|
||||
handler. Currently only the tbird flag is defined, and it should
|
||||
be set for directories which hold Thunderbird data, as their
|
||||
folder format is weird.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.4.2. The fields file
|
||||
@ -3394,19 +3616,24 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
oofice instead of openoffice etc.
|
||||
|
||||
Changes to this file can be done by direct editing, or through the recoll
|
||||
user preferences dialog.
|
||||
GUI preferences dialog.
|
||||
|
||||
If Use desktop preferences to choose document editor is checked in the
|
||||
Recoll GUI user preferences, all mimeview entries will be ignored except
|
||||
the one labelled application/x-all (which is set to use xdg-open by
|
||||
default).
|
||||
Recoll GUI preferences, all mimeview entries will be ignored except the
|
||||
one labelled application/x-all (which is set to use xdg-open by default).
|
||||
|
||||
In this case, the xallexcepts top level variable defines a list of mime
|
||||
type exceptions which will be processed according to the local entries
|
||||
instead of being passed to the desktop. This is so that specific Recoll
|
||||
options such as a page number or a search string can be passed to
|
||||
applications that support them, such as the evince viewer.
|
||||
|
||||
As for the other configuration files, the normal usage is to have a
|
||||
mimeview inside your own configuration directory, with just the
|
||||
non-default entries, which will override those from the central
|
||||
configuration file.
|
||||
|
||||
Please note that these entries must be placed under a [view] section.
|
||||
All viewer definition entries must be placed under a [view] section.
|
||||
|
||||
The keys in the file are normally mime types. You can add an application
|
||||
tag to specialize the choice for an area of the filesystem (using a
|
||||
@ -3436,6 +3663,15 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
* %M. Mime type
|
||||
|
||||
* %p. Page index. Only significant for a subset of document types,
|
||||
currently only PDF, Postscript and DVI files. Can be used to start the
|
||||
editor at the right page for a match or snippet.
|
||||
|
||||
* %s. Search term. The value will only be set for documents with indexed
|
||||
page numbers (ie: PDF). The value will be one of the matched search
|
||||
terms. It would allow pre-setting the value in the "Find" entry inside
|
||||
Evince for example, for easy highlighting of the term.
|
||||
|
||||
* %U, %u. Url.
|
||||
|
||||
In addition to the predefined values above, all strings like %(fieldname)
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user