release 2896
This commit is contained in:
parent
e2185379b5
commit
c1ce9caa36
85
src/INSTALL
85
src/INSTALL
@ -333,7 +333,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
a configuration directory. There can be several such directories, each of
|
a configuration directory. There can be several such directories, each of
|
||||||
which define the parameters for one index.
|
which define the parameters for one index.
|
||||||
|
|
||||||
The configuration files can be edited by hand or through the Indexing
|
The configuration files can be edited by hand or through the Index
|
||||||
configuration dialog (Preferences menu). The GUI tool will try to respect
|
configuration dialog (Preferences menu). The GUI tool will try to respect
|
||||||
your formatting and comments as much as possible, so it is quite possible
|
your formatting and comments as much as possible, so it is quite possible
|
||||||
to use both ways.
|
to use both ways.
|
||||||
@ -526,6 +526,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
window. A size of a few megabytes would seem reasonable (default:
|
window. A size of a few megabytes would seem reasonable (default:
|
||||||
1MB).
|
1MB).
|
||||||
|
|
||||||
|
membermaxkbs
|
||||||
|
|
||||||
|
This defines the maximum size in kilobytes for an archive member
|
||||||
|
(zip, tar or rar at the moment). Bigger entries will be skipped.
|
||||||
|
|
||||||
indexallfilenames
|
indexallfilenames
|
||||||
|
|
||||||
Recoll indexes file names in a special section of the database to
|
Recoll indexes file names in a special section of the database to
|
||||||
@ -562,6 +567,32 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
share the values for these parameters, because they usually affect both
|
share the values for these parameters, because they usually affect both
|
||||||
search and index operations.
|
search and index operations.
|
||||||
|
|
||||||
|
indexStripChars
|
||||||
|
|
||||||
|
Decide if we strip characters of diacritics and convert them to
|
||||||
|
lower-case before terms are indexed. If we don't, searches
|
||||||
|
sensitive to case and diacritics can be performed, but the index
|
||||||
|
will be bigger, and some marginal weirdness may sometimes occur.
|
||||||
|
The default is a stripped index (indexStripChars = 1) for now.
|
||||||
|
When using multiple indexes for a search, this parameter must be
|
||||||
|
defined identically for all. Changing the value implies an index
|
||||||
|
reset.
|
||||||
|
|
||||||
|
maxTermExpand
|
||||||
|
|
||||||
|
Maximum expansion count for a single term (e.g.: when using
|
||||||
|
wildcards). The default of 10000 is reasonable and will avoid
|
||||||
|
queries that appear frozen while the engine is walking the term
|
||||||
|
list.
|
||||||
|
|
||||||
|
maxXapianClauses
|
||||||
|
|
||||||
|
Maximum number of elementary clauses we can add to a single Xapian
|
||||||
|
query. In some cases, the result of term expansion can be
|
||||||
|
multiplicative, and we want to avoid using excessive memory. The
|
||||||
|
default of 100 000 should be both high enough in most cases and
|
||||||
|
compatible with current typical hardware configurations.
|
||||||
|
|
||||||
nonumbers
|
nonumbers
|
||||||
|
|
||||||
If this set to true, no terms will be generated for numbers. For
|
If this set to true, no terms will be generated for numbers. For
|
||||||
@ -699,6 +730,22 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
5.4.1.4. Miscellaneous parameters:
|
5.4.1.4. Miscellaneous parameters:
|
||||||
|
|
||||||
|
autodiacsens
|
||||||
|
|
||||||
|
IF the index is not stripped, decide if we automatically trigger
|
||||||
|
diacritics sensitivity if the search term has accented characters
|
||||||
|
(not in unac_except_trans). Else you need to use the query
|
||||||
|
language and the D modifier to specify diacritics sensitivity.
|
||||||
|
Default is no.
|
||||||
|
|
||||||
|
autocasesens
|
||||||
|
|
||||||
|
IF the index is not stripped, decide if we automatically trigger
|
||||||
|
character case sensitivity if the search term has upper-case
|
||||||
|
characters in any but the first position. Else you need to use the
|
||||||
|
query language and the C modifier to specify character-case
|
||||||
|
sensitivity. Default is yes.
|
||||||
|
|
||||||
loglevel,daemloglevel
|
loglevel,daemloglevel
|
||||||
|
|
||||||
Verbosity level for recoll and recollindex. A value of 4 lists
|
Verbosity level for recoll and recollindex. A value of 4 lists
|
||||||
@ -737,6 +784,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
the auxiliary databases (spelling, stemming) if needed. The
|
the auxiliary databases (spelling, stemming) if needed. The
|
||||||
default is one hour.
|
default is one hour.
|
||||||
|
|
||||||
|
monioniceclass, monioniceclassdata
|
||||||
|
|
||||||
|
These allow defining the ionice class and data used by the indexer
|
||||||
|
(default class 3, no data).
|
||||||
|
|
||||||
filtermaxseconds
|
filtermaxseconds
|
||||||
|
|
||||||
Maximum filter execution time, after which it is aborted. Some
|
Maximum filter execution time, after which it is aborted. Some
|
||||||
@ -781,6 +833,13 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
Useful for cases where you don't need the functionality or when it
|
Useful for cases where you don't need the functionality or when it
|
||||||
is unusable because aspell crashes during dictionary generation.
|
is unusable because aspell crashes during dictionary generation.
|
||||||
|
|
||||||
|
mhmboxquirks
|
||||||
|
|
||||||
|
This allows definining location-related quirks for the mailbox
|
||||||
|
handler. Currently only the tbird flag is defined, and it should
|
||||||
|
be set for directories which hold Thunderbird data, as their
|
||||||
|
folder format is weird.
|
||||||
|
|
||||||
5.4.2. The fields file
|
5.4.2. The fields file
|
||||||
|
|
||||||
This file contains information about dynamic fields handling in Recoll.
|
This file contains information about dynamic fields handling in Recoll.
|
||||||
@ -885,19 +944,24 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
oofice instead of openoffice etc.
|
oofice instead of openoffice etc.
|
||||||
|
|
||||||
Changes to this file can be done by direct editing, or through the recoll
|
Changes to this file can be done by direct editing, or through the recoll
|
||||||
user preferences dialog.
|
GUI preferences dialog.
|
||||||
|
|
||||||
If Use desktop preferences to choose document editor is checked in the
|
If Use desktop preferences to choose document editor is checked in the
|
||||||
Recoll GUI user preferences, all mimeview entries will be ignored except
|
Recoll GUI preferences, all mimeview entries will be ignored except the
|
||||||
the one labelled application/x-all (which is set to use xdg-open by
|
one labelled application/x-all (which is set to use xdg-open by default).
|
||||||
default).
|
|
||||||
|
In this case, the xallexcepts top level variable defines a list of mime
|
||||||
|
type exceptions which will be processed according to the local entries
|
||||||
|
instead of being passed to the desktop. This is so that specific Recoll
|
||||||
|
options such as a page number or a search string can be passed to
|
||||||
|
applications that support them, such as the evince viewer.
|
||||||
|
|
||||||
As for the other configuration files, the normal usage is to have a
|
As for the other configuration files, the normal usage is to have a
|
||||||
mimeview inside your own configuration directory, with just the
|
mimeview inside your own configuration directory, with just the
|
||||||
non-default entries, which will override those from the central
|
non-default entries, which will override those from the central
|
||||||
configuration file.
|
configuration file.
|
||||||
|
|
||||||
Please note that these entries must be placed under a [view] section.
|
All viewer definition entries must be placed under a [view] section.
|
||||||
|
|
||||||
The keys in the file are normally mime types. You can add an application
|
The keys in the file are normally mime types. You can add an application
|
||||||
tag to specialize the choice for an area of the filesystem (using a
|
tag to specialize the choice for an area of the filesystem (using a
|
||||||
@ -927,6 +991,15 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
* %M. Mime type
|
* %M. Mime type
|
||||||
|
|
||||||
|
* %p. Page index. Only significant for a subset of document types,
|
||||||
|
currently only PDF, Postscript and DVI files. Can be used to start the
|
||||||
|
editor at the right page for a match or snippet.
|
||||||
|
|
||||||
|
* %s. Search term. The value will only be set for documents with indexed
|
||||||
|
page numbers (ie: PDF). The value will be one of the matched search
|
||||||
|
terms. It would allow pre-setting the value in the "Find" entry inside
|
||||||
|
Evince for example, for easy highlighting of the term.
|
||||||
|
|
||||||
* %U, %u. Url.
|
* %U, %u. Url.
|
||||||
|
|
||||||
In addition to the predefined values above, all strings like %(fieldname)
|
In addition to the predefined values above, all strings like %(fieldname)
|
||||||
|
|||||||
346
src/README
346
src/README
@ -46,9 +46,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
2.2.2. Security aspects
|
2.2.2. Security aspects
|
||||||
|
|
||||||
2.3. Indexing configuration
|
2.3. Index configuration
|
||||||
|
|
||||||
2.3.1. The indexing configuration GUI
|
2.3.1. Index case and diacritics sensitivity
|
||||||
|
|
||||||
|
2.3.2. The index configuration GUI
|
||||||
|
|
||||||
2.4. Using Beagle WEB browser plugins
|
2.4. Using Beagle WEB browser plugins
|
||||||
|
|
||||||
@ -102,19 +104,21 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
3.4.1. Modifiers
|
3.4.1. Modifiers
|
||||||
|
|
||||||
3.5. Anchored searches and wildcards
|
3.5. Search case and diacritics sensitivity
|
||||||
|
|
||||||
3.5.1. More about wildcards
|
3.6. Anchored searches and wildcards
|
||||||
|
|
||||||
3.5.2. Anchored searches
|
3.6.1. More about wildcards
|
||||||
|
|
||||||
3.6. Desktop integration
|
3.6.2. Anchored searches
|
||||||
|
|
||||||
3.6.1. Hotkeying recoll
|
3.7. Desktop integration
|
||||||
|
|
||||||
3.6.2. The KDE Kicker Recoll applet
|
3.7.1. Hotkeying recoll
|
||||||
|
|
||||||
3.7. Multiple databases
|
3.7.2. The KDE Kicker Recoll applet
|
||||||
|
|
||||||
|
3.8. Multiple databases
|
||||||
|
|
||||||
4. Programming interface
|
4. Programming interface
|
||||||
|
|
||||||
@ -126,6 +130,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
4.1.3. Filter HTML output
|
4.1.3. Filter HTML output
|
||||||
|
|
||||||
|
4.1.4. Page numbers
|
||||||
|
|
||||||
4.2. Field data processing
|
4.2. Field data processing
|
||||||
|
|
||||||
4.3. API
|
4.3. API
|
||||||
@ -250,20 +256,36 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
plural (floor, floors), or on a verb tense (flooring, floored). Because
|
plural (floor, floors), or on a verb tense (flooring, floored). Because
|
||||||
the mechanisms used for stemming depend on the specific grammatical rules
|
the mechanisms used for stemming depend on the specific grammatical rules
|
||||||
for each language, there is a separate stemmer module for most common
|
for each language, there is a separate stemmer module for most common
|
||||||
languages where stemming makes sense. Storing documents written in
|
languages where stemming makes sense.
|
||||||
different languages in the same index is possible, and commonly done. In
|
|
||||||
this situation, you can specify several stemming languages for the index.
|
|
||||||
Recoll stores the unstemmed versions of terms in the main index and uses
|
Recoll stores the unstemmed versions of terms in the main index and uses
|
||||||
auxiliary databases for term expansion (one for each stemming language),
|
auxiliary databases for term expansion (one for each stemming language),
|
||||||
which means that you can switch stemming languages between searches, or
|
which means that you can switch stemming languages between searches, or
|
||||||
add a language without needing a full reindex. Recoll currently makes no
|
add a language without needing a full reindex.
|
||||||
attempt at automatic language recognition, which means that the stemmer
|
|
||||||
will sometimes be applied to terms from other languages with potentially
|
Storing documents written in different languages in the same index is
|
||||||
strange results. In practise, even if this introduces possibilities of
|
possible, and commonly done. In this situation, you can specify several
|
||||||
confusion, this approach has been proven quite useful, and, awaiting the
|
stemming languages for the index.
|
||||||
addition of an automatic language recognition module to Recoll, it is much
|
|
||||||
less cumbersome than separating your documents according to what language
|
Recoll currently makes no attempt at automatic language recognition, which
|
||||||
they are written in.
|
means that the stemmer will sometimes be applied to terms from other
|
||||||
|
languages with potentially strange results. In practise, even if this
|
||||||
|
introduces possibilities of confusion, this approach has been proven quite
|
||||||
|
useful, and, awaiting the addition of an automatic language recognition
|
||||||
|
module to Recoll, it is much less cumbersome than separating your
|
||||||
|
documents according to what language they are written in.
|
||||||
|
|
||||||
|
Before version 1.18, Recoll always stripped most accents and diacritics
|
||||||
|
from terms, and converted them to lower case before storing them in the
|
||||||
|
index. As a consequence, it was impossible to search for a particular
|
||||||
|
capitalization of a term (US / us), or to discriminate two terms based on
|
||||||
|
diacritics (sake / sake, mate / mate).
|
||||||
|
|
||||||
|
As of version 1.18, Recoll can optionally store the raw terms, without
|
||||||
|
accent stripping or case conversion. Expansions necessary for searches
|
||||||
|
insensitive to case and/or diacritics are then performed when searching.
|
||||||
|
This is described in more detail in the section about index case and
|
||||||
|
diacritics sensitivity.
|
||||||
|
|
||||||
Recoll has many parameters which define exactly what to index, and how to
|
Recoll has many parameters which define exactly what to index, and how to
|
||||||
classify and decode the source documents. These are kept in configuration
|
classify and decode the source documents. These are kept in configuration
|
||||||
@ -352,8 +374,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
The generated indexes can be queried concurrently in a transparent manner.
|
The generated indexes can be queried concurrently in a transparent manner.
|
||||||
|
|
||||||
For index generation, multiple configurations are totally independant from
|
For index generation, multiple configurations are totally independant from
|
||||||
each other. When multiple indexes are used for searches, some parameters
|
each other. When multiple indexes need to be used for a single search,
|
||||||
should be consistent among the configurations.
|
some parameters should be consistent among the configurations.
|
||||||
|
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
@ -480,7 +502,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
2.3. Indexing configuration
|
2.3. Index configuration
|
||||||
|
|
||||||
Variables set inside the Recoll configuration files control which areas of
|
Variables set inside the Recoll configuration files control which areas of
|
||||||
the file system are indexed, and how files are processed. These variables
|
the file system are indexed, and how files are processed. These variables
|
||||||
@ -506,25 +528,63 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
2.3.1. The indexing configuration GUI
|
2.3.1. Index case and diacritics sensitivity
|
||||||
|
|
||||||
Most parameters for a given indexing configuration can be set from a
|
As of Recoll version 1.18 you have a choice of building an index with
|
||||||
recoll GUI running on this configuration (either as default, or by setting
|
terms stripped of character case and diacritics, or one with raw terms.
|
||||||
|
For a source term of Resume, the former will store resume, the latter
|
||||||
|
Resume.
|
||||||
|
|
||||||
|
Each type of index allows performing searches insensitive to case and
|
||||||
|
diacritics: with a raw index, the user entry will be expanded to match all
|
||||||
|
case and diacritics variations present in the index. With a stripped
|
||||||
|
index, the search term will be stripped before searching.
|
||||||
|
|
||||||
|
A raw index allows for another possibility which a stripped index cannot
|
||||||
|
offer: using case and diacritics to discriminate between terms, returning
|
||||||
|
different results when searching for US and us or resume and resume. Read
|
||||||
|
the section about search case and diacritics sensitivity for more details.
|
||||||
|
|
||||||
|
The type of index to be created is controlled by the indexStripChars
|
||||||
|
configuration variable which can only be changed by editing the
|
||||||
|
configuration file. Any change implies an index reset (not automated by
|
||||||
|
Recoll), and all indexes in a search must be set in the same way (again,
|
||||||
|
not checked by Recoll).
|
||||||
|
|
||||||
|
If the indexStripChars is not set, Recoll 1.18 creates a stripped index by
|
||||||
|
default, for compatibility with previous versions.
|
||||||
|
|
||||||
|
As a cost for added capability, a raw index will be slightly bigger than a
|
||||||
|
stripped one (around 10%). Also, searches will be more complex, so
|
||||||
|
probably slightly slower, and the feature is still young, and a certain
|
||||||
|
amount of weirdness cannot be excluded.
|
||||||
|
|
||||||
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
|
2.3.2. The index configuration GUI
|
||||||
|
|
||||||
|
Most parameters for a given index configuration can be set from a recoll
|
||||||
|
GUI running on this configuration (either as default, or by setting
|
||||||
RECOLL_CONFDIR or the -c option.)
|
RECOLL_CONFDIR or the -c option.)
|
||||||
|
|
||||||
The interface is started from the Preferences->Indexing Configuration menu
|
The interface is started from the Preferences->Index Configuration menu
|
||||||
entry. It is divided in three tabs, Global parameters, Local parameters,
|
entry. It is divided in four tabs, Global parameters, Local parameters,
|
||||||
and Beagle web history, which is explained in the next section.
|
Beagle web history (which is explained in the next section) and Search
|
||||||
|
parameters.
|
||||||
|
|
||||||
The first tab allows setting global variables, like the lists of top
|
The Global parameters tab allows setting global variables, like the lists
|
||||||
directories, skipped paths, or stemming languages.
|
of top directories, skipped paths, or stemming languages.
|
||||||
|
|
||||||
The second tab allows setting variables that can be redefined for
|
The Local parameters tab allows setting variables that can be redefined
|
||||||
subdirectories. This second tab has an initially empty list of
|
for subdirectories. This second tab has an initially empty list of
|
||||||
customisation directories, to which you can add. The variables are then
|
customisation directories, to which you can add. The variables are then
|
||||||
set for the currently selected directory (or at the top level if the empty
|
set for the currently selected directory (or at the top level if the empty
|
||||||
line is selected).
|
line is selected).
|
||||||
|
|
||||||
|
The Search parameters section defines parameters which are used at query
|
||||||
|
time, but are global to an index and affect all search tools, not only the
|
||||||
|
GUI.
|
||||||
|
|
||||||
The meaning for most entries in the interface is self-evident and
|
The meaning for most entries in the interface is self-evident and
|
||||||
documented by a ToolTip popup on the text label. For more detail, you will
|
documented by a ToolTip popup on the text label. For more detail, you will
|
||||||
need to refer to the configuration section of this guide.
|
need to refer to the configuration section of this guide.
|
||||||
@ -550,7 +610,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
the Beagle queue directory. This supposes that Beagle is not running, else
|
the Beagle queue directory. This supposes that Beagle is not running, else
|
||||||
both programs will fight for the same files.
|
both programs will fight for the same files.
|
||||||
|
|
||||||
This feature can be enabled in the GUI indexing configuration panel, or by
|
This feature can be enabled in the GUI Index configuration panel, or by
|
||||||
editing the configuration file (set processbeaglequeue to 1).
|
editing the configuration file (set processbeaglequeue to 1).
|
||||||
|
|
||||||
There are more recent instructions about how to find and install the
|
There are more recent instructions about how to find and install the
|
||||||
@ -855,7 +915,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
Clicking the Open link will attempt to start an external viewer. The
|
Clicking the Open link will attempt to start an external viewer. The
|
||||||
viewer for each document type can be configured through the user
|
viewer for each document type can be configured through the user
|
||||||
preferences dialog, or by editing the mimeview configuration file. You can
|
preferences dialog, or by editing the mimeview configuration file. You can
|
||||||
also check the Use desktop preferences option in the user preferences
|
also check the Use desktop preferences option in the GUI preferences
|
||||||
dialog to use the desktop defaults for all documents. This is probably the
|
dialog to use the desktop defaults for all documents. This is probably the
|
||||||
best option if you are using a well configured Gnome or KDE desktop.
|
best option if you are using a well configured Gnome or KDE desktop.
|
||||||
|
|
||||||
@ -904,6 +964,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
* Open Parent document
|
* Open Parent document
|
||||||
|
|
||||||
|
* Open Snippets Window
|
||||||
|
|
||||||
The Preview and Open entries do the same thing as the corresponding links.
|
The Preview and Open entries do the same thing as the corresponding links.
|
||||||
|
|
||||||
The Copy File Name and Copy Url copy the relevant data to the clipboard,
|
The Copy File Name and Copy Url copy the relevant data to the clipboard,
|
||||||
@ -930,6 +992,13 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
this case. In other cases, the Open option makes sense, for example to
|
this case. In other cases, the Open option makes sense, for example to
|
||||||
start a chm viewer on the parent document for a help page.
|
start a chm viewer on the parent document for a help page.
|
||||||
|
|
||||||
|
The Open Snippets Window entry will only appear for documents which
|
||||||
|
support page breaks (typically PDF, Postscript, DVI). The snippets window
|
||||||
|
lists extracts from the document, taken around search terms occurrences,
|
||||||
|
along with the corresponding page number, as links which can be used to
|
||||||
|
start the native viewer on the appropriate page. If the viewer supports
|
||||||
|
it, its search function will also be primed with one of the search terms.
|
||||||
|
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
3.1.3. The result table
|
3.1.3. The result table
|
||||||
@ -1428,6 +1497,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
mimeview. xdg-open will in term use your desktop preferences to choose
|
mimeview. xdg-open will in term use your desktop preferences to choose
|
||||||
an appropriate application.
|
an appropriate application.
|
||||||
|
|
||||||
|
* Exceptions: when using the desktop preferences for opening documents,
|
||||||
|
these are mime types that will still be opened according to Recoll
|
||||||
|
preferences. This is useful for passing parameters like page numbers
|
||||||
|
or search strings to applications that support them (e.g. evince).
|
||||||
|
|
||||||
* Choose editor applications this will let you choose the command
|
* Choose editor applications this will let you choose the command
|
||||||
started by the Open links inside the result list, for specific
|
started by the Open links inside the result list, for specific
|
||||||
document types.
|
document types.
|
||||||
@ -1569,6 +1643,9 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
* %D. Date
|
* %D. Date
|
||||||
|
|
||||||
|
* %E. Precooked Snippets link (will only appear for documents indexed
|
||||||
|
with page numbers)
|
||||||
|
|
||||||
* %I. Icon image name. This is normally determined from the mime type.
|
* %I. Icon image name. This is normally determined from the mime type.
|
||||||
The associations are defined inside the mimeconf configuration file.
|
The associations are defined inside the mimeconf configuration file.
|
||||||
If a thumbnail for the file is found at the standard Freedesktop
|
If a thumbnail for the file is found at the standard Freedesktop
|
||||||
@ -1826,13 +1903,34 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
The field syntax also supports a few field-like, but special, criteria:
|
The field syntax also supports a few field-like, but special, criteria:
|
||||||
|
|
||||||
* dir for filtering the results on file location (Ex:
|
* dir for filtering the results on file location (Ex:
|
||||||
dir:/home/me/somedir). -dir also works to find results out of the
|
dir:/home/me/somedir). -dir also works to find results not in the
|
||||||
specified directory, only after release 1.15.8. A tilde inside the
|
specified directory (release >= 1.15.8). A tilde inside the value will
|
||||||
value will be expanded to the home directory. dir is not a regular
|
be expanded to the home directory. Wildcards will not be expanded. You
|
||||||
field and only one value makes sense in a query (you can't use
|
cannot use OR with dir clauses (this restriction may go away in the
|
||||||
dir:dir1 OR dir:dir2). Relative paths make sense, for example,
|
future).
|
||||||
dir:share/doc would match either /usr/share/doc or
|
|
||||||
/usr/local/share/doc
|
Relative paths also make sense, for example, dir:share/doc would match
|
||||||
|
either /usr/share/doc or /usr/local/share/doc
|
||||||
|
|
||||||
|
Several dir clauses can be specified, both positive and negative. For
|
||||||
|
example the following makes sense:
|
||||||
|
|
||||||
|
dir:recoll dir:src -dir:utils -dir:common
|
||||||
|
|
||||||
|
|
||||||
|
This would select results which have both recoll and src in the path
|
||||||
|
(in any order), and which have not either utils or common.
|
||||||
|
|
||||||
|
Another special aspect of dir clauses is that the values in the index
|
||||||
|
are not transcoded to UTF-8, and never lower-cased or unaccented, but
|
||||||
|
stored as binary. This means that you need to enter the values in the
|
||||||
|
exact lower or upper case, and that searches for names with diacritics
|
||||||
|
may sometimes be impossible because of character set conversion
|
||||||
|
issues. Non-ASCII UNIX file paths are an unending source of trouble
|
||||||
|
and are best avoided.
|
||||||
|
|
||||||
|
You need to use double-quotes around the path value if it contains
|
||||||
|
space characters.
|
||||||
|
|
||||||
* size for filtering the results on file size. Example: size<10000. You
|
* size for filtering the results on file size. Example: size<10000. You
|
||||||
can use <, > or = as operators. You can specify a range like the
|
can use <, > or = as operators. You can specify a range like the
|
||||||
@ -1913,12 +2011,68 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
* p can be used to turn the default phrase search into a proximity one
|
* p can be used to turn the default phrase search into a proximity one
|
||||||
(unordered). Example:"order any in"p
|
(unordered). Example:"order any in"p
|
||||||
|
|
||||||
|
* C will turn on case sensitivity (if the index supports it).
|
||||||
|
|
||||||
|
* D will turn on diacritics sensitivity (if the index supports it).
|
||||||
|
|
||||||
* A weight can be specified for a query element by specifying a decimal
|
* A weight can be specified for a query element by specifying a decimal
|
||||||
value at the start of the modifiers. Example: "Important"2.5.
|
value at the start of the modifiers. Example: "Important"2.5.
|
||||||
|
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
3.5. Anchored searches and wildcards
|
3.5. Search case and diacritics sensitivity
|
||||||
|
|
||||||
|
For Recoll versions 1.18 and later, and when working with a raw index (not
|
||||||
|
the default), searches can be made sensitive to character case and
|
||||||
|
diacritics. How this happens is controlled by configuration variables and
|
||||||
|
what search data is entered.
|
||||||
|
|
||||||
|
The general default is that searches are insensitive to case and
|
||||||
|
diacritics. An entry of resume will match any of Resume, RESUME, resume,
|
||||||
|
Resume etc.
|
||||||
|
|
||||||
|
Two configuration variables can automate switching on sensitivity:
|
||||||
|
|
||||||
|
autodiacsens
|
||||||
|
|
||||||
|
If this is set, search sensitivity to diacritics will be turned on
|
||||||
|
as soon as an accented character exists in a search term. When the
|
||||||
|
variable is set to true, resume will start a
|
||||||
|
diacritics-unsensitive search, but resume will be matched exactly.
|
||||||
|
The default value is false.
|
||||||
|
|
||||||
|
autocasesens
|
||||||
|
|
||||||
|
If this is set, search sensitivity to character case will be
|
||||||
|
turned on as soon as an upper-case character exists in a search
|
||||||
|
term except for the first one. When the variable is set to true,
|
||||||
|
us or Us will start a diacritics-unsensitive search, but US will
|
||||||
|
be matched exactly. The default value is true (contrary to
|
||||||
|
autodiacsens).
|
||||||
|
|
||||||
|
As in the past, capitalizing the first letter of a word will turn off its
|
||||||
|
stem expansion and have no effect on case-sensitivity.
|
||||||
|
|
||||||
|
You can also explicitely activate case and diacritics sensitivity by using
|
||||||
|
modifiers with the query language. C will make the term case-sensitive,
|
||||||
|
and D will make it diacritics-sensitive. Examples:
|
||||||
|
|
||||||
|
"us"C
|
||||||
|
|
||||||
|
|
||||||
|
will search for the term us exactly (Us will not be a match).
|
||||||
|
|
||||||
|
"resume"D
|
||||||
|
|
||||||
|
|
||||||
|
will search for the term resume exactly (resume will not be a match).
|
||||||
|
|
||||||
|
When either case or diacritics sensitivity is activated, stem expansion is
|
||||||
|
turned off. Having both does not make much sense.
|
||||||
|
|
||||||
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
|
3.6. Anchored searches and wildcards
|
||||||
|
|
||||||
Some special characters are interpreted by Recoll in search strings to
|
Some special characters are interpreted by Recoll in search strings to
|
||||||
expand or specialize the search. Wildcards expand a root term in
|
expand or specialize the search. Wildcards expand a root term in
|
||||||
@ -1928,7 +2082,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
3.5.1. More about wildcards
|
3.6.1. More about wildcards
|
||||||
|
|
||||||
All words entered in Recoll search fields will be processed for wildcard
|
All words entered in Recoll search fields will be processed for wildcard
|
||||||
expansion before the request is finally executed.
|
expansion before the request is finally executed.
|
||||||
@ -1959,7 +2113,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
3.5.2. Anchored searches
|
3.6.2. Anchored searches
|
||||||
|
|
||||||
Two characters are used to specify that a search hit should occur at the
|
Two characters are used to specify that a search hit should occur at the
|
||||||
beginning or at the end of the text. ^ at the beginning of a term or
|
beginning or at the end of the text. ^ at the beginning of a term or
|
||||||
@ -1984,14 +2138,14 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
3.6. Desktop integration
|
3.7. Desktop integration
|
||||||
|
|
||||||
Being independant of the desktop type has its drawbacks: Recoll desktop
|
Being independant of the desktop type has its drawbacks: Recoll desktop
|
||||||
integration is minimal. Here follow a few things that may help.
|
integration is minimal. Here follow a few things that may help.
|
||||||
|
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
3.6.1. Hotkeying recoll
|
3.7.1. Hotkeying recoll
|
||||||
|
|
||||||
It is surprisingly convenient to be able to show or hide the Recoll GUI
|
It is surprisingly convenient to be able to show or hide the Recoll GUI
|
||||||
with a single keystroke. Recoll comes with a small Python script, based on
|
with a single keystroke. Recoll comes with a small Python script, based on
|
||||||
@ -2000,7 +2154,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
3.6.2. The KDE Kicker Recoll applet
|
3.7.2. The KDE Kicker Recoll applet
|
||||||
|
|
||||||
The Recoll source tree contains the source code to the recoll_applet, a
|
The Recoll source tree contains the source code to the recoll_applet, a
|
||||||
small application derived from the find_applet. This can be used to add a
|
small application derived from the find_applet. This can be used to add a
|
||||||
@ -2023,7 +2177,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
3.7. Multiple databases
|
3.8. Multiple databases
|
||||||
|
|
||||||
Multiple Recoll databases or indexes can be created by using several
|
Multiple Recoll databases or indexes can be created by using several
|
||||||
configuration directories which are usually set to index different areas
|
configuration directories which are usually set to index different areas
|
||||||
@ -2216,6 +2370,15 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
|
4.1.4. Page numbers
|
||||||
|
|
||||||
|
The indexer will interpret ^L characters in the filter output as
|
||||||
|
indicating page breaks, and will record them. At query time, this allows
|
||||||
|
starting a viewer on the right page for a hit or a snippet. Currently,
|
||||||
|
only the PDF, Postscript and DVI filters generate page breaks.
|
||||||
|
|
||||||
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
4.2. Field data processing
|
4.2. Field data processing
|
||||||
|
|
||||||
Fields are named pieces of information in or about documents, like title,
|
Fields are named pieces of information in or about documents, like title,
|
||||||
@ -2824,7 +2987,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
a configuration directory. There can be several such directories, each of
|
a configuration directory. There can be several such directories, each of
|
||||||
which define the parameters for one index.
|
which define the parameters for one index.
|
||||||
|
|
||||||
The configuration files can be edited by hand or through the Indexing
|
The configuration files can be edited by hand or through the Index
|
||||||
configuration dialog (Preferences menu). The GUI tool will try to respect
|
configuration dialog (Preferences menu). The GUI tool will try to respect
|
||||||
your formatting and comments as much as possible, so it is quite possible
|
your formatting and comments as much as possible, so it is quite possible
|
||||||
to use both ways.
|
to use both ways.
|
||||||
@ -3021,6 +3184,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
window. A size of a few megabytes would seem reasonable (default:
|
window. A size of a few megabytes would seem reasonable (default:
|
||||||
1MB).
|
1MB).
|
||||||
|
|
||||||
|
membermaxkbs
|
||||||
|
|
||||||
|
This defines the maximum size in kilobytes for an archive member
|
||||||
|
(zip, tar or rar at the moment). Bigger entries will be skipped.
|
||||||
|
|
||||||
indexallfilenames
|
indexallfilenames
|
||||||
|
|
||||||
Recoll indexes file names in a special section of the database to
|
Recoll indexes file names in a special section of the database to
|
||||||
@ -3059,6 +3227,32 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
share the values for these parameters, because they usually affect both
|
share the values for these parameters, because they usually affect both
|
||||||
search and index operations.
|
search and index operations.
|
||||||
|
|
||||||
|
indexStripChars
|
||||||
|
|
||||||
|
Decide if we strip characters of diacritics and convert them to
|
||||||
|
lower-case before terms are indexed. If we don't, searches
|
||||||
|
sensitive to case and diacritics can be performed, but the index
|
||||||
|
will be bigger, and some marginal weirdness may sometimes occur.
|
||||||
|
The default is a stripped index (indexStripChars = 1) for now.
|
||||||
|
When using multiple indexes for a search, this parameter must be
|
||||||
|
defined identically for all. Changing the value implies an index
|
||||||
|
reset.
|
||||||
|
|
||||||
|
maxTermExpand
|
||||||
|
|
||||||
|
Maximum expansion count for a single term (e.g.: when using
|
||||||
|
wildcards). The default of 10000 is reasonable and will avoid
|
||||||
|
queries that appear frozen while the engine is walking the term
|
||||||
|
list.
|
||||||
|
|
||||||
|
maxXapianClauses
|
||||||
|
|
||||||
|
Maximum number of elementary clauses we can add to a single Xapian
|
||||||
|
query. In some cases, the result of term expansion can be
|
||||||
|
multiplicative, and we want to avoid using excessive memory. The
|
||||||
|
default of 100 000 should be both high enough in most cases and
|
||||||
|
compatible with current typical hardware configurations.
|
||||||
|
|
||||||
nonumbers
|
nonumbers
|
||||||
|
|
||||||
If this set to true, no terms will be generated for numbers. For
|
If this set to true, no terms will be generated for numbers. For
|
||||||
@ -3200,6 +3394,22 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
5.4.1.4. Miscellaneous parameters:
|
5.4.1.4. Miscellaneous parameters:
|
||||||
|
|
||||||
|
autodiacsens
|
||||||
|
|
||||||
|
IF the index is not stripped, decide if we automatically trigger
|
||||||
|
diacritics sensitivity if the search term has accented characters
|
||||||
|
(not in unac_except_trans). Else you need to use the query
|
||||||
|
language and the D modifier to specify diacritics sensitivity.
|
||||||
|
Default is no.
|
||||||
|
|
||||||
|
autocasesens
|
||||||
|
|
||||||
|
IF the index is not stripped, decide if we automatically trigger
|
||||||
|
character case sensitivity if the search term has upper-case
|
||||||
|
characters in any but the first position. Else you need to use the
|
||||||
|
query language and the C modifier to specify character-case
|
||||||
|
sensitivity. Default is yes.
|
||||||
|
|
||||||
loglevel,daemloglevel
|
loglevel,daemloglevel
|
||||||
|
|
||||||
Verbosity level for recoll and recollindex. A value of 4 lists
|
Verbosity level for recoll and recollindex. A value of 4 lists
|
||||||
@ -3238,6 +3448,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
the auxiliary databases (spelling, stemming) if needed. The
|
the auxiliary databases (spelling, stemming) if needed. The
|
||||||
default is one hour.
|
default is one hour.
|
||||||
|
|
||||||
|
monioniceclass, monioniceclassdata
|
||||||
|
|
||||||
|
These allow defining the ionice class and data used by the indexer
|
||||||
|
(default class 3, no data).
|
||||||
|
|
||||||
filtermaxseconds
|
filtermaxseconds
|
||||||
|
|
||||||
Maximum filter execution time, after which it is aborted. Some
|
Maximum filter execution time, after which it is aborted. Some
|
||||||
@ -3282,6 +3497,13 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
Useful for cases where you don't need the functionality or when it
|
Useful for cases where you don't need the functionality or when it
|
||||||
is unusable because aspell crashes during dictionary generation.
|
is unusable because aspell crashes during dictionary generation.
|
||||||
|
|
||||||
|
mhmboxquirks
|
||||||
|
|
||||||
|
This allows definining location-related quirks for the mailbox
|
||||||
|
handler. Currently only the tbird flag is defined, and it should
|
||||||
|
be set for directories which hold Thunderbird data, as their
|
||||||
|
folder format is weird.
|
||||||
|
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
5.4.2. The fields file
|
5.4.2. The fields file
|
||||||
@ -3394,19 +3616,24 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
oofice instead of openoffice etc.
|
oofice instead of openoffice etc.
|
||||||
|
|
||||||
Changes to this file can be done by direct editing, or through the recoll
|
Changes to this file can be done by direct editing, or through the recoll
|
||||||
user preferences dialog.
|
GUI preferences dialog.
|
||||||
|
|
||||||
If Use desktop preferences to choose document editor is checked in the
|
If Use desktop preferences to choose document editor is checked in the
|
||||||
Recoll GUI user preferences, all mimeview entries will be ignored except
|
Recoll GUI preferences, all mimeview entries will be ignored except the
|
||||||
the one labelled application/x-all (which is set to use xdg-open by
|
one labelled application/x-all (which is set to use xdg-open by default).
|
||||||
default).
|
|
||||||
|
In this case, the xallexcepts top level variable defines a list of mime
|
||||||
|
type exceptions which will be processed according to the local entries
|
||||||
|
instead of being passed to the desktop. This is so that specific Recoll
|
||||||
|
options such as a page number or a search string can be passed to
|
||||||
|
applications that support them, such as the evince viewer.
|
||||||
|
|
||||||
As for the other configuration files, the normal usage is to have a
|
As for the other configuration files, the normal usage is to have a
|
||||||
mimeview inside your own configuration directory, with just the
|
mimeview inside your own configuration directory, with just the
|
||||||
non-default entries, which will override those from the central
|
non-default entries, which will override those from the central
|
||||||
configuration file.
|
configuration file.
|
||||||
|
|
||||||
Please note that these entries must be placed under a [view] section.
|
All viewer definition entries must be placed under a [view] section.
|
||||||
|
|
||||||
The keys in the file are normally mime types. You can add an application
|
The keys in the file are normally mime types. You can add an application
|
||||||
tag to specialize the choice for an area of the filesystem (using a
|
tag to specialize the choice for an area of the filesystem (using a
|
||||||
@ -3436,6 +3663,15 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
|||||||
|
|
||||||
* %M. Mime type
|
* %M. Mime type
|
||||||
|
|
||||||
|
* %p. Page index. Only significant for a subset of document types,
|
||||||
|
currently only PDF, Postscript and DVI files. Can be used to start the
|
||||||
|
editor at the right page for a match or snippet.
|
||||||
|
|
||||||
|
* %s. Search term. The value will only be set for documents with indexed
|
||||||
|
page numbers (ie: PDF). The value will be one of the matched search
|
||||||
|
terms. It would allow pre-setting the value in the "Find" entry inside
|
||||||
|
Evince for example, for easy highlighting of the term.
|
||||||
|
|
||||||
* %U, %u. Url.
|
* %U, %u. Url.
|
||||||
|
|
||||||
In addition to the predefined values above, all strings like %(fieldname)
|
In addition to the predefined values above, all strings like %(fieldname)
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user