comments and doc
This commit is contained in:
parent
a1e98c1bdc
commit
9e2e73a995
@ -49,7 +49,7 @@ usermanual.pdf: usermanual.xml recoll.conf.xml
|
||||
dblatex --xslt-opts="--xinclude" -tpdf $<
|
||||
|
||||
UTILBUILDS=/home/dockes/tmp/builds/medocutils/
|
||||
recoll-conf-xml:
|
||||
recoll.conf.xml: ../../sampleconf/recoll.conf
|
||||
$(UTILBUILDS)/confxml --docbook \
|
||||
--idprefix=RCL.INSTALL.CONFIG.RECOLLCONF \
|
||||
../../sampleconf/recoll.conf > recoll.conf.xml
|
||||
@ -65,7 +65,7 @@ recoll-conf-xml:
|
||||
# script.
|
||||
# Also could not get readthedocs to generate the left pane TOC? could
|
||||
# probably be fixed...
|
||||
#usermanual-rst: recoll-conf-xml
|
||||
#usermanual-rst: recoll.conf.xml
|
||||
# tail -n +2 recoll.conf.xml > rcl-conf-tail.xml
|
||||
# sed -e '/xi:include/r rcl-conf-tail.xml' \
|
||||
# < usermanual.xml > full-man.xml
|
||||
|
||||
@ -8,26 +8,34 @@
|
||||
<listitem><para>Space-separated list of files or
|
||||
directories to recursively index. Default to ~ (indexes
|
||||
$HOME). You can use symbolic links in the list, they will be followed,
|
||||
independently of the value of the followLinks variable.</para></listitem></varlistentry>
|
||||
independently of the value of the followLinks variable.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MONITORDIRS">
|
||||
<term><varname>monitordirs</varname></term>
|
||||
<listitem><para>Space-separated list of files or directories to monitor for
|
||||
updates. When running the real-time indexer, this allows monitoring only a
|
||||
subset of the whole indexed area. The elements must be included in the
|
||||
tree defined by the 'topdirs' members.</para></listitem></varlistentry>
|
||||
tree defined by the 'topdirs' members.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.SKIPPEDNAMES">
|
||||
<term><varname>skippedNames</varname></term>
|
||||
<listitem><para>Files and directories which should be ignored. White space separated list of wildcard patterns (simple ones, not paths, must contain no
|
||||
'/' characters), which will be tested against file and directory names. Have a look at the default
|
||||
configuration for the initial value, some entries may not suit your situation. The easiest way to
|
||||
see it is through the GUI Index configuration "local parameters" panel. The list in the default
|
||||
configuration does not exclude hidden directories (names beginning with a dot), which means that
|
||||
it may index quite a few things that you do not want. On the other hand, email user agents like
|
||||
Thunderbird usually store messages in hidden directories, and you probably want this indexed. One
|
||||
possible solution is to have ".*" in "skippedNames", and add things like "~/.thunderbird"
|
||||
"~/.evolution" to "topdirs". Not even the file names are indexed for patterns in this list, see
|
||||
the "noContentSuffixes" variable for an alternative approach which indexes the file names. Can be
|
||||
redefined for any subtree.</para></listitem></varlistentry>
|
||||
'/' characters), which will be tested against file and directory names.
|
||||
</para><para>
|
||||
Have a look at the default configuration for the initial value, some entries may not suit your
|
||||
situation. The easiest way to see it is through the GUI Index configuration "local parameters"
|
||||
panel.
|
||||
</para><para>
|
||||
The list in the default configuration does not exclude hidden directories (names beginning with a
|
||||
dot), which means that it may index quite a few things that you do not want. On the other hand,
|
||||
email user agents like Thunderbird usually store messages in hidden directories, and you probably
|
||||
want this indexed. One possible solution is to have ".*" in "skippedNames", and add things like
|
||||
"~/.thunderbird" "~/.evolution" to "topdirs".
|
||||
</para><para>
|
||||
Not even the file names are indexed for patterns in this list, see the "noContentSuffixes"
|
||||
variable for an alternative approach which indexes the file names. Can be redefined for any
|
||||
subtree.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.SKIPPEDNAMES-">
|
||||
<term><varname>skippedNames-</varname></term>
|
||||
<listitem><para>List of name endings to remove from the default skippedNames
|
||||
@ -40,7 +48,8 @@ list. </para></listitem></varlistentry>
|
||||
<term><varname>onlyNames</varname></term>
|
||||
<listitem><para>Regular file name filter patterns If this is set, only the file names not in skippedNames and
|
||||
matching one of the patterns will be considered for indexing. Can be
|
||||
redefined per subtree. Does not apply to directories.</para></listitem></varlistentry>
|
||||
redefined per subtree. Does not apply to directories.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.NOCONTENTSUFFIXES">
|
||||
<term><varname>noContentSuffixes</varname></term>
|
||||
<listitem><para>List of name endings (not necessarily dot-separated suffixes) for
|
||||
@ -51,7 +60,8 @@ which will go away in a future release (the move from mimemap to
|
||||
recoll.conf allows editing the list through the GUI). This is different
|
||||
from skippedNames because these are name ending matches only (not
|
||||
wildcard patterns), and the file name itself gets indexed normally. This
|
||||
can be redefined for subdirectories.</para></listitem></varlistentry>
|
||||
can be redefined for subdirectories.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.NOCONTENTSUFFIXES-">
|
||||
<term><varname>noContentSuffixes-</varname></term>
|
||||
<listitem><para>List of name endings to remove from the default noContentSuffixes
|
||||
@ -62,19 +72,26 @@ list. </para></listitem></varlistentry>
|
||||
list. </para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.SKIPPEDPATHS">
|
||||
<term><varname>skippedPaths</varname></term>
|
||||
<listitem><para>Absolute paths we should not go into. Space-separated list of wildcard expressions for absolute
|
||||
filesystem paths. Must be defined at the top level of the configuration
|
||||
file, not in a subsection. Can contain files and directories. The database and
|
||||
configuration directories will automatically be added. The expressions
|
||||
are matched using 'fnmatch(3)' with the FNM_PATHNAME flag set by
|
||||
default. This means that '/' characters must be matched explicitly. You
|
||||
can set 'skippedPathsFnmPathname' to 0 to disable the use of FNM_PATHNAME
|
||||
(meaning that '/*/dir3' will match '/dir1/dir2/dir3'). The default value
|
||||
contains the usual mount point for removable media to remind you that it
|
||||
is a bad idea to have Recoll work on these (esp. with the monitor: media
|
||||
gets indexed on mount, all data gets erased on unmount). Explicitly
|
||||
adding '/media/xxx' to the 'topdirs' variable will override
|
||||
this.</para></listitem></varlistentry>
|
||||
<listitem><para>Absolute paths we should not go into. Space-separated list of wildcard expressions for absolute filesystem paths (for files or
|
||||
directories). The variable must be defined at the top level of the configuration file, not in a
|
||||
subsection.
|
||||
</para><para>
|
||||
Any value in the list must be textually consistent with the values in topdirs, no attempts are
|
||||
made to resolve symbolic links. In practise, if, as is frequently the case, /home is a link to
|
||||
/usr/home, your default topdirs will have a single entry '~' which will be translated to
|
||||
'/home/yourlogin'. In this case, any skippedPaths entry should start with '/home/yourlogin' *not*
|
||||
with '/usr/home/yourlogin'.
|
||||
</para><para>
|
||||
The index and configuration directories will automatically be added to the list.
|
||||
</para><para>
|
||||
The expressions are matched using 'fnmatch(3)' with the FNM_PATHNAME flag set by default. This
|
||||
means that '/' characters must be matched explicitly. You can set 'skippedPathsFnmPathname' to 0
|
||||
to disable the use of FNM_PATHNAME (meaning that '/*/dir3' will match '/dir1/dir2/dir3').
|
||||
</para><para>
|
||||
The default value contains the usual mount point for removable media to remind you that it is in
|
||||
most cases a bad idea to have Recoll work on these Explicitly adding '/media/xxx' to the 'topdirs'
|
||||
variable will override this.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.SKIPPEDPATHSFNMPATHNAME">
|
||||
<term><varname>skippedPathsFnmPathname</varname></term>
|
||||
<listitem><para>Set to 0 to
|
||||
@ -83,13 +100,15 @@ paths. </para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.NOWALKFN">
|
||||
<term><varname>nowalkfn</varname></term>
|
||||
<listitem><para>File name which will cause its parent directory to be skipped. Any directory containing a file with this name will be skipped as
|
||||
if it was part of the skippedPaths list. Ex: .recoll-noindex</para></listitem></varlistentry>
|
||||
if it was part of the skippedPaths list. Ex: .recoll-noindex
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.DAEMSKIPPEDPATHS">
|
||||
<term><varname>daemSkippedPaths</varname></term>
|
||||
<listitem><para>skippedPaths equivalent specific to
|
||||
real time indexing. This enables having parts of the tree
|
||||
which are initially indexed but not monitored. If daemSkippedPaths is
|
||||
not set, the daemon uses skippedPaths.</para></listitem></varlistentry>
|
||||
not set, the daemon uses skippedPaths.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.ZIPUSESKIPPEDNAMES">
|
||||
<term><varname>zipUseSkippedNames</varname></term>
|
||||
<listitem><para>Use skippedNames inside Zip archives. Fetched
|
||||
@ -115,7 +134,8 @@ multiple indexing of linked files. No effort is made to avoid duplication
|
||||
when this option is set to true. This option can be set individually for
|
||||
each of the 'topdirs' members by using sections. It can not be changed
|
||||
below the 'topdirs' level. Links in the 'topdirs' list itself are always
|
||||
followed.</para></listitem></varlistentry>
|
||||
followed.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.INDEXEDMIMETYPES">
|
||||
<term><varname>indexedmimetypes</varname></term>
|
||||
<listitem><para>Restrictive list of
|
||||
@ -124,14 +144,16 @@ supported types are indexed). If it is set, only the types from the list
|
||||
will have their contents indexed. The names will be indexed anyway if
|
||||
indexallfilenames is set (default). MIME type names should be taken from
|
||||
the mimemap file (the values may be different from xdg-mime or file -i
|
||||
output in some cases). Can be redefined for subtrees.</para></listitem></varlistentry>
|
||||
output in some cases). Can be redefined for subtrees.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.EXCLUDEDMIMETYPES">
|
||||
<term><varname>excludedmimetypes</varname></term>
|
||||
<listitem><para>List of excluded MIME
|
||||
types. Lets you exclude some types from indexing. MIME type
|
||||
names should be taken from the mimemap file (the values may be different
|
||||
from xdg-mime or file -i output in some cases) Can be redefined for
|
||||
subtrees.</para></listitem></varlistentry>
|
||||
subtrees.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.NOMD5TYPES">
|
||||
<term><varname>nomd5types</varname></term>
|
||||
<listitem><para>Don't compute md5 for these types. md5 checksums are used only for deduplicating results, and can be
|
||||
@ -140,32 +162,37 @@ lets you turn off md5 computation for selected types. It is global (no
|
||||
redefinition for subtrees). At the moment, it only has an effect for
|
||||
external handlers (exec and execm). The file types can be specified by
|
||||
listing either MIME types (e.g. audio/mpeg) or handler names
|
||||
(e.g. rclaudio).</para></listitem></varlistentry>
|
||||
(e.g. rclaudio).
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.COMPRESSEDFILEMAXKBS">
|
||||
<term><varname>compressedfilemaxkbs</varname></term>
|
||||
<listitem><para>Size limit for compressed
|
||||
files. We need to decompress these in a
|
||||
temporary directory for identification, which can be wasteful in some
|
||||
cases. Limit the waste. Negative means no limit. 0 results in no
|
||||
processing of any compressed file. Default 50 MB.</para></listitem></varlistentry>
|
||||
processing of any compressed file. Default 50 MB.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.TEXTFILEMAXMBS">
|
||||
<term><varname>textfilemaxmbs</varname></term>
|
||||
<listitem><para>Size limit for text
|
||||
files. Mostly for skipping monster
|
||||
logs. Default 20 MB.</para></listitem></varlistentry>
|
||||
logs. Default 20 MB.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.INDEXALLFILENAMES">
|
||||
<term><varname>indexallfilenames</varname></term>
|
||||
<listitem><para>Index the file names of
|
||||
unprocessed files Index the names of files the contents of
|
||||
which we don't index because of an excluded or unsupported MIME
|
||||
type.</para></listitem></varlistentry>
|
||||
type.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.USESYSTEMFILECOMMAND">
|
||||
<term><varname>usesystemfilecommand</varname></term>
|
||||
<listitem><para>Use a system command
|
||||
for file MIME type guessing as a final step in file type
|
||||
identification This is generally useful, but will usually
|
||||
cause the indexing of many bogus 'text' files. See 'systemfilecommand'
|
||||
for the command used.</para></listitem></varlistentry>
|
||||
for the command used.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.SYSTEMFILECOMMAND">
|
||||
<term><varname>systemfilecommand</varname></term>
|
||||
<listitem><para>Command used to guess
|
||||
@ -173,12 +200,14 @@ MIME types if the internal methods fails This should be a
|
||||
"file -i" workalike. The file path will be added as a last parameter to
|
||||
the command line. "xdg-mime" works better than the traditional "file"
|
||||
command, and is now the configured default (with a hard-coded fallback to
|
||||
"file")</para></listitem></varlistentry>
|
||||
"file")
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.PROCESSWEBQUEUE">
|
||||
<term><varname>processwebqueue</varname></term>
|
||||
<listitem><para>Decide if we process the
|
||||
Web queue. The queue is a directory where the Recoll Web
|
||||
browser plugins create the copies of visited pages.</para></listitem></varlistentry>
|
||||
browser plugins create the copies of visited pages.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.TEXTFILEPAGEKBS">
|
||||
<term><varname>textfilepagekbs</varname></term>
|
||||
<listitem><para>Page size for text
|
||||
@ -187,12 +216,14 @@ into documents of approximately this size. Will reduce memory usage at
|
||||
index time and help with loading data in the preview window at query
|
||||
time. Particularly useful with very big files, such as application or
|
||||
system logs. Also see textfilemaxmbs and
|
||||
compressedfilemaxkbs.</para></listitem></varlistentry>
|
||||
compressedfilemaxkbs.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MEMBERMAXKBS">
|
||||
<term><varname>membermaxkbs</varname></term>
|
||||
<listitem><para>Size limit for archive
|
||||
members. This is passed to the filters in the environment
|
||||
as RECOLL_FILTER_MAXMEMBERKB.</para></listitem></varlistentry>
|
||||
as RECOLL_FILTER_MAXMEMBERKB.
|
||||
</para></listitem></varlistentry>
|
||||
</variablelist></sect3>
|
||||
<sect3 id="RCL.INSTALL.CONFIG.RECOLLCONF.TERMS">
|
||||
<title>Parameters affecting how we generate terms and organize the index </title><variablelist>
|
||||
@ -204,28 +235,34 @@ searches sensitive to case and diacritics can be performed, but the index
|
||||
will be bigger, and some marginal weirdness may sometimes occur. The
|
||||
default is a stripped index. When using multiple indexes for a search,
|
||||
this parameter must be defined identically for all. Changing the value
|
||||
implies an index reset.</para></listitem></varlistentry>
|
||||
implies an index reset.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.INDEXSTOREDOCTEXT">
|
||||
<term><varname>indexStoreDocText</varname></term>
|
||||
<listitem><para>Decide if we store the
|
||||
documents' text content in the index. Storing the text
|
||||
allows extracting snippets from it at query time, instead of building
|
||||
them from index position data.
|
||||
</para><para>
|
||||
Newer Xapian index formats have rendered our use of positions list
|
||||
unacceptably slow in some cases. The last Xapian index format with good
|
||||
performance for the old method is Chert, which is default for 1.2, still
|
||||
supported but not default in 1.4 and will be dropped in 1.6.
|
||||
</para><para>
|
||||
The stored document text is translated from its original format to UTF-8
|
||||
plain text, but not stripped of upper-case, diacritics, or punctuation
|
||||
signs. Storing it increases the index size by 10-20% typically, but also
|
||||
allows for nicer snippets, so it may be worth enabling it even if not
|
||||
strictly needed for performance if you can afford the space.
|
||||
</para><para>
|
||||
The variable only has an effect when creating an index, meaning that the
|
||||
xapiandb directory must not exist yet. Its exact effect depends on the
|
||||
Xapian version.
|
||||
</para><para>
|
||||
For Xapian 1.4, if the variable is set to 0, the Chert format will be
|
||||
used, and the text will not be stored. If the variable is 1, Glass will
|
||||
be used, and the text stored.
|
||||
</para><para>
|
||||
For Xapian 1.2, and for versions after 1.5 and newer, the index format is
|
||||
always the default, but the variable controls if the text is stored or
|
||||
not, and the abstract generation method. With Xapian 1.5 and later, and
|
||||
@ -242,26 +279,31 @@ still be). Numbers are often quite interesting to search for, and this
|
||||
should probably not be set except for special situations, ie, scientific
|
||||
documents with huge amounts of numbers in them, where setting nonumbers
|
||||
will reduce the index size. This can only be set for a whole index, not
|
||||
for a subtree.</para></listitem></varlistentry>
|
||||
for a subtree.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.DEHYPHENATE">
|
||||
<term><varname>dehyphenate</varname></term>
|
||||
<listitem><para>Determines if we index 'coworker'
|
||||
also when the input is 'co-worker'. This is new
|
||||
in version 1.22, and on by default. Setting the variable to off allows
|
||||
restoring the previous behaviour.</para></listitem></varlistentry>
|
||||
restoring the previous behaviour.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.BACKSLASHASLETTER">
|
||||
<term><varname>backslashasletter</varname></term>
|
||||
<listitem><para>Process backslash as normal letter. This may make sense for people wanting to index TeX commands as
|
||||
such but is not of much general use.</para></listitem></varlistentry>
|
||||
such but is not of much general use.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.UNDERSCOREASLETTER">
|
||||
<term><varname>underscoreasletter</varname></term>
|
||||
<listitem><para>Process underscore as normal letter. This makes sense in so many cases that one wonders if it should
|
||||
not be the default.</para></listitem></varlistentry>
|
||||
not be the default.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MAXTERMLENGTH">
|
||||
<term><varname>maxtermlength</varname></term>
|
||||
<listitem><para>Maximum term length. Words longer than this will be discarded.
|
||||
The default is 40 and used to be hard-coded, but it can now be
|
||||
adjusted. You need an index reset if you change the value.</para></listitem></varlistentry>
|
||||
adjusted. You need an index reset if you change the value.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.NOCJK">
|
||||
<term><varname>nocjk</varname></term>
|
||||
<listitem><para>Decides if specific East Asian
|
||||
@ -269,20 +311,23 @@ adjusted. You need an index reset if you change the value.</para></listitem></va
|
||||
off. This will save a small amount of CPU if you have no CJK
|
||||
documents. If your document base does include such text but you are not
|
||||
interested in searching it, setting nocjk may be a
|
||||
significant time and space saver.</para></listitem></varlistentry>
|
||||
significant time and space saver.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.CJKNGRAMLEN">
|
||||
<term><varname>cjkngramlen</varname></term>
|
||||
<listitem><para>This lets you adjust the size of
|
||||
n-grams used for indexing CJK text. The default value of 2 is
|
||||
probably appropriate in most cases. A value of 3 would allow more precision
|
||||
and efficiency on longer words, but the index will be approximately twice
|
||||
as large.</para></listitem></varlistentry>
|
||||
as large.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.INDEXSTEMMINGLANGUAGES">
|
||||
<term><varname>indexstemminglanguages</varname></term>
|
||||
<listitem><para>Languages for which to create stemming expansion
|
||||
data. Stemmer names can be found by executing 'recollindex
|
||||
-l', or this can also be set from a list in the GUI. The values are full
|
||||
language names, e.g. english, french...</para></listitem></varlistentry>
|
||||
language names, e.g. english, french...
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.DEFAULTCHARSET">
|
||||
<term><varname>defaultcharset</varname></term>
|
||||
<listitem><para>Default character
|
||||
@ -293,37 +338,39 @@ set, the default character set is the one defined by the NLS environment
|
||||
($LC_ALL, $LC_CTYPE, $LANG), or ultimately iso-8859-1 (cp-1252 in fact).
|
||||
If for some reason you want a general default which does not match your
|
||||
LANG and is not 8859-1, use this variable. This can be redefined for any
|
||||
sub-directory.</para></listitem></varlistentry>
|
||||
sub-directory.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.UNAC_EXCEPT_TRANS">
|
||||
<term><varname>unac_except_trans</varname></term>
|
||||
<listitem><para>A list of characters,
|
||||
encoded in UTF-8, which should be handled specially
|
||||
when converting text to unaccented lowercase. For
|
||||
example, in Swedish, the letter a with diaeresis has full alphabet
|
||||
citizenship and should not be turned into an a.
|
||||
Each element in the space-separated list has the special character as
|
||||
first element and the translation following. The handling of both the
|
||||
lowercase and upper-case versions of a character should be specified, as
|
||||
appartenance to the list will turn-off both standard accent and case
|
||||
processing. The value is global and affects both indexing and querying.
|
||||
<listitem><para>A list of characters, encoded in UTF-8, which should be handled specially when converting
|
||||
text to unaccented lowercase. For example, in Swedish, the letter a with diaeresis has full alphabet citizenship and
|
||||
should not be turned into an a. Each element in the space-separated list has the special
|
||||
character as first element and the translation following. The handling of both the lowercase and
|
||||
upper-case versions of a character should be specified, as appartenance to the list will turn-off
|
||||
both standard accent and case processing. The value is global and affects both indexing and
|
||||
querying. We also convert a few confusing Unicode characters (quotes, hyphen) to their ASCII
|
||||
equivalent to avoid "invisible" search failures.
|
||||
</para><para>
|
||||
Examples:
|
||||
Swedish:
|
||||
unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe æae Æae ffff fifi flfl åå Åå
|
||||
unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe æae Æae ffff fifi flfl åå Åå ’' ❜' ʼ' ‐-
|
||||
. German:
|
||||
unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe æae Æae ffff fifi flfl
|
||||
unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe æae Æae ffff fifi flfl ’' ❜' ʼ' ‐-
|
||||
. French: you probably want to decompose oe and ae and nobody would type
|
||||
a German ß
|
||||
unac_except_trans = ßss œoe Œoe æae Æae ffff fifi flfl
|
||||
unac_except_trans = ßss œoe Œoe æae Æae ffff fifi flfl ’' ❜' ʼ' ‐-
|
||||
. The default for all until someone protests follows. These decompositions
|
||||
are not performed by unac, but it is unlikely that someone would type the
|
||||
composed forms in a search.
|
||||
unac_except_trans = ßss œoe Œoe æae Æae ffff fifi flfl</para></listitem></varlistentry>
|
||||
unac_except_trans = ßss œoe Œoe æae Æae ffff fifi flfl ’' ❜' ʼ' ‐-
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MAILDEFCHARSET">
|
||||
<term><varname>maildefcharset</varname></term>
|
||||
<listitem><para>Overrides the default
|
||||
character set for email messages which don't specify
|
||||
one. This is mainly useful for readpst (libpst) dumps,
|
||||
which are utf-8 but do not say so.</para></listitem></varlistentry>
|
||||
which are utf-8 but do not say so.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.LOCALFIELDS">
|
||||
<term><varname>localfields</varname></term>
|
||||
<listitem><para>Set fields on all files
|
||||
@ -331,7 +378,8 @@ which are utf-8 but do not say so.</para></listitem></varlistentry>
|
||||
name = value ; attr1 = val1 ; [...]
|
||||
value is empty so this needs an initial semi-colon. This is useful, e.g.,
|
||||
for setting the rclaptg field for application selection inside
|
||||
mimeview.</para></listitem></varlistentry>
|
||||
mimeview.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.TESTMODIFUSEMTIME">
|
||||
<term><varname>testmodifusemtime</varname></term>
|
||||
<listitem><para>Use mtime instead of
|
||||
@ -353,12 +401,12 @@ undetected). Perform a full index reset after changing this.
|
||||
<term><varname>noxattrfields</varname></term>
|
||||
<listitem><para>Disable extended attributes
|
||||
conversion to metadata fields. This probably needs to be
|
||||
set if testmodifusemtime is set.</para></listitem></varlistentry>
|
||||
set if testmodifusemtime is set.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.METADATACMDS">
|
||||
<term><varname>metadatacmds</varname></term>
|
||||
<listitem><para>Define commands to
|
||||
gather external metadata, e.g. tmsu tags.
|
||||
There can be several entries, separated by semi-colons, each defining
|
||||
gather external metadata, e.g. tmsu tags. There can be several entries, separated by semi-colons, each defining
|
||||
which field name the data goes into and the command to use. Don't forget the
|
||||
initial semi-colon. All the field names must be different. You can use
|
||||
aliases in the "field" file if necessary.
|
||||
@ -383,13 +431,15 @@ cachedir is ~/.cache/recoll, the default dbdir would be
|
||||
mboxcachedir, aspellDicDir, which can still be individually specified to
|
||||
override cachedir. Note that if you have multiple configurations, each
|
||||
must have a different cachedir, there is no automatic computation of a
|
||||
subpath under cachedir.</para></listitem></varlistentry>
|
||||
subpath under cachedir.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MAXFSOCCUPPC">
|
||||
<term><varname>maxfsoccuppc</varname></term>
|
||||
<listitem><para>Maximum file system occupation
|
||||
over which we stop indexing. The value is a percentage,
|
||||
corresponding to what the "Capacity" df output column shows. The default
|
||||
value is 0, meaning no checking.</para></listitem></varlistentry>
|
||||
value is 0, meaning no checking.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.DBDIR">
|
||||
<term><varname>dbdir</varname></term>
|
||||
<listitem><para>Xapian database directory
|
||||
@ -397,36 +447,43 @@ location. This will be created on first indexing. If the
|
||||
value is not an absolute path, it will be interpreted as relative to
|
||||
cachedir if set, or the configuration directory (-c argument or
|
||||
$RECOLL_CONFDIR). If nothing is specified, the default is then
|
||||
~/.recoll/xapiandb/</para></listitem></varlistentry>
|
||||
~/.recoll/xapiandb/
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.IDXSTATUSFILE">
|
||||
<term><varname>idxstatusfile</varname></term>
|
||||
<listitem><para>Name of the scratch file where the indexer process updates its
|
||||
status. Default: idxstatus.txt inside the configuration
|
||||
directory.</para></listitem></varlistentry>
|
||||
directory.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MBOXCACHEDIR">
|
||||
<term><varname>mboxcachedir</varname></term>
|
||||
<listitem><para>Directory location for storing mbox message offsets cache
|
||||
files. This is normally 'mboxcache' under cachedir if set,
|
||||
or else under the configuration directory, but it may be useful to share
|
||||
a directory between different configurations.</para></listitem></varlistentry>
|
||||
a directory between different configurations.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MBOXCACHEMINMBS">
|
||||
<term><varname>mboxcacheminmbs</varname></term>
|
||||
<listitem><para>Minimum mbox file size over which we cache the offsets. There is really no sense in caching offsets for small files. The
|
||||
default is 5 MB.</para></listitem></varlistentry>
|
||||
default is 5 MB.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MBOXMAXMSGMBS">
|
||||
<term><varname>mboxmaxmsgmbs</varname></term>
|
||||
<listitem><para>Maximum mbox member message size in megabytes. Size over which we assume that the mbox format is bad or we
|
||||
misinterpreted it, at which point we just stop processing the file.</para></listitem></varlistentry>
|
||||
misinterpreted it, at which point we just stop processing the file.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.WEBCACHEDIR">
|
||||
<term><varname>webcachedir</varname></term>
|
||||
<listitem><para>Directory where we store the archived web pages. This is only used by the web history indexing code
|
||||
Default: cachedir/webcache if cachedir is set, else
|
||||
$RECOLL_CONFDIR/webcache</para></listitem></varlistentry>
|
||||
$RECOLL_CONFDIR/webcache
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.WEBCACHEMAXMBS">
|
||||
<term><varname>webcachemaxmbs</varname></term>
|
||||
<listitem><para>Maximum size in MB of the Web archive. This is only used by the web history indexing code.
|
||||
Default: 40 MB.
|
||||
Reducing the size will not physically truncate the file.</para></listitem></varlistentry>
|
||||
Reducing the size will not physically truncate the file.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.WEBQUEUEDIR">
|
||||
<term><varname>webqueuedir</varname></term>
|
||||
<listitem><para>The path to the Web indexing queue. This used to be
|
||||
@ -434,36 +491,42 @@ hard-coded in the old plugin as ~/.recollweb/ToIndex so there would be no
|
||||
need or possibility to change it, but the WebExtensions plugin now downloads
|
||||
the files to the user Downloads directory, and a script moves them to
|
||||
webqueuedir. The script reads this value from the config so it has become
|
||||
possible to change it.</para></listitem></varlistentry>
|
||||
possible to change it.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.WEBDOWNLOADSDIR">
|
||||
<term><varname>webdownloadsdir</varname></term>
|
||||
<listitem><para>The path to browser downloads directory. This is
|
||||
where the new browser add-on extension has to create the files. They are
|
||||
then moved by a script to webqueuedir.</para></listitem></varlistentry>
|
||||
then moved by a script to webqueuedir.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.WEBCACHEKEEPINTERVAL">
|
||||
<term><varname>webcachekeepinterval</varname></term>
|
||||
<listitem><para>Page recycle interval By default, only one instance of an URL is kept in the cache. This
|
||||
can be changed by setting this to a value determining at what frequency
|
||||
we keep multiple instances ('day', 'week', 'month',
|
||||
'year'). Note that increasing the interval will not erase existing
|
||||
entries.</para></listitem></varlistentry>
|
||||
entries.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLDICDIR">
|
||||
<term><varname>aspellDicDir</varname></term>
|
||||
<listitem><para>Aspell dictionary storage directory location. The
|
||||
aspell dictionary (aspdict.(lang).rws) is normally stored in the
|
||||
directory specified by cachedir if set, or under the configuration
|
||||
directory.</para></listitem></varlistentry>
|
||||
directory.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.FILTERSDIR">
|
||||
<term><varname>filtersdir</varname></term>
|
||||
<listitem><para>Directory location for executable input handlers. If
|
||||
RECOLL_FILTERSDIR is set in the environment, we use it instead. Defaults
|
||||
to $prefix/share/recoll/filters. Can be redefined for
|
||||
subdirectories.</para></listitem></varlistentry>
|
||||
subdirectories.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.ICONSDIR">
|
||||
<term><varname>iconsdir</varname></term>
|
||||
<listitem><para>Directory location for icons. The only reason to
|
||||
change this would be if you want to change the icons displayed in the
|
||||
result list. Defaults to $prefix/share/recoll/images</para></listitem></varlistentry>
|
||||
result list. Defaults to $prefix/share/recoll/images
|
||||
</para></listitem></varlistentry>
|
||||
</variablelist></sect3>
|
||||
<sect3 id="RCL.INSTALL.CONFIG.RECOLLCONF.PERFS">
|
||||
<title>Parameters affecting indexing performance and resource usage </title><variablelist>
|
||||
@ -481,13 +544,15 @@ value (from this file) is now 50 MB, and should be ok in many cases.
|
||||
You can set it as low as 10 to conserve memory, but if you are looking
|
||||
for maximum speed, you may want to experiment with values between 20 and
|
||||
200. In my experience, values beyond this are always counterproductive. If
|
||||
you find otherwise, please drop me a note.</para></listitem></varlistentry>
|
||||
you find otherwise, please drop me a note.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.FILTERMAXSECONDS">
|
||||
<term><varname>filtermaxseconds</varname></term>
|
||||
<listitem><para>Maximum external filter execution time in
|
||||
seconds. Default 1200 (20mn). Set to 0 for no limit. This
|
||||
is mainly to avoid infinite loops in postscript files
|
||||
(loop.ps)</para></listitem></varlistentry>
|
||||
(loop.ps)
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.FILTERMAXMBYTES">
|
||||
<term><varname>filtermaxmbytes</varname></term>
|
||||
<listitem><para>Maximum virtual memory space for filter processes
|
||||
@ -495,7 +560,8 @@ is mainly to avoid infinite loops in postscript files
|
||||
Linux way to limit the data space only), so we need to be a bit generous
|
||||
here. Anything over 2000 will be ignored on 32 bits machines. The
|
||||
previous default value of 2000 would prevent java pdftk to work when
|
||||
executed from Python rclpdf.py.</para></listitem></varlistentry>
|
||||
executed from Python rclpdf.py.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.THRQSIZES">
|
||||
<term><varname>thrQSizes</varname></term>
|
||||
<listitem><para>Stage input queues configuration. There are three
|
||||
@ -507,7 +573,8 @@ next stage. In practise, deep queues have not been shown to increase
|
||||
performance. Default: a value of 0 for the first queue tells Recoll to
|
||||
perform autoconfiguration based on the detected number of CPUs (no need
|
||||
for the two other values in this case). Use thrQSizes = -1 -1 -1 to
|
||||
disable multithreading entirely.</para></listitem></varlistentry>
|
||||
disable multithreading entirely.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.THRTCOUNTS">
|
||||
<term><varname>thrTCounts</varname></term>
|
||||
<listitem><para>Number of threads used for each indexing stage. The
|
||||
@ -517,7 +584,8 @@ in thrQSizes: if the first queue depth is 0, all counts are ignored
|
||||
(autoconfigured); if a value of -1 is used for a queue depth, the
|
||||
corresponding thread count is ignored. It makes no sense to use a value
|
||||
other than 1 for the last stage because updating the Xapian index is
|
||||
necessarily single-threaded (and protected by a mutex).</para></listitem></varlistentry>
|
||||
necessarily single-threaded (and protected by a mutex).
|
||||
</para></listitem></varlistentry>
|
||||
</variablelist></sect3>
|
||||
<sect3 id="RCL.INSTALL.CONFIG.RECOLLCONF.MISC">
|
||||
<title>Miscellaneous parameters </title><variablelist>
|
||||
@ -525,7 +593,8 @@ necessarily single-threaded (and protected by a mutex).</para></listitem></varli
|
||||
<term><varname>loglevel</varname></term>
|
||||
<listitem><para>Log file verbosity 1-6. A value of 2 will print
|
||||
only errors and warnings. 3 will print information like document updates,
|
||||
4 is quite verbose and 6 very verbose.</para></listitem></varlistentry>
|
||||
4 is quite verbose and 6 very verbose.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.LOGFILENAME">
|
||||
<term><varname>logfilename</varname></term>
|
||||
<listitem><para>Log file destination. Use 'stderr' (default) to write to the
|
||||
@ -541,17 +610,20 @@ console. </para></listitem></varlistentry>
|
||||
<listitem><para>Destination file for external helpers standard error output. The external program error output is left alone by default,
|
||||
e.g. going to the terminal when the recoll[index] program is executed
|
||||
from the command line. Use /dev/null or a file inside a non-existent
|
||||
directory to completely suppress the output.</para></listitem></varlistentry>
|
||||
directory to completely suppress the output.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.DAEMLOGLEVEL">
|
||||
<term><varname>daemloglevel</varname></term>
|
||||
<listitem><para>Override loglevel for the indexer in real time
|
||||
mode. The default is to use the idx... values if set, else
|
||||
the log... values.</para></listitem></varlistentry>
|
||||
the log... values.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.DAEMLOGFILENAME">
|
||||
<term><varname>daemlogfilename</varname></term>
|
||||
<listitem><para>Override logfilename for the indexer in real time
|
||||
mode. The default is to use the idx... values if set, else
|
||||
the log... values.</para></listitem></varlistentry>
|
||||
the log... values.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.PYLOGLEVEL">
|
||||
<term><varname>pyloglevel</varname></term>
|
||||
<listitem><para>Override loglevel for the python module. </para></listitem></varlistentry>
|
||||
@ -564,7 +636,8 @@ the log... values.</para></listitem></varlistentry>
|
||||
configuration directory inside the directory tree makes it possible to
|
||||
provide automatic query time path translations once the data set has
|
||||
moved (for example, because it has been mounted on another
|
||||
location).</para></listitem></varlistentry>
|
||||
location).
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.CURIDXCONFDIR">
|
||||
<term><varname>curidxconfdir</varname></term>
|
||||
<listitem><para>Current location of the configuration directory. Complement orgidxconfdir for movable datasets. This should be used
|
||||
@ -576,7 +649,8 @@ example if a dataset originally indexed as '/home/me/mydata/config' has
|
||||
been mounted to '/media/me/mydata', and the GUI is running from a copied
|
||||
configuration, orgidxconfdir would be '/home/me/mydata/config', and
|
||||
curidxconfdir (as set in the copied configuration) would be
|
||||
'/media/me/mydata/config'.</para></listitem></varlistentry>
|
||||
'/media/me/mydata/config'.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.IDXRUNDIR">
|
||||
<term><varname>idxrundir</varname></term>
|
||||
<listitem><para>Indexing process current directory. The input
|
||||
@ -585,19 +659,22 @@ makes sense to have recollindex chdir to some temporary directory. If the
|
||||
value is empty, the current directory is not changed. If the
|
||||
value is (literal) tmp, we use the temporary directory as set by the
|
||||
environment (RECOLL_TMPDIR else TMPDIR else /tmp). If the value is an
|
||||
absolute path to a directory, we go there.</para></listitem></varlistentry>
|
||||
absolute path to a directory, we go there.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.CHECKNEEDRETRYINDEXSCRIPT">
|
||||
<term><varname>checkneedretryindexscript</varname></term>
|
||||
<listitem><para>Script used to heuristically check if we need to retry indexing
|
||||
files which previously failed. The default script checks
|
||||
the modified dates on /usr/bin and /usr/local/bin. A relative path will
|
||||
be looked up in the filters dirs, then in the path. Use an absolute path
|
||||
to do otherwise.</para></listitem></varlistentry>
|
||||
to do otherwise.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.RECOLLHELPERPATH">
|
||||
<term><varname>recollhelperpath</varname></term>
|
||||
<listitem><para>Additional places to search for helper executables. This is used, e.g., on Windows by the Python code, and on Mac OS by the bundled recoll.app
|
||||
(because I could find no reliable way to tell launchd to set the PATH). The example below is for
|
||||
Windows. Use ':' as entry separator for Mac and Ux-like systems, ';' is for Windows only.</para></listitem></varlistentry>
|
||||
Windows. Use ':' as entry separator for Mac and Ux-like systems, ';' is for Windows only.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.IDXABSMLEN">
|
||||
<term><varname>idxabsmlen</varname></term>
|
||||
<listitem><para>Length of abstracts we store while indexing. Recoll stores an abstract for each indexed file.
|
||||
@ -609,62 +686,72 @@ defines the size of the stored abstract. The default value is 250
|
||||
bytes. The search interface gives you the choice to display this stored
|
||||
text or a synthetic abstract built by extracting text around the search
|
||||
terms. If you always prefer the synthetic abstract, you can reduce this
|
||||
value and save a little space.</para></listitem></varlistentry>
|
||||
value and save a little space.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.IDXMETASTOREDLEN">
|
||||
<term><varname>idxmetastoredlen</varname></term>
|
||||
<listitem><para>Truncation length of stored metadata fields. This
|
||||
does not affect indexing (the whole field is processed anyway), just the
|
||||
amount of data stored in the index for the purpose of displaying fields
|
||||
inside result lists or previews. The default value is 150 bytes which
|
||||
may be too low if you have custom fields.</para></listitem></varlistentry>
|
||||
may be too low if you have custom fields.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.IDXTEXTTRUNCATELEN">
|
||||
<term><varname>idxtexttruncatelen</varname></term>
|
||||
<listitem><para>Truncation length for all document texts. Only index
|
||||
the beginning of documents. This is not recommended except if you are
|
||||
sure that the interesting keywords are at the top and have severe disk
|
||||
space issues.</para></listitem></varlistentry>
|
||||
space issues.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.IDXSYNONYMS">
|
||||
<term><varname>idxsynonyms</varname></term>
|
||||
<listitem><para>Name of the index-time synonyms file. This is used for indexing multiword synonyms as single terms,
|
||||
which in turn is only useful if you want to perform proximity searches
|
||||
with such terms.</para></listitem></varlistentry>
|
||||
with such terms.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLLANGUAGE">
|
||||
<term><varname>aspellLanguage</varname></term>
|
||||
<listitem><para>Language definitions to use when creating the aspell
|
||||
dictionary. The value must match a set of aspell language
|
||||
definition files. You can type "aspell dicts" to see a list The default
|
||||
if this is not set is to use the NLS environment to guess the value. The
|
||||
values are the 2-letter language codes (e.g. 'en', 'fr'...)</para></listitem></varlistentry>
|
||||
values are the 2-letter language codes (e.g. 'en', 'fr'...)
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLADDCREATEPARAM">
|
||||
<term><varname>aspellAddCreateParam</varname></term>
|
||||
<listitem><para>Additional option and parameter to aspell dictionary creation
|
||||
command. Some aspell packages may need an additional option
|
||||
(e.g. on Debian Jessie: --local-data-dir=/usr/lib/aspell). See Debian bug
|
||||
772415.</para></listitem></varlistentry>
|
||||
772415.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLKEEPSTDERR">
|
||||
<term><varname>aspellKeepStderr</varname></term>
|
||||
<listitem><para>Set this to have a look at aspell dictionary creation
|
||||
errors. There are always many, so this is mostly for
|
||||
debugging.</para></listitem></varlistentry>
|
||||
debugging.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.NOASPELL">
|
||||
<term><varname>noaspell</varname></term>
|
||||
<listitem><para>Disable aspell use. The aspell dictionary generation
|
||||
takes time, and some combinations of aspell version, language, and local
|
||||
terms, result in aspell crashing, so it sometimes makes sense to just
|
||||
disable the thing.</para></listitem></varlistentry>
|
||||
disable the thing.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MONAUXINTERVAL">
|
||||
<term><varname>monauxinterval</varname></term>
|
||||
<listitem><para>Auxiliary database update interval. The real time
|
||||
indexer only updates the auxiliary databases (stemdb, aspell)
|
||||
periodically, because it would be too costly to do it for every document
|
||||
change. The default period is one hour.</para></listitem></varlistentry>
|
||||
change. The default period is one hour.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MONIXINTERVAL">
|
||||
<term><varname>monixinterval</varname></term>
|
||||
<listitem><para>Minimum interval (seconds) between processings of the indexing
|
||||
queue. The real time indexer does not process each event
|
||||
when it comes in, but lets the queue accumulate, to diminish overhead and
|
||||
to aggregate multiple events affecting the same file. Default 30
|
||||
S.</para></listitem></varlistentry>
|
||||
S.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MONDELAYPATTERNS">
|
||||
<term><varname>mondelaypatterns</varname></term>
|
||||
<listitem><para>Timing parameters for the real time indexing. Definitions for files which get a longer delay before reindexing
|
||||
@ -673,21 +760,25 @@ reindexed once in a while. A list of wildcardPattern:seconds pairs. The
|
||||
patterns are matched with fnmatch(pattern, path, 0) You can quote entries
|
||||
containing white space with double quotes (quote the whole entry, not the
|
||||
pattern). The default is empty.
|
||||
Example: mondelaypatterns = *.log:20 "*with spaces.*:30"</para></listitem></varlistentry>
|
||||
Example: mondelaypatterns = *.log:20 "*with spaces.*:30"
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.IDXNICEPRIO">
|
||||
<term><varname>idxniceprio</varname></term>
|
||||
<listitem><para>"nice" process priority for the indexing processes. Default: 19
|
||||
(lowest) Appeared with 1.26.5. Prior versions were fixed at 19.</para></listitem></varlistentry>
|
||||
(lowest) Appeared with 1.26.5. Prior versions were fixed at 19.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MONIONICECLASS">
|
||||
<term><varname>monioniceclass</varname></term>
|
||||
<listitem><para>ionice class for the indexing process. Despite the misleading name, and on platforms where this is
|
||||
supported, this affects all indexing processes,
|
||||
not only the real time/monitoring ones. The default value is 3 (use
|
||||
lowest "Idle" priority).</para></listitem></varlistentry>
|
||||
lowest "Idle" priority).
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MONIONICECLASSDATA">
|
||||
<term><varname>monioniceclassdata</varname></term>
|
||||
<listitem><para>ionice class level parameter if the class supports it. The default is empty, as the default "Idle" class has no
|
||||
levels.</para></listitem></varlistentry>
|
||||
levels.
|
||||
</para></listitem></varlistentry>
|
||||
</variablelist></sect3>
|
||||
<sect3 id="RCL.INSTALL.CONFIG.RECOLLCONF.QUERY">
|
||||
<title>Query-time parameters (no impact on the index) </title><variablelist>
|
||||
@ -696,7 +787,8 @@ levels.</para></listitem></varlistentry>
|
||||
<listitem><para>auto-trigger diacritics sensitivity (raw index only). IF the index is not stripped, decide if we automatically trigger
|
||||
diacritics sensitivity if the search term has accented characters (not in
|
||||
unac_except_trans). Else you need to use the query language and the "D"
|
||||
modifier to specify diacritics sensitivity. Default is no.</para></listitem></varlistentry>
|
||||
modifier to specify diacritics sensitivity. Default is no.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.AUTOCASESENS">
|
||||
<term><varname>autocasesens</varname></term>
|
||||
<listitem><para>auto-trigger case sensitivity (raw index only). IF
|
||||
@ -704,40 +796,46 @@ the index is not stripped (see indexStripChars), decide if we
|
||||
automatically trigger character case sensitivity if the search term has
|
||||
upper-case characters in any but the first position. Else you need to use
|
||||
the query language and the "C" modifier to specify character-case
|
||||
sensitivity. Default is yes.</para></listitem></varlistentry>
|
||||
sensitivity. Default is yes.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MAXTERMEXPAND">
|
||||
<term><varname>maxTermExpand</varname></term>
|
||||
<listitem><para>Maximum query expansion count
|
||||
for a single term (e.g.: when using wildcards). This only
|
||||
affects queries, not indexing. We used to not limit this at all (except
|
||||
for filenames where the limit was too low at 1000), but it is
|
||||
unreasonable with a big index. Default 10000.</para></listitem></varlistentry>
|
||||
unreasonable with a big index. Default 10000.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MAXXAPIANCLAUSES">
|
||||
<term><varname>maxXapianClauses</varname></term>
|
||||
<listitem><para>Maximum number of clauses
|
||||
we add to a single Xapian query. This only affects queries,
|
||||
not indexing. In some cases, the result of term expansion can be
|
||||
multiplicative, and we want to avoid eating all the memory. Default
|
||||
50000.</para></listitem></varlistentry>
|
||||
50000.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.SNIPPETMAXPOSWALK">
|
||||
<term><varname>snippetMaxPosWalk</varname></term>
|
||||
<listitem><para>Maximum number of positions we walk while populating a snippet for
|
||||
the result list. The default of 1,000,000 may be
|
||||
insufficient for very big documents, the consequence would be snippets
|
||||
with possibly meaning-altering missing words.</para></listitem></varlistentry>
|
||||
with possibly meaning-altering missing words.
|
||||
</para></listitem></varlistentry>
|
||||
</variablelist></sect3>
|
||||
<sect3 id="RCL.INSTALL.CONFIG.RECOLLCONF.PDF">
|
||||
<title>Parameters for the PDF input script </title><variablelist>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.PDFOCR">
|
||||
<term><varname>pdfocr</varname></term>
|
||||
<listitem><para>Attempt OCR of PDF files with no text content. This can be defined in subdirectories. The default is off because
|
||||
OCR is so very slow.</para></listitem></varlistentry>
|
||||
OCR is so very slow.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.PDFATTACH">
|
||||
<term><varname>pdfattach</varname></term>
|
||||
<listitem><para>Enable PDF attachment extraction by executing pdftk (if
|
||||
available). This is
|
||||
normally disabled, because it does slow down PDF indexing a bit even if
|
||||
not one attachment is ever found.</para></listitem></varlistentry>
|
||||
not one attachment is ever found.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.PDFEXTRAMETA">
|
||||
<term><varname>pdfextrameta</varname></term>
|
||||
<listitem><para>Extract text from selected XMP metadata tags. This
|
||||
@ -745,7 +843,8 @@ is a space-separated list of qualified XMP tag names. Each element can also
|
||||
include a translation to a Recoll field name, separated by a '|'
|
||||
character. If the second element is absent, the tag name is used as the
|
||||
Recoll field names. You will also need to add specifications to the
|
||||
"fields" file to direct processing of the extracted data.</para></listitem></varlistentry>
|
||||
"fields" file to direct processing of the extracted data.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.PDFEXTRAMETAFIX">
|
||||
<term><varname>pdfextrametafix</varname></term>
|
||||
<listitem><para>Define name of XMP field editing script. This
|
||||
@ -754,7 +853,8 @@ values. The script should define a 'MetaFixer' class with a metafix()
|
||||
method which will be called with the qualified tag name and value of each
|
||||
selected field, for editing or erasing. A new instance is created for
|
||||
each document, so that the object can keep state for, e.g. eliminating
|
||||
duplicate values.</para></listitem></varlistentry>
|
||||
duplicate values.
|
||||
</para></listitem></varlistentry>
|
||||
</variablelist></sect3>
|
||||
<sect3 id="RCL.INSTALL.CONFIG.RECOLLCONF.OCR">
|
||||
<title>Parameters for OCR processing </title><variablelist>
|
||||
@ -766,17 +866,20 @@ the input file. Modules for tesseract (tesseract) and ABBYY FineReader
|
||||
(abbyy) are present in the standard distribution. For compatibility with
|
||||
the previous version, if this is not defined at all, the default value is
|
||||
"tesseract". Use an explicit empty value if needed. A value of "abbyy
|
||||
tesseract" will try everything.</para></listitem></varlistentry>
|
||||
tesseract" will try everything.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.OCRCACHEDIR">
|
||||
<term><varname>ocrcachedir</varname></term>
|
||||
<listitem><para>Location for caching OCR data. The default if this is empty or undefined is to store the cached
|
||||
OCR data under $RECOLL_CONFDIR/ocrcache.</para></listitem></varlistentry>
|
||||
OCR data under $RECOLL_CONFDIR/ocrcache.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.TESSERACTLANG">
|
||||
<term><varname>tesseractlang</varname></term>
|
||||
<listitem><para>Language to assume for tesseract OCR. Important for improving the OCR accuracy. This can also be set
|
||||
through the contents of a file in
|
||||
the currently processed directory. See the rclocrtesseract.py
|
||||
script. Example values: eng, fra... See the tesseract documentation.</para></listitem></varlistentry>
|
||||
script. Example values: eng, fra... See the tesseract documentation.
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.TESSERACTCMD">
|
||||
<term><varname>tesseractcmd</varname></term>
|
||||
<listitem><para>Path for the tesseract command. Do not quote. This is mostly useful on Windows, or for specifying a non-default
|
||||
@ -800,6 +903,7 @@ script. Typical values: English, French... See the ABBYY documentation.
|
||||
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.MHMBOXQUIRKS">
|
||||
<term><varname>mhmboxquirks</varname></term>
|
||||
<listitem><para>Enable thunderbird/mozilla-seamonkey mbox format quirks Set this for the directory where the email mbox files are
|
||||
stored.</para></listitem></varlistentry>
|
||||
stored.
|
||||
</para></listitem></varlistentry>
|
||||
</variablelist></sect3>
|
||||
</sect2>
|
||||
|
||||
@ -8929,24 +8929,26 @@ hasextract = False
|
||||
White space separated list of wildcard patterns
|
||||
(simple ones, not paths, must contain no '/'
|
||||
characters), which will be tested against file
|
||||
and directory names. Have a look at the default
|
||||
configuration for the initial value, some entries
|
||||
may not suit your situation. The easiest way to
|
||||
see it is through the GUI Index configuration
|
||||
"local parameters" panel. The list in the default
|
||||
configuration does not exclude hidden directories
|
||||
(names beginning with a dot), which means that it
|
||||
may index quite a few things that you do not
|
||||
want. On the other hand, email user agents like
|
||||
Thunderbird usually store messages in hidden
|
||||
directories, and you probably want this indexed.
|
||||
One possible solution is to have ".*" in
|
||||
"skippedNames", and add things like
|
||||
"~/.thunderbird" "~/.evolution" to "topdirs". Not
|
||||
even the file names are indexed for patterns in
|
||||
this list, see the "noContentSuffixes" variable
|
||||
for an alternative approach which indexes the
|
||||
file names. Can be redefined for any subtree.</p>
|
||||
and directory names.</p>
|
||||
<p>Have a look at the default configuration for
|
||||
the initial value, some entries may not suit your
|
||||
situation. The easiest way to see it is through
|
||||
the GUI Index configuration "local parameters"
|
||||
panel.</p>
|
||||
<p>The list in the default configuration does not
|
||||
exclude hidden directories (names beginning with
|
||||
a dot), which means that it may index quite a few
|
||||
things that you do not want. On the other hand,
|
||||
email user agents like Thunderbird usually store
|
||||
messages in hidden directories, and you probably
|
||||
want this indexed. One possible solution is to
|
||||
have ".*" in "skippedNames", and add things like
|
||||
"~/.thunderbird" "~/.evolution" to "topdirs".</p>
|
||||
<p>Not even the file names are indexed for
|
||||
patterns in this list, see the
|
||||
"noContentSuffixes" variable for an alternative
|
||||
approach which indexes the file names. Can be
|
||||
redefined for any subtree.</p>
|
||||
</dd>
|
||||
<dt><a name=
|
||||
"RCL.INSTALL.CONFIG.RECOLLCONF.SKIPPEDNAMES-" id=
|
||||
@ -9013,23 +9015,33 @@ hasextract = False
|
||||
<dd>
|
||||
<p>Absolute paths we should not go into.
|
||||
Space-separated list of wildcard expressions for
|
||||
absolute filesystem paths. Must be defined at the
|
||||
absolute filesystem paths (for files or
|
||||
directories). The variable must be defined at the
|
||||
top level of the configuration file, not in a
|
||||
subsection. Can contain files and directories.
|
||||
The database and configuration directories will
|
||||
automatically be added. The expressions are
|
||||
matched using 'fnmatch(3)' with the FNM_PATHNAME
|
||||
flag set by default. This means that '/'
|
||||
characters must be matched explicitly. You can
|
||||
set 'skippedPathsFnmPathname' to 0 to disable the
|
||||
use of FNM_PATHNAME (meaning that '/*/dir3' will
|
||||
match '/dir1/dir2/dir3'). The default value
|
||||
contains the usual mount point for removable
|
||||
media to remind you that it is a bad idea to have
|
||||
Recoll work on these (esp. with the monitor:
|
||||
media gets indexed on mount, all data gets erased
|
||||
on unmount). Explicitly adding '/media/xxx' to
|
||||
the 'topdirs' variable will override this.</p>
|
||||
subsection.</p>
|
||||
<p>Any value in the list must be textually
|
||||
consistent with the values in topdirs, no
|
||||
attempts are made to resolve symbolic links. In
|
||||
practise, if, as is frequently the case, /home is
|
||||
a link to /usr/home, your default topdirs will
|
||||
have a single entry '~' which will be translated
|
||||
to '/home/yourlogin'. In this case, any
|
||||
skippedPaths entry should start with
|
||||
'/home/yourlogin' *not* with
|
||||
'/usr/home/yourlogin'.</p>
|
||||
<p>The index and configuration directories will
|
||||
automatically be added to the list.</p>
|
||||
<p>The expressions are matched using 'fnmatch(3)'
|
||||
with the FNM_PATHNAME flag set by default. This
|
||||
means that '/' characters must be matched
|
||||
explicitly. You can set 'skippedPathsFnmPathname'
|
||||
to 0 to disable the use of FNM_PATHNAME (meaning
|
||||
that '/*/dir3' will match '/dir1/dir2/dir3').</p>
|
||||
<p>The default value contains the usual mount
|
||||
point for removable media to remind you that it
|
||||
is in most cases a bad idea to have Recoll work
|
||||
on these Explicitly adding '/media/xxx' to the
|
||||
'topdirs' variable will override this.</p>
|
||||
</dd>
|
||||
<dt><a name=
|
||||
"RCL.INSTALL.CONFIG.RECOLLCONF.SKIPPEDPATHSFNMPATHNAME"
|
||||
@ -9271,36 +9283,37 @@ hasextract = False
|
||||
<p>Decide if we store the documents' text content
|
||||
in the index. Storing the text allows extracting
|
||||
snippets from it at query time, instead of
|
||||
building them from index position data. Newer
|
||||
Xapian index formats have rendered our use of
|
||||
positions list unacceptably slow in some cases.
|
||||
The last Xapian index format with good
|
||||
building them from index position data.</p>
|
||||
<p>Newer Xapian index formats have rendered our
|
||||
use of positions list unacceptably slow in some
|
||||
cases. The last Xapian index format with good
|
||||
performance for the old method is Chert, which is
|
||||
default for 1.2, still supported but not default
|
||||
in 1.4 and will be dropped in 1.6. The stored
|
||||
document text is translated from its original
|
||||
format to UTF-8 plain text, but not stripped of
|
||||
upper-case, diacritics, or punctuation signs.
|
||||
Storing it increases the index size by 10-20%
|
||||
typically, but also allows for nicer snippets, so
|
||||
it may be worth enabling it even if not strictly
|
||||
needed for performance if you can afford the
|
||||
space. The variable only has an effect when
|
||||
creating an index, meaning that the xapiandb
|
||||
directory must not exist yet. Its exact effect
|
||||
depends on the Xapian version. For Xapian 1.4, if
|
||||
the variable is set to 0, the Chert format will
|
||||
be used, and the text will not be stored. If the
|
||||
variable is 1, Glass will be used, and the text
|
||||
stored. For Xapian 1.2, and for versions after
|
||||
1.5 and newer, the index format is always the
|
||||
default, but the variable controls if the text is
|
||||
stored or not, and the abstract generation
|
||||
method. With Xapian 1.5 and later, and the
|
||||
variable set to 0, abstract generation may be
|
||||
very slow, but this setting may still be useful
|
||||
to save space if you do not use abstract
|
||||
generation at all.</p>
|
||||
in 1.4 and will be dropped in 1.6.</p>
|
||||
<p>The stored document text is translated from
|
||||
its original format to UTF-8 plain text, but not
|
||||
stripped of upper-case, diacritics, or
|
||||
punctuation signs. Storing it increases the index
|
||||
size by 10-20% typically, but also allows for
|
||||
nicer snippets, so it may be worth enabling it
|
||||
even if not strictly needed for performance if
|
||||
you can afford the space.</p>
|
||||
<p>The variable only has an effect when creating
|
||||
an index, meaning that the xapiandb directory
|
||||
must not exist yet. Its exact effect depends on
|
||||
the Xapian version.</p>
|
||||
<p>For Xapian 1.4, if the variable is set to 0,
|
||||
the Chert format will be used, and the text will
|
||||
not be stored. If the variable is 1, Glass will
|
||||
be used, and the text stored.</p>
|
||||
<p>For Xapian 1.2, and for versions after 1.5 and
|
||||
newer, the index format is always the default,
|
||||
but the variable controls if the text is stored
|
||||
or not, and the abstract generation method. With
|
||||
Xapian 1.5 and later, and the variable set to 0,
|
||||
abstract generation may be very slow, but this
|
||||
setting may still be useful to save space if you
|
||||
do not use abstract generation at all.</p>
|
||||
</dd>
|
||||
<dt><a name=
|
||||
"RCL.INSTALL.CONFIG.RECOLLCONF.NONUMBERS" id=
|
||||
@ -9425,19 +9438,23 @@ hasextract = False
|
||||
should be specified, as appartenance to the list
|
||||
will turn-off both standard accent and case
|
||||
processing. The value is global and affects both
|
||||
indexing and querying. Examples: Swedish:
|
||||
unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe
|
||||
æae Æae ffff fifi flfl åå Åå . German:
|
||||
unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe
|
||||
æae Æae ffff fifi flfl . French: you probably want
|
||||
to decompose oe and ae and nobody would type a
|
||||
German ß unac_except_trans = ßss œoe Œoe æae Æae
|
||||
ffff fifi flfl . The default for all until someone
|
||||
protests follows. These decompositions are not
|
||||
performed by unac, but it is unlikely that
|
||||
someone would type the composed forms in a
|
||||
indexing and querying. We also convert a few
|
||||
confusing Unicode characters (quotes, hyphen) to
|
||||
their ASCII equivalent to avoid "invisible"
|
||||
search failures.</p>
|
||||
<p>Examples: Swedish: unac_except_trans = ää Ää
|
||||
öö Öö üü Üü ßss œoe Œoe æae Æae ffff fifi flfl åå Åå
|
||||
’' ❜' ʼ' ‐- . German: unac_except_trans = ää Ää
|
||||
öö Öö üü Üü ßss œoe Œoe æae Æae ffff fifi flfl ’' ❜'
|
||||
ʼ' ‐- . French: you probably want to decompose oe
|
||||
and ae and nobody would type a German ß
|
||||
unac_except_trans = ßss œoe Œoe æae Æae ffff fifi
|
||||
flfl ’' ❜' ʼ' ‐- . The default for all until
|
||||
someone protests follows. These decompositions
|
||||
are not performed by unac, but it is unlikely
|
||||
that someone would type the composed forms in a
|
||||
search. unac_except_trans = ßss œoe Œoe æae Æae
|
||||
ffff fifi flfl</p>
|
||||
ffff fifi flfl ’' ❜' ʼ' ‐-</p>
|
||||
</dd>
|
||||
<dt><a name=
|
||||
"RCL.INSTALL.CONFIG.RECOLLCONF.MAILDEFCHARSET" id=
|
||||
|
||||
@ -33,16 +33,21 @@ topdirs = ~
|
||||
# <brief>Files and directories which should be ignored.</brief>
|
||||
#
|
||||
# <descr> White space separated list of wildcard patterns (simple ones, not paths, must contain no
|
||||
# '/' characters), which will be tested against file and directory names. Have a look at the default
|
||||
# configuration for the initial value, some entries may not suit your situation. The easiest way to
|
||||
# see it is through the GUI Index configuration "local parameters" panel. The list in the default
|
||||
# configuration does not exclude hidden directories (names beginning with a dot), which means that
|
||||
# it may index quite a few things that you do not want. On the other hand, email user agents like
|
||||
# Thunderbird usually store messages in hidden directories, and you probably want this indexed. One
|
||||
# possible solution is to have ".*" in "skippedNames", and add things like "~/.thunderbird"
|
||||
# "~/.evolution" to "topdirs". Not even the file names are indexed for patterns in this list, see
|
||||
# the "noContentSuffixes" variable for an alternative approach which indexes the file names. Can be
|
||||
# redefined for any subtree.</descr>
|
||||
# '/' characters), which will be tested against file and directory names.
|
||||
#
|
||||
# Have a look at the default configuration for the initial value, some entries may not suit your
|
||||
# situation. The easiest way to see it is through the GUI Index configuration "local parameters"
|
||||
# panel.
|
||||
#
|
||||
# The list in the default configuration does not exclude hidden directories (names beginning with a
|
||||
# dot), which means that it may index quite a few things that you do not want. On the other hand,
|
||||
# email user agents like Thunderbird usually store messages in hidden directories, and you probably
|
||||
# want this indexed. One possible solution is to have ".*" in "skippedNames", and add things like
|
||||
# "~/.thunderbird" "~/.evolution" to "topdirs".
|
||||
#
|
||||
# Not even the file names are indexed for patterns in this list, see the "noContentSuffixes"
|
||||
# variable for an alternative approach which indexes the file names. Can be redefined for any
|
||||
# subtree.</descr>
|
||||
#
|
||||
#</var>
|
||||
skippedNames = #* CVS Cache cache* .cache caughtspam tmp \
|
||||
@ -104,19 +109,26 @@ noContentSuffixes+ =
|
||||
# <var name="skippedPaths" type="string">
|
||||
#
|
||||
# <brief>Absolute paths we should not go into.</brief>
|
||||
# <descr>Space-separated list of wildcard expressions for absolute
|
||||
# filesystem paths. Must be defined at the top level of the configuration
|
||||
# file, not in a subsection. Can contain files and directories. The database and
|
||||
# configuration directories will automatically be added. The expressions
|
||||
# are matched using 'fnmatch(3)' with the FNM_PATHNAME flag set by
|
||||
# default. This means that '/' characters must be matched explicitly. You
|
||||
# can set 'skippedPathsFnmPathname' to 0 to disable the use of FNM_PATHNAME
|
||||
# (meaning that '/*/dir3' will match '/dir1/dir2/dir3'). The default value
|
||||
# contains the usual mount point for removable media to remind you that it
|
||||
# is a bad idea to have Recoll work on these (esp. with the monitor: media
|
||||
# gets indexed on mount, all data gets erased on unmount). Explicitly
|
||||
# adding '/media/xxx' to the 'topdirs' variable will override
|
||||
# this.</descr></var>
|
||||
#
|
||||
# <descr>Space-separated list of wildcard expressions for absolute filesystem paths (for files or
|
||||
# directories). The variable must be defined at the top level of the configuration file, not in a
|
||||
# subsection.
|
||||
#
|
||||
# Any value in the list must be textually consistent with the values in topdirs, no attempts are
|
||||
# made to resolve symbolic links. In practise, if, as is frequently the case, /home is a link to
|
||||
# /usr/home, your default topdirs will have a single entry '~' which will be translated to
|
||||
# '/home/yourlogin'. In this case, any skippedPaths entry should start with '/home/yourlogin' *not*
|
||||
# with '/usr/home/yourlogin'.
|
||||
#
|
||||
# The index and configuration directories will automatically be added to the list.
|
||||
#
|
||||
# The expressions are matched using 'fnmatch(3)' with the FNM_PATHNAME flag set by default. This
|
||||
# means that '/' characters must be matched explicitly. You can set 'skippedPathsFnmPathname' to 0
|
||||
# to disable the use of FNM_PATHNAME (meaning that '/*/dir3' will match '/dir1/dir2/dir3').
|
||||
#
|
||||
# The default value contains the usual mount point for removable media to remind you that it is in
|
||||
# most cases a bad idea to have Recoll work on these Explicitly adding '/media/xxx' to the 'topdirs'
|
||||
# variable will override this.</descr></var>
|
||||
skippedPaths = /media
|
||||
|
||||
# <var name="skippedPathsFnmPathname" type="bool"><brief>Set to 0 to
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user