This commit is contained in:
Jean-Francois Dockes 2021-10-05 18:36:04 +02:00
parent feb2a3ec59
commit b5013b41e1
2 changed files with 107 additions and 39 deletions

View File

@ -17,19 +17,17 @@ subset of the whole indexed area. The elements must be included in the
tree defined by the 'topdirs' members.</para></listitem></varlistentry> tree defined by the 'topdirs' members.</para></listitem></varlistentry>
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.SKIPPEDNAMES"> <varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.SKIPPEDNAMES">
<term><varname>skippedNames</varname></term> <term><varname>skippedNames</varname></term>
<listitem><para>Files and directories which should be ignored. <listitem><para>Files and directories which should be ignored. White space separated list of wildcard patterns (simple ones, not paths, must contain no
White space separated list of wildcard patterns (simple ones, not paths, '/' characters), which will be tested against file and directory names. Have a look at the default
must contain no / ), which will be tested against file and directory configuration for the initial value, some entries may not suit your situation. The easiest way to
names. The list in the default configuration does not exclude hidden see it is through the GUI Index configuration "local parameters" panel. The list in the default
directories (names beginning with a dot), which means that it may index configuration does not exclude hidden directories (names beginning with a dot), which means that
quite a few things that you do not want. On the other hand, email user it may index quite a few things that you do not want. On the other hand, email user agents like
agents like Thunderbird usually store messages in hidden directories, and Thunderbird usually store messages in hidden directories, and you probably want this indexed. One
you probably want this indexed. One possible solution is to have ".*" in possible solution is to have ".*" in "skippedNames", and add things like "~/.thunderbird"
"skippedNames", and add things like "~/.thunderbird" "~/.evolution" to "~/.evolution" to "topdirs". Not even the file names are indexed for patterns in this list, see
"topdirs". Not even the file names are indexed for patterns in this the "noContentSuffixes" variable for an alternative approach which indexes the file names. Can be
list, see the "noContentSuffixes" variable for an alternative approach redefined for any subtree.</para></listitem></varlistentry>
which indexes the file names. Can be redefined for any
subtree.</para></listitem></varlistentry>
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.SKIPPEDNAMES-"> <varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.SKIPPEDNAMES-">
<term><varname>skippedNames-</varname></term> <term><varname>skippedNames-</varname></term>
<listitem><para>List of name endings to remove from the default skippedNames <listitem><para>List of name endings to remove from the default skippedNames
@ -313,7 +311,7 @@ Swedish:
unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe æae Æae ffff fifi flfl åå Åå unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe æae Æae ffff fifi flfl åå Åå
. German: . German:
unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe æae Æae ffff fifi flfl unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe æae Æae ffff fifi flfl
In French, you probably want to decompose oe and ae and nobody would type . French: you probably want to decompose oe and ae and nobody would type
a German ß a German ß
unac_except_trans = ßss œoe Œoe æae Æae ffff fifi flfl unac_except_trans = ßss œoe Œoe æae Æae ffff fifi flfl
. The default for all until someone protests follows. These decompositions . The default for all until someone protests follows. These decompositions
@ -442,6 +440,13 @@ possible to change it.</para></listitem></varlistentry>
<listitem><para>The path to browser downloads directory. This is <listitem><para>The path to browser downloads directory. This is
where the new browser add-on extension has to create the files. They are where the new browser add-on extension has to create the files. They are
then moved by a script to webqueuedir.</para></listitem></varlistentry> then moved by a script to webqueuedir.</para></listitem></varlistentry>
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.WEBCACHEKEEPINTERVAL">
<term><varname>webcachekeepinterval</varname></term>
<listitem><para>Page recycle interval By default, only one instance of an URL is kept in the cache. This
can be changed by setting this to a value determining at what frequency
we keep multiple instances ('day', 'week', 'month',
'year'). Note that increasing the interval will not erase existing
entries.</para></listitem></varlistentry>
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLDICDIR"> <varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLDICDIR">
<term><varname>aspellDicDir</varname></term> <term><varname>aspellDicDir</varname></term>
<listitem><para>Aspell dictionary storage directory location. The <listitem><para>Aspell dictionary storage directory location. The
@ -486,10 +491,11 @@ is mainly to avoid infinite loops in postscript files
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.FILTERMAXMBYTES"> <varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.FILTERMAXMBYTES">
<term><varname>filtermaxmbytes</varname></term> <term><varname>filtermaxmbytes</varname></term>
<listitem><para>Maximum virtual memory space for filter processes <listitem><para>Maximum virtual memory space for filter processes
(setrlimit(RLIMIT_AS)), in megabytes. Note that this (setrlimit(RLIMIT_AS)), in megabytes. Note that this includes any mapped libs (there is no reliable
includes any mapped libs (there is no reliable Linux way to limit the Linux way to limit the data space only), so we need to be a bit generous
data space only), so we need to be a bit generous here. Anything over here. Anything over 2000 will be ignored on 32 bits machines. The
2000 will be ignored on 32 bits machines.</para></listitem></varlistentry> previous default value of 2000 would prevent java pdftk to work when
executed from Python rclpdf.py.</para></listitem></varlistentry>
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.THRQSIZES"> <varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.THRQSIZES">
<term><varname>thrQSizes</varname></term> <term><varname>thrQSizes</varname></term>
<listitem><para>Stage input queues configuration. There are three <listitem><para>Stage input queues configuration. There are three
@ -530,6 +536,12 @@ console. </para></listitem></varlistentry>
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.IDXLOGFILENAME"> <varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.IDXLOGFILENAME">
<term><varname>idxlogfilename</varname></term> <term><varname>idxlogfilename</varname></term>
<listitem><para>Override logfilename for the indexer. </para></listitem></varlistentry> <listitem><para>Override logfilename for the indexer. </para></listitem></varlistentry>
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.HELPERLOGFILENAME">
<term><varname>helperlogfilename</varname></term>
<listitem><para>Destination file for external helpers standard error output. The external program error output is left alone by default,
e.g. going to the terminal when the recoll[index] program is executed
from the command line. Use /dev/null or a file inside a non-existent
directory to completely suppress the output.</para></listitem></varlistentry>
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.DAEMLOGLEVEL"> <varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.DAEMLOGLEVEL">
<term><varname>daemloglevel</varname></term> <term><varname>daemloglevel</varname></term>
<listitem><para>Override loglevel for the indexer in real time <listitem><para>Override loglevel for the indexer in real time
@ -583,7 +595,9 @@ be looked up in the filters dirs, then in the path. Use an absolute path
to do otherwise.</para></listitem></varlistentry> to do otherwise.</para></listitem></varlistentry>
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.RECOLLHELPERPATH"> <varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.RECOLLHELPERPATH">
<term><varname>recollhelperpath</varname></term> <term><varname>recollhelperpath</varname></term>
<listitem><para>Additional places to search for helper executables. This is only used on Windows for now.</para></listitem></varlistentry> <listitem><para>Additional places to search for helper executables. This is used, e.g., on Windows by the Python code, and on Mac OS by the bundled recoll.app
(because I could find no reliable way to tell launchd to set the PATH). The example below is for
Windows. Use ':' as entry separator for Mac and Ux-like systems, ';' is for Windows only.</para></listitem></varlistentry>
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.IDXABSMLEN"> <varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.IDXABSMLEN">
<term><varname>idxabsmlen</varname></term> <term><varname>idxabsmlen</varname></term>
<listitem><para>Length of abstracts we store while indexing. Recoll stores an abstract for each indexed file. <listitem><para>Length of abstracts we store while indexing. Recoll stores an abstract for each indexed file.
@ -609,6 +623,11 @@ may be too low if you have custom fields.</para></listitem></varlistentry>
the beginning of documents. This is not recommended except if you are the beginning of documents. This is not recommended except if you are
sure that the interesting keywords are at the top and have severe disk sure that the interesting keywords are at the top and have severe disk
space issues.</para></listitem></varlistentry> space issues.</para></listitem></varlistentry>
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.IDXSYNONYMS">
<term><varname>idxsynonyms</varname></term>
<listitem><para>Name of the index-time synonyms file. This is used for indexing multiword synonyms as single terms,
which in turn is only useful if you want to perform proximity searches
with such terms.</para></listitem></varlistentry>
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLLANGUAGE"> <varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLLANGUAGE">
<term><varname>aspellLanguage</varname></term> <term><varname>aspellLanguage</varname></term>
<listitem><para>Language definitions to use when creating the aspell <listitem><para>Language definitions to use when creating the aspell

View File

@ -8927,22 +8927,26 @@ hasextract = False
<dd> <dd>
<p>Files and directories which should be ignored. <p>Files and directories which should be ignored.
White space separated list of wildcard patterns White space separated list of wildcard patterns
(simple ones, not paths, must contain no / ), (simple ones, not paths, must contain no '/'
which will be tested against file and directory characters), which will be tested against file
names. The list in the default configuration does and directory names. Have a look at the default
not exclude hidden directories (names beginning configuration for the initial value, some entries
with a dot), which means that it may index quite may not suit your situation. The easiest way to
a few things that you do not want. On the other see it is through the GUI Index configuration
hand, email user agents like Thunderbird usually "local parameters" panel. The list in the default
store messages in hidden directories, and you configuration does not exclude hidden directories
probably want this indexed. One possible solution (names beginning with a dot), which means that it
is to have ".*" in "skippedNames", and add things may index quite a few things that you do not
like "~/.thunderbird" "~/.evolution" to want. On the other hand, email user agents like
"topdirs". Not even the file names are indexed Thunderbird usually store messages in hidden
for patterns in this list, see the directories, and you probably want this indexed.
"noContentSuffixes" variable for an alternative One possible solution is to have ".*" in
approach which indexes the file names. Can be "skippedNames", and add things like
redefined for any subtree.</p> "~/.thunderbird" "~/.evolution" to "topdirs". Not
even the file names are indexed for patterns in
this list, see the "noContentSuffixes" variable
for an alternative approach which indexes the
file names. Can be redefined for any subtree.</p>
</dd> </dd>
<dt><a name= <dt><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.SKIPPEDNAMES-" id= "RCL.INSTALL.CONFIG.RECOLLCONF.SKIPPEDNAMES-" id=
@ -9425,7 +9429,7 @@ hasextract = False
unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe
æae Æae ffff fifi flfl åå Åå . German: æae Æae ffff fifi flfl åå Åå . German:
unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe
æae Æae ffff fifi flfl In French, you probably want æae Æae ffff fifi flfl . French: you probably want
to decompose oe and ae and nobody would type a to decompose oe and ae and nobody would type a
German ß unac_except_trans = ßss œoe Œoe æae Æae German ß unac_except_trans = ßss œoe Œoe æae Æae
ffff fifi flfl . The default for all until someone ffff fifi flfl . The default for all until someone
@ -9644,6 +9648,21 @@ hasextract = False
to webqueuedir.</p> to webqueuedir.</p>
</dd> </dd>
<dt><a name= <dt><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.WEBCACHEKEEPINTERVAL"
id=
"RCL.INSTALL.CONFIG.RECOLLCONF.WEBCACHEKEEPINTERVAL">
</a><span class="term"><code class=
"varname">webcachekeepinterval</code></span></dt>
<dd>
<p>Page recycle interval By default, only one
instance of an URL is kept in the cache. This can
be changed by setting this to a value determining
at what frequency we keep multiple instances
('day', 'week', 'month', 'year'). Note that
increasing the interval will not erase existing
entries.</p>
</dd>
<dt><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLDICDIR" id= "RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLDICDIR" id=
"RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLDICDIR"></a><span class="term"><code class="varname">aspellDicDir</code></span></dt> "RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLDICDIR"></a><span class="term"><code class="varname">aspellDicDir</code></span></dt>
<dd> <dd>
@ -9734,7 +9753,9 @@ hasextract = False
no reliable Linux way to limit the data space no reliable Linux way to limit the data space
only), so we need to be a bit generous here. only), so we need to be a bit generous here.
Anything over 2000 will be ignored on 32 bits Anything over 2000 will be ignored on 32 bits
machines.</p> machines. The previous default value of 2000
would prevent java pdftk to work when executed
from Python rclpdf.py.</p>
</dd> </dd>
<dt><a name= <dt><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.THRQSIZES" id= "RCL.INSTALL.CONFIG.RECOLLCONF.THRQSIZES" id=
@ -9818,6 +9839,19 @@ hasextract = False
<p>Override logfilename for the indexer.</p> <p>Override logfilename for the indexer.</p>
</dd> </dd>
<dt><a name= <dt><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.HELPERLOGFILENAME"
id=
"RCL.INSTALL.CONFIG.RECOLLCONF.HELPERLOGFILENAME"></a><span class="term"><code class="varname">helperlogfilename</code></span></dt>
<dd>
<p>Destination file for external helpers standard
error output. The external program error output
is left alone by default, e.g. going to the
terminal when the recoll[index] program is
executed from the command line. Use /dev/null or
a file inside a non-existent directory to
completely suppress the output.</p>
</dd>
<dt><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.DAEMLOGLEVEL" id= "RCL.INSTALL.CONFIG.RECOLLCONF.DAEMLOGLEVEL" id=
"RCL.INSTALL.CONFIG.RECOLLCONF.DAEMLOGLEVEL"></a><span class="term"><code class="varname">daemloglevel</code></span></dt> "RCL.INSTALL.CONFIG.RECOLLCONF.DAEMLOGLEVEL"></a><span class="term"><code class="varname">daemloglevel</code></span></dt>
<dd> <dd>
@ -9915,8 +9949,13 @@ hasextract = False
"varname">recollhelperpath</code></span></dt> "varname">recollhelperpath</code></span></dt>
<dd> <dd>
<p>Additional places to search for helper <p>Additional places to search for helper
executables. This is only used on Windows for executables. This is used, e.g., on Windows by
now.</p> the Python code, and on Mac OS by the bundled
recoll.app (because I could find no reliable way
to tell launchd to set the PATH). The example
below is for Windows. Use ':' as entry separator
for Mac and Ux-like systems, ';' is for Windows
only.</p>
</dd> </dd>
<dt><a name= <dt><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.IDXABSMLEN" id= "RCL.INSTALL.CONFIG.RECOLLCONF.IDXABSMLEN" id=
@ -9964,6 +10003,16 @@ hasextract = False
severe disk space issues.</p> severe disk space issues.</p>
</dd> </dd>
<dt><a name= <dt><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.IDXSYNONYMS" id=
"RCL.INSTALL.CONFIG.RECOLLCONF.IDXSYNONYMS"></a><span class="term"><code class="varname">idxsynonyms</code></span></dt>
<dd>
<p>Name of the index-time synonyms file. This is
used for indexing multiword synonyms as single
terms, which in turn is only useful if you want
to perform proximity searches with such
terms.</p>
</dd>
<dt><a name=
"RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLLANGUAGE" id= "RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLLANGUAGE" id=
"RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLLANGUAGE"></a><span class="term"><code class="varname">aspellLanguage</code></span></dt> "RCL.INSTALL.CONFIG.RECOLLCONF.ASPELLLANGUAGE"></a><span class="term"><code class="varname">aspellLanguage</code></span></dt>
<dd> <dd>