added compressedfilemaxkbs
This commit is contained in:
parent
bf16706d50
commit
d0a8a37298
@ -3,8 +3,8 @@
|
|||||||
.SH NAME
|
.SH NAME
|
||||||
recoll.conf \- main personal configuration file for Recoll
|
recoll.conf \- main personal configuration file for Recoll
|
||||||
.SH DESCRIPTION
|
.SH DESCRIPTION
|
||||||
This file defines the indexation configuration for the full-text search
|
This file defines the indexation configuration for the Recoll full-text search
|
||||||
system Recoll.
|
system.
|
||||||
.LP
|
.LP
|
||||||
The system-wide configuration file is normally located inside
|
The system-wide configuration file is normally located inside
|
||||||
/usr/[local]/share/recoll/examples. Any parameter set in the common file
|
/usr/[local]/share/recoll/examples. Any parameter set in the common file
|
||||||
@ -58,6 +58,11 @@ embedded spaces can be quoted with double-quotes.
|
|||||||
.BI "topdirs = " directories
|
.BI "topdirs = " directories
|
||||||
Specifies the list of directories to index (recursively).
|
Specifies the list of directories to index (recursively).
|
||||||
.TP
|
.TP
|
||||||
|
.BI "dbdir = " directory
|
||||||
|
The name of the Xapian database directory. It will be created if needed
|
||||||
|
when the database is initialized. If this is not an absolute pathname, it
|
||||||
|
will be taken relative to the configuration directory.
|
||||||
|
.TP
|
||||||
.BI "skippedNames = " patterns
|
.BI "skippedNames = " patterns
|
||||||
A space-separated list of patterns for names of files or directories that
|
A space-separated list of patterns for names of files or directories that
|
||||||
should be completely ignored. The list defined in the default file is:
|
should be completely ignored. The list defined in the default file is:
|
||||||
@ -76,6 +81,18 @@ into. Together with topdirs, this allows pruning the indexed tree to one's
|
|||||||
content. daemSkippedPaths can be used to define a specific value for the
|
content. daemSkippedPaths can be used to define a specific value for the
|
||||||
real time indexing monitor.
|
real time indexing monitor.
|
||||||
.TP
|
.TP
|
||||||
|
.BI "followLinks = " boolean
|
||||||
|
Specifies if the indexer should follow
|
||||||
|
symbolic links while walking the file tree. The default is
|
||||||
|
to ignore symbolic links to avoid multiple indexing of
|
||||||
|
linked files. No effort is made to avoid duplication when
|
||||||
|
this option is set to true. This option can be set
|
||||||
|
individually for each of the
|
||||||
|
.I topdirs
|
||||||
|
members by using sections. It can not be changed below the
|
||||||
|
.I topdirs
|
||||||
|
level.
|
||||||
|
.TP
|
||||||
.BI "loglevel = " value
|
.BI "loglevel = " value
|
||||||
Verbosity level for recoll and recollindex. A value of 4 lists quite a lot of
|
Verbosity level for recoll and recollindex. A value of 4 lists quite a lot of
|
||||||
debug/information messages. 3 lists only errors.
|
debug/information messages. 3 lists only errors.
|
||||||
@ -87,11 +104,6 @@ Where should the messages go. 'stderr' can be used as a special value.
|
|||||||
.B daemlogfilename
|
.B daemlogfilename
|
||||||
can be used to specify a different value for the real-time indexing daemon.
|
can be used to specify a different value for the real-time indexing daemon.
|
||||||
.TP
|
.TP
|
||||||
.BI "dbdir = " directory
|
|
||||||
The name of the Xapian database directory. It will be created if needed
|
|
||||||
when the database is initialized. If this is not an absolute pathname, it
|
|
||||||
will be taken relative to the configuration directory.
|
|
||||||
.TP
|
|
||||||
.BI "indexstemminglanguages = " languages
|
.BI "indexstemminglanguages = " languages
|
||||||
A list of languages for which the stem expansion databases will be
|
A list of languages for which the stem expansion databases will be
|
||||||
built. See recollindex(1) for possible values.
|
built. See recollindex(1) for possible values.
|
||||||
@ -132,13 +144,6 @@ Try to guess the character set of files if no internal value is available
|
|||||||
(ie: for plain text files). This does not work well in general, and should
|
(ie: for plain text files). This does not work well in general, and should
|
||||||
probably not be used.
|
probably not be used.
|
||||||
.TP
|
.TP
|
||||||
.BI "indexallfilenames = " boolean
|
|
||||||
Recoll indexes file names into a special section of the database to allow
|
|
||||||
specific file names searches using wild cards. This parameter decides if
|
|
||||||
file name indexing is performed only for files with mime types that would
|
|
||||||
qualify them for full text indexation, or for all files inside
|
|
||||||
the selected subtrees, independant of mime type.
|
|
||||||
.TP
|
|
||||||
.BI "usesystemfilecommand = " boolean
|
.BI "usesystemfilecommand = " boolean
|
||||||
Decide if we use the
|
Decide if we use the
|
||||||
.B "file -i"
|
.B "file -i"
|
||||||
@ -147,6 +152,65 @@ system command as a final step for determining the mime type for a file
|
|||||||
.B mimemap
|
.B mimemap
|
||||||
file). This can be useful for files with suffixless names, but it will
|
file). This can be useful for files with suffixless names, but it will
|
||||||
also cause the indexation of many bogus "text" files.
|
also cause the indexation of many bogus "text" files.
|
||||||
|
.TP
|
||||||
|
.BI "indexedmimetypes = " list
|
||||||
|
Recoll normally indexes any file which it knows how to read. This list lets
|
||||||
|
you restrict the indexed mime types to what you specify. If the variable is
|
||||||
|
unspecified or the list empty (the default), all supported types are
|
||||||
|
processed.
|
||||||
|
.TP
|
||||||
|
.BI "compressedfilemaxkbs = " value
|
||||||
|
Size limit for compressed (.gz or .bz2) files. These need to be
|
||||||
|
decompressed in a temporary directory for identification, which can be very
|
||||||
|
wasteful if 'uninteresting' big compressed files are present. Negative
|
||||||
|
means no limit, 0 means no processing of any compressed file. Defaults
|
||||||
|
to -1.
|
||||||
|
.TP
|
||||||
|
.BI "indexallfilenames = " boolean
|
||||||
|
Recoll indexes file names into a special section of the database to allow
|
||||||
|
specific file names searches using wild cards. This parameter decides if
|
||||||
|
file name indexing is performed only for files with mime types that would
|
||||||
|
qualify them for full text indexation, or for all files inside
|
||||||
|
the selected subtrees, independant of mime type.
|
||||||
|
.TP
|
||||||
|
.BI "idxabsmlen = " value
|
||||||
|
Recoll stores an abstract for each indexed file inside the database. The
|
||||||
|
text can come from an actual 'abstract' section in the document or will
|
||||||
|
just be the beginning of the document. It is stored in the index so that it
|
||||||
|
can be displayed inside the result lists without decoding the original
|
||||||
|
file. The
|
||||||
|
.I idxabsmlen
|
||||||
|
parameter defines the size of the stored abstract. The default value is 250
|
||||||
|
bytes. The search interface gives you the choice to display this stored
|
||||||
|
text or a synthetic abstract built by extracting text around the search
|
||||||
|
terms. If you always prefer the synthetic abstract, you can reduce this
|
||||||
|
value and save a little space.
|
||||||
|
.TP
|
||||||
|
.BI "aspellLanguage = " lang
|
||||||
|
Language definitions to use when creating the aspell dictionary. The value
|
||||||
|
must match a set of aspell language definition files. You can type "aspell
|
||||||
|
config" to see where these are installed (look for data-dir). The default
|
||||||
|
if the variable is not set is to use your desktop national language
|
||||||
|
environment to guess the value.
|
||||||
|
.TP
|
||||||
|
.BI "noaspell = " boolean
|
||||||
|
If this is set, the aspell dictionary generation is turned off. Useful for
|
||||||
|
cases where you don't need the functionality or when it is unusable because
|
||||||
|
aspell crashes during dictionary generation.
|
||||||
|
.TP
|
||||||
|
.BI "nocjk = " boolean
|
||||||
|
If this set to true, specific east asian (Chinese Korean Japanese)
|
||||||
|
characters/word splitting is turned off. This will save a small amount of
|
||||||
|
cpu if you have no CJK documents. If your document base does include such
|
||||||
|
text but you are not interested in searching it, setting
|
||||||
|
.I nocjk
|
||||||
|
may be a significant time and space saver.
|
||||||
|
.TP
|
||||||
|
.BI "cjkngramlen = " value
|
||||||
|
This lets you adjust the size of n-grams used for indexing CJK text. The
|
||||||
|
default value of 2 is probably appropriate in most cases. A value of 3
|
||||||
|
would allow more precision and efficiency on longer words, but the index
|
||||||
|
will be approximately twice as large.
|
||||||
.SH SEE ALSO
|
.SH SEE ALSO
|
||||||
.PP
|
.PP
|
||||||
recollindex(1) recoll(1)
|
recollindex(1) recoll(1)
|
||||||
|
|||||||
@ -578,9 +578,9 @@ fvwm
|
|||||||
</chapter>
|
</chapter>
|
||||||
|
|
||||||
<chapter id="rcl.search">
|
<chapter id="rcl.search">
|
||||||
<title>Searching</title>
|
<title>Searching with the Qt graphical user interface</title>
|
||||||
|
|
||||||
<para>The <command>recoll</command> program provides the user
|
<para>The <command>recoll</command> program provides the main user
|
||||||
interface for searching. It is based on the
|
interface for searching. It is based on the
|
||||||
<application>QT</application> library.</para>
|
<application>QT</application> library.</para>
|
||||||
|
|
||||||
@ -1048,6 +1048,23 @@ fvwm
|
|||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
|
|
||||||
|
|
||||||
|
<formalpara><title>Phrases and Proximity searches</title>
|
||||||
|
<para>These two clauses work in similar ways, with the
|
||||||
|
difference that proximity searches do not impose an order on the
|
||||||
|
words. In both cases, an adjustable number (slack) of non-matched words
|
||||||
|
may be accepted between the searched ones (use the counter on
|
||||||
|
the left to adjust this count). For phrases, the default count
|
||||||
|
is zero (exact match). For proximity it is ten (meaning that two search
|
||||||
|
terms, would be matched if found within a window of twelve
|
||||||
|
words). Examples: a phrase search for <literal>quick
|
||||||
|
fox</literal> with a slack of 0 will match <literal>quick
|
||||||
|
fox</literal> but not <literal>quick brown fox</literal>. With
|
||||||
|
a slack of 1 it will match the latter, but not <literal>fox
|
||||||
|
quick</literal>. A proximity search for <literal>quick
|
||||||
|
fox</literal> with the default slack will match the
|
||||||
|
latter, and also <literal>a fox is a cunning and quick animal</literal>.
|
||||||
|
</formalpara>
|
||||||
|
|
||||||
<para>Click on the <guilabel>Start Search</guilabel> button in
|
<para>Click on the <guilabel>Start Search</guilabel> button in
|
||||||
the advanced search dialog, or type <keycap>Enter</keycap> in
|
the advanced search dialog, or type <keycap>Enter</keycap> in
|
||||||
any text field to start the search. The button in
|
any text field to start the search. The button in
|
||||||
@ -1361,7 +1378,7 @@ fvwm
|
|||||||
quotes. Example: <literal>"user manual"</literal> will look
|
quotes. Example: <literal>"user manual"</literal> will look
|
||||||
only for occurrences of <literal>user</literal> immediately
|
only for occurrences of <literal>user</literal> immediately
|
||||||
followed by <literal>manual</literal>. You can use the
|
followed by <literal>manual</literal>. You can use the
|
||||||
<guilabel>This exact phrase</guilabel> field of the advanced
|
<guilabel>This phrase</guilabel> field of the advanced
|
||||||
search dialog to the same effect. Phrases can be entered along
|
search dialog to the same effect. Phrases can be entered along
|
||||||
simple terms in all simple or advanced search entry fields
|
simple terms in all simple or advanced search entry fields
|
||||||
(except <guilabel>This exact phrase</guilabel>).</para>
|
(except <guilabel>This exact phrase</guilabel>).</para>
|
||||||
@ -1646,6 +1663,135 @@ fvwm
|
|||||||
|
|
||||||
</chapter>
|
</chapter>
|
||||||
|
|
||||||
|
<chapter id="rcl.searchkio">
|
||||||
|
<title>Searching with the KDE KIO slave</title>
|
||||||
|
|
||||||
|
<sect1 id="rcl.searchkio.intro">
|
||||||
|
<title>What's this</title>
|
||||||
|
|
||||||
|
<para>The &RCL; KIO slave allows performing a &RCL; search
|
||||||
|
by entering an appropriate URL in a KDE open dialog, or with an
|
||||||
|
HTML-based interface displayed in
|
||||||
|
<command>Konqueror</command>.</para>
|
||||||
|
|
||||||
|
<para>The HTML-based interface is similar to the QT-based
|
||||||
|
interface, but slightly less powerful for now. Its advantage is
|
||||||
|
that you can perform your search while staying fully within the
|
||||||
|
KDE framework: drag and drop from the result list works normally
|
||||||
|
and you have your normal choice of applications for opening
|
||||||
|
files.</para>
|
||||||
|
|
||||||
|
<para>The alternative interface uses a directory view of search
|
||||||
|
results. Due to limitations in the current KIO slave interface,
|
||||||
|
it is currently not obviously useful (to me).</para>
|
||||||
|
|
||||||
|
<para>The interface is described in more detail inside a help
|
||||||
|
file which you can access by entering
|
||||||
|
<filename>recoll:/</filename> inside the
|
||||||
|
<command>konqueror</command> URL line (this works only if the
|
||||||
|
recoll KIO slave has been previously installed).</para>
|
||||||
|
|
||||||
|
|
||||||
|
<para>The instructions for building this module are located in
|
||||||
|
the source tree. See:
|
||||||
|
<filename>kde/kio/recoll/00README.txt</filename></para>
|
||||||
|
</sect1>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<sect1 id="rcl.searchkio.searchabledocs">
|
||||||
|
<title>Searchable documents</title>
|
||||||
|
|
||||||
|
<para>As a sample application, the &RCL; KIO slave could allow
|
||||||
|
preparing a set of HTML documents (for example a manual) so that
|
||||||
|
they become their own search interface inside
|
||||||
|
<command>konqueror</command>.</para>
|
||||||
|
|
||||||
|
<para>This can be done by either explicitely inserting
|
||||||
|
<literal><a href="recoll:/..."></literal> links
|
||||||
|
around some document areas, or automatically by adding a
|
||||||
|
very small <application>javascript</application> program to the
|
||||||
|
documents, like the following example, which would initiate a search by
|
||||||
|
double-clicking any term:</para>
|
||||||
|
|
||||||
|
<programlisting><script language="JavaScript">
|
||||||
|
function recollsearch() {
|
||||||
|
var t = document.getSelection();
|
||||||
|
window.location.href = 'recoll://search/query?qtp=a&p=0&q=' +
|
||||||
|
encodeURIComponent(t);
|
||||||
|
}
|
||||||
|
</script>
|
||||||
|
....
|
||||||
|
<body ondblclick="recollsearch()">
|
||||||
|
|
||||||
|
</programlisting>
|
||||||
|
|
||||||
|
</sect1>
|
||||||
|
|
||||||
|
</chapter>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<chapter id="rcl.searchkcl">
|
||||||
|
<title>Searching on the command line</title>
|
||||||
|
|
||||||
|
<para>There are several ways to obtain search results as a text
|
||||||
|
stream, without a graphical interface:</para>
|
||||||
|
<itemizedlist>
|
||||||
|
<listitem><para>By passing option <literal>-t</literal> to the
|
||||||
|
<command>recoll</command> program.</para>
|
||||||
|
</listitem>
|
||||||
|
<listitem><para>By using the <command>recollq</command> program.</para>
|
||||||
|
</listitem>
|
||||||
|
<listitem><para>By writing a custom
|
||||||
|
<application>Python</application> program, using the
|
||||||
|
<link linkend="rcl.program.api.python">Recoll Python API</link>.</para>
|
||||||
|
</listitem>
|
||||||
|
</itemizedlist>
|
||||||
|
|
||||||
|
<para>The first two methods work in the same way and accept/need the same
|
||||||
|
arguments (except for the additional <literal>-t</literal> to
|
||||||
|
<command>recoll</command>). The query to be executed is specified
|
||||||
|
as command line arguments.</para>
|
||||||
|
|
||||||
|
<para><command>recollq</command> is not built by default. You can
|
||||||
|
use the <filename>Makefile</filename> in the
|
||||||
|
<filename>query</filename> directory to build it. This is a very
|
||||||
|
simple program, and it will often be useful to taylor its output format
|
||||||
|
to your needs.</para>
|
||||||
|
|
||||||
|
<para><command>recollq</command> has a man page (not installed by
|
||||||
|
default, look in the <filename>doc/man</filename> directory). The
|
||||||
|
Usage string is as follows:</para>
|
||||||
|
<programlisting>recollq [-o|-a|-f] <query string>
|
||||||
|
Runs a recoll query and displays result lines.
|
||||||
|
Default: will interpret the argument(s) as a query language string
|
||||||
|
-o Emulate the gui simple search in ANY TERM mode
|
||||||
|
-a Emulate the gui simple search in ALL TERMS mode
|
||||||
|
-f Emulate the gui simple search in filename mode
|
||||||
|
Common options:
|
||||||
|
-c <configdir> : specify config directory, overriding $RECOLL_CONFDIR
|
||||||
|
-d also dump file contents
|
||||||
|
-n <cnt> limit the maximum number of results (0->no limit, default 2000)
|
||||||
|
-b : basic. Just output urls, no mime types or titles
|
||||||
|
-m : dump the whole document meta[] array
|
||||||
|
-S fld : sort by field name
|
||||||
|
-D : sort descending
|
||||||
|
</programlisting>
|
||||||
|
|
||||||
|
<para>Sample execution:</para>
|
||||||
|
<programlisting>recollq 'ilur -nautique mime:text/html'
|
||||||
|
Recoll query: ((((ilur:(wqf=11) OR ilurs) AND_NOT (nautique:(wqf=11)
|
||||||
|
OR nautiques OR nautiqu OR nautiquement)) FILTER Ttext/html))
|
||||||
|
4 results
|
||||||
|
text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/comptes.html] [comptes.html] 18593 bytes
|
||||||
|
text/html [file:///Users/uncrypted-dockes/projets/nautique/webnautique/articles/ilur1/index.html] [Constructio...
|
||||||
|
text/html [file:///Users/uncrypted-dockes/projets/pagepers/index.html] [psxtcl/writemime/recoll]...
|
||||||
|
text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/recu-chasse-maree....
|
||||||
|
</programlisting>
|
||||||
|
|
||||||
|
</chapter>
|
||||||
|
|
||||||
<chapter id="rcl.program">
|
<chapter id="rcl.program">
|
||||||
<title>Programming interface</title>
|
<title>Programming interface</title>
|
||||||
|
|
||||||
@ -2713,6 +2859,16 @@ skippedPaths = ~/somedir/∗.txt
|
|||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
|
<varlistentry><term><literal>compressedfilemaxkbs</literal></term>
|
||||||
|
<listitem><para>Size limit for compressed (.gz or .bz2)
|
||||||
|
files. These need to be decompressed in a temporary
|
||||||
|
directory for identification, which can be very wasteful
|
||||||
|
if 'uninteresting' big compressed files are present.
|
||||||
|
Negative means no limit, 0 means no processing of any
|
||||||
|
compressed file. Defaults to -1.</para>
|
||||||
|
</listitem>
|
||||||
|
</varlistentry>
|
||||||
|
|
||||||
<varlistentry><term><literal>indexallfilenames</literal></term>
|
<varlistentry><term><literal>indexallfilenames</literal></term>
|
||||||
<listitem><para>&RCL; indexes file names in a special
|
<listitem><para>&RCL; indexes file names in a special
|
||||||
section of the database to allow specific file names
|
section of the database to allow specific file names
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user