*** empty log message ***
This commit is contained in:
parent
b955f42655
commit
9001129bf4
@ -1 +1 @@
|
|||||||
1.4.0
|
1.4.1
|
||||||
|
|||||||
@ -24,7 +24,7 @@
|
|||||||
Dockes</holder>
|
Dockes</holder>
|
||||||
</copyright>
|
</copyright>
|
||||||
|
|
||||||
<releaseinfo>$Id: usermanual.sgml,v 1.11 2006-04-07 13:07:34 dockes Exp $</releaseinfo>
|
<releaseinfo>$Id: usermanual.sgml,v 1.12 2006-04-08 14:00:14 dockes Exp $</releaseinfo>
|
||||||
|
|
||||||
<abstract>
|
<abstract>
|
||||||
<para>This document introduces full text search notions
|
<para>This document introduces full text search notions
|
||||||
@ -114,24 +114,24 @@
|
|||||||
in your document files. The acquisition process is called
|
in your document files. The acquisition process is called
|
||||||
indexing. </para>
|
indexing. </para>
|
||||||
|
|
||||||
<para>The resulting database can be big (roughly the size of the
|
<para>The resulting index can be big (roughly the size of the
|
||||||
original document set), but it is not a document
|
original document set), but it is not a document
|
||||||
archive. &RCL; can only display documents that still exist at
|
archive. &RCL; can only display documents that still exist at
|
||||||
the place from which they were indexed. (Actually, there is a
|
the place from which they were indexed. (Actually, there is a
|
||||||
way to reconstruct a document from the information in the
|
way to reconstruct a document from the information in the
|
||||||
database, but the result is not nice, as all formatting,
|
index, but the result is not nice, as all formatting,
|
||||||
punctuation and capitalisation are lost).</para>
|
punctuation and capitalisation are lost).</para>
|
||||||
|
|
||||||
<para>&RCL; stores all internal data in <application>Unicode
|
<para>&RCL; stores all internal data in <application>Unicode
|
||||||
UTF-8</application> format, and it can index files with
|
UTF-8</application> format, and it can index files with
|
||||||
different character sets, encodings, and languages into the same
|
different character sets, encodings, and languages into the same
|
||||||
database. It has input filters for many document types.</para>
|
index. It has input filters for many document types.</para>
|
||||||
|
|
||||||
<para>Stemming depends on the document language. &RCL; stores
|
<para>Stemming depends on the document language. &RCL; stores
|
||||||
the unstemmed versions of terms and uses auxiliary databases for
|
the unstemmed versions of terms and uses auxiliary databases for
|
||||||
term expansion. It can switch stemming languages, or add a
|
term expansion. It can switch stemming languages, or add a
|
||||||
language, without reindexing. Storing documents in different
|
language, without reindexing. Storing documents in different
|
||||||
languages in the same database is possible, and useful in
|
languages in the same index is possible, and useful in
|
||||||
practice, but does introduce possibilities of confusion. &RCL;
|
practice, but does introduce possibilities of confusion. &RCL;
|
||||||
currently makes no attempt at automatic language recognition.</para>
|
currently makes no attempt at automatic language recognition.</para>
|
||||||
|
|
||||||
@ -218,6 +218,37 @@
|
|||||||
|
|
||||||
</sect1>
|
</sect1>
|
||||||
|
|
||||||
|
<sect1 id="rcl.indexing.storage">
|
||||||
|
<title>Index storage</title>
|
||||||
|
|
||||||
|
<para>The default location for the index data is the
|
||||||
|
<filename>$HOME/.recoll/xapiandb/</filename> directory. This can
|
||||||
|
be changed by setting the <literal>RECOLL_CONFDIR</literal>
|
||||||
|
environment variable, or by specifying the
|
||||||
|
<literal>dbdir</literal> parameter in the configuration file
|
||||||
|
(see the <link linkend="rcl.install.config">configuration
|
||||||
|
section</link>).</para>
|
||||||
|
|
||||||
|
<para>The size of the index is determined by the size of the set
|
||||||
|
of documents, but the ratio can vary a lot. For a typical mixed
|
||||||
|
set of documents, the index size will often be close to
|
||||||
|
the data set size. In specific cases (a set of compressed
|
||||||
|
mbox files for example), the index can become much bigger than
|
||||||
|
the documents. It may also be much smaller if the documents
|
||||||
|
contain a lot of images or other non-indexed data (an extreme
|
||||||
|
example being a set of mp3 files where only the tags would be
|
||||||
|
indexed).</para>
|
||||||
|
|
||||||
|
<para>Of course, images, sound and video do not increase the
|
||||||
|
index size, which means that it will be quite typical nowadays
|
||||||
|
(2006), that even a big index will be negligible against the
|
||||||
|
total amount of data on the computer.</para>
|
||||||
|
|
||||||
|
<para>The index data directory only contains data that will be
|
||||||
|
rebuilt by an index run, so that it can be destroyed safely.</para>
|
||||||
|
|
||||||
|
</sect1>
|
||||||
|
|
||||||
<sect1 id="rcl.indexing.config">
|
<sect1 id="rcl.indexing.config">
|
||||||
<title>The indexing configuration</title>
|
<title>The indexing configuration</title>
|
||||||
|
|
||||||
@ -251,14 +282,14 @@
|
|||||||
indexing thread inside the <command>recoll</command>
|
indexing thread inside the <command>recoll</command>
|
||||||
program (use the <guimenu>File</guimenu> menu).
|
program (use the <guimenu>File</guimenu> menu).
|
||||||
|
|
||||||
<para>If the <command>recoll</command> program finds no database
|
<para>If the <command>recoll</command> program finds no index
|
||||||
when it starts, it will automatically start indexing (except
|
when it starts, it will automatically start indexing (except
|
||||||
if cancelled).</para>
|
if cancelled).</para>
|
||||||
|
|
||||||
<para>It is best to avoid interrupting the indexing process, as
|
<para>It is best to avoid interrupting the indexing process, as
|
||||||
this may sometimes leave the database in a bad state. This is
|
this may sometimes leave the database in a bad state. This is
|
||||||
not a serious problem, as you then just need to clear
|
not a serious problem, as you then just need to clear
|
||||||
everything and restart the indexing: the database files are
|
everything and restart the indexing: the index files are
|
||||||
normally stored in the <filename>$HOME/.recoll/xapiandb</filename>
|
normally stored in the <filename>$HOME/.recoll/xapiandb</filename>
|
||||||
directory,
|
directory,
|
||||||
which you can just delete if needed. Alternatively, you can
|
which you can just delete if needed. Alternatively, you can
|
||||||
@ -442,12 +473,13 @@
|
|||||||
</formalpara>
|
</formalpara>
|
||||||
|
|
||||||
<formalpara><title>File names</title>
|
<formalpara><title>File names</title>
|
||||||
<para>All file name elements (the broken up file path) are
|
<para>File names are added as terms during indexing, and you can
|
||||||
entered as terms during indexing, and you can specify them
|
specify them as ordinary terms in normal search fields (&RCL; used
|
||||||
as ordinary terms in normal search fields. Alternatively, you
|
to index all directories in the file path as terms. This has been
|
||||||
|
abandonned as it did not seem really useful). Alternatively, you
|
||||||
can use specific file name search which will
|
can use specific file name search which will
|
||||||
<emphasis>only</emphasis> look for file names and can use
|
<emphasis>only</emphasis> look for file names and can use wildcard
|
||||||
wildcard expansion.</para>
|
expansion.</para>
|
||||||
</formalpara>
|
</formalpara>
|
||||||
|
|
||||||
<formalpara><title>Quitting</title>
|
<formalpara><title>Quitting</title>
|
||||||
@ -487,7 +519,7 @@
|
|||||||
</listitem>
|
</listitem>
|
||||||
|
|
||||||
<listitem><para><guilabel>Html help browser</guilabel>: this
|
<listitem><para><guilabel>Html help browser</guilabel>: this
|
||||||
will let you chose your the preferred browser which will be
|
will let you chose your preferred browser which will be
|
||||||
started from the <guimenu>Help</guimenu> menu to read the user
|
started from the <guimenu>Help</guimenu> menu to read the user
|
||||||
manual. You can enter a simple name if the command is in your
|
manual. You can enter a simple name if the command is in your
|
||||||
PATH, or browse for a full pathname.</para>
|
PATH, or browse for a full pathname.</para>
|
||||||
@ -735,10 +767,8 @@
|
|||||||
they define default values for the system. A parallel set of
|
they define default values for the system. A parallel set of
|
||||||
files exists in the <filename>.recoll</filename> directory in
|
files exists in the <filename>.recoll</filename> directory in
|
||||||
your home (this can be changed with the
|
your home (this can be changed with the
|
||||||
<literal>RECOLL_CONFDIR</literal> environment variable.
|
<literal>RECOLL_CONFDIR</literal> environment variable.</para>
|
||||||
The database is also kept in <filename>.recoll</filename> by
|
|
||||||
default, (this can be changed by a configuration
|
|
||||||
parameter).</para>
|
|
||||||
<para>If the <filename>.recoll</filename> directory does not
|
<para>If the <filename>.recoll</filename> directory does not
|
||||||
exist when <command>recoll</command> or
|
exist when <command>recoll</command> or
|
||||||
<command>recollindex</command> are started, it
|
<command>recollindex</command> are started, it
|
||||||
@ -806,11 +836,11 @@
|
|||||||
configuration file. It defines things like
|
configuration file. It defines things like
|
||||||
what to index (top directories and things to ignore), and the
|
what to index (top directories and things to ignore), and the
|
||||||
default character set to use for document types which do not
|
default character set to use for document types which do not
|
||||||
specify it internally. </para>
|
specify it internally.</para>
|
||||||
|
|
||||||
<para>The default configuration will index your home
|
<para>The default configuration will index your home
|
||||||
directory. If this is not appropriate, use
|
directory. If this is not appropriate, start
|
||||||
<command>recoll</command> to copy the sample
|
<command>recoll</command> to create a blank
|
||||||
configuration, click <guimenu>Cancel</guimenu>, and edit
|
configuration, click <guimenu>Cancel</guimenu>, and edit
|
||||||
the configuration file before restarting the command. This
|
the configuration file before restarting the command. This
|
||||||
will start the initial indexing, which may take some time.</para>
|
will start the initial indexing, which may take some time.</para>
|
||||||
@ -865,8 +895,8 @@
|
|||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
<varlistentry><term><literal>logfilename</literal></term>
|
<varlistentry><term><literal>logfilename</literal></term>
|
||||||
<listitem><para>Where should the messages go. 'stderr' can
|
<listitem><para>Where the messages should go. 'stderr' can
|
||||||
be used as a special value. </para>
|
be used as a special value, and is the default. </para>
|
||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
@ -899,9 +929,9 @@
|
|||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
<varlistentry><term><literal>dbdir</literal></term>
|
<varlistentry><term><literal>dbdir</literal></term>
|
||||||
<listitem><para>The name of the Xapian database
|
<listitem><para>The name of the Xapian data directory. It
|
||||||
directory. It will be created if needed when the database
|
will be created if needed when the index is
|
||||||
is initialized. </para>
|
initialized. </para>
|
||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
@ -958,11 +988,6 @@
|
|||||||
executed to determine the mime type (this can be switched off
|
executed to determine the mime type (this can be switched off
|
||||||
inside the main configuration file).</para>
|
inside the main configuration file).</para>
|
||||||
|
|
||||||
<para><filename>mimemap</filename> also has a list of
|
|
||||||
extensions which should be ignored totally (to avoid losing
|
|
||||||
time by executing <command>file</command>
|
|
||||||
for things that certainly should not be indexed).</para>
|
|
||||||
|
|
||||||
<para>The mappings can be specified on a per-subtree basis,
|
<para>The mappings can be specified on a per-subtree basis,
|
||||||
which may be useful in some cases. Example:
|
which may be useful in some cases. Example:
|
||||||
<application>gaim</application> logs have a
|
<application>gaim</application> logs have a
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user