use indexing instead of indexation
This commit is contained in:
parent
c23b7e452b
commit
678e661190
@ -24,7 +24,7 @@
|
|||||||
Dockes</holder>
|
Dockes</holder>
|
||||||
</copyright>
|
</copyright>
|
||||||
|
|
||||||
<releaseinfo>$Id: usermanual.sgml,v 1.10 2006-04-05 13:30:00 dockes Exp $</releaseinfo>
|
<releaseinfo>$Id: usermanual.sgml,v 1.11 2006-04-07 13:07:34 dockes Exp $</releaseinfo>
|
||||||
|
|
||||||
<abstract>
|
<abstract>
|
||||||
<para>This document introduces full text search notions
|
<para>This document introduces full text search notions
|
||||||
@ -108,11 +108,11 @@
|
|||||||
mature package using <ulink
|
mature package using <ulink
|
||||||
url="http://www.xapian.org/docs/intro_ir.html">a sophisticated
|
url="http://www.xapian.org/docs/intro_ir.html">a sophisticated
|
||||||
probabilistic ranking model</ulink>. &RCL; provides the interface
|
probabilistic ranking model</ulink>. &RCL; provides the interface
|
||||||
to get data into (indexation) and out (searching) of the system.</para>
|
to get data into (indexing) and out (searching) of the system.</para>
|
||||||
|
|
||||||
<para>In practice, &XAP; works by remembering where terms appear
|
<para>In practice, &XAP; works by remembering where terms appear
|
||||||
in your document files. The acquisition process is called
|
in your document files. The acquisition process is called
|
||||||
indexation. </para>
|
indexing. </para>
|
||||||
|
|
||||||
<para>The resulting database can be big (roughly the size of the
|
<para>The resulting database can be big (roughly the size of the
|
||||||
original document set), but it is not a document
|
original document set), but it is not a document
|
||||||
@ -151,7 +151,7 @@
|
|||||||
giving &RCL; a try, but you may want to adjust it
|
giving &RCL; a try, but you may want to adjust it
|
||||||
later.</para>
|
later.</para>
|
||||||
|
|
||||||
<para><link linkend="rcl.indexing.exec">Indexation</link> is started
|
<para><link linkend="rcl.indexing.exec">Indexing</link> is started
|
||||||
automatically the first time you execute the
|
automatically the first time you execute the
|
||||||
<command>recoll</command> search graphical user interface, or by
|
<command>recoll</command> search graphical user interface, or by
|
||||||
executing the <command>recollindex</command> command.</para>
|
executing the <command>recollindex</command> command.</para>
|
||||||
@ -166,22 +166,22 @@
|
|||||||
|
|
||||||
|
|
||||||
<chapter id="rcl.indexing">
|
<chapter id="rcl.indexing">
|
||||||
<title>Indexation</title>
|
<title>Indexing</title>
|
||||||
|
|
||||||
<sect1 id="rcl.indexing.introduction">
|
<sect1 id="rcl.indexing.introduction">
|
||||||
<title>Introduction</title>
|
<title>Introduction</title>
|
||||||
|
|
||||||
<para>Indexation is the process by which the set of documents is
|
<para>Indexing is the process by which the set of documents is
|
||||||
analyzed and the data entered into the database. &RCL; indexation
|
analyzed and the data entered into the database. &RCL; indexing
|
||||||
is normally incremental: documents will only be processed if
|
is normally incremental: documents will only be processed if
|
||||||
they have been modified. On the first execution, of course, all
|
they have been modified. On the first execution, of course, all
|
||||||
documents will need processing. A full index build can be forced
|
documents will need processing. A full index build can be forced
|
||||||
later on by specifying an option to the indexation command
|
later on by specifying an option to the indexing command
|
||||||
(<command>recollindex -z</command>).</para>
|
(<command>recollindex -z</command>).</para>
|
||||||
|
|
||||||
<para>&RCL; indexation takes place at discrete times. There is
|
<para>&RCL; indexing takes place at discrete times. There is
|
||||||
currently no interface to real time file modification
|
currently no interface to real time file modification
|
||||||
monitors. The typical usage is to have a nightly indexation run
|
monitors. The typical usage is to have a nightly indexing run
|
||||||
<link linkend="rcl.indexing.automat">programmed</link> into your
|
<link linkend="rcl.indexing.automat">programmed</link> into your
|
||||||
<command>cron</command> file.</para>
|
<command>cron</command> file.</para>
|
||||||
|
|
||||||
@ -205,7 +205,7 @@
|
|||||||
many individually indexed documents.
|
many individually indexed documents.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>&RCL; indexation processes plain text, HTML, openoffice
|
<para>&RCL; indexing processes plain text, HTML, openoffice
|
||||||
and e-mail files internally. Other types (ie: postscript, pdf,
|
and e-mail files internally. Other types (ie: postscript, pdf,
|
||||||
ms-word, rtf) need external applications for preprocessing. The
|
ms-word, rtf) need external applications for preprocessing. The
|
||||||
list is in the <link
|
list is in the <link
|
||||||
@ -219,7 +219,7 @@
|
|||||||
</sect1>
|
</sect1>
|
||||||
|
|
||||||
<sect1 id="rcl.indexing.config">
|
<sect1 id="rcl.indexing.config">
|
||||||
<title>The indexation configuration</title>
|
<title>The indexing configuration</title>
|
||||||
|
|
||||||
<para>Values set in the system-wide configuration file (named
|
<para>Values set in the system-wide configuration file (named
|
||||||
like
|
like
|
||||||
@ -231,9 +231,9 @@
|
|||||||
|
|
||||||
<para>The most accurate documentation for editing the file is
|
<para>The most accurate documentation for editing the file is
|
||||||
given by comments inside the central one. If you want to adjust
|
given by comments inside the central one. If you want to adjust
|
||||||
the configuration before indexation, just click
|
the configuration before indexing, just click
|
||||||
<guilabel>Cancel</guilabel> when the program asks if it should
|
<guilabel>Cancel</guilabel> when the program asks if it should
|
||||||
start initial indexation. This will have created a
|
start initial indexing. This will have created a
|
||||||
<filename>.recoll</filename> directory containing empty
|
<filename>.recoll</filename> directory containing empty
|
||||||
configuration files.</para>
|
configuration files.</para>
|
||||||
|
|
||||||
@ -244,34 +244,34 @@
|
|||||||
</sect1>
|
</sect1>
|
||||||
|
|
||||||
<sect1 id="rcl.indexing.exec">
|
<sect1 id="rcl.indexing.exec">
|
||||||
<title>Starting indexation</title>
|
<title>Starting indexing</title>
|
||||||
|
|
||||||
<para>Indexation is performed either by the
|
<para>Indexing is performed either by the
|
||||||
<command>recollindex</command> program, or by the
|
<command>recollindex</command> program, or by the
|
||||||
indexation thread inside the <command>recoll</command>
|
indexing thread inside the <command>recoll</command>
|
||||||
program (use the <guimenu>File</guimenu> menu).
|
program (use the <guimenu>File</guimenu> menu).
|
||||||
|
|
||||||
<para>If the <command>recoll</command> program finds no database
|
<para>If the <command>recoll</command> program finds no database
|
||||||
when it starts, it will automatically start indexation (except
|
when it starts, it will automatically start indexing (except
|
||||||
if cancelled).</para>
|
if cancelled).</para>
|
||||||
|
|
||||||
<para>It is best to avoid interrupting the indexation process, as
|
<para>It is best to avoid interrupting the indexing process, as
|
||||||
this may sometimes leave the database in a bad state. This is
|
this may sometimes leave the database in a bad state. This is
|
||||||
not a serious problem, as you then just need to clear
|
not a serious problem, as you then just need to clear
|
||||||
everything and restart the indexation: the database files are
|
everything and restart the indexing: the database files are
|
||||||
normally stored in the <filename>$HOME/.recoll/xapiandb</filename>
|
normally stored in the <filename>$HOME/.recoll/xapiandb</filename>
|
||||||
directory,
|
directory,
|
||||||
which you can just delete if needed. Alternatively, you can
|
which you can just delete if needed. Alternatively, you can
|
||||||
start <command>recollindex -z</command>, which will
|
start <command>recollindex -z</command>, which will
|
||||||
reset the database before indexation.</para>
|
reset the database before indexing.</para>
|
||||||
|
|
||||||
</sect1>
|
</sect1>
|
||||||
|
|
||||||
<sect1 id="rcl.indexing.automat">
|
<sect1 id="rcl.indexing.automat">
|
||||||
<title>Using <command>cron</command> to automate
|
<title>Using <command>cron</command> to automate
|
||||||
indexation</title>
|
indexing</title>
|
||||||
|
|
||||||
<para>The most common way to set up indexation is to have a cron
|
<para>The most common way to set up indexing is to have a cron
|
||||||
task execute it every night. For example the following
|
task execute it every night. For example the following
|
||||||
<filename>crontab</filename> entry would do it every day at
|
<filename>crontab</filename> entry would do it every day at
|
||||||
3:30AM (supposing <command>recollindex</command> is in your PATH):</para>
|
3:30AM (supposing <command>recollindex</command> is in your PATH):</para>
|
||||||
@ -443,7 +443,7 @@
|
|||||||
|
|
||||||
<formalpara><title>File names</title>
|
<formalpara><title>File names</title>
|
||||||
<para>All file name elements (the broken up file path) are
|
<para>All file name elements (the broken up file path) are
|
||||||
entered as terms during indexation, and you can specify them
|
entered as terms during indexing, and you can specify them
|
||||||
as ordinary terms in normal search fields. Alternatively, you
|
as ordinary terms in normal search fields. Alternatively, you
|
||||||
can use specific file name search which will
|
can use specific file name search which will
|
||||||
<emphasis>only</emphasis> look for file names and can use
|
<emphasis>only</emphasis> look for file names and can use
|
||||||
@ -510,7 +510,7 @@
|
|||||||
file</link>), or later added with
|
file</link>), or later added with
|
||||||
<command>recollindex -s</command> (See the recollindex
|
<command>recollindex -s</command> (See the recollindex
|
||||||
manual). Stemming languages which are dynamically added will be
|
manual). Stemming languages which are dynamically added will be
|
||||||
deleted at the next indexation pass unless they are also added in
|
deleted at the next indexing pass unless they are also added in
|
||||||
the configuration file.</para>
|
the configuration file.</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
|
|
||||||
@ -745,7 +745,7 @@
|
|||||||
will be created with a set of empty configuration files.
|
will be created with a set of empty configuration files.
|
||||||
<command>recoll</command> will give you a
|
<command>recoll</command> will give you a
|
||||||
chance to edit the configuration file before starting
|
chance to edit the configuration file before starting
|
||||||
indexation. <command>recollindex</command> will
|
indexing. <command>recollindex</command> will
|
||||||
proceed immediately.</para>
|
proceed immediately.</para>
|
||||||
|
|
||||||
<para>Most of the parameters specific to the
|
<para>Most of the parameters specific to the
|
||||||
@ -787,7 +787,7 @@
|
|||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
|
|
||||||
<para>Section lines allow redefining some parameters for a
|
<para>Section lines allow redefining some parameters for a
|
||||||
directory subtree. Some of the parameters used for indexation
|
directory subtree. Some of the parameters used for indexing
|
||||||
are looked up hierarchically from the more to the less
|
are looked up hierarchically from the more to the less
|
||||||
specific. Not all parameters can be meaningfully redefined,
|
specific. Not all parameters can be meaningfully redefined,
|
||||||
this is specified for each in the next section. </para>
|
this is specified for each in the next section. </para>
|
||||||
@ -813,7 +813,7 @@
|
|||||||
<command>recoll</command> to copy the sample
|
<command>recoll</command> to copy the sample
|
||||||
configuration, click <guimenu>Cancel</guimenu>, and edit
|
configuration, click <guimenu>Cancel</guimenu>, and edit
|
||||||
the configuration file before restarting the command. This
|
the configuration file before restarting the command. This
|
||||||
will start the initial indexation, which may take some time.</para>
|
will start the initial indexing, which may take some time.</para>
|
||||||
|
|
||||||
<para>Paramers:</para>
|
<para>Paramers:</para>
|
||||||
|
|
||||||
@ -824,7 +824,7 @@
|
|||||||
index (recursively for directories). The indexer will not
|
index (recursively for directories). The indexer will not
|
||||||
follow symbolic links inside the indexed trees. If an entry in
|
follow symbolic links inside the indexed trees. If an entry in
|
||||||
the <literal>topdirs</literal> list is a symbolic link,
|
the <literal>topdirs</literal> list is a symbolic link,
|
||||||
indexation will not start and will generate an error.</para>
|
indexing will not start and will generate an error.</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
@ -885,7 +885,7 @@
|
|||||||
possible values. You can add a stem expansion database for
|
possible values. You can add a stem expansion database for
|
||||||
a different language by using <command>recollindex
|
a different language by using <command>recollindex
|
||||||
-s</command>, but it will be deleted during the next
|
-s</command>, but it will be deleted during the next
|
||||||
indexation. Only languages listed in the configuration
|
indexing. Only languages listed in the configuration
|
||||||
file are permanent.</para>
|
file are permanent.</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
@ -927,7 +927,7 @@
|
|||||||
type for a file (the main procedure uses suffix
|
type for a file (the main procedure uses suffix
|
||||||
associations as defined in the <filename>mimemap</filename>
|
associations as defined in the <filename>mimemap</filename>
|
||||||
file). This can be useful for files with suffixless names,
|
file). This can be useful for files with suffixless names,
|
||||||
but it will also cause the indexation of many bogus "text"
|
but it will also cause the indexing of many bogus "text"
|
||||||
files.</para>
|
files.</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
@ -937,7 +937,7 @@
|
|||||||
section of the database to allow specific file names
|
section of the database to allow specific file names
|
||||||
searches using wild cards. This parameter decides if
|
searches using wild cards. This parameter decides if
|
||||||
file name indexing is performed only for files with mime
|
file name indexing is performed only for files with mime
|
||||||
types that would qualify them for full text indexation, or
|
types that would qualify them for full text indexing, or
|
||||||
for all files inside the selected subtrees, independant of
|
for all files inside the selected subtrees, independant of
|
||||||
mime type.</para>
|
mime type.</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
@ -985,10 +985,10 @@
|
|||||||
<title>The mimeconf file</title>
|
<title>The mimeconf file</title>
|
||||||
|
|
||||||
<para><filename>mimeconf</filename> specifies how the
|
<para><filename>mimeconf</filename> specifies how the
|
||||||
different mime types are handled for indexation, and for
|
different mime types are handled for indexing, and for
|
||||||
display.</para>
|
display.</para>
|
||||||
|
|
||||||
<para>Changing the indexation parameters is probably not a
|
<para>Changing the indexing parameters is probably not a
|
||||||
good idea except if you are a &RCL; developper.</para>
|
good idea except if you are a &RCL; developper.</para>
|
||||||
|
|
||||||
<para>You may want to adjust the external viewers defined in
|
<para>You may want to adjust the external viewers defined in
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user