use indexing instead of indexation

This commit is contained in:
dockes 2006-04-07 13:07:34 +00:00
parent c23b7e452b
commit 678e661190

View File

@ -24,7 +24,7 @@
Dockes</holder> Dockes</holder>
</copyright> </copyright>
<releaseinfo>$Id: usermanual.sgml,v 1.10 2006-04-05 13:30:00 dockes Exp $</releaseinfo> <releaseinfo>$Id: usermanual.sgml,v 1.11 2006-04-07 13:07:34 dockes Exp $</releaseinfo>
<abstract> <abstract>
<para>This document introduces full text search notions <para>This document introduces full text search notions
@ -108,11 +108,11 @@
mature package using <ulink mature package using <ulink
url="http://www.xapian.org/docs/intro_ir.html">a sophisticated url="http://www.xapian.org/docs/intro_ir.html">a sophisticated
probabilistic ranking model</ulink>. &RCL; provides the interface probabilistic ranking model</ulink>. &RCL; provides the interface
to get data into (indexation) and out (searching) of the system.</para> to get data into (indexing) and out (searching) of the system.</para>
<para>In practice, &XAP; works by remembering where terms appear <para>In practice, &XAP; works by remembering where terms appear
in your document files. The acquisition process is called in your document files. The acquisition process is called
indexation. </para> indexing. </para>
<para>The resulting database can be big (roughly the size of the <para>The resulting database can be big (roughly the size of the
original document set), but it is not a document original document set), but it is not a document
@ -151,7 +151,7 @@
giving &RCL; a try, but you may want to adjust it giving &RCL; a try, but you may want to adjust it
later.</para> later.</para>
<para><link linkend="rcl.indexing.exec">Indexation</link> is started <para><link linkend="rcl.indexing.exec">Indexing</link> is started
automatically the first time you execute the automatically the first time you execute the
<command>recoll</command> search graphical user interface, or by <command>recoll</command> search graphical user interface, or by
executing the <command>recollindex</command> command.</para> executing the <command>recollindex</command> command.</para>
@ -166,22 +166,22 @@
<chapter id="rcl.indexing"> <chapter id="rcl.indexing">
<title>Indexation</title> <title>Indexing</title>
<sect1 id="rcl.indexing.introduction"> <sect1 id="rcl.indexing.introduction">
<title>Introduction</title> <title>Introduction</title>
<para>Indexation is the process by which the set of documents is <para>Indexing is the process by which the set of documents is
analyzed and the data entered into the database. &RCL; indexation analyzed and the data entered into the database. &RCL; indexing
is normally incremental: documents will only be processed if is normally incremental: documents will only be processed if
they have been modified. On the first execution, of course, all they have been modified. On the first execution, of course, all
documents will need processing. A full index build can be forced documents will need processing. A full index build can be forced
later on by specifying an option to the indexation command later on by specifying an option to the indexing command
(<command>recollindex -z</command>).</para> (<command>recollindex -z</command>).</para>
<para>&RCL; indexation takes place at discrete times. There is <para>&RCL; indexing takes place at discrete times. There is
currently no interface to real time file modification currently no interface to real time file modification
monitors. The typical usage is to have a nightly indexation run monitors. The typical usage is to have a nightly indexing run
<link linkend="rcl.indexing.automat">programmed</link> into your <link linkend="rcl.indexing.automat">programmed</link> into your
<command>cron</command> file.</para> <command>cron</command> file.</para>
@ -205,7 +205,7 @@
many individually indexed documents. many individually indexed documents.
</para> </para>
<para>&RCL; indexation processes plain text, HTML, openoffice <para>&RCL; indexing processes plain text, HTML, openoffice
and e-mail files internally. Other types (ie: postscript, pdf, and e-mail files internally. Other types (ie: postscript, pdf,
ms-word, rtf) need external applications for preprocessing. The ms-word, rtf) need external applications for preprocessing. The
list is in the <link list is in the <link
@ -219,7 +219,7 @@
</sect1> </sect1>
<sect1 id="rcl.indexing.config"> <sect1 id="rcl.indexing.config">
<title>The indexation configuration</title> <title>The indexing configuration</title>
<para>Values set in the system-wide configuration file (named <para>Values set in the system-wide configuration file (named
like like
@ -231,9 +231,9 @@
<para>The most accurate documentation for editing the file is <para>The most accurate documentation for editing the file is
given by comments inside the central one. If you want to adjust given by comments inside the central one. If you want to adjust
the configuration before indexation, just click the configuration before indexing, just click
<guilabel>Cancel</guilabel> when the program asks if it should <guilabel>Cancel</guilabel> when the program asks if it should
start initial indexation. This will have created a start initial indexing. This will have created a
<filename>.recoll</filename> directory containing empty <filename>.recoll</filename> directory containing empty
configuration files.</para> configuration files.</para>
@ -244,34 +244,34 @@
</sect1> </sect1>
<sect1 id="rcl.indexing.exec"> <sect1 id="rcl.indexing.exec">
<title>Starting indexation</title> <title>Starting indexing</title>
<para>Indexation is performed either by the <para>Indexing is performed either by the
<command>recollindex</command> program, or by the <command>recollindex</command> program, or by the
indexation thread inside the <command>recoll</command> indexing thread inside the <command>recoll</command>
program (use the <guimenu>File</guimenu> menu). program (use the <guimenu>File</guimenu> menu).
<para>If the <command>recoll</command> program finds no database <para>If the <command>recoll</command> program finds no database
when it starts, it will automatically start indexation (except when it starts, it will automatically start indexing (except
if cancelled).</para> if cancelled).</para>
<para>It is best to avoid interrupting the indexation process, as <para>It is best to avoid interrupting the indexing process, as
this may sometimes leave the database in a bad state. This is this may sometimes leave the database in a bad state. This is
not a serious problem, as you then just need to clear not a serious problem, as you then just need to clear
everything and restart the indexation: the database files are everything and restart the indexing: the database files are
normally stored in the <filename>$HOME/.recoll/xapiandb</filename> normally stored in the <filename>$HOME/.recoll/xapiandb</filename>
directory, directory,
which you can just delete if needed. Alternatively, you can which you can just delete if needed. Alternatively, you can
start <command>recollindex -z</command>, which will start <command>recollindex -z</command>, which will
reset the database before indexation.</para> reset the database before indexing.</para>
</sect1> </sect1>
<sect1 id="rcl.indexing.automat"> <sect1 id="rcl.indexing.automat">
<title>Using <command>cron</command> to automate <title>Using <command>cron</command> to automate
indexation</title> indexing</title>
<para>The most common way to set up indexation is to have a cron <para>The most common way to set up indexing is to have a cron
task execute it every night. For example the following task execute it every night. For example the following
<filename>crontab</filename> entry would do it every day at <filename>crontab</filename> entry would do it every day at
3:30AM (supposing <command>recollindex</command> is in your PATH):</para> 3:30AM (supposing <command>recollindex</command> is in your PATH):</para>
@ -443,7 +443,7 @@
<formalpara><title>File names</title> <formalpara><title>File names</title>
<para>All file name elements (the broken up file path) are <para>All file name elements (the broken up file path) are
entered as terms during indexation, and you can specify them entered as terms during indexing, and you can specify them
as ordinary terms in normal search fields. Alternatively, you as ordinary terms in normal search fields. Alternatively, you
can use specific file name search which will can use specific file name search which will
<emphasis>only</emphasis> look for file names and can use <emphasis>only</emphasis> look for file names and can use
@ -510,7 +510,7 @@
file</link>), or later added with file</link>), or later added with
<command>recollindex -s</command> (See the recollindex <command>recollindex -s</command> (See the recollindex
manual). Stemming languages which are dynamically added will be manual). Stemming languages which are dynamically added will be
deleted at the next indexation pass unless they are also added in deleted at the next indexing pass unless they are also added in
the configuration file.</para> the configuration file.</para>
</listitem> </listitem>
@ -745,7 +745,7 @@
will be created with a set of empty configuration files. will be created with a set of empty configuration files.
<command>recoll</command> will give you a <command>recoll</command> will give you a
chance to edit the configuration file before starting chance to edit the configuration file before starting
indexation. <command>recollindex</command> will indexing. <command>recollindex</command> will
proceed immediately.</para> proceed immediately.</para>
<para>Most of the parameters specific to the <para>Most of the parameters specific to the
@ -787,7 +787,7 @@
</itemizedlist> </itemizedlist>
<para>Section lines allow redefining some parameters for a <para>Section lines allow redefining some parameters for a
directory subtree. Some of the parameters used for indexation directory subtree. Some of the parameters used for indexing
are looked up hierarchically from the more to the less are looked up hierarchically from the more to the less
specific. Not all parameters can be meaningfully redefined, specific. Not all parameters can be meaningfully redefined,
this is specified for each in the next section. </para> this is specified for each in the next section. </para>
@ -813,7 +813,7 @@
<command>recoll</command> to copy the sample <command>recoll</command> to copy the sample
configuration, click <guimenu>Cancel</guimenu>, and edit configuration, click <guimenu>Cancel</guimenu>, and edit
the configuration file before restarting the command. This the configuration file before restarting the command. This
will start the initial indexation, which may take some time.</para> will start the initial indexing, which may take some time.</para>
<para>Paramers:</para> <para>Paramers:</para>
@ -824,7 +824,7 @@
index (recursively for directories). The indexer will not index (recursively for directories). The indexer will not
follow symbolic links inside the indexed trees. If an entry in follow symbolic links inside the indexed trees. If an entry in
the <literal>topdirs</literal> list is a symbolic link, the <literal>topdirs</literal> list is a symbolic link,
indexation will not start and will generate an error.</para> indexing will not start and will generate an error.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
@ -885,7 +885,7 @@
possible values. You can add a stem expansion database for possible values. You can add a stem expansion database for
a different language by using <command>recollindex a different language by using <command>recollindex
-s</command>, but it will be deleted during the next -s</command>, but it will be deleted during the next
indexation. Only languages listed in the configuration indexing. Only languages listed in the configuration
file are permanent.</para> file are permanent.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
@ -927,7 +927,7 @@
type for a file (the main procedure uses suffix type for a file (the main procedure uses suffix
associations as defined in the <filename>mimemap</filename> associations as defined in the <filename>mimemap</filename>
file). This can be useful for files with suffixless names, file). This can be useful for files with suffixless names,
but it will also cause the indexation of many bogus "text" but it will also cause the indexing of many bogus "text"
files.</para> files.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
@ -937,7 +937,7 @@
section of the database to allow specific file names section of the database to allow specific file names
searches using wild cards. This parameter decides if searches using wild cards. This parameter decides if
file name indexing is performed only for files with mime file name indexing is performed only for files with mime
types that would qualify them for full text indexation, or types that would qualify them for full text indexing, or
for all files inside the selected subtrees, independant of for all files inside the selected subtrees, independant of
mime type.</para> mime type.</para>
</listitem> </listitem>
@ -985,10 +985,10 @@
<title>The mimeconf file</title> <title>The mimeconf file</title>
<para><filename>mimeconf</filename> specifies how the <para><filename>mimeconf</filename> specifies how the
different mime types are handled for indexation, and for different mime types are handled for indexing, and for
display.</para> display.</para>
<para>Changing the indexation parameters is probably not a <para>Changing the indexing parameters is probably not a
good idea except if you are a &RCL; developper.</para> good idea except if you are a &RCL; developper.</para>
<para>You may want to adjust the external viewers defined in <para>You may want to adjust the external viewers defined in