doc updates

This commit is contained in:
Jean-Francois Dockes 2012-03-08 13:23:31 +01:00
parent d40d47ff97
commit 5ff8aadfe9
9 changed files with 69 additions and 430 deletions

View File

@ -195,7 +195,7 @@
<itemizedlist>
<listitem>
<formalpara><title>Periodic indexing:</title>
<formalpara><title>Periodic (or Batch) indexing:</title>
<para>indexing takes place at discrete
times, by executing the <command>recollindex</command>
command. The typical usage is to have a nightly indexing run
@ -336,8 +336,9 @@ recoll
total amount of data on the computer.</para>
<para>The index data directory (<filename>xapiandb</filename>)
only contains data that can be completely rebuilt by an index
run, and it can always be destroyed safely.</para>
only contains data that can be completely rebuilt by an index run
(as long as the original documents exist), and it can always be
destroyed safely.</para>
<sect2 id="rcl.indexing.storage.format">
<title>Xapian index formats</title>
@ -405,7 +406,7 @@ recoll
will be asked whether or not you would like it to build the
index. If you want to adjust the configuration before indexing,
just click <guilabel>Cancel</guilabel> at this point, which will get
you into the configuration interface. If you exit,
you into the configuration interface. If you exit at this point,
<filename>recoll</filename> will have created a ~/.recoll directory
containing empty configuration files, which you can edit by hand.</para>
@ -485,6 +486,12 @@ recoll
<ulink url="https://bitbucket.org/medoc/recoll/wiki/IndexBeagleWeb">
Recoll wiki</ulink>.</para>
<para>Unfortunately, it seems that the plugin does not work anymore
with recent Firefox versions (tried with 10.0). This is not the
trival installation version check issue, explicit manual indexing
requests still work, but automatic indexing on page load does
not.</para>
</sect1>
<sect1 id="rcl.indexing.periodic">
@ -493,50 +500,26 @@ recoll
<sect2 id="rcl.indexing.periodic.exec">
<title>Running indexing</title>
<para>Indexing is performed either by the
<command>recollindex</command> program, or by the indexing thread
inside the <command>recoll</command> program (start it from the
<guimenu>File</guimenu> menu). Both programs will use the
<para>Indexing is always performed by the
<command>recollindex</command> program, which can be started
either from the command line or from the <guimenu>File</guimenu>
menu in the <command>recoll</command> GUI program. When started
from the GUI, the indexing will run on the same configuration
<command>recoll</command> was started on. When started from the
command line, <command>recollindex</command> will use the
<literal>RECOLL_CONFDIR</literal> variable or accept a
<literal>-c</literal> <replaceable>confdir</replaceable> option
to specify a non-default configuration directory.</para>
<para>There are reasons to use either the indexing thread or the
<command>recollindex</command> command, but it is also a matter of
personal preferences:
<itemizedlist>
<listitem><para>Starting the indexing thread is more convenient,
being just one click away.</para>
</listitem>
<listitem><para>The <command>recollindex</command> command has
more options, especially the one to reset the index
(<literal>-z</literal>).</para>
</listitem>
<listitem><para>The <command>recollindex</command> command will
not take down your GUI if it crashes (a rare occurrence,
but who knows...)</para>
</listitem>
<listitem><para>The <command>recollindex</command> command uses
<command>setpriority/nice</command> to lower its priority
while indexing. When available (and for &RCL; version
1.16.2 and newer), it also uses the
<command>ionice</command> command to lower its IO
priority. The thread can't do it, else it would also slow
down the user/search interface.</para>
</listitem>
</itemizedlist>
</para>
<para>If the <command>recoll</command> program finds no index
when it starts, it will automatically start indexing (except
if canceled).</para>
<para>The <command>recollindex</command> indexing process can be
interrupted by sending an
interrupt (^C, SIGINT) or terminate (SIGTERM) signal. Some time may
elapse before the process exits, because it needs to properly flush
and close the index. The indexing thread can be equivalently
stopped from the menu.</para>
interrupted by sending an interrupt (^C, SIGINT) or terminate
(SIGTERM) signal. Some time may elapse before the process exits,
because it needs to properly flush and close the index. The
indexing thread can be equivalently stopped from the menu.</para>
<para>After such an interruption, the index will be somewhat
inconsistent because some operations which are normally performed
@ -585,6 +568,13 @@ recoll
<programlisting>1 15 su mylogin -c "recollindex recollindex > /tmp/rcltraceme 2>&1"</programlisting>
</para>
<para>As of version 1.17 the &RCL; GUI has dialogs to manage
<filename>crontab</filename> entries for
<command>recollindex</command>. You can reach them from the
<guimenu>Preferences->Indexing Schedule</guimenu> menu. They only
work with the good old <command>cron</command>, and do not give
access to all features of <command>cron</command> scheduling.</para>
<para>The usual command to edit your
<filename>crontab</filename> is
<userinput>crontab -e</userinput> (which will usually start
@ -593,10 +583,11 @@ recoll
system.</para>
<para>Please be aware that there may be differences between your
usual interactive command line environment and the one seen by
crontab commands. Especially the PATH variable may be of
concern. Please check the crontab manual pages about possible
issues.</para>
usual interactive command line environment and the one seen by
crontab commands. Especially the PATH variable may be of
concern. Please check the crontab manual pages about possible
issues.</para>
</sect2>
</sect1>
@ -605,27 +596,28 @@ recoll
<title>Real time indexing</title>
<para>Real time monitoring/indexing is performed by starting the
<command>recollindex -m</command> command. With this option,
<command>recollindex</command> will detach from the terminal and
become a daemon, permanently monitoring file changes and updating
the index.</para>
<command>recollindex -m</command> command. With this option,
<command>recollindex</command> will detach from the terminal and
become a daemon, permanently monitoring file changes and updating
the index.</para>
<para>The real time indexing support can be customised during package
<link linkend="rcl.install.building.build">configuration</link>
with the <literal>--with[out]-fam</literal> or
<literal>--with[out]-inotify</literal> options. The default is
currently to include inotify monitoring on systems that support
it, and, as of recoll 1.17, gamin support on FreeBSD.</para>
<para>Under KDE, Gnome and some other desktop environments, the daemon
can automatically started when you log in, by creating a desktop
file inside the <filename>~/.config/autostart</filename> directory.
This can be done for you by the &RCL; GUI. Use the
<guimenu>Preferences->Indexing Schedule</guimenu> menu.</para>
<para>With older X11 setups, starting the daemon is normally
performed as part of the user session script.</para>
<para>The <filename>rclmon.sh</filename> script can be used to
easily start and stop the daemon. It can be found in the
<filename>examples</filename> directory (typically
<filename>/usr/local/[share/]recoll/examples</filename>).</para>
<para>Starting the daemon is normally performed as part
of the user session script. For example, my out of fashion
xdm-based session has a <filename>.xsession</filename> script
with the following lines at the end:</para>
<para>For example, my out of fashion xdm-based session has a
<filename>.xsession</filename> script with the following lines at
the end:</para>
<programlisting>recollconf=$HOME/.recoll-home
recolldata=/usr/local/share/recoll
@ -636,24 +628,15 @@ fvwm
</programlisting>
<para>The indexing daemon gets started, then the window manager,
for which the session waits.</para> <para>By default the
indexing daemon will monitor the state of the X11 session, and
exit when it finishes, it is not necessary to kill it
explicitly. (The X11 server monitoring can be disabled with option
<literal>-x</literal> to <command>recollindex</command>).
</para>
<para>Under KDE, you can place a small script to start
<command>recollindex -m</command> under
<filename>$HOME/.kde/Autostart</filename>. This will be executed
when the session begins.</para>
<para>There is a similar mechanism under Gnome (find the session
control tool in the menus and use the "Startup programs" tab).</para>
for which the session waits.</para> <para>By default the
indexing daemon will monitor the state of the X11 session, and
exit when it finishes, it is not necessary to kill it
explicitly. (The X11 server monitoring can be disabled with option
<literal>-x</literal> to <command>recollindex</command>).</para>
<para>If you use the daemon completely out of an X11 session, you
need to add option <literal>-x</literal> to disable X11 session
monitoring (else the daemon will not start).</para>
need to add option <literal>-x</literal> to disable X11 session
monitoring (else the daemon will not start).</para>
<para>By default, the messages from the indexing daemon will be
discarded. You may want to change this by setting the
@ -663,6 +646,14 @@ fvwm
daemon runs permanently, the log file may grow quite big, depending
on the log level.</para>
<para>When building &RCL;, the real time indexing support can be
customised during package
<link linkend="rcl.install.building.build">configuration</link>
with the <literal>--with[out]-fam</literal> or
<literal>--with[out]-inotify</literal> options. The default is
currently to include inotify monitoring on systems that support
it, and, as of recoll 1.17, gamin support on FreeBSD.</para>
<para>While it is convenient that data is indexed in real time,
repeated indexing can generate a significant load on the
system when files such as email folders change. Also,
@ -3319,6 +3310,11 @@ while query.next >= 0 and query.next < nres:
real time indexing. Inotify support is enabled by default on
recent Linux systems.</para>
</listitem>
<listitem><para><literal>--disable-webkit</literal> is available
from version 1.17 to implement the result list with a
<application>Qt</application> QTextBrowser instead of a
WebKit widget if you do not or can't depend on the
latter.</para>
<listitem><para><literal>--enable-xattr</literal> will enable
code to fetch data from file extended attributes. This is only
useful is some application stores data in there, and also needs

View File

@ -1,26 +0,0 @@
#!/bin/sh
# Build howto index page from howto subdirs
fatal()
{
echo $*;exit 1
}
#set -x
test -f fraghead.html || \
fatal repertoire courant pas un repertoire de construction
cat fraghead.html > index.html
subdirs=`ls -F | grep /`
for dir in $subdirs
do
echo processing $dir
title=`grep '<h1>' $dir/index.html | sed -e 's/<h1>//' -e 's!</h1>!!'`
test "$title" = "" && fatal No title line in $dir/index.html
# Add title/label to list of articles
echo "<li><a href=\"${dir}index.html\">$title</a></li>" >> index.html
done
cat fragend.html >> index.html

View File

@ -1,5 +0,0 @@
</ul>
</div>
</body>
</html>

View File

@ -1,40 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Recoll howtos</title>
<meta name="generator" content="HTML Tidy, see www.w3.org">
<meta name="Author" content="Jean-Francois Dockes">
<meta name="Description" content=
"recoll is a simple full-text search system for unix and linux
based on the powerful and mature xapian engine">
<meta name="Keywords" content=
"full text search, desktop search, unix, linux">
<meta http-equiv="Content-language" content="en">
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
<meta name="robots" content="All,Index,Follow">
<link type="text/css" rel="stylesheet" href="../styles/style.css">
</head>
<body>
<div class="rightlinks">
<ul>
<li><a href="../../index.html">Home</a></li>
<li><a href="../../doc.html">Documentation</a></li>
<li><a href="../../download.html">Downloads</a></li>
</ul>
</div>
<div class="content">
<h1>Recoll howtos</h1>
<p>The following short documents contain information
mostly extracted from the main user manual (possibly out of
separate sections), arranged differently in order to explain
how to achieve a given goal.</p>
<ul>

View File

@ -1,47 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Recoll howtos</title>
<meta name="generator" content="HTML Tidy, see www.w3.org">
<meta name="Author" content="Jean-Francois Dockes">
<meta name="Description" content=
"recoll is a simple full-text search system for unix and linux
based on the powerful and mature xapian engine">
<meta name="Keywords" content=
"full text search, desktop search, unix, linux">
<meta http-equiv="Content-language" content="en">
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
<meta name="robots" content="All,Index,Follow">
<link type="text/css" rel="stylesheet" href="../styles/style.css">
</head>
<body>
<div class="rightlinks">
<ul>
<li><a href="../../index.html">Home</a></li>
<li><a href="../../doc.html">Documentation</a></li>
<li><a href="../../download.html">Downloads</a></li>
</ul>
</div>
<div class="content">
<h1>Recoll howtos</h1>
<p>The following short documents contain information
mostly extracted from the main user manual (possibly out of
separate sections), arranged differently in order to explain
how to achieve a given goal.</p>
<ul>
<li><a href="prevent_indexing_a_directory/index.html"> Preventing indexing in a given directory</a></li>
<li><a href="use_multiple_indexes/index.html"> Creating and using multiple indexes</a></li>
</ul>
</div>
</body>
</html>

View File

@ -1,22 +0,0 @@
#!/bin/sh
fatal()
{
echo $*; exit 1
}
usage()
{
fatal 'Usage: newdir nom'
}
test $# -gt 1 || usage
dir=`echo $* | sed -e 's/ /_/g' -e 's!/!_!g'`
echo dir: $dir
mkdir $dir || fatal mkdir failed
cp -i template.html $dir/index.html
open -a emacs $dir/index.html

View File

@ -1,68 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Recoll howtos</title>
<meta name="generator" content="HTML Tidy, see www.w3.org">
<meta name="Author" content="Jean-Francois Dockes">
<meta name="Description" content=
"recoll is a simple full-text search system for unix and linux
based on the powerful and mature xapian engine">
<meta name="Keywords" content=
"full text search, desktop search, unix, linux">
<meta http-equiv="Content-language" content="en">
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
<meta name="robots" content="All,Index,Follow">
<link type="text/css" rel="stylesheet" href="../../styles/style.css">
</head>
<body>
<div class="rightlinks">
<ul>
<li><a href="../../index.html">Home</a></li>
<li><a href="../../doc.html">Documentation</a></li>
<li><a href="../index.html">Howtos</a></li>
</ul>
</div>
<div class="content">
<h1>Preventing indexing in a directory</h1>
<h2>Why would you want to do this ?</h2>
<p>By default, recollindex (or the indexing thread inside the
recoll QT user interface) will process your home directories and
most its subdirectories, at the exception of some well known
places (thumbnails, beagle and web browser caches, etc.)</p>
<p>You may want to prevent indexing in some directories where
you don't expect interesting search results. This will avoid
polluting the search result lists, speed up indexing times and
make the index smaller.</p>
<h2>How to do it</h2>
<p>There are two ways to block indexing at certain points:
either by listing specific paths, or by directory name pattern
matches.</p>
<ul>
<li><em>Blocking specific paths</em>: this is controlled by
the <tt>skippedPaths</tt> variable in the main configuration
file. You can adjust the value either by editing the file or
by using the indexing configuration dialog:
<span class="guimenu">Preferences->Indexing&nbsp;configuration->Global&nbsp;parameters->Skipped&nbsp;paths</span></li>
<li><em>Using pattern matches</em>: these are listed in the
<tt>skippedNames</tt> variable in the main configuration file. You
can adjust the value either by editing the file or by using
the GUI:
<span class="guimenu">Preferences->Indexing&nbsp;configuration->Local&nbsp;parameters->Skipped&nbsp;names</span></li>
</ul>
</div>
</body>
</html>

View File

@ -1,40 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Recoll howtos</title>
<meta name="generator" content="HTML Tidy, see www.w3.org">
<meta name="Author" content="Jean-Francois Dockes">
<meta name="Description" content=
"recoll is a simple full-text search system for unix and linux
based on the powerful and mature xapian engine">
<meta name="Keywords" content=
"full text search, desktop search, unix, linux">
<meta http-equiv="Content-language" content="en">
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
<meta name="robots" content="All,Index,Follow">
<link type="text/css" rel="stylesheet" href="../../styles/style.css">
</head>
<body>
<div class="rightlinks">
<ul>
<li><a href="../../index.html">Home</a></li>
<li><a href="../../doc.html">Documentation</a></li>
<li><a href="../index.html">Howtos</a></li>
</ul>
</div>
<div class="content">
<h1>Howto do something</h1>
<h2>Why would you want to do this ?</h2>
<h2>How to do it</h2>
</div>
</body>
</html>

View File

@ -1,109 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Recoll howtos</title>
<meta name="generator" content="HTML Tidy, see www.w3.org">
<meta name="Author" content="Jean-Francois Dockes">
<meta name="Description" content=
"recoll is a simple full-text search system for unix and linux
based on the powerful and mature xapian engine">
<meta name="Keywords" content=
"full text search, desktop search, unix, linux">
<meta http-equiv="Content-language" content="en">
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
<meta name="robots" content="All,Index,Follow">
<link type="text/css" rel="stylesheet" href="../../styles/style.css">
</head>
<body>
<div class="rightlinks">
<ul>
<li><a href="../../index.html">Home</a></li>
<li><a href="../../doc.html">Documentation</a></li>
<li><a href="../index.html">Howtos</a></li>
</ul>
</div>
<div class="content">
<h1>Creating and using multiple indexes</h1>
<h2>Why would you want to do this ?</h2>
<ul>
<li>Easy adjustment of search areas: you can filter
results by using the directory filter in the advanced search
panel, but, if you have separate well defined places where
you store different kind of data, it is easier to maintain
separate index and use the <em>External indexes</em> dialog
to switch them on or off, and it will also yield much better
search performance.</li>
<li>Shared indexes: it may be useful to maintain one or
several indexes for shared data, and separate personal
indexes for each user.</li>
</ul>
<h2>How to do it</h2>
<p>As an example we'll suppose that you
have <span class="application">Recoll</span> installed and indexing your
home directory, and that you would like to have a separate index
for <tt class="filename">/usr/shared/doc</tt>.</p>
<p>You need to create a separate configuration for the new index,
then add it to the external indexes list in the user
interface, and activate it as needed. </p>
<ol>
<li>Create a directory for the new index:
<pre>cd
mkdir .recoll-sharedoc
</pre>
</li>
<li>Create a minimal configuration file:
<pre>cd .recoll-sharedoc
echo "topdirs = /usr/share/doc" > recoll.conf
</pre>
</li>
<li>Perform initial indexing:
<pre>recollindex -c ~/.recoll-sharedoc</pre>
</li>
<li>Optionally set up cron to perform nightly indexing, use
<pre>crontab -e</pre>
and insert a line like the following:
<pre>45 20 * * * recollindex -c ~/.recoll-sharedoc</pre>
This would start the indexing at
20:45. <tt class="command">crontab&nbsp;-e</tt> will use the
<tt class="command">vi</tt> editor by default, you can
change this by using the <tt class="command">EDITOR</tt>
environment variable. Exemple:
<pre>EDITOR=kate crontab -e</pre>
Your favorite desktop may also have a dedicated tool to add
<tt class="filename">crontab</tt> entries.
</li>
<li>Start <tt class="command">recoll</tt> and choose
the <span
class="guimenu">Preferences->External&nbsp;index&nbsp;dialog</span>
menu entry, then click the <span class="guilabel">Browse</span> button
(near the bottom), and select the new index Xapian database
directory:
<pre><tt class="filename">~/.recoll-sharedoc/xapiandb</tt></pre>
Then click <span class="guilabel">Add&nbsp;index</span>.
</li>
<li>You can then activate or deactivate the new index by
clicking the box in front of the directory name in the list.
</li>
</ol>
<p>When adding an index shared by multiple users, it may
be helpful to use
the <tt class="variable">RECOLL_EXTRA_DBS</tt> environment
variable instead of editing individual configurations, see the
manual for more details.</p>
</div>
</body>
</html>