diff --git a/src/doc/user/usermanual.sgml b/src/doc/user/usermanual.sgml
index 17ef2ef5..36b0548a 100644
--- a/src/doc/user/usermanual.sgml
+++ b/src/doc/user/usermanual.sgml
@@ -140,20 +140,20 @@
currently makes no attempt at automatic language recognition.
&RCL; has many parameters which define exactly what to
- index, and how to classify and decode the source
- documents. These are kept in configuration files. A
- default configuration is copied into a standard location
- (usually something like
- /usr/[local/]share/recoll/examples)
- during installation. The default parameters from this file may
- be overridden by values that you set inside your personal
- configuration, found by default in the
- .recoll sub-directory of your home
- directory. The default configuration will index your home
- directory with default parameters and should be sufficient for
- giving &RCL; a try, but you may want to adjust it
- later.
+ index, and how to classify and decode the source documents. These
+ are kept in configuration
+ files. A default configuration is copied into a standard
+ location (usually something like
+ /usr/[local/]share/recoll/examples) during
+ installation. The default parameters from this file may be
+ overridden by values that you set inside your personal
+ configuration, found by default in the .recoll
+ sub-directory of your home directory. The default configuration
+ will index your home directory with default parameters and should
+ be sufficient for giving &RCL; a try, but you may want to adjust it
+ later, which can be done either by editing the text files or by
+ using configuration menus in the recoll
+ GUI
Indexing
is started automatically the first time you execute the
@@ -184,7 +184,7 @@
Indexing is the process by which the set of documents is
analyzed and the data entered into the database. &RCL; indexing
is normally incremental: documents will only be processed if
- they have been modified. On the first execution, of course, all
+ they have been modified. On the first execution, all
documents will need processing. A full index build can be forced
later by specifying an option to the indexing command
(recollindex -z).
@@ -238,7 +238,7 @@
a folder file archived inside a zip file...&RCL; indexing processes plain text, HTML, openoffice
- and e-mail files internally (a few more actually).
+ and e-mail files, and a few others internally.
Other file types (ie: postscript, pdf, ms-word, rtf ...)
need external applications for preprocessing. The list is in the
@@ -342,40 +342,23 @@ recoll
Xapian index formats
- If your first installation of &RCL; was 1.9.0 or more
- recent, you can skip this section.
-
- &XAP; has had two possible index formats for quite some
- time. The "old" one named Quartz, and the
- new one named Flint. &XAP; 0.9 used
- Quartz by default, but could use
- Flint if a specific environment variable
- (XAPIAN_PREFER_FLINT) was set. &XAP; 1.0
- still supports Quartz but will use
- Flint by default for new index
- creations.
-
- The number of disk accesses performed during indexing
- has been much optimized in the new Flint
- engine and you may see indexing times improved by 50% in some
- cases (compared to Quartz), typically for
- big indexes where disk accesses dominate the indexing
- time. There is also a more modest improvement of index
- size.
+ &XAP; versions usually support several formats for index
+ storage. A given major &XAP; version will have a current format,
+ used to create new indexes, and will also support the format from
+ the previous major version.&XAP; will not convert automatically an existing index
- from the Quartz to the
- Flint format. If you have an older index
- and want to take advantage of the new format (which can be
- done without setting the environment variable as of &RCL;
- 1.8.2 and &XAP; 1.0.0), you will have to explicitly delete
- the old index, then run a normal indexing process.
+ from the older format to the newer one. If you want to upgrade to
+ the new format, or if a very old index needs to be converted
+ because its format is not supported any more, you will have to
+ explicitly delete the old index, then run a normal indexing
+ process.Unfortunately, using the -z option to
recollindex is not sufficient to change the
- format, you have to delete all files inside the index
+ format, you will have to delete all files inside the index
directory (typically ~/.recoll/xapiandb)
- before starting indexing.
+ before starting the indexing.
@@ -387,7 +370,7 @@ recoll
complete reconstruction. If confidential data is indexed,
access to the database directory should be restricted.
- As of version 1.4, &RCL; will create the configuration
+ &RCL; (since version 1.4) will create the configuration
directory with a mode of 0700 (access by owner only). As the
index data directory is by default a sub-directory of the
configuration directory, this should result in appropriate
@@ -511,16 +494,16 @@ recoll
Running indexingIndexing is performed either by the
- recollindex program, or by the
- indexing thread inside the recoll
- program (use the File menu). Both programs
- will use the RECOLL_CONFDIR
- variable or accept a -c
- confdir option to specify a non-default
- configuration directory.
+ recollindex program, or by the indexing thread
+ inside the recoll program (start it from the
+ File menu). Both programs will use the
+ RECOLL_CONFDIR variable or accept a
+ -cconfdir option
+ to specify a non-default configuration directory.
- Reasons to use either the indexing thread or the
- recollindex command:
+ There are reasons to use either the indexing thread or the
+ recollindex command, but it is also a matter of
+ personal preferences:
Starting the indexing thread is more convenient,
being just one click away.
@@ -534,14 +517,15 @@ recoll
but who knows...)The recollindex command uses
- setpriority/nice to lower its priority while
- indexing
- (it will also use ionice when this becomes
- more widely available), the thread can't do it, else it would
- also slow down the user/search interface.
+ setpriority/nice to lower its priority
+ while indexing. When available (and for &RCL; version
+ 1.16.2 and newer), it also uses the
+ ionice command to lower its IO
+ priority. The thread can't do it, else it would also slow
+ down the user/search interface.
- I'll let the reader decide where my heart belongs...
+
If the recoll program finds no index
when it starts, it will automatically start indexing (except
@@ -631,7 +615,7 @@ recoll
with the --with[out]-fam or
--with[out]-inotify options. The default is
currently to include inotify monitoring on systems that support
- it.
+ it, and, as of recoll 1.17, gamin support on FreeBSD.
The rclmon.sh script can be used to
easily start and stop the daemon. It can be found in the
@@ -1311,19 +1295,13 @@ fvwm
Sorting search results and collapsing duplicatesThe documents in a result list are normally sorted in
- order of relevance. It is possible to specify different sort
- parameters by using the Sort parameters
- dialog (located in the Tools menu).
-
- The tool sorts a specified number of the most
- relevant documents in the result list, according to specified
- criteria. The currently available criteria are
- date and mime
- type.
-
- The sort parameters stay in effect until they are
- explicitly reset, or the program exits. An activated sort is
- indicated in the result list header.
+ order of relevance. It is possible to specify a different sort
+ order, either by using the vertical arrows in the GUI toolbox to
+ sort by date, or switching to the result table display and clicking
+ on any header. The sort order chosen inside the result table
+ remains active if you switch back to the result list, until you
+ click one of the vertical arrows, until both are unchecked (you are
+ back to sort by relevance).Sort parameters are remembered between program
invocations, but result sorting is normally always inactive
@@ -1427,15 +1405,34 @@ fvwm
AutoPhrasesThis option can be set in the preferences dialog. If it is
- set, a phrase will be automatically built and added to simple
- searches when looking for Any terms. This
- will not change radically the results, but will give a relevance
- boost to the results where the search terms appear as a
- phrase. Ie: searching for virtual reality
- will still find all documents where either
- virtual or reality or
- both appear, but those which contain virtual
- reality should appear sooner in the list.
+ set, a phrase will be automatically built and added to simple
+ searches when looking for Any terms. This
+ will not change radically the results, but will give a relevance
+ boost to the results where the search terms appear as a
+ phrase. Ie: searching for virtual reality
+ will still find all documents where either
+ virtual or reality or
+ both appear, but those which contain virtual
+ reality should appear sooner in the list.
+
+ Phrase searches can strongly slow down a query if most of the
+ terms in the phrase are common. This is why the
+ autophrase option is off by default for &RCL;
+ versions before 1.17. As of version 1.17,
+ autophrase is on by default, but very common
+ terms will be removed from the constructed phrase. The removal
+ threshold can be adjusted from the search preferences.
+
+ Phrases and abbreviationsAs of
+ &RCL; version 1.17, dotted abbreviations like
+ I.B.M. are also automatically indexed as a word
+ without the dots: IBM. Searching for the word
+ inside a phrase (ie: "the IBM company") will only
+ match the dotted abrreviation if you increase the phrase slack (using the
+ advanced search panel control, or the o query
+ language modifier). Literal occurences of the word will be matched
+ normally.
+
@@ -3406,6 +3403,13 @@ skippedNames = #* bin CVS Cache cache* caughtspam tmp .thumbnails .svn \
skippedPaths = ~/somedir/∗.txt
+ The values in the *skippedPaths
+ variables are currently matched with
+ fnmatch(3), with the FNM_PATHNAME and
+ FNM_LEADING_DIR flags. This means that '/' characters must
+ be matched explicitely, which is probably
+ unfortunate.
+