doc

2019-04-14 16:18:39 +02:00 · 2019-04-14 16:18:39 +02:00 · 567aaa2035
commit 567aaa2035
parent 48bc71da70
4 changed files with 875 additions and 1037 deletions
--- a/src/doc/man/recoll.conf.5
+++ b/src/doc/man/recoll.conf.5
@ -54,12 +54,20 @@ home directory.
 Where values are lists, white space is used for separation, and elements with
 embedded spaces can be quoted with double-quotes.
 .SH OPTIONS
 .TP
 .BI "topdirs = "string
 Space-separated list of files or
 directories to recursively index. Default to ~ (indexes
 $HOME). You can use symbolic links in the list, they will be followed,
-independently of the value of the followLinks variable.
+independantly of the value of the followLinks variable.
 .TP
 .BI "monitordirs = "string
 Space-separated list of files or directories to monitor for
 updates. When running the real-time indexer, this allows monitoring only a
 subset of the whole indexed area. The elements must be included in the
 tree defined by the 'topdirs' members.
 .TP
 .BI "skippedNames = "string
 Files and directories which should be ignored. 
@ -69,13 +77,21 @@ names.  The list in the default configuration does not exclude hidden
 directories (names beginning with a dot), which means that it may index
 quite a few things that you do not want. On the other hand, email user
 agents like Thunderbird usually store messages in hidden directories, and
-you probably want this indexed. One possible solution is to have '.*'
+you probably want this indexed. One possible solution is to have ".*" in
-in 'skippedNames', and add things like '~/.thunderbird' '~/.evolution'
+"skippedNames", and add things like "~/.thunderbird" "~/.evolution" to
-to 'topdirs'.  Not even the file names are indexed for patterns in this
+"topdirs".  Not even the file names are indexed for patterns in this
-list, see the 'noContentSuffixes' variable for an alternative approach
+list, see the "noContentSuffixes" variable for an alternative approach
 which indexes the file names. Can be redefined for any
 subtree.
 .TP
 .BI "skippedNames- = "string
 List of name endings to remove from the default skippedNames
 list. 
 .TP
 .BI "skippedNames+ = "string
 List of name endings to add to the default skippedNames
 list. 
 .TP
 .BI "noContentSuffixes = "string
 List of name endings (not necessarily dot-separated suffixes) for
 which we don't try MIME type identification, and don't uncompress or
@ -87,38 +103,59 @@ from skippedNames because these are name ending matches only (not
 wildcard patterns), and the file name itself gets indexed normally. This
 can be redefined for subdirectories.
 .TP
 .BI "noContentSuffixes- = "string
 List of name endings to remove from the default noContentSuffixes
 list. 
 .TP
 .BI "noContentSuffixes+ = "string
 List of name endings to add to the default noContentSuffixes
 list. 
 .TP
 .BI "skippedPaths = "string
-Paths we should not go into. Space-separated list of
+Absolute paths we should not go into. Space-separated list of wildcard expressions for absolute
-wildcard expressions for filesystem paths. Can contain files and
+filesystem paths. Must be defined at the top level of the configuration
-directories. The database and configuration directories will
+file, not in a subsection. Can contain files and directories. The database and
-automatically be added. The expressions are matched using 'fnmatch(3)'
+configuration directories will automatically be added. The expressions
-with the FNM_PATHNAME flag set by default. This means that '/' characters
+are matched using 'fnmatch(3)' with the FNM_PATHNAME flag set by
-must be matched explicitly. You can set 'skippedPathsFnmPathname' to 0
+default. This means that '/' characters must be matched explicitely. You
-to disable the use of FNM_PATHNAME (meaning that '/*/dir3' will
+can set 'skippedPathsFnmPathname' to 0 to disable the use of FNM_PATHNAME
-match '/dir1/dir2/dir3').  The default value contains the usual mount point
+(meaning that '/*/dir3' will match '/dir1/dir2/dir3'). The default value
-for removable media to remind you that it is a bad idea to have Recoll work
+contains the usual mount point for removable media to remind you that it
-on these (esp. with the monitor: media gets indexed on mount, all data
+is a bad idea to have Recoll work on these (esp. with the monitor: media
-gets erased on unmount).  Explicitly adding '/media/xxx' to the topdirs
+gets indexed on mount, all data gets erased on unmount). Explicitely
-will override this.
+adding '/media/xxx' to the 'topdirs' variable will override
 this.
 .TP
 .BI "skippedPathsFnmPathname = "bool
 Set to 0 to
 override use of FNM_PATHNAME for matching skipped
 paths. 
 .TP
 .BI "nowalkfn = "string
 File name which will cause its parent directory to be skipped. Any directory containing a file with this name will be skipped as
 if it was part of the skippedPaths list. Ex: .recoll-noindex
 .TP
 .BI "daemSkippedPaths = "string
 skippedPaths equivalent specific to
 real time indexing. This enables having parts of the tree
 which are initially indexed but not monitored. If daemSkippedPaths is
 not set, the daemon uses skippedPaths.
 .TP
 .BI "zipUseSkippedNames = "bool
 Use skippedNames inside Zip archives. Fetched
 directly by the rclzip handler. Skip the patterns defined by skippedNames
 inside Zip archives. Can be redefined for subdirectories.
 See https://www.lesbonscomptes.com/recoll/faqsandhowtos/FilteringOutZipArchiveMembers.html
 .TP
 .BI "zipSkippedNames = "string
 Space-separated list of wildcard expressions for names that should
 be ignored inside zip archives. This is used directly by
-the zip handler, and has a function similar to skippedNames, but works
+the zip handler. If zipUseSkippedNames is not set, zipSkippedNames
-independently. Can be redefined for subdirectories. Supported by recoll
+defines the patterns to be skipped inside archives. If zipUseSkippedNames
-1.20 and newer. See
+is set, the two lists are concatenated and used. Can be redefined for
-https://bitbucket.org/medoc/recoll/wiki/Filtering%20out%20Zip%20archive%20members
+subdirectories.
 See https://www.lesbonscomptes.com/recoll/faqsandhowtos/FilteringOutZipArchiveMembers.html
 .TP
 .BI "followLinks = "bool
@ -133,16 +170,27 @@ followed.
 .BI "indexedmimetypes = "string
 Restrictive list of
 indexed mime types. Normally not set (in which case all
-supported types are indexed). If it is set,
+supported types are indexed). If it is set, only the types from the list
-only the types from the list will have their contents indexed. The names
+will have their contents indexed. The names will be indexed anyway if
-will be indexed anyway if indexallfilenames is set (default). MIME
+indexallfilenames is set (default). MIME type names should be taken from
-type names should be taken from the mimemap file. Can be redefined for
+the mimemap file (the values may be different from xdg-mime or file -i
-subtrees.
+output in some cases). Can be redefined for subtrees.
 .TP
 .BI "excludedmimetypes = "string
 List of excluded MIME
-types. Lets you exclude some types from indexing. Can be
+types. Lets you exclude some types from indexing. MIME type
-redefined for subtrees.
+names should be taken from the mimemap file (the values may be different
 from xdg-mime or file -i output in some cases) Can be redefined for
 subtrees.
 .TP
 .BI "nomd5types = "string
 Don't compute md5 for these types. md5 checksums are used only for deduplicating results, and can be
 very expensive to compute on multimedia or other big files. This list
 lets you turn off md5 computation for selected types. It is global (no
 redefinition for subtrees). At the moment, it only has an effect for
 external handlers (exec and execm). The file types can be specified by
 listing either MIME types (e.g. audio/mpeg) or handler names
 (e.g. rclaudio).
 .TP
 .BI "compressedfilemaxkbs = "int
 Size limit for compressed
@ -173,9 +221,9 @@ for the command used.
 Command used to guess
 MIME types if the internal methods fails This should be a
 "file -i" workalike.  The file path will be added as a last parameter to
-the command line. 'xdg-mime' works better than the traditional 'file'
+the command line. "xdg-mime" works better than the traditional "file"
-command, and is now the configured default (with a hard-coded fallback
+command, and is now the configured default (with a hard-coded fallback to
-to 'file')
+"file")
 .TP
 .BI "processwebqueue = "bool
 Decide if we process the
@ -204,6 +252,34 @@ will be bigger, and some marginal weirdness may sometimes occur. The
 default is a stripped index. When using multiple indexes for a search,
 this parameter must be defined identically for all. Changing the value
 implies an index reset.
 .TP
 .BI "indexStoreDocText = "bool
 Decide if we store the
 documents' text content in the index. Storing the text
 allows extracting snippets from it at query time, instead of building
 them from index position data.
 Newer Xapian index formats have rendered our use of positions list
 unacceptably slow in some cases. The last Xapian index format with good
 performance for the old method is Chert, which is default for 1.2, still
 supported but not default in 1.4 and will be dropped in 1.6.
 The stored document text is translated from its original format to UTF-8
 plain text, but not stripped of upper-case, diacritics, or punctuation
 signs. Storing it increases the index size by 10-20% typically, but also
 allows for nicer snippets, so it may be worth enabling it even if not
 strictly needed for performance if you can afford the space.
 The variable only has an effect when creating an index, meaning that the
 xapiandb directory must not exist yet. Its exact effect depends on the
 Xapian version.
 For Xapian 1.4, if the variable is set to 0, the Chert format will be
 used, and the text will not be stored. If the variable is 1, Glass will
 be used, and the text stored.
 For Xapian 1.2, and for versions after 1.5 and newer, the index format is
 always the default, but the variable controls if the text is stored or
 not, and the abstract generation method. With Xapian 1.5 and later, and
 the variable set to 0, abstract generation may be very slow, but this
 setting may still be useful to save space if you do not use abstract
 generation at all.
 .TP
 .BI "nonumbers = "bool
 Decides if terms will be
@ -216,9 +292,19 @@ will reduce the index size. This can only be set for a whole index, not
 for a subtree.
 .TP
 .BI "dehyphenate = "bool
-Determines if we index 'coworker' also when the input is 'co-worker'.
+Determines if we index
-This is new in version 1.22, and on by default. Setting the variable to off
+'coworker' also when the input is 'co-worker'. This is new
-allows restoring the previous behaviour.
+in version 1.22, and on by default. Setting the variable to off allows
 restoring the previous behaviour.
 .TP
 .BI "backslashasletter = "bool
 Process backslash as normal letter This may make sense for people wanting to index TeX commands as
 such but is not of much general use.
 .TP
 .BI "maxtermlength = "int
 Maximum term length. Words longer than this will be discarded.
 The default is 40 and used to be hard-coded, but it can now be
 adjusted. You need an index reset if you change the value.
 .TP
 .BI "nocjk = "bool
 Decides if specific East Asian
@ -263,24 +349,16 @@ lowercase and upper-case versions of a character should be specified, as
 appartenance to the list will turn-off both standard accent and case
 processing. The value is global and affects both indexing and querying.
 Examples:
 Swedish:
 unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe æae Æae ﬀff ﬁfi ﬂfl åå Åå
-
+. German:
 German:
 unac_except_trans = ää Ää öö Öö üü Üü ßss œoe Œoe æae Æae ﬀff ﬁfi ﬂfl
 In French, you probably want to decompose oe and ae and nobody would type
 a German ß
 unac_except_trans = ßss œoe Œoe æae Æae ﬀff ﬁfi ﬂfl
-
+. The default for all until someone protests follows. These decompositions
 The default for all until someone protests follows. These decompositions
 are not performed by unac, but it is unlikely that someone would type the
 composed forms in a search.
 unac_except_trans = ßss œoe Œoe æae Æae ﬀff ﬁfi ﬂfl
 .TP
 .BI "maildefcharset = "string
@ -352,7 +430,7 @@ over which we stop indexing. The value is a percentage,
 corresponding to what the "Capacity" df output column shows. The default
 value is 0, meaning no checking.
 .TP
-.BI "xapiandb = "dfn
+.BI "dbdir = "dfn
 Xapian database directory
 location. This will be created on first indexing. If the
 value is not an absolute path, it will be interpreted as relative to
@ -386,9 +464,17 @@ Default: 40 MB.
 Reducing the size will not physically truncate the file.
 .TP
 .BI "webqueuedir = "fn
-The path to the Web indexing queue. This is
+The path to the Web indexing queue. This used to be
-hard-coded in the plugin as ~/.recollweb/ToIndex so there should be no
+hard-coded in the old plugin as ~/.recollweb/ToIndex so there would be no
-need or possibility to change it.
+need or possibility to change it, but the WebExtensions plugin now downloads
 the files to the user Downloads directory, and a script moves them to
 webqueuedir. The script reads this value from the config so it has become
 possible to change it.
 .TP
 .BI "webdownloadsdir = "fn
 The path to browser downloads directory. This is
 where the new browser add-on extension has to create the files. They are
 then moved by a script to webqueuedir.
 .TP
 .BI "aspellDicDir = "dfn
 Aspell dictionary storage directory location. The
@ -415,10 +501,11 @@ which lets Xapian perform its own thing, meaning flushing every
 $XAPIAN_FLUSH_THRESHOLD documents created, modified or deleted: as memory
 usage depends on average document size, not only document count, the
 Xapian approach is is not very useful, and you should let Recoll manage
-the flushes.  The default value of idxflushmb is 10 MB, and may be a bit
+the flushes. The program compiled value is 0. The configured default
-low. If you are looking for maximum speed, you may want to experiment
+value (from this file) is now 50 MB, and should be ok in many cases.
-with values between 20 and
+You can set it as low as 10 to conserve memory, but if you are looking
-80. In my experience, values beyond 100 are always counterproductive. If
+for maximum speed, you may want to experiment with values between 20 and
 200. In my experience, values beyond this are always counterproductive. If
 you find otherwise, please drop me a note.
 .TP
 .BI "filtermaxseconds = "int
@ -481,6 +568,25 @@ Override logfilename for the indexer in real time
 mode. The default is to use the idx... values if set, else
 the log... values.
 .TP
 .BI "orgidxconfdir = "dfn
 Original location of the configuration directory. This is used exclusively for movable datasets. Locating the
 configuration directory inside the directory tree makes it possible to
 provide automatic query time path translations once the data set has
 moved (for example, because it has been mounted on another
 location).
 .TP
 .BI "curidxconfdir = "dfn
 Current location of the configuration directory. Complement orgidxconfdir for movable datasets. This should be used
 if the configuration directory has been copied from the dataset to
 another location, either because the dataset is readonly and an r/w copy
 is desired, or for performance reasons. This records the original moved
 location before copy, to allow path translation computations.  For
 example if a dataset originally indexed as '/home/me/mydata/config' has
 been mounted to '/media/me/mydata', and the GUI is running from a copied
 configuration, orgidxconfdir would be '/home/me/mydata/config', and
 curidxconfdir (as set in the copied configuration) would be
 '/media/me/mydata/config'.
 .TP
 .BI "idxrundir = "dfn
 Indexing process current directory. The input
 handlers sometimes leave temporary files in the current directory, so it
@ -519,6 +625,12 @@ amount of data stored in the index for the purpose of displaying fields
 inside result lists or previews. The default value is 150 bytes which
 may be too low if you have custom fields.
 .TP
 .BI "idxtexttruncatelen = "int
 Truncation length for all document texts. Only index
 the beginning of documents. This is not recommended except if you are
 sure that the interesting keywords are at the top and have severe disk
 space issues.
 .TP
 .BI "aspellLanguage = "string
 Language definitions to use when creating the aspell
 dictionary. The value must match a set of aspell language
@ -612,16 +724,39 @@ Attempt OCR of PDF files with no text content if both tesseract and
 pdftoppm are installed. The default is off because OCR is so
 very slow.
 .TP
 .BI "pdfocrlang = "string
 Language to assume for PDF OCR. This is very important for having a reasonable rate of errors
 with tesseract. This can also be set through a configuration variable
 or directory-local parameters. See the rclpdf.py script.
 .TP
 .BI "pdfattach = "bool
 Enable PDF attachment extraction by executing pdftk (if
 available). This is
 normally disabled, because it does slow down PDF indexing a bit even if
 not one attachment is ever found.
 .TP
 .BI "pdfextrameta = "string
 Extract text from selected XMP metadata tags. This
 is a space-separated list of qualified XMP tag names. Each element can also
 include a translation to a Recoll field name, separated by a '|'
 character. If the second element is absent, the tag name is used as the
 Recoll field names. You will also need to add specifications to the
 "fields" file to direct processing of the extracted data.
 .TP
 .BI "pdfextrametafix = "fn
 Define name of XMP field editing script. This
 defines the name of a script to be loaded for editing XMP field
 values. The script should define a 'MetaFixer' class with a metafix()
 method which will be called with the qualified tag name and value of each
 selected field, for editing or erasing. A new instance is created for
 each document, so that the object can keep state for, e.g. eliminating
 duplicate values.
 .TP
 .BI "mhmboxquirks = "string
 Enable thunderbird/mozilla-seamonkey mbox format quirks Set this for the directory where the email mbox files are
 stored.
 .SH SEE ALSO
 .PP
 recollindex(1) recoll(1)
--- a/src/doc/user/usermanual.html
+++ b/src/doc/user/usermanual.html
--- a/src/doc/user/usermanual.xml
+++ b/src/doc/user/usermanual.xml
@ -8,6 +8,7 @@
 <!ENTITY RCLVERSION "1.25">
 <!ENTITY XAP "<application>Xapian</application>">
 <!ENTITY WIN "<application>Windows</application>">
 <!ENTITY LIN "<application>Unix</application>-like systems">
 <!ENTITY FAQS "https://www.lesbonscomptes.com/recoll/faqsandhowtos/">
 ]>
@ -89,7 +90,7 @@
        </menuchoice>, then adjust the <guilabel>Top
      directories</guilabel> section).</para>
-      <para>On Unix/Linux, you may need to install the
+      <para>On &LIN;, you may need to install the
      appropriate
      <link linkend="RCL.INSTALL.EXTERNAL">supporting applications</link> 
      for document types that need them (for
@ -177,16 +178,13 @@
      <para>The &XAP; index can be big (roughly the size of the original
      document set), but it is not a document archive. &RCL; can only
      display documents that still exist at the place from which they were
-      indexed. (Actually, there is a way to reconstruct a document from the
+      indexed.</para>
      information in the index, but only the pure text is saved, possibly
      without punctuation and capitalization, depending on &RCL;
      version).</para>
      <para>&RCL; stores all internal data in <application>Unicode
-      UTF-8</application> format, and it can index files of many types
+      UTF-8</application> format, and it can index many types of files
      with different character sets, encodings, and languages into the
      same index. It can process documents embedded inside other
-      documents (for example a pdf document stored inside a Zip
+      documents (for example a PDF document stored inside a Zip
      archive sent as an email attachment...), down to an arbitrary
      depth.</para>
@ -233,25 +231,17 @@
      <link linkend="RCL.INDEXING.CONFIG.SENS">index case and diacritics sensitivity</link>.
      </para>
-      <para>&RCL; has many parameters which define exactly what to
+      <para>&RCL; uses many parameters to define exactly what to index,
-      index, and how to classify and decode the source
+      and how to classify and decode the source documents. These are kept
-      documents. These are kept in
+      in <link linkend="RCL.INDEXING.CONFIG">configuration files</link>.  A
-      <link linkend="RCL.INDEXING.CONFIG">configuration files</link>. 
+      default configuration is copied into a standard location (usually
-      A default configuration is copied into a standard location
+      something like <filename>/usr/share/recoll/examples</filename>)
-      (usually something like
+      during installation. The default values set by the configuration
-      <filename>/usr/share/recoll/examples</filename>)
+      files in this directory may be overridden by values set inside your
-      during installation. The default values set by the
+      personal configuration. With the default configuration, &RCL; will
-      configuration files in this directory may be overridden by
+      index your home directory with generic parameters. The configuration
-      values set inside your personal configuration, found
+      can be customized either by editing the text files or by using
-      by default in the <filename>.recoll</filename> sub-directory
+      configuration menus in the <command>recoll</command> GUI.</para>
      of your home directory. The default configuration will index
      your home directory with default parameters and should be
      sufficient for giving &RCL; a try, but you may want to adjust
      it later, which can be done either by editing the text files
      or by using configuration menus in the
      <command>recoll</command> GUI. Some other parameters affecting only
      the <command>recoll</command> GUI are stored in the standard
      location defined by <application>Qt</application>.</para>
      <para>The <link linkend="RCL.INDEXING.PERIODIC.EXEC">indexing process</link>
      is started automatically (after asking permission), the
@ -265,7 +255,7 @@
      <para><link linkend="RCL.SEARCH">Searches</link> are usually
      performed inside the <command>recoll</command> GUI, which has many
      options to help you find what you are looking for. However, there
-      are other ways to perform &RCL; searches:
+      are other ways to query the index:
      <itemizedlist>
        <listitem><para>A
        <link linkend="RCL.SEARCH.COMMANDLINE">command line interface</link>.
@ -328,41 +318,44 @@
      <sect2 id="RCL.INDEXING.INTRODUCTION.MODES">
        <title>Indexing modes</title> 
-        <para>&RCL; indexing can be performed along two main modes:
+        <para>&RCL; indexing can be performed along two main modes:</para>
        <itemizedlist>
          <listitem>
-            <formalpara>
+            <formalpara><title>
-              <title><link linkend="RCL.INDEXING.PERIODIC">Periodic (or batch) indexing:</link></title>
+              <link linkend="RCL.INDEXING.PERIODIC">Periodic (or batch) indexing</link>
            </title>
            <para><command>recollindex</command> is executed
-              at discrete times. The typical usage is to have a nightly run
+            at discrete times. On &LIN;, the typical usage is to have a
-              <link linkend="RCL.INDEXING.PERIODIC.AUTOMAT">programmed</link> into
+            nightly run 
-              your <command>cron</command> file.</para>
+            <link linkend="RCL.INDEXING.PERIODIC.AUTOMAT">programmed</link>
            into your <command>cron</command> file. On &WIN;, this is
            the only mode available, and the indexer is usually started
            from the GUI (but there is nothing to prevent starting it
            from a command script).</para>
            </formalpara>
          </listitem>
          <listitem>
-            <formalpara><title><link linkend="RCL.INDEXING.MONITOR">Real time indexing:</link></title>
+            <formalpara><title>
-            <para><command>recollindex</command> runs permanently as a
+              <link linkend="RCL.INDEXING.MONITOR">Real time indexing</link>
-            daemon and uses a file system alteration monitor
+            </title>
            <para>(Only available on &LIN;). <command>recollindex</command> runs
            permanently as a daemon and uses a file system alteration monitor
            (e.g. <application>inotify</application>) to detect file
-            changes. New or updated files are indexed at once.</para>
+            changes. New or updated files are indexed at once. Monitoring a
            big file system tree can consume 
            significant system resources. </para>
            </formalpara>
          </listitem>
        </itemizedlist>
        </para>
        <simplesect><title>&LIN;: choosing an indexing mode</title>
        <para>The choice between the two methods is mostly a matter of
        preference, and they can be combined by setting up multiple
        indexes (ie: use periodic indexing on a big documentation
        directory, and real time indexing on a small home
-        directory). Monitoring a big file system tree can consume
+        directory), or, with &RCL; 1.24 and newer, by
-        significant system resources.</para>
+        <link linkend="RCL.INDEXING.MONITOR">configuring the index so that only a subset of the tree will be monitored.</link>
-
+        </para>
        <para>With &RCL; 1.24 and newer, it is also possible to set up an
        index so that only a subset of the tree will be monitored and the
        rest will be covered by batch/incremental indexing.  (See the
        details in the <link linkend="RCL.INDEXING.MONITOR">Real time indexing</link>
        section.</para>
        <para>The choice of method and the parameters used can be
        configured from the <command>recoll</command> GUI:
        <menuchoice>
@ -370,21 +363,7 @@
          <guimenuitem>Indexing schedule</guimenuitem>
        </menuchoice>
        </para>
-
+        </simplesect>
        <para>The GUI <menuchoice><guimenu>File</guimenu>
        </menuchoice> menu also has entries to start or stop
        the current indexing operation. Stopping indexing is performed by
        killing the <command>recollindex</command> process, which will
        checkpoint its state and exit. A later restart of indexing will
        mostly resume from where things stopped (the file tree walk has to
        be restarted from the beginning).</para>
        <para>When the real time indexer is running, two operations are
        available from the menu: 'Stop' and 'Trigger incremental pass'.
        When no indexing is running, you have a choice of updating the
        index or rebuilding it (the first choice only processes changed
        files, the second one zeroes the index before starting so that all
        files are processed).</para>
      </sect2>
@ -396,11 +375,13 @@
        in which several configuration files describe
        what should be indexed and how.</para>
-        <para>A default personal configuration directory
+        <para>When <command>recoll</command> or
-        (<filename>$HOME/.recoll/</filename>) is created
+        <command>recollindex</command> is first executed, it creates a
-        when a &RCL; program is first executed. This configuration is
+        default configuration directory. This configuration is the one used
-        the one used for indexing and querying when no specific
+        for indexing and querying when no specific configuration is
-        configuration is specified.</para>
+        specified. It is located in <filename>$HOME/.recoll/</filename> for
        &LIN; and <filename>%LOCALAPPDATA%</filename> on &WIN;
        (typically <filename>C:\Users\[me]\Appdata\Local</filename>).</para>
        <para>All configuration parameters have defaults, defined in
        system-wide files. Without further customisation, the default
@ -431,33 +412,6 @@
        machines), and then merging them, or querying them in
        parallel.</para>
        <para>A specific configuration can be selected by setting the
        <envar>RECOLL_CONFDIR</envar> environment variable, or giving the
        <option>-c</option> option to any of the &RCL; commands.</para>
        <para>When creating or updating indexes, the different
        configurations are entirely independant (no parameters are ever
        shared between configurations when indexing). The
        <command>recollindex</command> program always works on a single
        index.</para>
        <para>When querying, multiple indexes can be accessed concurrently,
        either from the GUI or the command line. When doing this, there is
        always one main configuration, from which both configuration and
        index data are used. Only the index data from the additional
        indexes is used (their configuration parameters are
        ignored).</para>
        <para>The behaviour of index update and query regarding multiple
        configurations is important and sometimes confusing, so it will be
        rephrased here: for index generation, multiple configurations are
        totally independant from each other. When querying, configuration
        and data are used from the main index (the one designated by
        <literal>-c</literal> or <envar>RECOLL_CONFDIR</envar>), and only
        the data from the additional indexes is used. This implies
        that some parameters should be consistent among the configurations
        for indexes which are to be used together.</para>
        <para>See the section about
        <link linkend="RCL.INDEXING.CONFIG.MULTIPLE">configuring multiple indexes</link>
        for more detail</para>
@ -751,27 +705,26 @@
      <link linkend="RCL.INDEXING.CONFIG.GUI">dialogs in the <command>recoll</command> GUI</link>.
      </para>
-      <para>The first time you start <command>recoll</command>, you
+      <para>The first time you start <command>recoll</command>, you will be
-      will be asked whether or not you would like it to build the
+      asked whether or not you would like it to build the index. If you
-      index. If you want to adjust the configuration before
+      want to adjust the configuration before indexing, just click
-      indexing, just click <guilabel>Cancel</guilabel> at this
+      <guilabel>Cancel</guilabel> at this point, which will get you into
-      point, which will get you into the configuration interface. If
+      the configuration interface. If you exit at this point,
-      you exit at this point, <filename>recoll</filename> will have
+      <filename>recoll</filename> will have created a default configuration
-      created a <filename>~/.recoll</filename> directory containing
+      directory with empty configuration files, which you can then
-      empty configuration files, which you can edit by hand.</para>
+      edit.</para>
      <para>The configuration is documented inside the 
      <link linkend="RCL.INSTALL.CONFIG">installation chapter</link> 
      of this document, or in the
-      <citerefentry>
+      <ulink url="https://www.lesbonscomptes.com/recoll/manpages/recoll.conf.5.html"><citerefentry><refentrytitle>recoll.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry></ulink>
-        <refentrytitle>recoll.conf</refentrytitle>
+      manual page.Both documents are automatically generated from
-        <manvolnum>5</manvolnum>
+      the comments inside the configuration file.</para>
-      </citerefentry>
+
-      man page, but the most current information will most likely be the
+      <para>The most immediately useful variable
      comments inside the sample file. The most immediately useful variable
      is probably
      <link linkend="RCL.INSTALL.CONFIG.RECOLLCONF.TOPDIRS"><varname>topdirs</varname></link>, 
-      which determines what subtrees and files get indexed.</para>
+      which lists the subtrees and files to be indexed.</para>
      <para>The applications needed to index file types other than
      text, HTML or email (ie: pdf, postscript, ms-word...) are
@ -789,67 +742,62 @@
        <para>Multiple &RCL; indexes can be created by using several
        configuration directories which are typically set to index
-        different areas of the file system. A specific index can be
+        different areas of the file system.</para>
-        selected for updating or searching, using the
+
-        <envar>RECOLL_CONFDIR</envar> environment variable or the
+        <para>A specific index can be selected by setting the
        <envar>RECOLL_CONFDIR</envar> environment variable or giving the
        <option>-c</option> option to <command>recoll</command> and
        <command>recollindex</command>.</para>
-        <para>Index configuration parameters can be set either by using a
+        <para>The <command>recollindex</command> program, used for creating
-        text editor on the files, or, for most parameters, by using the
+        or updating indexes, always works on a single index. The different
-        <command>recoll</command> index configuration GUI. In the latter
+        configurations are entirely independant (no parameters are ever
-        case, the configuration directory for which parameters are modified
+        shared between configurations when indexing). </para>
        is the one which was selected by <envar>RECOLL_CONFDIR</envar> or
        the <option>-c</option> parameter, and there is no way to switch
        configurations within the GUI.</para>
-        <para>As a remainder from a previous section, a
+        <para>All the search interfaces (<command>recoll</command>,
        <command>recollindex</command> program instance can only update one
        specific index, and it will only use parameters from a single
        configuration (no parameters are ever shared between configurations
        when indexing). All the query methods (<command>recoll</command>,
        <command>recollq</command>, the Python API, etc.) operate with a
        main configuration, from which both configuration and index data
-        are used, but can also query data from multiple additional
+        are used, and can also query data from multiple additional
        indexes. Only the index data from the latter is used, their
-        configuration parameters are ignored.</para>
+        configuration parameters are ignored. This implies that some
        parameters should be consistent among index configurations which
        are to be used together.</para>
        <para>When searching, the current main index (defined by
        <envar>RECOLL_CONFDIR</envar> or <option>-c</option>) is always
        active. If this is undesirable, you can set up your base
        configuration to index an empty directory.</para>
-        <para>If a set of multiple indexes are to be used together for
+        <para>Index configuration parameters can be set either by using a
-        searches, some configuration parameters must be consistent
+        text editor on the files, or, for most parameters, by using the
-        among the set. These are parameters which need to be the same
+        <link linkend="RCL.INDEXING.CONFIG.GUI"><command>recoll</command> index configuration GUI</link>.
-        when indexing and searching. As the parameters come from the
+        In the latter case, the configuration directory for which
-        main configuration when searching, they need to be compatible
+        parameters are modified is the one which was selected by
-        with what was set when creating the other indexes (which came
+        <envar>RECOLL_CONFDIR</envar> or the <option>-c</option> parameter,
-        from their respective configuration directories).</para>
+        and there is no way to switch configurations within the GUI.</para>
-        <para>Most importantly, all indexes to be queried concurrently must
+        <para>See the <link linkend="RCL.INSTALL.CONFIG.RECOLLCONF">configuration section</link>
-        have the same option concerning character case and diacritics
+        for a detailed description of the parameters</para>
-        stripping, but there are other constraints. Most of the
+
-        relevant parameters are described in the 
+        <para>Some configuration parameters must be consistent among a set
-        <link linkend="RCL.INSTALL.CONFIG.RECOLLCONF.TERMS">linked section</link>.
+        of multiple indexes used together for searches.  Most importantly,
        all indexes to be queried concurrently must have the same option
        concerning character case and diacritics stripping, but there are
        other constraints. Most of the relevant parameters affect the
        <link linkend="RCL.INSTALL.CONFIG.RECOLLCONF.TERMS">term generation</link>.
        </para>
-        <para>The different search interfaces (GUI, command line, ...)
+        <para>Using multiple configurations implies a small
-        have different methods to define the set of indexes to be
+        level of command line or file manager usage. The user must
-        used, see the appropriate section.</para>
+        explicitely create additional configuration directories, the GUI
        will not do it. This is to avoid mistakenly creating additional
        directories when an argument is mistyped. Also, the GUI or the
        indexer must be launched with a specific option or environment to
        work on the right configuration.</para>
-        <para>At the moment, using multiple configurations implies a small
+        <simplesect>
-        level of command line usage. Additional configuration directories
+          <title>In practise: creating and using an additional index</title>
        (beyond <filename>~/.recoll</filename>) must be created by hand
        (<command>mkdir</command> or such), the GUI will not do it. This is
        to avoid mistakenly creating additional directories when an
        argument is mistyped. Also, the GUI or the indexer must be launched
        with a specific option or environment to work on the right
        configuration.</para>
        <para>To be more practical, here follows a few examples of the
        commands need to create, configure, update, and query an additional
        index.</para>
        <para>Initially creating the configuration and index:<programlisting>
 mkdir <replaceable>/path/to/my/new/config</replaceable></programlisting></para>
@ -858,15 +806,19 @@ mkdir <replaceable>/path/to/my/new/config</replaceable></programlisting></para>
        <command>recoll</command> GUI, launched from the
        command line to pass the <literal>-c</literal> option
        (you could create a desktop file to do it for you), and then using the
-        GUI index configuration tool to set up the index.
+        <link linkend="RCL.INDEXING.CONFIG.GUI">GUI index configuration tool</link>
        to set up the index.
        <programlisting>
 recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
 </para>
         <para>Alternatively, you can just start a text editor on the main
-         configuration file
+         configuration file:
-         <link linkend="RCL.INSTALL.CONFIG.RECOLLCONF"><filename>recoll.conf</filename></link>.</para> 
+         <programlisting>
 <replaceable>someEditor</replaceable> <replaceable>/path/to/my/new/config</replaceable>/<link linkend="RCL.INSTALL.CONFIG.RECOLLCONF"><filename>recoll.conf</filename></link>
 </programlisting>
         </para>
 <para>Creating and updating the index can be done from the command line:
@ -891,7 +843,7 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
          <guimenu>Preferences</guimenu>
          <guimenuitem>External Index Dialog</guimenuitem> 
        </menuchoice> menu.</para> 
-
+      </simplesect>
      </sect2>
@ -911,9 +863,8 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        the index. With a stripped index, the search term will be stripped
        before searching.</para>
-        <para>A raw index allows for another possibility which a stripped
+        <para>A raw index allows using case and diacritics to discriminate
-        index cannot offer: using case and diacritics to discriminate
+        between terms, e.g., returning different results when searching for
        between terms, returning different results when searching for
        <literal>US</literal> and <literal>us</literal> or
        <literal>resume</literal> and <literal>résumé</literal>.
        Read the
@ -927,15 +878,14 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        automated by &RCL;), and all indexes in a search must be set
        in the same way (again, not checked by &RCL;). </para>
-        <para>If the <literal>indexStripChars</literal> is not set, &RCL;
+        <para>&RCL; creates a stripped index by default if
-        1.18 creates a stripped index by default, for
+        <literal>indexStripChars</literal> is not set.</para>
        compatibility with previous versions.</para>
        <para>As a cost for added capability, a raw index will be slightly
        bigger than a stripped one (around 10%). Also, searches will be
        more complex, so probably slightly slower, and the feature is
-        still young, so that a certain amount of weirdness cannot be
+        relatively little used, so that a certain amount of weirdness
-        excluded.</para> 
+        cannot be excluded.</para>
        <para>One of the most adverse consequence of using a raw index
        is that some phrase and proximity searches may become
@ -950,7 +900,7 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
      <sect2 id="RCL.INDEXING.CONFIG.THREADS">
-        <title>Indexing threads configuration</title>
+        <title>Indexing threads configuration (&LIN;)</title>
        <para>The &RCL; indexing process 
        <command>recollindex</command> can use multiple threads to
@ -1363,7 +1313,7 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
    <sect1 id="RCL.INDEXING.PERIODIC">
      <title>Periodic indexing</title>
-      <sect2 id="RCL.INDEXING.PERIODIC.EXEC">
+      <simplesect id="RCL.INDEXING.PERIODIC.EXEC">
        <title>Running indexing</title>
        <para>Indexing is always performed by the
@ -1381,19 +1331,36 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        when it starts, it will automatically start indexing (except
        if canceled).</para>
-        <para>The <command>recollindex</command> indexing process can be
+        <para>The GUI <menuchoice><guimenu>File</guimenu> </menuchoice>
-        interrupted by sending an interrupt (<keysym>Ctrl-C</keysym>, 
+        menu has entries to start or stop the current indexing
-        SIGINT) or terminate
+        operation.</para>
-        (SIGTERM) signal. Some time may elapse before the process exits,
+
-        because it needs to properly flush and close the index. This can
+        <para>When no indexing is running, you have a choice of updating the
-        also be done from the <command>recoll</command> GUI
+        index or rebuilding it (the first choice only processes changed
        files, the second one zeroes the index before starting so that all
        files are processed).</para>
        <para>On Linux, the <command>recollindex</command> indexing process
        can be interrupted by sending an interrupt
        (<keysym>Ctrl-C</keysym>, SIGINT) or terminate (SIGTERM)
        signal. 
        </para>
        <para>On Linux and Windows, the GUI can  used to manage the indexing
        operation. Stopping the indexer can be done
        from the <command>recoll</command> GUI
        <menuchoice>
          <guimenu>File</guimenu>
          <guimenuitem>Stop Indexing</guimenuitem>
        </menuchoice>
-        menu entry.</para>
+        menu entry.
        </para>
-        <para>After such an interruption, the index will be somewhat
+        <para>When stopped, some time may elapse before
        <command>recollindex</command> exits, because it needs to properly
        flush and close the index.</para>
        <para>After an interruption, the index will be somewhat
        inconsistent because some operations which are normally
        performed at the end of the indexing pass will have been
        skipped (for example, the stemming and spelling databases
@ -1404,9 +1371,11 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        to the interruption and for which the index is still up to
        date will not need to be reindexed).</para>
-        <para><command>recollindex</command> has a number of other options
+        <para><command>recollindex</command> has many options
-        which are described in its man page. Only a few will be
+        which are listed in its
-        described here.</para>
+        <ulink url="https://www.lesbonscomptes.com/recoll/manpages/recollindex.1.html">manual page</ulink>.
        Only a few will be described here.</para>
        <para>Option <option>-z</option> will reset the index when
        starting. This is almost the same as destroying the index
        files (the nuance is that the &XAP; format version will not
@ -1446,11 +1415,10 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        but just add them as index entries. It is
        up to the external file selection method to build the complete
        file list.</para>
-      </sect2>
+      </simplesect>
-      <sect2 id="RCL.INDEXING.PERIODIC.AUTOMAT">
+      <simplesect id="RCL.INDEXING.PERIODIC.AUTOMAT">
-        <title>Using <command>cron</command> to automate
+        <title>Linux: using <command>cron</command> to automate indexing</title>
        indexing</title>
        <para>The most common way to set up indexing is to have a cron
        task execute it every night. For example the following
@ -1468,7 +1436,7 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        ]]></screen>
        </para>
-        <para>As of version 1.17 the &RCL; GUI has dialogs to manage
+        <para>The &RCL; GUI has dialogs to manage
        <filename>crontab</filename> entries for
        <command>recollindex</command>. You can reach them from the
        <menuchoice>
@ -1492,11 +1460,11 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        issues.</para>
-      </sect2>
+      </simplesect>
    </sect1>
    <sect1 id="RCL.INDEXING.MONITOR">
-      <title>Real time indexing</title>
+      <title>&LIN;: real time indexing</title>
      <para>Real time monitoring/indexing is performed by starting the
      <command>recollindex</command> <option>-m</option> command.
@ -1504,6 +1472,11 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
      from the terminal and become a daemon, permanently monitoring
      file changes and updating the index.</para>
      <para>In this situation, the <command>recoll</command> GUI
      <menuchoice><guimenu>File</guimenu></menuchoice> menu
      makes two operations available: 'Stop' and 'Trigger incremental pass'.
      </para>
      <para>While it is convenient that data is indexed in real time,
      repeated indexing can generate a significant load on the
      system when files such as email folders change. Also,
@ -1522,8 +1495,8 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
      process. The <command>recoll</command> GUI also has a menu entry for
      this.</para>
-      <sect2 id="RCL.INDEXING.MONITOR.START">
+      <simplesect id="RCL.INDEXING.MONITOR.START">
-        <title>Real time indexing: automatic daemon start</title>
+        <title>Automatic daemon start</title>
        <para>Under <application>KDE</application>,
        <application>Gnome</application> and some other desktop
@ -1542,17 +1515,15 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        <filename>examples</filename> directory (typically
        <filename>/usr/local/[share/]recoll/examples</filename>).</para>
-        <para>For example, my out of fashion
+        <para>For example, a good old <application>xdm</application>-based
-        <application>xdm</application>-based session has a
+        session could have a <filename>.xsession</filename> script with the
-        <filename>.xsession</filename> script with the following lines
+        following lines  at the end:</para>
        at the end:</para>
        <programlisting>recollconf=$HOME/.recoll-home
 recolldata=/usr/local/share/recoll
 RECOLL_CONFDIR=$recollconf $recolldata/examples/rclmon.sh start
 fvwm 
 </programlisting>
        <para>The indexing daemon gets started, then the window manager,
@ -1567,10 +1538,10 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        <application>X11</application> session, you need to add option
        <option>-x</option> to disable <application>X11</application>
        session monitoring (else the daemon will not start).</para>
-      </sect2>
+      </simplesect>
-      <sect2 id="RCL.INDEXING.MONITOR.DETAILS">
+      <simplesect id="RCL.INDEXING.MONITOR.DETAILS">
-        <title>Real time indexing: miscellaneous details</title>
+        <title>Miscellaneous details</title>
        <para>By default, the messages from the indexing daemon will be
        sent to the same file as those from the interactive commands
@ -1581,17 +1552,7 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        the daemon runs permanently, the log file may grow quite big,
        depending on the log level.</para>
-        <para>When building &RCL;, the real time indexing support can be
+        <formalpara><title>Increasing resources for inotify</title>
        customised during package
        <link linkend="RCL.INSTALL.BUILDING">configuration</link>
        with the <option>--with[out]-fam</option> or
        <option>--with[out]-inotify</option> options.  The default is
        currently to include <application>inotify</application>
        monitoring on systems that support it, and, as of &RCL; 1.17,
        <application>gamin</application> support on
        <application>FreeBSD</application>.</para>
        <note><title>Increasing resources for inotify</title>
        <para>On Linux systems, monitoring a big tree may need
        increasing the resources available to inotify, which are
        normally defined in <filename>/etc/sysctl.conf</filename>.
@ -1609,29 +1570,28 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
 fs.inotify.max_user_watches=32768
        </programlisting>
-        </para>
+        Especially, you will need to trim your tree or adjust
        <para>Especially, you will need to trim your tree or adjust
        the <literal>max_user_watches</literal> value if indexing exits with
        a message about errno <literal>ENOSPC</literal> (28) from
-        <function>inotify_add_watch</function>.</para>
+        <function>inotify_add_watch</function>.
-        </note>
+        </para>
        </formalpara>
-        <note><title>Slowing down the reindexing rate for fast changing
+        <formalpara><title>Slowing down the reindexing rate for fast changing
        files</title>
        <para>When using the real time monitor, it may happen that some
        files need to be indexed, but change so often that they impose an
-        excessive load for the system.</para>
+        excessive load for the system.
-        <para>&RCL; provides a configuration option to specify the minimum
+        &RCL; provides a configuration option to specify the minimum
        time before which a file, specified by a wildcard pattern, cannot be
        reindexed. See the <varname>mondelaypatterns</varname> parameter in
        the <link linkend="RCL.INSTALL.CONFIG.RECOLLCONF.MISC">configuration section</link>.
        </para>
-        </note>
+        </formalpara>
-      </sect2>
+      </simplesect>
    </sect1>
@ -1660,12 +1620,9 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        </listitem>
      </itemizedlist>
-      <para>In most cases, you can enter the terms as you
+      <para>In most cases, you can enter the terms as you think them, even
-      think them, even if they contain embedded punctuation or other
+      if they contain embedded punctuation or other non-textual characters
-      non-textual characters. For
+      (e.g. &RCL; can handle things like email addresses).</para>
      example, &RCL; can handle things like email addresses, or
      arbitrary cut and paste from another text window, punctation
      and all.</para>
      <para>The main case where you should enter text differently from
      how it is printed is for east-asian languages (Chinese,
@ -1674,10 +1631,10 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
      case (they would typically be printed without white
      space).</para>
-      <para>Some searches can be quite complex, and you may want to
+      <para>Some searches can be quite complex, and you may want to re-use
-      re-use them later, perhaps with some tweaking. &RCL; versions
+      them later, perhaps with some tweaking. &RCL; can save and restore
-      1.21 and later can save and restore searches, using XML files. See 
+      searches. See <link linkend="RCL.SEARCH.SAVING">Saving and restoring
-      <link linkend="RCL.SEARCH.SAVING">Saving and restoring queries</link>.
+      queries</link>.
      </para>
      <sect2 id="RCL.SEARCH.GUI.SIMPLE">
@ -1704,12 +1661,9 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        documents containing all of the search terms (the ones with more
        terms will get better scores), just like the <guilabel>All
        terms</guilabel> mode. <guilabel>Any term</guilabel> will search
-        for documents where at least one of the terms appear.</para>
+        for documents where at least one of the terms
-
+        appear. <guilabel>File name</guilabel> will exclusively look for
-        <para>The <guilabel>Query Language</guilabel> features are
+        file names, not contents</para>
        described in
        <link linkend="RCL.SEARCH.LANG">a separate section</link>.
        </para>  
        <para>All search modes allow terms to be expanded with wildcards
        characters (<literal>*</literal>, <literal>?</literal>,
@ -1717,11 +1671,21 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        <link linkend="RCL.SEARCH.WILDCARDS">section about wildcards</link> for
        more details.</para>
        <para>In all modes except <guilabel>File name</guilabel>, you can
        search for exact phrases (adjacent words in a given order) by
        enclosing the input inside double quotes. Ex:
        <literal>"virtual reality"</literal>.</para>
        <para>The <guilabel>Query Language</guilabel> features are
        described in
        <link linkend="RCL.SEARCH.LANG">a separate section</link>.
        </para>  
        <para>The <guilabel>File name</guilabel> search mode will
        specifically look for file names. The point of having a separate
        file name search is that wild card expansion can be performed more
        efficiently on a small subset of the index (allowing wild cards on
-        the left of terms without excessive penality).  Things to know:
+        the left of terms without excessive cost). Things to know:
        <itemizedlist>
          <listitem><para>White space in the entry should match white
          space in the file name, and is not treated specially.</para>
@ -1743,11 +1707,6 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
        </itemizedlist>
        </para>
        <para>In all modes except <guilabel>File name</guilabel>, you can
        search for exact phrases (adjacent words in a given order) by
        enclosing the input inside double quotes. Ex:
        <literal>"virtual reality"</literal>.</para>
        <para>When using a stripped index (the default), character case has
        no influence on search, except that you can disable stem expansion
        for any term by capitalizing it. Ie: a search for
@ -3403,20 +3362,19 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
      <command>recoll</command>). The query to be executed is specified
      as command line arguments.</para> 
-      <para><command>recollq</command> is not built by default. You can
+      <para><command>recollq</command> is not always built by default. You
-      use the <filename>Makefile</filename> in the
+      can use the <filename>Makefile</filename> in the
      <filename>query</filename> directory to build it. This is a very
      simple program, and if you can program a little c++, you may find it
-      useful to taylor its output format to your needs. Not that recollq is
+      useful to taylor its output format to your needs. Apart from being
-      only really useful on systems where the Qt libraries (or even the X11
+      easily customised, <command>recollq</command> is only really useful
-      ones) are not available. Otherwise, just use
+      on systems where the Qt libraries are not available, else it is
-      <literal>recoll -t</literal>, which takes the exact same
+      redundant with <literal>recoll -t</literal>.</para>
      parameters and options which
      are described for <command>recollq</command></para> 
-      <para><command>recollq</command> has a man page (not installed by
+      <para><command>recollq</command> has a
-      default, look in the <filename>doc/man</filename> directory). The
+      <ulink url="https://www.lesbonscomptes.com/recoll/manpages/recollq.1.html">man page</ulink>. 
-      Usage string is as follows:</para>
+
      The Usage string is as follows:</para>
      <programlisting>
 recollq: usage:
 -P: Show the date span for all the documents present in the index
@ -3455,9 +3413,9 @@ recoll -c <replaceable>/path/to/my/new/config</replaceable></programlisting>
      </programlisting>
      <para>Sample execution:</para>
-      <programlisting>recollq 'ilur -nautique mime:text/html'
+      <programlisting>
-      Recoll query: ((((ilur:(wqf=11) OR ilurs) AND_NOT (nautique:(wqf=11)
+recollq 'ilur -nautique mime:text/html'
-      OR nautiques OR nautiqu OR nautiquement)) FILTER Ttext/html))
+Recoll query: ((((ilur:(wqf=11) OR ilurs) AND_NOT (nautique:(wqf=11) OR nautiques OR nautiqu OR nautiquement)) FILTER Ttext/html))
 4 results
 text/html       [file:///Users/uncrypted-dockes/projets/bateaux/ilur/comptes.html]      [comptes.html]  18593   bytes   
 text/html       [file:///Users/uncrypted-dockes/projets/nautique/webnautique/articles/ilur1/index.html] [Constructio...
@ -5835,9 +5793,8 @@ for i in range(nres):
    <sect1 id="RCL.INSTALL.EXTERNAL">
      <title>Supporting packages</title>
-      <note><para>The &WIN; installation of &RCL; is self-contained, and
+      <note><para>The &WIN; installation of &RCL; is self-contained.
-      only needs Python 2.7 to be externally installed. &WIN; users can
+      &WIN; users can skip this section.</para></note>
      skip this section.</para></note>
      <para>&RCL; uses external applications to index some file
      types. You need to install them for the file types that you wish to
@ -5851,134 +5808,46 @@ for i in range(nres):
      <filename>missing</filename> text file inside the configuration
      directory.</para>
-      <para>A list of common file types which need external
+      <para>The past has proven that I was unable to maintain an up to date
-      commands follows. Many of the handlers need the
+      application list in this manual. Please check &RCLAPPS; for a
-      <command>iconv</command> command, which is not always listed as a
+      complete list along with links to the home pages or best
-      dependancy.</para> 
+      source/patches pages, and misc tips. What follows is only a
      very short extract of the stable essentials.</para>
      <para>Please note that, due to the relatively dynamic nature of this
      information, the most up to date version is now kept on &RCLAPPS;
      along with links to the home pages or best source/patches pages,
      and misc tips. The list below is not updated often and may be quite
      stale.</para>
      <para>For many Linux distributions, most of the commands listed can
      be installed from the package repositories. However, the packages
      are sometimes outdated, or not the best version for &RCL;, so you
      should take a look at &RCLAPPS; if a file
      type is important to you.</para>
      <para>As of &RCL; release 1.14, a number of XML-based formats that
      were handled by ad hoc handler code now use the
      <command>xsltproc</command> command, which usually comes with  
      <application>libxslt</application>. These are: abiword, fb2
      (ebooks), kword, openoffice, svg.</para> 
      <para>Now for the list:</para>
      <itemizedlist>
        <listitem><para>Openoffice files need <command>unzip</command> and
        <command>xsltproc</command>.</para></listitem>
        <listitem><para>PDF files need <command>pdftotext</command>
        which is part of <application>Poppler</application> (usually
        comes with the <literal>poppler-utils</literal>
        package). Avoid the original one from 
        <application>Xpdf</application>.</para></listitem>
-        <listitem><para>Postscript files need <command>pstotext</command>. 
+        <listitem><para>MS Word documents need
        The original version has an issue with shell
        character in file names, which is corrected in recent
        packages. See &RCLAPPS; for more detail.</para>
        </listitem>
        <listitem><para>MS Word needs
        <command>antiword</command>. It is also useful to have
        <command>wvWare</command> installed as it may be 
        be used as a fallback for some files which
        <command>antiword</command> does not handle.</para></listitem>
        <listitem><para>MS Excel and PowerPoint are processed by
        internal <command>Python</command> handlers.</para></listitem>
        <listitem><para>MS Open XML (docx) needs <command>
        xsltproc</command>.</para></listitem>
        <listitem><para>Wordperfect files need <command>wpd2html</command>
        from the <application>libwpd</application> (or
        <application>libwpd-tools</application> on Ubuntu)
        package.</para></listitem>
        <listitem><para>RTF files need <command>unrtf</command>,
        which, in its older versions, has much trouble with
        non-western character sets. Many Linux distributions carry
        outdated <command>unrtf</command> versions. Check
        &RCLAPPS; for details.</para></listitem>
        <listitem><para>TeX files need <command>untex</command> or
        <command>detex</command>. Check &RCLAPPS; for sources if it's not
        packaged for your distribution.</para></listitem>
        <listitem><para>dvi files need <command>dvips</command>.</para>
        </listitem>
        <listitem><para>djvu files need <command>djvutxt</command> and
        <command>djvused</command> from the
        <application>DjVuLibre</application> package.</para></listitem>
        <listitem><para>Audio files: &RCL; releases 1.14 and later use
        a single <application>Python</application> handler based 
        on <application>mutagen</application> for all audio file
        types.</para>
        </listitem>
        <listitem><para>Pictures: &RCL; uses the 
        <application>Exiftool</application>
        <application>Perl</application> package to extract tag
-        information. Most image file formats are supported. Note that
+        information. Most image file formats are
-        there may not be much interest in indexing the technical tags
+        supported.</para></listitem> 
        (image size, aperture, etc.). This is only of interest if you
        store personal tags or textual descriptions inside the image
        files.</para></listitem>
-        <listitem><para>chm: files in Microsoft help format need Python and
+        <listitem><para>Up to &RCL; 1.24, many XML-based formats need the
-        the <application>pychm</application> module (which needs 
+        <command>xsltproc</command> command, which usually comes with
-        <application>chmlib</application>).</para></listitem>
+        <application>libxslt</application>. These are: abiword, fb2
-
+        ebooks, kword, openoffice, opendocument svg. &RCL; 1.25 and later
-        <listitem><para>ICS: up to &RCL; 1.13, iCalendar files need 
+        process them internally (using libxslt).</para>
        <application>Python</application>
        and the <application>icalendar</application>
        module. <application>icalendar</application> is not needed for newer
        versions,  which use internal code.</para></listitem> 
        <listitem><para>Zip archives need <application>Python</application>
        (and the standard zipfile module). </para></listitem>
        <listitem><para>Rar archives need
        <application>Python</application>, the
        <application>rarfile</application> Python module and the
        <command>unrar</command> utility.</para></listitem>
        <listitem><para>Midi karaoke files need
        <application>Python</application> and the 
        <ulink url="http://pypi.python.org/pypi/midi/0.2.1">
          <application>Midi module</application></ulink></para>
        </listitem>
        <listitem><para>Konqueror webarchive format with Python (uses the
        Tarfile module).</para></listitem>
        <listitem><para>Mimehtml web archive format (support based on
        the email handler, which introduces some mild weirdness, but
        still usable).</para></listitem>
      </itemizedlist>
      <para>Text, HTML, email folders, and Scribus files are
      processed internally. <application>Lyx</application> is used to
      index Lyx files. Many handlers need <command>iconv</command> and the
      standard <command>sed</command> and <command>awk</command>.
      </para>
    </sect1>
@ -6089,9 +5958,10 @@ for i in range(nres):
              terms. </para></listitem>
              <listitem><para><option>--with-fam</option> or
-              <option>--with-inotify</option> will enable the code for
+              <option>--with-inotify</option> will enable the code for real
-              real time indexing. Inotify support is enabled by default on
+              time indexing. Inotify support is enabled by default on Linux
-              recent Linux systems.</para></listitem>
+              systems.</para></listitem>
              <listitem><para><option>--with-qzeitgeist</option> will
              enable sending <application>Zeitgeist</application>
--- a/src/sampleconf/recoll.conf
+++ b/src/sampleconf/recoll.conf
@ -216,9 +216,9 @@ usesystemfilecommand = 1
 # <var name="systemfilecommand" type="string"><brief>Command used to guess
 # MIME types if the internal methods fails</brief><descr>This should be a
 # "file -i" workalike.  The file path will be added as a last parameter to
-# the command line. 'xdg-mime' works better than the traditional 'file'
+# the command line. "xdg-mime" works better than the traditional "file"
 # command, and is now the configured default (with a hard-coded fallback to
-# 'file')</descr></var>
+# "file")</descr></var>
 systemfilecommand = xdg-mime query filetype
 # <var name="processwebqueue" type="bool"><brief>Decide if we process the
@ -885,7 +885,7 @@ snippetMaxPosWalk = 1000000
 # include a translation to a Recoll field name, separated by a '|'
 # character. If the second element is absent, the tag name is used as the
 # Recoll field names. You will also need to add specifications to the
-# 'fields' file to direct processing of the extracted data.</descr></var>
+# "fields" file to direct processing of the extracted data.</descr></var>
 #pdfextrameta =  bibtex:location|location bibtex:booktitle bibtex:pages
 # <var name="pdfextrametafix" type="fn">