*** empty log message ***

2007-07-13 10:24:32 +00:00 · 2007-07-13 10:24:32 +00:00 · 6ed2673331
commit 6ed2673331
parent bf4a2ccf5d
10 changed files with 353 additions and 187 deletions
--- a/src/INSTALL
+++ b/src/INSTALL
@ -23,40 +23,35 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or

   4.4. Configuration overview

+   4.5. Extending Recoll
+
                        4.1. Installing a prebuilt copy

-   Recoll binary installations are always linked statically to the xapian
-   libraries, and have no other dependencies. You will only have to check or
-   install supporting applications for the file types that you want to index
-   beyond text, HTML and mail files.
+   Recoll binary packages from the Recoll web site are always linked
+   statically to the Xapian libraries, and have no other dependencies. You
+   will only have to check or install supporting applications for the file
+   types that you want to index beyond text, HTML and mail files, and maybe
+   have a look at the configuration section (but this may not be necessary
+   for a quick test with default parameters).

 4.1.1. Installing through a package system

   If you use a BSD-type port system or a prebuilt package (RPM or other),
-   just follow the usual procedure, and maybe have a look at the
-   configuration section (but this may not be necessary for a quick test with
-   default parameters).
+   just follow the usual procedure for your system.

 4.1.2. Installing a prebuilt Recoll

-   The unpackaged binary versions are just compressed tar files of a build
-   tree, where only the useful parts were kept (executables and sample
-   configuration).
+   The unpackaged binary versions on the Recoll web site are just compressed
+   tar files of a build tree, where only the useful parts were kept
+   (executables and sample configuration).

   The executable binary files are built with a static link to libxapian and
-   libiconv, to make installation easier (no dependencies). However, this
-   also means that you cannot change the versions which are used.
+   libiconv, to make installation easier (no dependencies).

   After extracting the tar file, you can proceed with installation as if you
   had built the package from source (that is, just type make install). The
   binary trees are built for installation to /usr/local.

-   You may then need to install external applications to process some file
-   types that you want indexed (ie: acrobat, postscript ...). See next
-   section.
-
-   Finally, you may want to have a look at the configuration section.
-
   --------------------------------------------------------------------------

   Prev                                   Home                           Next 
@ -120,9 +115,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
 4.3.1. Prerequisites

   At the very least, you will need to download and install the xapian core
-   package (Recoll development currently uses version 0.9.5), and the qt
-   run-time and development packages (Recoll development currently uses
-   version 3.3.5, but any 3.3 version is probably OK).
+   package (Recoll 1.9 normally uses version 1.0.2, but any 0.9 or 1.0.x
+   version will work too), and the qt run-time and development packages
+   (Recoll development currently uses version 3.3.5, but any 3.3 version is
+   probably OK).

   You will most probably be able to find a binary package for qt for your
   system. You may have to compile Xapian but this is not difficult (if you
@ -135,8 +131,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
 4.3.2. Building

   Recoll has been built on Linux (redhat7.3, mandriva 2005/6, Fedora Core
-   3/4/5), FreeBSD and Solaris 8. If you build on another system, I would
-   very much welcome patches.
+   3/4/5/6), FreeBSD 5/6, macosx, and Solaris 8. If you build on another
+   system, and need to modify things, I would very much welcome patches.

   Depending on the qt configuration on your system, you may have to set the
   QTDIR and QMAKESPECS variables in your environment:
@ -190,9 +186,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
   Link: HOME
   Link: UP
   Link: PREVIOUS
+   Link: NEXT

                               Recoll user manual
-   Prev                     Chapter 4. Installation                           
+   Prev                     Chapter 4. Installation                      Next 

   --------------------------------------------------------------------------

@ -334,20 +331,14 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
           value, and is the default. The daemversion is specific to the
           indexing monitor daemon.

-   filtersdir
-
-           A directory to search for the external filter scripts used to
-           index some types of files. The value should not be changed, except
-           if you want to modify one of the default scripts. The value can be
-           redefined for any sub-directory.
-
   indexstemminglanguages

           A list of languages for which the stem expansion databases will be
-           built. See recollindex(1) for possible values. You can add a stem
-           expansion database for a different language by using recollindex
-           -s, but it will be deleted during the next indexing. Only
-           languages listed in the configuration file are permanent.
+           built. See recollindex(1) or use the recollindex -l command for
+           possible values. You can add a stem expansion database for a
+           different language by using recollindex -s, but it will be deleted
+           during the next indexing. Only languages listed in the
+           configuration file are permanent.

   defaultcharset

@ -357,6 +348,32 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
           character set used is the one defined by the nls environment
           (LC_ALL, LC_CTYPE, LANG), or iso8859-1 if nothing is set.

+   maxfsoccuppc
+
+           Maximum file system occupation before we stop indexing. The value
+           is a percentage, corresponding to what the "Capacity" df output
+           column shows. The default value is 0, meaning no checking.
+
+   idxflushmb
+
+           Threshold (megabytes of new text data) where we flush from memory
+           to disk index. Setting this can help control memory usage. A value
+           of 0 means no explicit flushing, letting Xapian use its own
+           default, which is flushing every 10000 documents (memory usage
+           depends on average document size). The default value is 10.
+
+   filtersdir
+
+           A directory to search for the external filter scripts used to
+           index some types of files. The value should not be changed, except
+           if you want to modify one of the default scripts. The value can be
+           redefined for any sub-directory.
+
+   iconsdir
+
+           The name of the directory where recoll result list icons are
+           stored. You can change this if you want different images.
+
   guesscharset

           Decide if we try to guess the character set of files if no
@ -389,11 +406,6 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
           section or just be the beginning of the text). The default value
           is 250.

-   iconsdir
-
-           The name of the directory where recoll result list icons are
-           stored. You can change this if you want different images.
-
   aspellLanguage

           Language definitions to use when creating the aspell dictionary.
@ -525,29 +537,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
   argument and should output the text contents in html format on the
   standard output.

-   The html could be very minimal like the following example:
-
- <html><head>
- <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
- </head>
- <body>some text content</body></html>
-         
-
-   You should take care to escape some characters inside the text by
-   transforming them into appropriate entities. "&" should be transformed
-   into "&amp;", "<" should be transformed into "&lt;".
-
-   The character set needs to be specified in the header. It does not need to
-   be UTF-8 (Recoll will take care of translating it), but it must be
-   accurate for good results.
-
-   Recoll will also make use of other header fields if they are present:
-   title, description, keywords.
-
-   The easiest way to write a new filter is probably to start from an
-   existing one.
+   You can find more details about writing a Recoll filter in the section
+   about writing filters

   --------------------------------------------------------------------------

-   Prev                               Home                                    
-   Building from source                Up                                     
+   Prev                               Home                               Next 
+   Building from source                Up                    Extending Recoll 
--- a/src/README
+++ b/src/README
@ -11,7 +11,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
   Copyright (c) 2005 Jean-Francois Dockes

   This document introduces full text search notions and describes the
-   installation and use of the Recoll application.
+   installation and use of the Recoll application. It currently describes
+   Recoll 1.9.

   [ Split HTML / Single HTML ]

@ -105,6 +106,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or

                             4.4.5. Examples of configuration adjustments

+                4.5. Extending Recoll
+
+                             4.5.1. Writing a document filter
+
     ----------------------------------------------------------------------

                            Chapter 1. Introduction
@ -370,9 +375,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
   configuration files.

   The configuration is documented inside the installation chapter of this
-   document, or in the recoll.conf(5) man page. The most immediately useful
-   variable you may interested in is probably topdirs, which determines what
-   subtrees get indexed.
+   document, or in the recoll.conf(5) man page, but the most current
+   information will most likely be the comments inside the sample file. The
+   most immediately useful variable you may interested in is probably
+   topdirs, which determines what subtrees get indexed.

   The applications needed to index file types other than text, HTML or email
   (ie: pdf, postscript, ms-word...) are described in the external packages
@ -660,23 +666,6 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
   or lennon and either live or unplugged but not potatoes (in any part of
   the document).

-   The first element author:"john doe" is a phrase search limited to a
-   specific field. Phrase searches are specified as usual by enclosing the
-   words in double quotes. The field specification appears before the colon
-   (of course this is not limited to phrases, author:Balzac would be ok too).
-   Recoll currently manages the following fields:
-
-     * title, subject or caption are synonyms which specify data to be
-       searched for in the document title or subject.
-
-     * author or from for searching the documents originators.
-
-     * keyword for searching the document specified keywords (few documents
-       actually have any).
-
-   The query language is currently the only way to use the Recoll field
-   search capability.
-
   All elements in the search entry are normally combined with an implicit
   AND. It is possible to specify that elements be OR'ed instead, as in
   Beatles OR Lennon. The OR must be entered literally (capitals), and it has
@ -686,8 +675,40 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or

   An entry preceded by a - specifies a term that should not appear.

+   The first element in the above exemple, author:"john doe" is a phrase
+   search limited to a specific field. Phrase searches are specified as usual
+   by enclosing the words in double quotes. The field specification appears
+   before the colon (of course this is not limited to phrases, author:Balzac
+   would be ok too). Recoll currently manages the following fields:
+
+     * title, subject or caption are synonyms which specify data to be
+       searched for in the document title or subject.
+
+     * author or from for searching the documents originators.
+
+     * keyword for searching the document specified keywords (few documents
+       actually have any).
+
+   As of release 1.9, the filters have the possibility to create other fields
+   with arbitrary names. No standard filters use this possibility yet.
+
+   There are two other elements which may be specified through the field
+   syntax, but are somewhat special:
+
+     * ext for specifying the file name extension (Ex: ext:html)
+
+     * mime for specifying the mime type. This one is quite special because
+       you can specify several values which will be OR'ed (the normal default
+       for the language is AND). Ex: mime:text/plain mime:text/html.
+       Specifying an explicit boolean operator or negation (-) before a mime
+       specification is not supported and will produce strange results.
+
+   The query language is currently the only way to use the Recoll field
+   search capability.
+
   Words inside phrases and capitalized words are not stem-expanded.
-   Wildcards may be used anywhere.
+   Wildcards may be used anywhere inside a term. Specifying a wild-card on
+   the left of a term can produce a very slow search.

   You can use the show query link at the top of the result list to check the
   exact query which was finally executed by Xapian.
@ -873,8 +894,13 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
 3.9. Document history

   Documents that you actually view (with the internal preview or an external
-   tool) are entered into the document history, which is remembered. You can
-   display the history list by using the Tools/Doc History menu entry.
+   tool) are entered into the document history, which is remembered.
+
+   You can display the history list by using the Tools/Doc History menu
+   entry.
+
+   You can erase the document history by using the Erase document history
+   entry in the File menu.

     ----------------------------------------------------------------------

@ -891,6 +917,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
   The sort parameters stay in effect until they are explicitly reset, or the
   program exits. An activated sort is indicated in the result list header.

+   Sort parameters are remembered between program invocations, but result
+   sorting is normally always inactive when the program starts. It is
+   possible to keep the sorting activation state between program invocations
+   by checking the Remember sort activation state option in the preferences.
+
     ----------------------------------------------------------------------

 3.11. Search tips, shortcuts
@ -984,6 +1015,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or

          * %D. Date

+          * %I. Icon image name
+
          * %K. Keywords (if any)

          * %L. Preview and Edit links
@ -1002,7 +1035,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or

       The default value for the string is:

- %R %S %L &nbsp;&nbsp;<b>%T</b><br>
+ <img src="%I" align="left">%R %S %L &nbsp;&nbsp;<b>%T</b><br>
 %M&nbsp;%D&nbsp;&nbsp;&nbsp;<i>%U</i><br>
 %A %K
       
@ -1014,19 +1047,30 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
 %A<font color=#008000>%U - %S</font> - %L
       

+       Or the clean looking:
+
+ <img src="%I" align="left">%L <font color="#900000">%R</font>
+   <b>%T</b><br>%S 
+ <font color="#808080"><i>%U</i></font>
+ <table bgcolor="#e0e0e0">
+ <tr><td><div>%A</div></td></tr>
+ </table>%K
+       
+
       The format of the Preview and Edit links is <a href="Pdocnum"> and <a
       href="Edocnum"> where docnum is what %N would print. This makes the
       title a preview link in the above format.

+       Please note that, due to the way the program handles right mouse
+       clicks in the result list, if the custom formatting results in
+       multiple paragraphs per result, right clicks will only work inside the
+       first one.
+
     * HTML help browser: this will let you chose your preferred browser
       which will be started from the Help menu to read the user manual. You
       can enter a simple name if the command is in your PATH, or browse for
       a full pathname.

-     * Show document type icons in result list: icons in the result list can
-       be turned off. They take quite a lot of space and convey relatively
-       little useful information.
-
     * Auto-start simple search on white space entry: if this is checked, a
       search will be executed each time you enter a space in the simple
       search input field. This lets you look at the result list as you enter
@ -1086,42 +1130,35 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or

 4.1. Installing a prebuilt copy

-   Recoll binary installations are always linked statically to the xapian
-   libraries, and have no other dependencies. You will only have to check or
-   install supporting applications for the file types that you want to index
-   beyond text, HTML and mail files.
+   Recoll binary packages from the Recoll web site are always linked
+   statically to the Xapian libraries, and have no other dependencies. You
+   will only have to check or install supporting applications for the file
+   types that you want to index beyond text, HTML and mail files, and maybe
+   have a look at the configuration section (but this may not be necessary
+   for a quick test with default parameters).

     ----------------------------------------------------------------------

  4.1.1. Installing through a package system

   If you use a BSD-type port system or a prebuilt package (RPM or other),
-   just follow the usual procedure, and maybe have a look at the
-   configuration section (but this may not be necessary for a quick test with
-   default parameters).
+   just follow the usual procedure for your system.

     ----------------------------------------------------------------------

  4.1.2. Installing a prebuilt Recoll

-   The unpackaged binary versions are just compressed tar files of a build
-   tree, where only the useful parts were kept (executables and sample
-   configuration).
+   The unpackaged binary versions on the Recoll web site are just compressed
+   tar files of a build tree, where only the useful parts were kept
+   (executables and sample configuration).

   The executable binary files are built with a static link to libxapian and
-   libiconv, to make installation easier (no dependencies). However, this
-   also means that you cannot change the versions which are used.
+   libiconv, to make installation easier (no dependencies).

   After extracting the tar file, you can proceed with installation as if you
   had built the package from source (that is, just type make install). The
   binary trees are built for installation to /usr/local.

-   You may then need to install external applications to process some file
-   types that you want indexed (ie: acrobat, postscript ...). See next
-   section.
-
-   Finally, you may want to have a look at the configuration section.
-
     ----------------------------------------------------------------------

 4.2. Supporting packages
@ -1161,9 +1198,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
  4.3.1. Prerequisites

   At the very least, you will need to download and install the xapian core
-   package (Recoll development currently uses version 0.9.5), and the qt
-   run-time and development packages (Recoll development currently uses
-   version 3.3.5, but any 3.3 version is probably OK).
+   package (Recoll 1.9 normally uses version 1.0.2, but any 0.9 or 1.0.x
+   version will work too), and the qt run-time and development packages
+   (Recoll development currently uses version 3.3.5, but any 3.3 version is
+   probably OK).

   You will most probably be able to find a binary package for qt for your
   system. You may have to compile Xapian but this is not difficult (if you
@ -1178,8 +1216,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
  4.3.2. Building

   Recoll has been built on Linux (redhat7.3, mandriva 2005/6, Fedora Core
-   3/4/5), FreeBSD and Solaris 8. If you build on another system, I would
-   very much welcome patches.
+   3/4/5/6), FreeBSD 5/6, macosx, and Solaris 8. If you build on another
+   system, and need to modify things, I would very much welcome patches.

   Depending on the qt configuration on your system, you may have to set the
   QTDIR and QMAKESPECS variables in your environment:
@ -1370,20 +1408,14 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
           value, and is the default. The daemversion is specific to the
           indexing monitor daemon.

-   filtersdir
-
-           A directory to search for the external filter scripts used to
-           index some types of files. The value should not be changed, except
-           if you want to modify one of the default scripts. The value can be
-           redefined for any sub-directory.
-
   indexstemminglanguages

           A list of languages for which the stem expansion databases will be
-           built. See recollindex(1) for possible values. You can add a stem
-           expansion database for a different language by using recollindex
-           -s, but it will be deleted during the next indexing. Only
-           languages listed in the configuration file are permanent.
+           built. See recollindex(1) or use the recollindex -l command for
+           possible values. You can add a stem expansion database for a
+           different language by using recollindex -s, but it will be deleted
+           during the next indexing. Only languages listed in the
+           configuration file are permanent.

   defaultcharset

@ -1393,6 +1425,32 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
           character set used is the one defined by the nls environment
           (LC_ALL, LC_CTYPE, LANG), or iso8859-1 if nothing is set.

+   maxfsoccuppc
+
+           Maximum file system occupation before we stop indexing. The value
+           is a percentage, corresponding to what the "Capacity" df output
+           column shows. The default value is 0, meaning no checking.
+
+   idxflushmb
+
+           Threshold (megabytes of new text data) where we flush from memory
+           to disk index. Setting this can help control memory usage. A value
+           of 0 means no explicit flushing, letting Xapian use its own
+           default, which is flushing every 10000 documents (memory usage
+           depends on average document size). The default value is 10.
+
+   filtersdir
+
+           A directory to search for the external filter scripts used to
+           index some types of files. The value should not be changed, except
+           if you want to modify one of the default scripts. The value can be
+           redefined for any sub-directory.
+
+   iconsdir
+
+           The name of the directory where recoll result list icons are
+           stored. You can change this if you want different images.
+
   guesscharset

           Decide if we try to guess the character set of files if no
@ -1425,11 +1483,6 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
           section or just be the beginning of the text). The default value
           is 250.

-   iconsdir
-
-           The name of the directory where recoll result list icons are
-           stored. You can change this if you want different images.
-
   aspellLanguage

           Language definitions to use when creating the aspell dictionary.
@ -1571,7 +1624,34 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
   argument and should output the text contents in html format on the
   standard output.

-   The html could be very minimal like the following example:
+   You can find more details about writing a Recoll filter in the section
+   about writing filters
+
+     ----------------------------------------------------------------------
+
+4.5. Extending Recoll
+
+  4.5.1. Writing a document filter
+
+   Recoll filters are executable programs which translate from a specific
+   format (ie: openoffice, acrobat, etc.) to the Recoll indexing input
+   format, which was chosen to be HTML.
+
+   Recoll filters are usually shell-scripts, but this is in no way necessary.
+   These programs are extremely simple and most of the difficulty lies in
+   extracting the text from the native format, not outputting what is
+   expected by Recoll. Happily enough, most document formats already have
+   translators or text extractors which handle the difficult part and can be
+   called from the filter.
+
+   Filters are called with a single argument which is the source file name.
+   They should output the result to stdout.
+
+   The RECOLL_FILTER_FORPREVIEW environment variable (values yes, no) tells
+   the filter if the operation is for indexing or previewing. Some filters
+   use this to output a slightly different format. This is not essential.
+
+   The output HTML could be very minimal like the following example:

 <html><head>
 <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
@ -1590,6 +1670,16 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
   Recoll will also make use of other header fields if they are present:
   title, description, keywords.

+   As of Recoll release 1.9, filters also have the possibility to "invent"
+   field names. This should be output as meta tags:
+
+ <meta name="somefield" content="Some textual data" />
+
+   In this case, a correspondance between field name and Xapian prefix should
+   also be added to the mimeconf file. See the existing entries for
+   inspiration. The field can then be used inside the query language to
+   narrow searches.
+
   The easiest way to write a new filter is probably to start from an
   existing one.

--- a/src/doc/man/recoll.conf.5
+++ b/src/doc/man/recoll.conf.5
@ -1,4 +1,4 @@
-.\" $Id: recoll.conf.5,v 1.4 2006-11-20 18:07:02 dockes Exp $ (C) 2005 J.F.Dockes\$
+.\" $Id: recoll.conf.5,v 1.5 2007-07-13 10:18:49 dockes Exp $ (C) 2005 J.F.Dockes\$
 .TH RECOLL.CONF 5 "8 January 2006"
 .SH NAME
 recoll.conf \- main personal configuration file for Recoll
@ -10,6 +10,11 @@ The system-wide configuration file is normally located inside
 /usr/[local]/share/recoll/examples. Any parameter set in the common file
 may be overriden by setting it in the personal configuration file, by default:
 .IR $HOME/.recoll/recoll.conf
+.LP
+Please note while we try to keep this manual page reasonably up to date, it
+will frequently lag the current state of the software. The best source of
+information about the configuration are the comments in the configuration
+file.

 .LP
 A short extract of the file might look as follows:
@ -65,6 +70,12 @@ The list can be redefined for subdirectories, but is only actually changed
 for the top level ones in 
 .I topdirs
 .TP
+.BI "skippedPaths = " patterns
+A space-separated list of patterns for paths the indexer should not descend
+into. Together with topdirs, this allows pruning the indexed tree to one's
+content. daemSkippedPaths can be used to define a specific value for the
+real time indexing monitor.
+.TP
 .BI "loglevel = " value
 Verbosity level for recoll and recollindex. A value of 4 lists quite a lot of
 debug/information messages. 3 lists only errors. 
@ -76,32 +87,46 @@ Where should the messages go. 'stderr' can be used as a special value.
 .B daemlogfilename
 can be used to specify a different value for the real-time indexing daemon.
 .TP
+.BI "dbdir = " directory
+The name of the Xapian database directory. It will be created if needed
+when the database is initialized. If this is not an absolute pathname, it
+will be taken relative to the configuration directory.
+.TP
+.BI "indexstemminglanguages = " languages
+A list of languages for which the stem expansion databases will be
+built. See recollindex(1) for possible values.
+.TP
+.BI "defaultcharset = " charset
+The name of the character set used for files that do not contain a
+character set definition (ie: plain text files). This can be redefined for
+any subdirectory.
+.TP
+.BI "maxfsoccuppc = " percentnumber
+Maximum file system occupation before we
+stop indexing. The value is a percentage, corresponding to
+what the "Capacity" df output column shows.  The default
+value is 0, meaning no checking.
+.TP
+.BI "idxflushmb = " megabytes
+Threshold (megabytes of new text data)
+where we flush from memory to disk index. Setting this can
+help control memory usage. A value of 0 means no explicit
+flushing, letting Xapian use its own default, which is
+flushing every 10000 documents (memory usage depends on
+average document size). The default value is 10.
+.TP
 .BI "filtersdir = " directory
 A directory to search for the external filter scripts used to index some
 types of files. The value should not be changed, except if you want to
 modify one of the default scripts. The value can be redefined for any
 subdirectory. 
 .TP
-.BI "indexstemminglanguages = " languages
-A list of languages for which the stem expansion databases will be
-built. See recollindex(1) for possible values.
-.TP
 .BI "iconsdir = " directory
 The name of the directory where 
 .B recoll
 result list icons are stored. You can change this if you want different
 images.
 .TP
-.BI "dbdir = " directory
-The name of the Xapian database directory. It will be created if needed
-when the database is initialized. If this is not an absolute pathname, it
-will be taken relative to the configuration directory.
-.TP
-.BI "defaultcharset = " charset
-The name of the character set used for files that do not contain a
-character set definition (ie: plain text files). This can be redefined for
-any subdirectory.
-.TP
 .BI "guesscharset = " boolean
 Try to guess the character set of files if no internal value is available
 (ie: for plain text files). This does not work well in general, and should
--- a/src/qtgui/i18n/recoll_fr.ts
+++ b/src/qtgui/i18n/recoll_fr.ts
@ -1308,7 +1308,7 @@ Peut ralentir l&apos;affichage si les documents sont gros.</translation>
    </message>
    <message>
        <source>Show document type icons in result list.</source>
-        <translation>Afficher les icônes de type de fichier dans la liste de résultats.</translation>
+        <translation type="obsolete">Afficher les icônes de type de fichier dans la liste de résultats.</translation>
    </message>
    <message>
        <source>Auto-start simple search on whitespace entry.</source>
@ -1434,7 +1434,7 @@ Peut ralentir l&apos;affichage si les documents sont gros.</translation>
    </message>
    <message>
        <source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:&lt;br&gt;%A Abstract&lt;br&gt; %D Date&lt;br&gt; %K Keywords (if any)&lt;br&gt; %L Preview and Edit links&lt;br&gt; %M Mime type&lt;br&gt; %N Result number&lt;br&gt; %R Relevance percentage&lt;br&gt; %S Size information&lt;br&gt; %T Title&lt;br&gt; %U Url&lt;br&gt;</source>
-        <translation>Définit the format pour chaque paragraphe de la liste de résultats. Utilise le format html qt et des remplacements à la printf:&lt;br&gt;%A Résumé&lt;br&gt; %D Date&lt;br&gt; %K Mots clefs (s&apos;il y en a)&lt;br&gt; %L Liens aperçu et édition&lt;br&gt; %M Type Mime&lt;br&gt; %N Numéro de résultat&lt;br&gt; %R Pertinence&lt;br&gt; %S Taille&lt;br&gt; %T Titre&lt;br&gt; %U Url&lt;br&gt;</translation>
+        <translation type="obsolete">Définit the format pour chaque paragraphe de la liste de résultats. Utilise le format html qt et des remplacements à la printf:&lt;br&gt;%A Résumé&lt;br&gt; %D Date&lt;br&gt; %K Mots clefs (s&apos;il y en a)&lt;br&gt; %L Liens aperçu et édition&lt;br&gt; %M Type Mime&lt;br&gt; %N Numéro de résultat&lt;br&gt; %R Pertinence&lt;br&gt; %S Taille&lt;br&gt; %T Titre&lt;br&gt; %U Url&lt;br&gt;</translation>
    </message>
    <message>
        <source>Automatically add phrase to simple searches</source>
@ -1488,7 +1488,15 @@ Ceci devrait donner une meilleure pertinence aux résultats où les termes reche
    </message>
    <message>
        <source>Remember sorting preference between invocations.</source>
-        <translation>Mémoriser l&apos;état des paramètres de tri.</translation>
+        <translation type="obsolete">Mémoriser l&apos;état des paramètres de tri.</translation>
+    </message>
+    <message>
+        <source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:&lt;br&gt;%A Abstract&lt;br&gt; %D Date&lt;br&gt; %I Icon image name&lt;br&gt; %K Keywords (if any)&lt;br&gt; %L Preview and Edit links&lt;br&gt; %M Mime type&lt;br&gt; %N Result number&lt;br&gt; %R Relevance percentage&lt;br&gt; %S Size information&lt;br&gt; %T Title&lt;br&gt; %U Url&lt;br&gt;</source>
+        <translation>Definit le format des paragraphes de la liste de resultats. Utilise le format html qt et des directives de substitution de type printf:&lt;br&gt;%A Abstract&lt;br&gt; %D Date&lt;br&gt; %I Icon image name&lt;br&gt; %K Keywords (if any)&lt;br&gt; %L Preview and Edit links&lt;br&gt; %M Mime type&lt;br&gt; %N Result number&lt;br&gt; %R Relevance percentage&lt;br&gt; %S Size information&lt;br&gt; %T Title&lt;br&gt; %U Url&lt;br&gt;</translation>
+    </message>
+    <message>
+        <source>Remember sort activation state.</source>
+        <translation>Memoriser l&apos;etat d&apos;activation du tri.</translation>
    </message>
 </context>
 <context>
--- a/src/qtgui/i18n/recoll_it.ts
+++ b/src/qtgui/i18n/recoll_it.ts
@ -1309,7 +1309,7 @@ Peut ralentir l&apos;affichage si les documents sont gros.</translation>
    </message>
    <message>
        <source>Show document type icons in result list.</source>
-        <translation type="unfinished">Mostra le icone nella lsita dei risultati.</translation>
+        <translation type="obsolete">Mostra le icone nella lsita dei risultati.</translation>
    </message>
    <message>
        <source>Auto-start simple search on whitespace entry.</source>
@ -1435,7 +1435,7 @@ Può essere lento per grossi documenti..</translation>
    </message>
    <message>
        <source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:&lt;br&gt;%A Abstract&lt;br&gt; %D Date&lt;br&gt; %K Keywords (if any)&lt;br&gt; %L Preview and Edit links&lt;br&gt; %M Mime type&lt;br&gt; %N Result number&lt;br&gt; %R Relevance percentage&lt;br&gt; %S Size information&lt;br&gt; %T Title&lt;br&gt; %U Url&lt;br&gt;</source>
-        <translation>Definisci il formato per ogni risultato. Usa il formato qt-html:e simile a quello di printf:&lt;br&gt;%A Riassunto&lt;br&gt; %D Data&lt;br&gt; %K Keywords (se ci sono)&lt;br&gt; %L Links Preview e Edita&lt;br&gt; %M Tipo Mime&lt;br&gt; %N Numero di risultati&lt;br&gt; %R Rilevanza&lt;br&gt; %S Size&lt;br&gt; %T Titolo&lt;br&gt; %U Url&lt;br&gt;</translation>
+        <translation type="obsolete">Definisci il formato per ogni risultato. Usa il formato qt-html:e simile a quello di printf:&lt;br&gt;%A Riassunto&lt;br&gt; %D Data&lt;br&gt; %K Keywords (se ci sono)&lt;br&gt; %L Links Preview e Edita&lt;br&gt; %M Tipo Mime&lt;br&gt; %N Numero di risultati&lt;br&gt; %R Rilevanza&lt;br&gt; %S Size&lt;br&gt; %T Titolo&lt;br&gt; %U Url&lt;br&gt;</translation>
    </message>
    <message>
        <source>Automatically add phrase to simple searches</source>
@ -1489,7 +1489,11 @@ Questo dovrebbe dare la precedenza ai risultati che contengono i termini esattam
        <translation type="unfinished">Rimuovi dalla lista. Non ha effetto sull&apos;indice del disco</translation>
    </message>
    <message>
-        <source>Remember sorting preference between invocations.</source>
+        <source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:&lt;br&gt;%A Abstract&lt;br&gt; %D Date&lt;br&gt; %I Icon image name&lt;br&gt; %K Keywords (if any)&lt;br&gt; %L Preview and Edit links&lt;br&gt; %M Mime type&lt;br&gt; %N Result number&lt;br&gt; %R Relevance percentage&lt;br&gt; %S Size information&lt;br&gt; %T Title&lt;br&gt; %U Url&lt;br&gt;</source>
+        <translation type="unfinished"></translation>
+    </message>
+    <message>
+        <source>Remember sort activation state.</source>
        <translation type="unfinished"></translation>
    </message>
 </context>
--- a/src/qtgui/i18n/recoll_ru.ts
+++ b/src/qtgui/i18n/recoll_ru.ts
@ -1374,7 +1374,7 @@ May be slow for big documents.</source>
    </message>
    <message>
        <source>Show document type icons in result list.</source>
-        <translation>Отображать типы документов в списке результатов.</translation>
+        <translation type="obsolete">Отображать типы документов в списке результатов.</translation>
    </message>
    <message>
        <source>Auto-start simple search on whitespace entry.</source>
@ -1498,10 +1498,6 @@ May be slow for big documents.</source>
        <source>Result paragraph&lt;br&gt;format string</source>
        <translation type="unfinished"></translation>
    </message>
-    <message>
-        <source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:&lt;br&gt;%A Abstract&lt;br&gt; %D Date&lt;br&gt; %K Keywords (if any)&lt;br&gt; %L Preview and Edit links&lt;br&gt; %M Mime type&lt;br&gt; %N Result number&lt;br&gt; %R Relevance percentage&lt;br&gt; %S Size information&lt;br&gt; %T Title&lt;br&gt; %U Url&lt;br&gt;</source>
-        <translation type="unfinished"></translation>
-    </message>
    <message>
        <source>Automatically add phrase to simple searches</source>
        <translation type="unfinished"></translation>
@ -1552,7 +1548,11 @@ This should give higher precedence to the results where the search terms appear
        <translation type="unfinished"></translation>
    </message>
    <message>
-        <source>Remember sorting preference between invocations.</source>
+        <source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:&lt;br&gt;%A Abstract&lt;br&gt; %D Date&lt;br&gt; %I Icon image name&lt;br&gt; %K Keywords (if any)&lt;br&gt; %L Preview and Edit links&lt;br&gt; %M Mime type&lt;br&gt; %N Result number&lt;br&gt; %R Relevance percentage&lt;br&gt; %S Size information&lt;br&gt; %T Title&lt;br&gt; %U Url&lt;br&gt;</source>
+        <translation type="unfinished"></translation>
+    </message>
+    <message>
+        <source>Remember sort activation state.</source>
        <translation type="unfinished"></translation>
    </message>
 </context>
--- a/src/qtgui/i18n/recoll_uk.ts
+++ b/src/qtgui/i18n/recoll_uk.ts
@ -1205,7 +1205,7 @@ May be slow for big documents.</source>
    </message>
    <message>
        <source>Show document type icons in result list.</source>
-        <translation>Відображати типи документів у списку результатів.</translation>
+        <translation type="obsolete">Відображати типи документів у списку результатів.</translation>
    </message>
    <message>
        <source>Auto-start simple search on whitespace entry.</source>
@ -1329,10 +1329,6 @@ May be slow for big documents.</source>
        <source>Result paragraph&lt;br&gt;format string</source>
        <translation type="unfinished"></translation>
    </message>
-    <message>
-        <source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:&lt;br&gt;%A Abstract&lt;br&gt; %D Date&lt;br&gt; %K Keywords (if any)&lt;br&gt; %L Preview and Edit links&lt;br&gt; %M Mime type&lt;br&gt; %N Result number&lt;br&gt; %R Relevance percentage&lt;br&gt; %S Size information&lt;br&gt; %T Title&lt;br&gt; %U Url&lt;br&gt;</source>
-        <translation type="unfinished"></translation>
-    </message>
    <message>
        <source>Automatically add phrase to simple searches</source>
        <translation type="unfinished"></translation>
@ -1383,7 +1379,11 @@ This should give higher precedence to the results where the search terms appear
        <translation type="unfinished"></translation>
    </message>
    <message>
-        <source>Remember sorting preference between invocations.</source>
+        <source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:&lt;br&gt;%A Abstract&lt;br&gt; %D Date&lt;br&gt; %I Icon image name&lt;br&gt; %K Keywords (if any)&lt;br&gt; %L Preview and Edit links&lt;br&gt; %M Mime type&lt;br&gt; %N Result number&lt;br&gt; %R Relevance percentage&lt;br&gt; %S Size information&lt;br&gt; %T Title&lt;br&gt; %U Url&lt;br&gt;</source>
+        <translation type="unfinished"></translation>
+    </message>
+    <message>
+        <source>Remember sort activation state.</source>
        <translation type="unfinished"></translation>
    </message>
 </context>
--- a/website/BUGS.txt
+++ b/website/BUGS.txt
@ -8,9 +8,16 @@ Latest (1.8.2):
 - There are a few problems in the qt4 version of recoll: some accelerators
  (esc-spc, ctl-arrow) do not work, neither do copy/paste between the
  result list and preview windows and x11 applications.
+
+- The q3textedit find() method is extremely slow. Positionning to first
+  search term in preview has been disabled in qt4, and the application will
+  sometimes appear to be looping when using the find feature in the
+  preview window (it's not looping, it's searching :( )
+
 - The dates shown for email attachments in a result list are the email
  folder modification date. This should be inherited from the parent
  message instead.
+
 - There are sometimes problems with document deletions: the index can
  get in a state where deleted or moved documents are not purged from the
  index (the log file says that the doc are deleted, but they aren't
@ -19,6 +26,15 @@ Latest (1.8.2):
  fixed in a future release. You can apply the following patch to xapian
  1.0.1 to fix it:
      http://www.lesbonscomptes.com/recoll/xapian/xapian-delete-document.patch 
+
+- Under ubuntu (at least), the default awk interpreter (mawk) is buggy,
+  and the recoll pdf input filter does not work (removes all space
+  characters). This can be solved by installing the gawk package.
+
+- If the user-chosen result list entry format results in several paragraphs
+  (in the qt textedit sense), right clicks will only work inside the first
+  one for each entry.
+
 - NEAR crashes: 1.6 has added NEAR searches. Unlike what recoll did
  with PHRASES, stemming expansion is performed on terms inside NEAR
  clauses (except if prevented by a capitalized entry of course). There is
--- a/website/CHANGES.txt
+++ b/website/CHANGES.txt
@ -1,30 +1,61 @@
 CHANGES 

 1.9.0
- Add option to remember sort tool state between program invocations (it is
-  reset to inactive by default)
- Improve qt4 build: no more need for --enable-qt4
- Fixed a number of qt4 glitches: selection and keyboard shortcuts.
+- Incompatible change: the icon image reference is now part of the result
+  list paragraph format string:
+  - If you had a standard config, you need do nothing.
+  - If you had a custom format string, you need to add
+     <img src="%I" align="left"> at its beginning to get the same result as
+     before.
+  - If you had unchecked the "show icons" option, you need to remove the
+    above string from the paragraph format to make the icons go away.
+  Changes to the format string are performed in the 
+  "Preferences->Query Configuration->User Interface" dialog tab.
+
+- New filters: abiword and kword, rcljpeg, rclflac, rclogg (contributed
+  filters). The jpeg and audio filters should be extended to make use of
+  the new field indexing/search capability (hint :) )
+
 - When searching for an empty string inside the preview window, position
-  the window to the next occurrence of the primary search terms.
- Have email attachments inherit date and author from their parent message
+  the window to the next occurrence of a primary search term.
+
+- Added ext: and mime: selectors to the query language.
+
 - Added an adjustable flush threshold during indexing: should help control
-  memory usage. See the idxflushmb configuration parameter.
+  memory usage. See the idxflushmb configuration variable.
+
 - Added a check for file system free space. Indexing will stop if the
  threshold is reached. See the maxfsoccuppc configuration parameter.
- Fix bus error on rclmon exit
- Better handle aspell errors inside rclmon
+
+- Add preference option to remember sort tool state between program
+  invocations (it is reset to inactive by default)
+
 - Added File menu entry to erase document history.
- Added ext: and mime: selectors to the query language.
+
+- Bound the space and backspace keys to PgUp/PgDown in preview.
+
+- (Hopefully) Improved abstract (keyword in context) generation
+
+- Improve qt4 build: no more need for --enable-qt4. Note: the qt4 build
+  still needs the qt3 support library.
+
 - Added support for arbitrary fields. Filters can now produce any number of
-  fields which will be selectively searchable through the query language.
- Added abiword and kword support. 
- Contributed filter: rcljpeg. This should be extended to use the new field
-  support.
+  fields which will be selectively searchable through the query
+  language. This could be useful, for exemple, for the mp3 and jpeg filters
+  (but is not currently used).
+
 - Changed the icon to an ugly one. The previous one was nicer but looked
  too much like Xapian's.
+
 - Added some kind of support for a stopword list.
- Bound space and backspace to PgUp/PgDown in preview.
+- Have email attachments inherit date and author from their parent message
+  memory usage. See the idxflushmb configuration parameter.
+
+- Fix bus error on rclmon exit
+
+- Better handling of aspell errors inside rclmon
+
+- Fixed a number of qt4 glitches: selection and keyboard shortcuts.

 1.8.2 2007-05-19
 - Fixed method name for compatibility with xapian 1.0.0
@ -293,4 +324,4 @@ or keep only the modified parameters.
   identification for suffix-less or unknown files.
 - Typo had removed support for .Z compression
 - Use more appropriate conjonction operators when computing the advanced
-   search query (OP_AND_MAYBE, OP_FILTER instead of OP_AND)
+   search query (OP_AND_MAYBE, OP_FILTER instead of OP_AND)
--- a/website/download.html
+++ b/website/download.html
@ -56,11 +56,12 @@
      <p><i>For building from source</i>, you will need a xapian-core
 	installation. You will find source and binary packages on  the 
 	<a href="http://www.xapian.org/download.php">Xapian download
-	  page</a>. Recoll should build with any 0.9.x Xapian version
-	  (the current one is 0.9.10).</p>
+	  page</a>. Recoll 1.8.2 should build with any 0.9.x or 1.0.x
+	  Xapian version (the current one is 1.0.1).</p>

      <p>You need Qt 3.3 (or qt 4) in all cases (configure Recoll with
-      <em>configure --enable-qt4</em> to build with qt4).</p>
+      <em>configure --enable-qt4</em> to build with qt4, this needs
+      the qt3 support library to be present).</p>

      <p>Recoll relies on external packages for some
 	of its functionality (ie: for many of the non-text file
@ -124,9 +125,7 @@
 	of which is the new default index format. In order to take
 	advantage of the new format (which is not mandatory) Recoll
 	users updating from an older release need to delete their old
-	index. There are <a
-	href="usermanual/usermanual.html#RCL.INDEXING.STORAGE.FORMAT">more
-	details in the user manual</a>.</p>
+	index. <a href="xapUpg100.html">More details</a>.</p>

      <p>Older recoll releases:
 	<a href="recoll-1.8.1.tar.gz">1.8.1</a>