doc

2010-10-19 15:57:36 +02:00 · 2010-10-19 15:57:36 +02:00 · 9d89fc2061
commit 9d89fc2061
parent fe108af875
6 changed files with 342 additions and 356 deletions
--- a/src/doc/user/usermanual.sgml
+++ b/src/doc/user/usermanual.sgml
@ -1,7 +1,8 @@
 <!DOCTYPE BOOK PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [

 <!ENTITY RCL "<application>Recoll</application>">
-<!ENTITY RCLVERSION "1.12-1.13">
+<!ENTITY RCLAPPS "<ulink url='http://www.recoll.org/features.html'>Recoll helper applications page</ulink>">
+<!ENTITY RCLVERSION "1.14">
 <!ENTITY XAP "<application>Xapian</application>">
 ]>
 
@ -2620,138 +2621,119 @@ while query.next >= 0 and query.next < nres:
        specific file type).</para>

      <para>After an indexing pass, the commands that were found
-      missing can be displayed from the <command>recoll</command>
-      <guilabel>File</guilabel> menu. The list is stored in the
-      <filename>missing</filename> text file inside the configuration
-      directory.</para>
+	missing can be displayed from the <command>recoll</command>
+	<guilabel>File</guilabel> menu. The list is stored in the
+	<filename>missing</filename> text file inside the configuration
+	directory.</para>

      <para>A list of common file types which need external
        commands follows. Many of the filters need the
        <command>iconv</command> command, which is not always listed as a
        dependancy.</para> 

-      <para>As of &RCL; release 1.14, a number of XML-based formats that
-        were handled by ad hoc filter code now use
-        <command>xsltproc</command>, which usually comes with  
-        <ulink
-        url="http://xmlsoft.org/XSLT/index.html">libxslt</ulink>. These
-        are: abiword, fb2 (ebooks), kword, openoffice, svg.</para>
+      <para>Please note that, due to the relatively dynamic nature of this
+	information, the most up to date version is now kept on the &RCLAPPS;
+	along with links to the home pages or best source/patches download
+	links. The list below is not updated often and may be quite
+	stale.</para>

+      <para>For many Linux distributions, most of the commands listed can
+        be installed from the package repositories. However, the packages
+        are sometimes outdated, or not the best version for &RCL;, so you
+        should take a look at the &RCLAPPS; if a file
+        type is important to you.</para>
+
+      <para>As of &RCL; release 1.14, a number of XML-based formats that
+        were handled by ad hoc filter code now use the
+        <command>xsltproc</command>, which usually comes with  
+	  <application>libxslt</application>. These are: abiword, fb2
+	  (ebooks), kword, openoffice, svg.</para> 
+
+      <para>Now for the list:</para>
      <itemizedlist>

-        <listitem><para>Openoffice: supported natively, but needs the
-        <command>unzip</command> command to be installed.</para>
+        <listitem><para>Openoffice files need <command>unzip</command> and
+        <command>xsltproc</command>.</para></listitem>
+
+        <listitem><para>PDF files need <command>pdftotext</command> which
+        is part of the <application>Xpdf</application> or
+        <application>Poppler</application> packages.</para></listitem>
+
+        <listitem><para>Postscript files need <command>pstotext</command>. 
+            The original version has an issue with shell
+            character in file names, which is corrected in recent
+            packages. See the the &RCLAPPS; for more detail.
          </listitem>

-        <listitem><para>PDF: pdftotext is part of the <ulink
-            url="http://www.foolabs.com/xpdf/">Xpdf</ulink> or <ulink
-            url="http://poppler.freedesktop.org/">Poppler</ulink> packages.</para>
-          </listitem>
+        <listitem><para>MS Word needs
+        <command>antiword</command>. It is also useful to have
+        <command>wvWare</command> installed as it may be 
+        be used as a fallback for some files which
+        <command>antiword</command> does not handle.</para></listitem>

-        <listitem><para>Postscript: <ulink
-          url="http://www.cs.wisc.edu/~ghost/doc/pstotext.htm">
-            pstotext</ulink>. The original version has an issue with shell
-            character in file names. Most recent package repositories /
-            ports system use a patched version (ie FreeBSD, Debian). If
-            compiling from source, it would be better to apply the patch
-            found 
-           <ulink url="http://www.recoll.org/files/pstotext-1.9_4-debian.patch">
-            here</ulink>.</para>
-          </listitem>
+        <listitem><para>MS Excel and PowerPoint need <command>
+            catdoc</command>.</para></listitem>

-        <listitem><para>MS Word: <ulink url="http://www.winfield.demon.nl"> 
-            antiword</ulink>.</para>
-          </listitem>
+        <listitem><para>MS Open XML (docx) needs <command>
+         xsltproc</command>.</para></listitem>

-        <listitem><para>MS Excel and PowerPoint: 
-           <ulink url="http://catdoc.klik.atekon.de/"> 
-            catdoc</ulink>.</para>
-          </listitem>
+        <listitem><para>Wordperfect files need <command>wpd2html</command>
+        from the <application>libwpd</application> package.</para></listitem>

-        <listitem><para>MS Open XML (docx): needs 
-                 <command>xsltproc</command>.</para>
-          </listitem>
+        <listitem><para>RTF files need <command>unrtf</command>, which, in
+        its standard version, has much trouble with non-western character
+        sets. Check the &RCLAPPS;.</para></listitem>

-        <listitem><para>Wordperfect files: 
-           <ulink url="http://libwpd.sourceforge.net/download.html"> 
-            libwpd</ulink>.</para>
-          </listitem>
+        <listitem><para>TeX files need <command>untex</command> or
+        <command>detex</command>. Check the &RCLAPPS; for sources if it's not
+        packaged for your distribution.</para></listitem>

-        <listitem>
-            <para>RTF: <ulink
-            url="http://www.gnu.org/software/unrtf/unrtf.html">unrtf</ulink>
-          </para>
+        <listitem><para>dvi files need <command>dvips</command>.</para>
        </listitem>

-        <listitem>
-          <para>TeX: &RCL; uses the <application>untex</application>
-          program. Your distribution may have a package for it. If it doesn't, 
-            <ulink url="http://www.recoll.org/untex/untex-1.3.jf.tar.gz">
-            there is a copy of the source on the &RCL; web site</ulink>,
-            because the program has no obvious home. The filter can
-            also work with 
-            <ulink url="http://www.cs.purdue.edu/homes/trinkle/detex/">
-             detex</ulink> and will use it if it is installed.</para>
-        </listitem>
-
-        <listitem>
-            <para>dvi: <ulink
-               url="http://www.radicaleye.com/dvips.html">dvips</ulink></para>
-        </listitem>
-
-        <listitem>
-            <para>djvu: 
-            <ulink
-               url="http://djvu.sourceforge.net">DjVuLibre
-            </ulink></para>
-        </listitem>
+        <listitem><para>djvu files need <command>djvutxt</command> and
+        <command>djvused</command> from the
+        <application>DjVuLibre</application> package.</para></listitem>
          
-        <listitem><para>mp3, flac, ogg vorbis: &RCL; releases before 1.13
-            use the <command>id3info</command> command from the <ulink
-          url="http://id3lib.sourceforge.net/">id3lib</ulink> package to
-          extract mp3 tag information. (Some gcc versions after 4.4 may have
-          trouble compiling <application>id3lib</application>. <ulink
-          url="http://www.recoll.org/id3lib.html">You can find a
-          workaround here</ulink>), metaflac (standard flac tools) for flac
-          files, and ogginfo (vorbis tools) for ogg files. Releases 1.14
-          and later use a single Python filter based on 
-          <ulink url="http://code.google.com/p/mutagen/">mutagen</ulink>
-	  for all audio file types.</para>
+        <listitem><para>Audio files: &RCL; releases before 1.13
+          used the <command>id3info</command> command from the <application>
+          id3lib</application> package to extract mp3 tag information,
+          <command>metaflac</command> (standard flac tools) for flac files,
+          and <command>ogginfo</command> (vorbis tools) for ogg
+          files. Releases 1.14 and later use a single
+          <application>Python</application> filter based 
+          on <application>mutagen</application> for all audio file
+          types.</para>
 	</listitem>

-        <listitem>
-        <para>Pictures: &RCL; uses the 
-        <ulink url="http://www.sno.phy.queensu.ca/~phil/exiftool/">
-         Exiftool</ulink> <application>Perl</application> package to
-         extract tag information. Most image file formats are
-         supported. Note that there may not be much interest in indexing
-         the technical tags (image size, aperture, etc.). This is only of
-         interest if you store personal tags or textual descriptions inside
-         the image files.</para> 
-          </listitem>
+        <listitem><para>Pictures: &RCL; uses the 
+         <application>Exiftool</application>
+         <application>Perl</application> package to extract tag
+         information. Most image file formats are supported. Note that
+         there may not be much interest in indexing the technical tags
+         (image size, aperture, etc.). This is only of interest if you
+         store personal tags or textual descriptions inside the image
+         files.</para></listitem>

 	<listitem><para>chm: files in microsoft help format need Python and
-          the <ulink
-          url="http://gnochm.sourceforge.net/pychm.html">pychm</ulink>
-          module (which needs <ulink
-          url="http://www.jedrea.com/chmlib/">chmlib</ulink>).</para>
-	</listitem>
+          the <application>pychm</application> module (which needs 
+          <application>chmlib</application>).</para></listitem>

-	<listitem><para>ics: up to &RCL; 1.13, iCalendar files need Python
-	and the <application>icalendar</application> module. For newer
-	versions, <application>icalendar</application> is not needed
-	</para></listitem> 
+	<listitem><para>ICS: up to &RCL; 1.13, iCalendar files need 
+        <application>Python</application>
+	and the <application>icalendar</application>
+	module. <application>icalendar</application> is not needed for newer
+	versions,  which use internal code.</para></listitem> 

-	<listitem><para>zip: Zip archives need Python (and the standard
- 	  zipfile module).</para>
-	</listitem>
+	<listitem><para>Zip archives need <application>Python</application>
+	(and the standard zipfile module).</para></listitem>
 	
        </itemizedlist>

-        <para>Text, HTML, mail folders, Openoffice and Scribus files
-        are processed internally. Lyx is used to index Lyx files. Many
-        filters need <command>iconv</command> and the standard
-        <command>sed</command> and <command>awk</command>. 
+        <para>Text, HTML, mail folders, and Scribus files are
+        processed internally. <application>Lyx</application> is used to
+        index Lyx files. Many filters need <command>iconv</command> and the
+        standard <command>sed</command> and <command>awk</command>.
        </para>

    </sect1>
--- a/website/doc.html
+++ b/website/doc.html
@ -46,20 +46,13 @@
      <li><a href="perfs.html">Index size and indexing performance
 	  data.</a></li>

-      <li>Faqs and Howtos are now kept in the 
-	<a href="http://bitbucket.org/medoc/recoll/wiki/FaqsAndHowTos">
-	  Recoll Wiki</a> on 
-	<a href="http://bitbucket.org/medoc/recoll">bitbucket.org</a>.</li>
-
-	<p>Current list of HowTos:</p>
-        <ul>
-<li><a href="http://bitbucket.org/medoc/recoll/wiki/PreventIndexingDir">Prevent indexing of a directory</a></li>
-<li><a href="http://bitbucket.org/medoc/recoll/wiki/MultipleIndexes">Creating and using multiple indexes</a></li>
-<li><a href="http://bitbucket.org/medoc/recoll/wiki/SavingConfig.wiki">Recoll configuration backup</a></p>
-<li><a href="http://bitbucket.org/medoc/recoll/wiki/IndexMozillaCalendari">Indexing Mozilla Sunbird / Lightning calendar data</a></li>
-       </ul>
+      <li><a href="http://bitbucket.org/medoc/recoll/wiki/FaqsAndHowTos">
+          Faqs and Howtos</a> are now kept in the 
+	  <a href="http://bitbucket.org/medoc/recoll/wiki/">
+	    Recoll Wiki</a> on 
+	  <a href="http://bitbucket.org/medoc/recoll">bitbucket.org</a>.</li>
      </ul>
-      
+
    </div>
  </body>
 </html>
--- a/website/download.html
+++ b/website/download.html
@ -384,7 +384,8 @@ sudo add-apt-repository ppa:recoll-backports/ppa

      <h2><a name="translations">Translations</a></h2>

-      <p>Most of the translations for 1.13 are incomplete. The source
+      <p>Most of the translations for 1.13 are incomplete (and I
+	forgot to update the message files for 1.14, ugh). The source
 	translation files are included in the source release. If your
 	language has some english messages left and you want to take a
 	shot at fixing the problem, you can send the results to
@ -400,17 +401,17 @@ sudo add-apt-repository ppa:recoll-backports/ppa
      </p>

      <p><a href="translations/recoll_xx.ts">recoll_xx.ts</a> is a blank
-	Recoll 1.13 message file, handy to work on a new translation.</p>
+	Recoll 1.14 message file, handy to work on a new translation.</p>

-      <h3>Updated 1.13 translations that became available after the
+      <h3>Updated 1.13/1.14 translations that became available after the
 	release:</h3>

-    <p>None for now :(</p>
-<!--  
-      <p>German. 
-	<a href="translations/recoll_de.ts">recoll_de.ts</a>
-	<a href="translations/recoll_de.qm">recoll_de.qm</a>
+<!--    <p>None for now :(</p> -->
+      <p>Lithuanian. 
+	<a href="translations/recoll_lt.ts">recoll_lt.ts</a>
+	<a href="translations/recoll_lt.qm">recoll_lt.qm</a>
 	</p> 
+<!--  
      <p>Ukrainian. 
 	<a href="translations/recoll_uk.ts">recoll_uk.ts</a>
 	<a href="translations/recoll_uk.qm">recoll_uk.qm</a>
--- a/website/features.html
+++ b/website/features.html
@ -9,7 +9,7 @@
    <meta name="Description" content=
    "recoll is a simple full-text search system for unix and linux based on the powerful and mature xapian engine">
    <meta name="Keywords" content=
-      "full text search,fulltext,desktop search,unix,linux,solaris,open source,free">
+    "full text search,fulltext,desktop search,unix,linux,solaris,open source,free">
    <meta http-equiv="Content-language" content="en">
    <meta http-equiv="content-type" content=
    "text/html; charset=iso-8859-1">
@ -18,260 +18,268 @@
  </head>

  <body>
-
    <div class="rightlinks">
      <ul>
-	<li><a href="index.html">Home</a></li>
-	<li><a href="pics/index.html">Screenshots</a></li>
-	<li><a href="download.html">Downloads</a></li>
-	<li><a href="usermanual/index.html">User manual</a></li>
-	<li><a href="index.html#support">Support</a></li>
-	<li><a href="devel.html">Development</a></li>
+        <li><a href="index.html">Home</a></li>
+
+        <li><a href="pics/index.html">Screenshots</a></li>
+
+        <li><a href="download.html">Downloads</a></li>
+
+        <li><a href="usermanual/index.html">User manual</a></li>
+
+        <li><a href="index.html#support">Support</a></li>
+
+        <li><a href="devel.html">Development</a></li>
      </ul>
    </div>

    <div class="content">
-
      <h1 class="intro">Recoll features</h1>

-      <dl>
-	<dt><a name="systems">Supported systems</a></dt>
-	<dd><span class="application">Recoll</span> has been compiled and
-	  tested on FreeBSD, Linux, Darwin and Solaris (versions
-	  FreeBSD 5-7, Redhat 7/8/9, Fedora Core 5-13, Suse 10/11,
-	  Gentoo, Debian 3.1, Solaris 8/9/10. Other not too distant
-	  releases should be ok too).</dd>
+      <h2><a name="systems">Supported systems</a></h2>

-	<dd>Qt versions from 3.1 to 4.5</dd>
+      <p><span class="application">Recoll</span> has been compiled
+      and tested on FreeBSD, Linux, Darwin and Solaris (initial
+      versions FreeBSD 5, Redhat 7, Fedora Core 5, Suse 10, Gentoo,
+      Debian 3.1, Solaris 8). It should compile and run on all
+      subsequent releases of these systems and probably a few
+      others too.</p>

-        <dt><a name="doctypes">Document types</a></dt>
-	<dd>Recoll can index many document types (along with their
-          compressed versions). Some types are handled internally (no
-          external application needed). Other types need some application to
-          be installed to extract the text. Types that only need common
-          very common utilities (awk/sed/groff etc.) are listed in the
-          native section.</dd>
+      <p>Qt versions from 3.1 to 4.7</p>

-          <dl>
-            <dt>Natively</dt>
+      <h2><a name="doctypes">Document types</a></h2>

-            <dd>
-              <ul>
-                <li><span class="literal">text</span>.</li>
+      <p>Recoll can index many document types (along with their
+      compressed versions). Some types are handled internally (no
+      external application needed). Other types need a separate
+      application to be installed to extract the text. Types that
+      only need very common utilities (awk/sed/groff etc.) are
+      listed in the native section.</p>

-                <li><span class="literal">html</span>.</li>
+      <h4>File types indexed natively</h4>

-                <li><span class="literal">maildir</span> and <span
-		    class="literal">mailbox</span> (<span class=
-		    "literal">Mozilla</span>, <span class=
-		    "literal">Thunderbird</span> and <span class=
-		    "literal">Evolution</span> mail ok).</li>
+      <ul>
+        <li><span class="literal">text</span>.</li>

-                <li><span class="literal">OpenOffice</span>
-                files (needs <span class="command">unzip</span> command).</li>
+        <li><span class="literal">html</span>.</li>

-                <li><span class="literal">Abiword</span> files.</li>
+        <li><span class="literal">maildir</span> and <span class=
+        "literal">mailbox</span> (<span class=
+        "literal">Mozilla</span>, <span class=
+        "literal">Thunderbird</span> and <span class=
+        "literal">Evolution</span> mail ok).</li>

-                <li><span class="literal">Kword</span> files.</li>
+        <li><span class="literal">gaim</span> and <span class=
+        "literal">purple</span> log files.</li>

-                <li><span class="literal">gaim</span> and <span
-                    class="literal">purple</span> log files.</li> 
+        <li><span class="literal">Lyx</span> files (needs <span
+        class="literal">Lyx</span> to be installed).</li>

-                <li><span class="literal">Lyx</span> files (needs
-		  <span class="literal">Lyx</span> to be installed).</li>
+        <li><span class="literal">Scribus</span> files.</li>

-                <li><span class="literal">Scribus</span> files.</li>
-
-                <li><span class="literal">Man pages</span> (need <span
-                class="command">groff</span>).</li> 
-
-              </ul>
-            </dd>
-
-            <dt>With external helpers</dt>
-
-            <dd>
-            <para>In addition to the applications listed below, many
-            document types need the <span
-            class="command">iconv</span> command.</para>
-
-              <ul>
-                <li><span class="literal">Microsoft Office Open XML</span>
-                files with the <span class="command">unzip</span>
-                and <span class="command">xsltproc</span> commands.</li>
-
-                <li><span class="literal">pdf</span> with the <span
-                class="command">pdftotext</span> command, which can be
-                installed as part of <a href=
-                "http://www.foolabs.com/xpdf/">xpdf</a> or <a
-                href="http://poppler.freedesktop.org/">poppler</a>,
-                depending on your distribution.</li>
-
-                <li><span class="literal">msword</span> with <a href=
-                "http://www.winfield.demon.nl/">antiword</a>.</li>
-
-                <li><span class="literal">Powerpoint</span> and 
-		  <span class="literal">Excel</span> with the
-		  <a href="http://catdoc.klik.atekon.de">
-		    catdoc</a> utilities.</li>
-
-                <li><span class="literal">CHM (Microsoft help)</span>
-                  files (needs <span class="command">Python, pychm or
-                    chmlib</span>).</li> 
-
-                <li><span class="literal">Zip</span>
-                  archives (needs <span class="command">Python</span>).</li>
-
-                <li><span class="literal">iCalendar</span>(.ics) files
-                  (needs <span class="command">Python, 
-<a href="http://pypi.python.org/pypi/icalendar/2.1">icalendar</a></span>).</li> 
-
-                <li><span class="literal">Mozilla calendar data</span>
-                    See <a href="http://bitbucket.org/medoc/recoll/wiki/IndexMozillaCalendari">
-                    the wiki</a> about this.</li>
-
-                <li><span class="literal">Wordperfect</span> with <a href=
-                "http://libwpd.sourceforge.net">libwpd</a>.</li>
-
-                <li><span class="literal">postscript</span> with 
-	          <a href="http://www.gnu.org/software/ghostscript/ghostscript.html">
-		    ghostscript</a> and 
-		  <a href="http://www.cs.wisc.edu/~ghost/doc/pstotext.htm">
-		    pstotext</a>.
-		  Actually the pstotext 1.9 found at the latter link
-		  has a problem with file names using special shell
-		  characters, and you should either use the version
-		  packaged for your system which is probably patched,
-		  or apply the Debian patch which is
-		  stored <a href="files/pstotext-1.9_4-debian.patch">here</a>
-		  for convenience. See
-		  http://packages.debian.org/squeeze/pstotext and
-		  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=356988
-		  for references/explanations.</li>
-
-                <li><span class="literal">rtf</span> with <a href=
-                "http://www.gnu.org/software/unrtf/unrtf.html">unrtf</a>.</li>
-
-		<li><span class="literal">TeX</span> with
-		  <span class="command">untex</span>. If there is no untex
-		  package for your distribution, 
-		  <a href="untex/untex-1.3.jf.tar.gz">a source package is
-		    stored on this site</a> (as untex has no obvious
-		    home).
-		  Will also work
-		  with <a
-		  href="http://www.cs.purdue.edu/homes/trinkle/detex/">detex</a>
-		  if this is installed.
-		</li>
-
-		<li><span class="literal">dvi</span> with 
-		  <a href="http://www.radicaleye.com/dvips.html">dvips</a>.
-		</li>
-
-		<li><span class="literal">djvu</span> with 
-		  <a href="http://djvu.sourceforge.net">DjVuLibre</a>. 
-		</li>
-		<li><span class="literal">mp3/flac/ogg vorbis</span>
-		  tags support with  
-		  <a href="http://id3lib.sourceforge.net/">id3info (id3lib)
-		  </a> (compiling id3lib on recent systems may need
-		  a small patch, see <a href="id3lib.html">here.</a>) or
-		  the ogg and flac tools. Release 1.14 and later use a
-		  python filter based on 
-		  <a href="http://code.google.com/p/mutagen/">mutagen</a>
-		  for all audio tags. 
-		</li>
-		<li>Image file tags support with 
-		  <a href="http://www.sno.phy.queensu.ca/~phil/exiftool/">
-		    exiftool</a>. This is a perl program, so you also
-		    need perl on the system. This works with about any
-		  possible image file and tag format (jpg, png, tiff,
-		  gif etc.).
-		</li>
-
-              </ul>
-            </dd>
-          </dl>
-	</dd>
-
-	<dt>Other features</dt>
-	<dd>
-	  <ul>
-	    <li>Can use <b>Beagle</b> browser plug-ins to index web
-	       history. See the 
-               <a href="http://bitbucket.org/medoc/recoll/wiki/IndexBeagleWeb">
-               the Wiki</a> for more detail.</li>
-
-	    <li>Processes all email attachments.</li>
-
-	    <li>Multiple selectable databases.</li>
-
-	    <li>Powerful query facilities, with boolean searches,
-	      phrases, filter on file types and directory tree.</li>
-
-	    <li>Xesam-compatible query language.</li>
-
-	    <li>Wildcard searches (with a specific and faster function for
-	      file names).</li>
-
-	    <li>Support for multiple charsets. Internal processing and
-	      storage uses Unicode UTF-8.</li>
-
-	    <li><a href="#Stemming">Stemming</a> performed at query
-	      time (can switch stemming language after indexing).</li>
-
-	    <li>Easy installation. No database daemon, web server or
-	      exotic language necessary.</li>
-
-	    <li>An indexer which runs either as a thread inside the GUI,
-	      as an external, batch, cron'able program, or as a
-	      real-time indexing daemon.</li>
-	  </ul>
-	</dd>
+        <li><span class="literal">Man pages</span> (need <span
+        class="command">groff</span>).</li>
      </ul>

+      <h4>File types indexed with external helpers</h4>
+
+      <p>Many document types need the <span class="command">iconv</span>
+      command in addition to the applications specifically listed.</p>
+
+      <p>The following types need <span class=
+      "command">xsltproc</span> from the <b>libxslt</b> package.
+      Quite a few also need <span class="command">unzip</span>:</p>
+
+      <ul>
+        <li><span class="literal">Abiword</span> files.</li>
+
+        <li><span class="literal">Fb2</span> ebooks.</li>
+
+        <li><span class="literal">Kword</span> files.</li>
+
+        <li><span class="literal">Microsoft Office Open XML</span>
+        files.</li>
+
+        <li><span class="literal">OpenOffice</span> files.</li>
+
+        <li><span class="literal">SVG</span> files.</li>
+      </ul>
+
+      <p>Others:</p>
+
+      <ul>
+        <li><span class="literal">pdf</span> with the <span class=
+        "command">pdftotext</span> command, which can be installed
+        as part of <a href="http://www.foolabs.com/xpdf/">xpdf</a>
+        or <a href="http://poppler.freedesktop.org/">poppler</a>,
+        depending on your distribution.</li>
+
+        <li><span class="literal">msword</span> with <a href=
+        "http://www.winfield.demon.nl/">antiword</a>.</li>
+
+        <li><span class="literal">Powerpoint</span> and <span
+        class="literal">Excel</span> with the <a href=
+        "http://catdoc.klik.atekon.de">catdoc</a> utilities.</li>
+
+        <li><span class="literal">CHM (Microsoft help)</span> files
+        (needs <span class="command">Python, pychm or
+        chmlib</span>).</li>
+
+        <li><span class="literal">Zip</span> archives (needs <span
+        class="command">Python</span>).</li>
+
+        <li><span class="literal">iCalendar</span>(.ics) files
+        (needs <span class="command">Python, <a href=
+        "http://pypi.python.org/pypi/icalendar/2.1">icalendar</a></span>).</li>
+
+        <li><span class="literal">Mozilla calendar data</span> See
+        <a href=
+        "http://bitbucket.org/medoc/recoll/wiki/IndexMozillaCalendari">
+        the wiki</a> about this.</li>
+
+        <li><span class="literal">Wordperfect</span> with <a href=
+        "http://libwpd.sourceforge.net">libwpd</a>.</li>
+
+        <li><span class="literal">postscript</span> with <a href=
+        "http://www.gnu.org/software/ghostscript/ghostscript.html">ghostscript</a>
+        and <a href=
+        "http://www.cs.wisc.edu/~ghost/doc/pstotext.htm">pstotext</a>.
+        Actually the pstotext 1.9 found at the latter link has a
+        problem with file names using special shell characters, and
+        you should either use the version packaged for your system
+        which is probably patched, or apply the Debian patch which
+        is stored <a href=
+        "files/pstotext-1.9_4-debian.patch">here</a> for
+        convenience. See
+        http://packages.debian.org/squeeze/pstotext and
+        http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=356988 for
+        references/explanations.</li>
+
+        <li><span class="literal">RTF</span> files with <a href=
+        "http://www.gnu.org/software/unrtf/unrtf.html">unrtf</a>. Please
+        note that up to version
+        0.21, <span class="command">unrtf</span> mostly does not work
+        with non western-european character sets. If you have a need
+        for indexing, ie, russian or chinese RTF files, I have
+        produced a modified version which works much better (as
+        indicated by my tests and a few external ones). You can
+        download the <a href="unrtf/unrtf-0.22.0beta.tar.gz">source
+        here</a>. The development is hosted
+        on <a href="http://www.bitbucket.org/medoc/unrtf-int">
+         bitbucket.org</a>.</li>  
+
+        <li><span class="literal">TeX</span> with <span class=
+        "command">untex</span>. If there is no untex package for
+        your distribution, <a href="untex/untex-1.3.jf.tar.gz">a
+        source package is stored on this site</a> (as untex has no
+        obvious home). Will also work with <a href=
+        "http://www.cs.purdue.edu/homes/trinkle/detex/">detex</a>
+        if this is installed.</li>
+
+        <li><span class="literal">dvi</span> with <a href=
+        "http://www.radicaleye.com/dvips.html">dvips</a>.</li>
+
+        <li><span class="literal">djvu</span> with <a href=
+        "http://djvu.sourceforge.net">DjVuLibre</a>.</li>
+
+        <li>Audio file tags: Recoll releases 1.13 and older use <a
+        href="http://id3lib.sourceforge.net/">id3info (id3lib)</a>
+        (compiling id3lib on recent systems may need a small patch,
+        see <a href="id3lib.html">here.</a>) or the ogg and flac
+        tools.<br>
+         Recoll releases 1.14 and later use a Python filter based
+        on <a href="http://code.google.com/p/mutagen/">mutagen</a>
+        for all audio types.</li>
+
+        <li>Image file tags support with <a href=
+        "http://www.sno.phy.queensu.ca/~phil/exiftool/">exiftool</a>.
+        This is a perl program, so you also need perl on the
+        system. This works with about any possible image file and
+        tag format (jpg, png, tiff, gif etc.).</li>
+      </ul>
+
+      <h2>Other features</h2>
+
+      <ul>
+        <li>Can use <b>Beagle</b> browser plug-ins to index web
+        history. See the <a href=
+        "http://bitbucket.org/medoc/recoll/wiki/IndexBeagleWeb">the
+        Wiki</a> for more detail.</li>
+
+        <li>Processes all email attachments.</li>
+
+        <li>Multiple selectable databases.</li>
+
+        <li>Powerful query facilities, with boolean searches,
+        phrases, filter on file types and directory tree.</li>
+
+        <li>Xesam-compatible query language.</li>
+
+        <li>Wildcard searches (with a specific and faster function
+        for file names).</li>
+
+        <li>Support for multiple charsets. Internal processing and
+        storage uses Unicode UTF-8.</li>
+
+        <li><a href="#Stemming">Stemming</a> performed at query
+        time (can switch stemming language after indexing).</li>
+
+        <li>Easy installation. No database daemon, web server or
+        exotic language necessary.</li>
+
+        <li>An indexer which runs either as a thread inside the
+        GUI, as an external, batch, cron'able program, or as a
+        real-time indexing daemon.</li>
+      </ul>

      <h2><a name="#stemming"></a>Stemming</h2>

-      <p>Stemming is a process which transforms inflected words into
-      their most basic form. For example, <i>flooring</i>,
-      <i>floors</i>, <i>floored</i> would probably all be transformed
-      to <i>floor</i> by a stemmer for the English language.</p>
+      <p>Stemming is a process which transforms inflected words
+      into their most basic form. For example, <i>flooring</i>,
+      <i>floors</i>, <i>floored</i> would probably all be
+      transformed to <i>floor</i> by a stemmer for the English
+      language.</p>

      <p>In many search engines, the stemming process occurs during
-      indexing. The index will only contain the stemmed form of words,
-      with exceptions for terms which are detected as being probably
-      proper nouns (ie: capitalized). At query time, the terms entered
-      by the user are stemmed, then matched against the index.</p>
+      indexing. The index will only contain the stemmed form of
+      words, with exceptions for terms which are detected as being
+      probably proper nouns (ie: capitalized). At query time, the
+      terms entered by the user are stemmed, then matched against
+      the index.</p>

      <p>This process results into a smaller index, but it has the
-	grave inconvenient of irrevocably losing information during
-	indexing.</p>
+      grave inconvenient of irrevocably losing information during
+      indexing.</p>

-      <p>Recoll works in a different way. No stemming is performed at
-	query time, so that all information gets into the index. The
-	resulting index is bigger, but most people probably don't care
-	much about this nowadays, because they have a 100Gb disk 95%
-	full of binary data <em>which does not get indexed</em>.</p>
-      <p>At the end of an indexing pass, Recoll builds one or several
-	stemming dictionaries, where all word stems are listed in
-	correspondence to the list of their derivatives.</p>
+      <p>Recoll works in a different way. No stemming is performed
+      at query time, so that all information gets into the index.
+      The resulting index is bigger, but most people probably don't
+      care much about this nowadays, because they have a 100Gb disk
+      95% full of binary data <em>which does not get
+      indexed</em>.</p>
+
+      <p>At the end of an indexing pass, Recoll builds one or
+      several stemming dictionaries, where all word stems are
+      listed in correspondence to the list of their
+      derivatives.</p>

      <p>At query time, by default, user-entered terms are stemmed,
-	then matched against the stem database, and the query is
-	expanded to include all derivatives. This will yield search
-	results analogous to those obtained by a classical engine.
-	The benefits of this approach is that stem expansion can be
-	controlled instantly at query time in several ways:
-	<ul>
-	<li>It can be selectively turned-off for any query term by
-	  capitalizing it (<i>Floor</i>).</li>
-	<li>The stemming language (ie: english, french...) can be
-	  selected (this supposes that several stemming databases have
-	  been built, which can be configured as part of the indexing,
-	  or done later, in a reasonably fast way).</li>
+      then matched against the stem database, and the query is
+      expanded to include all derivatives. This will yield search
+      results analogous to those obtained by a classical engine.
+      The benefits of this approach is that stem expansion can be
+      controlled instantly at query time in several ways:</p>
+
+      <ul>
+        <li>It can be selectively turned-off for any query term by
+        capitalizing it (<i>Floor</i>).</li>
+
+        <li>The stemming language (ie: english, french...) can be
+        selected (this supposes that several stemming databases
+        have been built, which can be configured as part of the
+        indexing, or done later, in a reasonably fast way).</li>
      </ul>
-	
    </div>
  </body>
 </html>
--- a/website/index.html.en
+++ b/website/index.html.en
@ -104,16 +104,14 @@
 	  </ul>
 	</li>

-	<li>2010-04-14 :
-	  Recoll <a href="download.html#source">1.13.04</a> is out. It
-	  fixes a nasty bug (broken stemming) in 1.13.02.</li>
-
        <li>2010-01-29 : the full Recoll source repository is now
-          hosted on
-          <a href="http://bitbucket.org/medoc/recoll">Bitbucket</a>, along
-          with a Wiki and an 
-          <a href="http://bitbucket.org/medoc/recoll/issues">issues tracking
-            system</a>. Hopefully, this
+          hosted on 
+	  <a href="http://bitbucket.org/medoc/recoll">Bitbucket</a>,
+          along with a Wiki
+          (<a href="http://bitbucket.org/medoc/recoll/wiki/FaqsAndHowTos">
+          Faqs and Howtos</a>) and an
+          <a href="http://bitbucket.org/medoc/recoll/issues">
+	    issues tracking system</a>. Hopefully, this
          new channel for reporting bugs and make suggestions will
          increase the feedback rate...</li>

--- a/website/index.html.fr
+++ b/website/index.html.fr
@ -135,6 +135,10 @@
 	contributions en code ou en suggestions, voir la page des 
 	<a class="important" href="credits.html">Attributions</a>.</p>

+      <h2>Autres</h2>
+      <p>Je loue une 
+	<a href="http://www.metairie-enbor.com/index.html">
+	  grande maison sympa dans l'Aude</a> :)</p>

    </div>
  </body>