diff --git a/src/doc/user/usermanual.sgml b/src/doc/user/usermanual.sgml
index b8f278d6..8c7cc7a2 100644
--- a/src/doc/user/usermanual.sgml
+++ b/src/doc/user/usermanual.sgml
@@ -1,7 +1,8 @@
Recoll">
-
+Recoll helper applications page">
+
Xapian">
]>
@@ -2620,138 +2621,119 @@ while query.next >= 0 and query.next < nres:
specific file type).
After an indexing pass, the commands that were found
- missing can be displayed from the recoll
- File menu. The list is stored in the
- missing text file inside the configuration
- directory.
+ missing can be displayed from the recoll
+ File menu. The list is stored in the
+ missing text file inside the configuration
+ directory.
A list of common file types which need external
commands follows. Many of the filters need the
iconv command, which is not always listed as a
dependancy.
- As of &RCL; release 1.14, a number of XML-based formats that
- were handled by ad hoc filter code now use
- xsltproc, which usually comes with
- libxslt. These
- are: abiword, fb2 (ebooks), kword, openoffice, svg.
+ Please note that, due to the relatively dynamic nature of this
+ information, the most up to date version is now kept on the &RCLAPPS;
+ along with links to the home pages or best source/patches download
+ links. The list below is not updated often and may be quite
+ stale.
+ For many Linux distributions, most of the commands listed can
+ be installed from the package repositories. However, the packages
+ are sometimes outdated, or not the best version for &RCL;, so you
+ should take a look at the &RCLAPPS; if a file
+ type is important to you.
+
+ As of &RCL; release 1.14, a number of XML-based formats that
+ were handled by ad hoc filter code now use the
+ xsltproc, which usually comes with
+ libxslt. These are: abiword, fb2
+ (ebooks), kword, openoffice, svg.
+
+ Now for the list:
- Openoffice: supported natively, but needs the
- unzip command to be installed.
+ Openoffice files need unzip and
+ xsltproc.
+
+ PDF files need pdftotext which
+ is part of the Xpdf or
+ Poppler packages.
+
+ Postscript files need pstotext.
+ The original version has an issue with shell
+ character in file names, which is corrected in recent
+ packages. See the the &RCLAPPS; for more detail.
- PDF: pdftotext is part of the Xpdf or Poppler packages.
-
+ MS Word needs
+ antiword. It is also useful to have
+ wvWare installed as it may be
+ be used as a fallback for some files which
+ antiword does not handle.
- Postscript:
- pstotext. The original version has an issue with shell
- character in file names. Most recent package repositories /
- ports system use a patched version (ie FreeBSD, Debian). If
- compiling from source, it would be better to apply the patch
- found
-
- here.
-
+ MS Excel and PowerPoint need
+ catdoc.
- MS Word:
- antiword.
-
+ MS Open XML (docx) needs
+ xsltproc.
- MS Excel and PowerPoint:
-
- catdoc.
-
+ Wordperfect files need wpd2html
+ from the libwpd package.
- MS Open XML (docx): needs
- xsltproc.
-
+ RTF files need unrtf, which, in
+ its standard version, has much trouble with non-western character
+ sets. Check the &RCLAPPS;.
- Wordperfect files:
-
- libwpd.
-
+ TeX files need untex or
+ detex. Check the &RCLAPPS; for sources if it's not
+ packaged for your distribution.
-
- RTF: unrtf
-
+ dvi files need dvips.
-
- TeX: &RCL; uses the untex
- program. Your distribution may have a package for it. If it doesn't,
-
- there is a copy of the source on the &RCL; web site,
- because the program has no obvious home. The filter can
- also work with
-
- detex and will use it if it is installed.
-
-
-
- dvi: dvips
-
-
-
- djvu:
- DjVuLibre
-
-
+ djvu files need djvutxt and
+ djvused from the
+ DjVuLibre package.
- mp3, flac, ogg vorbis: &RCL; releases before 1.13
- use the id3info command from the id3lib package to
- extract mp3 tag information. (Some gcc versions after 4.4 may have
- trouble compiling id3lib. You can find a
- workaround here), metaflac (standard flac tools) for flac
- files, and ogginfo (vorbis tools) for ogg files. Releases 1.14
- and later use a single Python filter based on
- mutagen
- for all audio file types.
+ Audio files: &RCL; releases before 1.13
+ used the id3info command from the
+ id3lib package to extract mp3 tag information,
+ metaflac (standard flac tools) for flac files,
+ and ogginfo (vorbis tools) for ogg
+ files. Releases 1.14 and later use a single
+ Python filter based
+ on mutagen for all audio file
+ types.
-
- Pictures: &RCL; uses the
-
- Exiftool Perl package to
- extract tag information. Most image file formats are
- supported. Note that there may not be much interest in indexing
- the technical tags (image size, aperture, etc.). This is only of
- interest if you store personal tags or textual descriptions inside
- the image files.
-
+ Pictures: &RCL; uses the
+ Exiftool
+ Perl package to extract tag
+ information. Most image file formats are supported. Note that
+ there may not be much interest in indexing the technical tags
+ (image size, aperture, etc.). This is only of interest if you
+ store personal tags or textual descriptions inside the image
+ files.
chm: files in microsoft help format need Python and
- the pychm
- module (which needs chmlib).
-
+ the pychm module (which needs
+ chmlib).
- ics: up to &RCL; 1.13, iCalendar files need Python
- and the icalendar module. For newer
- versions, icalendar is not needed
-
+ ICS: up to &RCL; 1.13, iCalendar files need
+ Python
+ and the icalendar
+ module. icalendar is not needed for newer
+ versions, which use internal code.
- zip: Zip archives need Python (and the standard
- zipfile module).
-
+ Zip archives need Python
+ (and the standard zipfile module).
- Text, HTML, mail folders, Openoffice and Scribus files
- are processed internally. Lyx is used to index Lyx files. Many
- filters need iconv and the standard
- sed and awk.
+ Text, HTML, mail folders, and Scribus files are
+ processed internally. Lyx is used to
+ index Lyx files. Many filters need iconv and the
+ standard sed and awk.
diff --git a/website/doc.html b/website/doc.html
index 34fdce2e..53e5a3d6 100644
--- a/website/doc.html
+++ b/website/doc.html
@@ -46,20 +46,13 @@
Index size and indexing performance
data.
- Faqs and Howtos are now kept in the
-
- Recoll Wiki on
- bitbucket.org.
-
- Current list of HowTos:
-
-Indexing Mozilla Sunbird / Lightning calendar data
-
+
+ Faqs and Howtos are now kept in the
+
+ Recoll Wiki on
+ bitbucket.org.
-
+