Updated filters for Recoll

The following describe new and updated filters, which will be part of the next release, but can be installed on the current release if you need them.

For updated filters, you just need to copy the script to the filters directory which may be typically either /usr/share/recoll/filters, or /usr/local/share/recoll/filters.

For new filters, you'll need to copy the script file as above, possibly install the supporting application, and usually edit the mimemap, mimeview and mimeconf files, either in the shared directory ( /usr[/local]/share/recoll/examples), or in your personal configuration directory ($HOME/.recoll or $RECOLL_CONFDIR).

Alternatively, you can replace your 1.[8,9,10] system files with these updated and complete versions: mimemap mimeconf mimeview

Notes:

All filters are up to date in Recoll 1.10.5

Recoll 1.10.0: only rclsvg for Scalable Vector Graphic files is missing.

Recoll 1.9: all filters are up to date in the release, except the rclimg image filter and rcltexTeX filter.

Recoll 1.8: The image, kword, abiword and wordperfect can be installed in addition.

Open XML Office formats

Filter: rclopxml.

This needs xsltproc to be installed (if you run a decently recent Linux, this is probably on your system already).

The filters are certainly not perfect, but extract a good part of the text, which is probably better than nothing.

There are quite a few added lines in the configuration files, just fetch the new ones: mimemap mimeconf mimeview

Scalable Vector Graphics filter

A new filter for SVG files: rclsvg. You'll have to add the following lines in the configuration files:

In mimemap:

.svg = image/svg+xml

In mimeconf, [index] section:

image/svg+xml = exec rclsvg

mimeconf, [icons] section:

image/svg+xml = drawing

mimeconf, [categories] section, also add image/svg+xml to the other list.

The filter is based on sed, so you don't need to install any external application.

In mimeview, or the [view] section of mimeconf for older recoll versions:

    image/svg+xml = inkview %f

(Or substitute your favorite editor).

TeX filter

A new filter for TeX files: rcltex. You'll have to add the following lines in the configuration files:

In mimemap:

.tex = application/x-tex

In mimeconf, [index] section:

    application/x-tex = exec rcltex

mimeconf, [icons] section:

application/x-tex = wordprocessing

mimeconf, [categories] section, also add application/x-tex to the texts list.

This filter uses either untex or detex if the command is available. . A copy of the source code for untex is stored here

In mimeview, or the [view] section of mimeconf for older recoll versions:

    application/x-tex = gnuclient -q %f

(Or substitute your favorite editor).

A filter for image tags

A new filter for extracting tags from image and picture files: rclimg, by Cedric Scott. It is based on the Exiftool Perl application and library. You'll have to add the following lines in the configuration files:

In mimemap:

.jpeg = image/jpeg
.gif = image/gif
.tiff = image/tiff
.tif  = image/tiff

In mimeconf, [index] section:

image/gif = exec rclimg
image/jpeg = exec rclimg
image/png = exec rclimg
image/tiff = exec rclimg
      

And remove the image/jpeg = exec rcljpeg line.

Exiftool supports many other image formats, just enter any additional ones like above.

Wordperfect filter

A new filter for Wordperfect files: rclwpd. You'll have to add the following lines in the configuration files:

In mimemap:

.wpd = application/vnd.wordperfect

In mimeconf, [index] section:

    application/vnd.wordperfect = exec rclwpd

mimeconf, [icons] section:

application/vnd.wordperfect = wordprocessing

mimeconf, [categories] section, also add application/vnd.wordperfect to the texts list.

In mimeview, or the [view] section of mimeconf for older recoll versions:

    application/vnd.wordperfect = openoffice %f

Abiword filter

A new filter for abiword files: rclabw. You'll have to add the following lines in the configuration files:

In mimemap:

    .abw = application/x-abiword

In mimeconf:

    application/x-abiword = exec rclabw

In mimeview, or the [view] section of mimeconf for older recoll versions:

    application/x-abiword = abiword %f

Kword filter

A new filter for kword files: rclkwd. You'll have to add the following lines in the configuration files:

In mimemap:

    .kwd = application/x-kword

In mimeconf:

    application/x-kword = exec rclkwd

In mimeview, or the [view] section of mimeconf for older recoll versions:

    application/x-kword = kword %f

Openoffice filter

The filter script for all releases up and including 1.7.5 had a bug on Debian and Ubuntu systems. You can download the corrected script.

Scribus filter

A new filter for Scribus files: rclscribus. This is only for the newer .sla files. I am willing to add support for the older .scd format if someone sends me a sample... You'll have to add the following lines in the configuration files:

In mimemap:

      .sla = application/x-scribus

In mimeconf:

      application/x-scribus = exec rclscribus

In mimeview, or the [view] section of mimeconf for older recoll versions:

       application/x-scribus = scribus %f

Do *not* add entries for .sla.gz, the normal recoll decompression process will handle them (hopefully...).

Lyx filter

A new filter for Lyx files: rcllyx. This probably has quite a few issues with character encoding, but it's also probably better than handling lyx documents as text files.

In mimemap:

      .lyx = application/x-lyx

In mimeconf:

      application/x-lyx = exec rcllyx

In mimeview, or the [view] section of mimeconf for older recoll versions:

       application/x-lyx = lyx %f