recoll/website/filters/filters.html
2010-05-04 09:06:52 +02:00

295 lines
10 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Recoll updated filters</title>
<meta name="generator" content="HTML Tidy, see www.w3.org">
<meta name="Author" content="Jean-Francois Dockes">
<meta name="Description" content=
"recoll is a simple full-text search system for unix and linux
based on the powerful and mature xapian engine">
<meta name="Keywords" content=
"full text search, desktop search, unix, linux">
<meta http-equiv="Content-language" content="en">
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
<meta name="robots" content="All,Index,Follow">
<link type="text/css" rel="stylesheet" href="../styles/style.css">
</head>
<body>
<div class="rightlinks">
<ul>
<li><a href="../index.html">Home</a></li>
<li><a href="../download.html">Downloads</a></li>
<li><a href="../usermanual/index.html">User manual</a></li>
<li><a href="../usermanual/rcl.install.html">Installation</a></li>
<li><a href="../index.html#support">Support</a></li>
</ul>
</div>
<div class="content">
<h1>Updated filters for Recoll</h1>
<p>The following describe new and updated filters, which will be
part of the next release, but can be installed on the current
release if you need them.</p>
<p>For updated filters, you just need to copy the script to the
filters directory which may be typically either <span
class="filename">/usr/share/recoll/filters</span>, or <span
class="filename">/usr/local/share/recoll/filters</span>.</p>
<p>For new filters, you'll need to copy the script file as
above, possibly install the supporting application, and usually
edit the
<span class="filename">mimemap</span>,
<span class="filename">mimeview</span> and
<span class="filename">mimeconf</span> files, either in the
shared directory
(<span class="filename">
/usr[/local]/share/recoll/examples</span>), or
in your personal configuration directory
(<span class="filename">$HOME/.recoll</span> or
<span class="filename">$RECOLL_CONFDIR</span>).</p>
<p>Alternatively, you can replace your 1.[8,9,10] system files with
these updated and complete versions:
<a href="mimemap">mimemap</a>
<a href="mimeconf">mimeconf</a>
<a href="mimeview">mimeview</a> </p>
<p>Notes:</p>
<blockquote>
<p>All filters are up to date in Recoll 1.10.5</p>
<p>Recoll 1.10.0: only <span class="filename">rclsvg</span> for
Scalable Vector Graphic files is missing.</p>
<p>Recoll 1.9: all filters are up to date in the release,
except the <span class="filename">rclimg</span> image
filter and <span class="filename">rcltex</span>TeX filter.</p>
<p>Recoll 1.8: The image, <b>kword</b>,
<b>abiword</b> and <b>wordperfect</b> can be installed in
addition.</p>
</blockquote>
<h2>Open XML Office formats</h2>
<p>Filter: <a href="rclopxml">rclopxml</a>. </p>
<p>This needs <span class="command">xsltproc</span> to be
installed (if you run a decently recent Linux, this is
probably on your system already). </p>
<p>The filters are certainly not perfect, but extract a good
part of the text, which is probably better than nothing.</p>
<p>There are quite a few added lines in the configuration
files, just fetch the new ones:
<a href="mimemap">mimemap</a>
<a href="mimeconf">mimeconf</a>
<a href="mimeview">mimeview</a> </p>
<h2>Scalable Vector Graphics filter</h2>
<p>A new filter for <b>SVG</b> files:
<a href="rclsvg">rclsvg</a>.
You'll have to add the following lines in the configuration
files:</p>
<p>In <span class="filename">mimemap</span>: </p>
<pre>.svg = image/svg+xml
</pre>
<p>In <span class="filename">mimeconf</span>, [index] section: </p>
<pre>image/svg+xml = exec rclsvg</pre>
<p><span class="filename">mimeconf</span>, [icons] section:</p>
<pre>image/svg+xml = drawing</pre>
<p><span class="filename">mimeconf</span>, [categories] section, also add
<tt>image/svg+xml</tt> to the <tt>other</tt> list.</p>
<p>The filter is based on <span class="command">sed</span>, so
you don't need to install any external application.</p>
<p>In
<span class="filename">mimeview</span>, or the <em>[view]</em>
section of
<span class="filename">mimeconf</span> for older recoll versions: </p>
<pre> image/svg+xml = inkview %f</pre>
<p>(Or substitute your favorite editor).</p>
<h2>TeX filter</h2>
<p>A new filter for <b>TeX</b> files:
<a href="rcltex">rcltex</a>.
You'll have to add the following lines in the configuration
files:</p>
<p>In <span class="filename">mimemap</span>: </p>
<pre>.tex = application/x-tex
</pre>
<p>In <span class="filename">mimeconf</span>, [index] section: </p>
<pre> application/x-tex = exec rcltex</pre>
<p>mimeconf, [icons] section:</p>
<pre>application/x-tex = wordprocessing</pre>
<p>mimeconf, [categories] section, also add
application/x-tex to the <tt>texts</tt> list.</p>
<p>This filter uses either <span class="command">untex</span>
or <a
"href=http://www.cs.purdue.edu/homes/trinkle/detex/">detex</a>
if the command is available. . A copy of the
source code for untex is stored <a "href=../untex/untex-1.3.jf.tar.gz">
here</a></p>
<p>In
<span class="filename">mimeview</span>, or the <em>[view]</em>
section of
<span class="filename">mimeconf</span> for older recoll versions: </p>
<pre> application/x-tex = gnuclient -q %f</pre>
<p>(Or substitute your favorite editor).</p>
<h2>A filter for image tags</h2>
<p>A new filter for extracting tags from image and picture files:
<a href="rclimg">rclimg</a>, by Cedric Scott. It is based on
the <b>Exiftool</b> Perl application and library.
You'll have to add the following lines in the configuration
files:</p>
<p>In <span class="filename">mimemap</span>: </p>
<pre>.jpeg = image/jpeg
.gif = image/gif
.tiff = image/tiff
.tif = image/tiff
</pre>
<p>In <span class="filename">mimeconf</span>, [index] section: </p>
<pre>image/gif = exec rclimg
image/jpeg = exec rclimg
image/png = exec rclimg
image/tiff = exec rclimg
</pre>
<p>And remove the <tt>image/jpeg = exec rcljpeg</tt> line.</p>
<p>Exiftool supports many other image formats, just enter any
additional ones like above.</p>
<h2>Wordperfect filter</h2>
<p>A new filter for <b>Wordperfect</b> files:
<a href="rclwpd">rclwpd</a>.
You'll have to add the following lines in the configuration
files:</p>
<p>In <span class="filename">mimemap</span>: </p>
<pre>.wpd = application/vnd.wordperfect
</pre>
<p>In <span class="filename">mimeconf</span>, [index] section: </p>
<pre> application/vnd.wordperfect = exec rclwpd</pre>
<p>mimeconf, [icons] section:</p>
<pre>application/vnd.wordperfect = wordprocessing</pre>
<p>mimeconf, [categories] section, also add
application/vnd.wordperfect to the <tt>texts</tt> list.</p>
<p>In
<span class="filename">mimeview</span>, or the <em>[view]</em>
section of
<span class="filename">mimeconf</span> for older recoll versions: </p>
<pre> application/vnd.wordperfect = openoffice %f</pre>
<h2>Abiword filter</h2>
<p>A new filter for <a href="http://www.abisource.com/">
abiword</a> files: <a href="rclabw">
rclabw</a>.
You'll have to add the following lines in the configuration
files:</p>
<p>In <span class="filename">mimemap</span>: </p>
<pre> .abw = application/x-abiword</pre>
<p>In <span class="filename">mimeconf</span>: </p>
<pre> application/x-abiword = exec rclabw</pre>
<p>In
<span class="filename">mimeview</span>, or the <em>[view]</em>
section of
<span class="filename">mimeconf</span> for older recoll versions: </p>
<pre> application/x-abiword = abiword %f</pre>
<h2>Kword filter</h2>
<p>A new filter for <a href="http://www.kde.org/whatiskde/koffice.php/">
kword</a> files: <a href="rclkwd">
rclkwd</a>.
You'll have to add the following lines in the configuration
files:</p>
<p>In <span class="filename">mimemap</span>: </p>
<pre> .kwd = application/x-kword</pre>
<p>In <span class="filename">mimeconf</span>: </p>
<pre> application/x-kword = exec rclkwd</pre>
<p>In
<span class="filename">mimeview</span>, or the <em>[view]</em>
section of
<span class="filename">mimeconf</span> for older recoll versions: </p>
<pre> application/x-kword = kword %f</pre>
<h2>Openoffice filter</h2>
<p>The filter script for all releases up and including 1.7.5 had
a bug on Debian and Ubuntu systems. You can download the <a
href="rclsoff">corrected script</a>.</p>
<h2>Scribus filter</h2>
<p>A new filter for <a href="http://www.scribus.net/">
Scribus</a> files: <a href="rclscribus">
rclscribus</a>. This is only for the newer
<em>.sla</em> files. I am willing to add support for the older
<em>.scd</em> format if someone sends me a sample... You'll
have to add the following lines in the configuration files:</p>
<p>In <span class="filename">mimemap</span>: </p>
<pre> .sla = application/x-scribus</pre>
<p>In <span class="filename">mimeconf</span>: </p>
<pre> application/x-scribus = exec rclscribus</pre>
<p>In
<span class="filename">mimeview</span>, or the <em>[view]</em>
section of
<span class="filename">mimeconf</span> for older recoll versions: </p>
<pre> application/x-scribus = scribus %f</pre>
<p>Do *not* add entries for <em>.sla.gz</em>, the normal recoll
decompression process will handle them (hopefully...).</p>
<h2>Lyx filter</h2>
<p>A new filter for <a href="http://www.lyx.or/">
Lyx</a> files: <a href="rcllyx">rcllyx</a>.
This probably has quite a few issues with character encoding,
but it's also probably better than handling lyx documents as
text files.</p>
<p>In <span class="filename">mimemap</span>: </p>
<pre> .lyx = application/x-lyx</pre>
<p>In <span class="filename">mimeconf</span>: </p>
<pre> application/x-lyx = exec rcllyx</pre>
<p>In
<span class="filename">mimeview</span>, or the <em>[view]</em>
section of
<span class="filename">mimeconf</span> for older recoll versions: </p>
<pre> application/x-lyx = lyx %f</pre>
</div>
</body>
</html>