The RECOLL_FILTER_FORPREVIEW environment
@@ -6196,13 +6195,32 @@ application/x-chm = execm rclchm
The output HTML could be very minimal like the - following example:
+Both the simple and persistent input handlers can + return any MIME type to Recoll, which will further + process the data according to the MIME configuration.
+ +Most input filters filters produce either text/plain or text/html data. There are exceptions,
+ for example, filters which process archive file
+ (zip, tar, etc.) will usually return the
+ documents as they are found, without processing them
+ further.
There is nothing to say about text/plain output, except that its
+ character encoding should be consistent with what is
+ specified in the mimeconf
+ file.
For filters producing HTML, the output could be very + minimal like the following example:
<html>
<head>
@@ -6222,9 +6240,9 @@ application/x-chm = execm rclchm
"literal">&", "<" should be transformed into
"<". This is not
- always properly done by translating programs which output
- HTML, and of course never by those which output plain
- text.
+ always properly done by external helper programs which
+ output HTML, and of course never by those which output
+ plain text.
When encapsulating plain text in an HTML body, the
display of a preview may be improved by enclosing the
@@ -6293,6 +6311,17 @@ or
described in a further
section.
+
+ Persistent filters can use another, probably simpler,
+ method to produce metadata, by calling the setfield() helper method. This avoids
+ the necessity to produce HTML, and any issue with HTML
+ quoting. See, for example, rclaudio in Recoll 1.23 and later for an example
+ of handler which outputs text/plain and uses setfield() to produce metadata.
skippedNames-List of name endings to remove from the default + skippedNames list.
+skippedNames+List of name endings to add to the default + skippedNames list.
+noContentSuffixesnoContentSuffixes-List of name endings to remove from the default + noContentSuffixes list.
+noContentSuffixes+List of name endings to add to the default + noContentSuffixes list.
+skippedPathsnomd5mimetypesDon't compute md5 for these types. md5 checksums + are used only for deduplicating results, and can be + very expensive to compute on multimedia or other + big files. This list lets you turn off md5 + computation for selected types. It is global (no + redefinition for subtrees). At the moment, it only + has an effect for external handlers (exec and + execm). The file types can be specified by listing + either MIME types (e.g. audio/mpeg) or handler + names (e.g. rclaudio).
+All extension values in mimemap must be entered in lower case.
+ File names extensions are lower-cased for comparison
+ during indexing, meaning that an upper case mimemap entry will never be
+ matched.
The mappings can be specified on a per-subtree basis,
which may be useful in some cases. Example: okular notes have a