diff --git a/src/doc/user/usermanual.sgml b/src/doc/user/usermanual.sgml index 57a99f55..5acc4386 100644 --- a/src/doc/user/usermanual.sgml +++ b/src/doc/user/usermanual.sgml @@ -3356,36 +3356,71 @@ application/x-chm = execm rclchm Filter HTML output The output HTML could be very minimal like the following - example: - - <html><head> -<meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> -</head> -<body>some text content</body></html> + example: + +<html> + <head> + <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> + </head> + <body> + Some text content + </body> +</html> + You should take care to escape some - characters inside - the text by transforming them into appropriate - entities. "&" should be transformed into + characters inside the text by transforming them into + appropriate entities. At the very minimum, + "&" should be transformed into "&amp;", "<" should be transformed into "&lt;". This is not always properly done by translating programs which output HTML, and of - course never by those which output plain text. + course never by those which output plain text. + + When encapsulating plain text in an HTML body, + the display of a preview may be improved by enclosing the + text inside <pre> tags. The character set needs to be specified in the header. It does not need to be UTF-8 (&RCL; will take care of translating it), but it must be accurate for good results. - &RCL; will also make use of other header fields if - they are present: title, - description, - keywords. + &RCL; will process meta tags inside + the header as possible document fields candidates. Documents + fields can be processed by the indexer in different ways, + for searching or displaying inside query results. This is + described in a following + section. + + + By default, the indexer will process the standard header + fields if they are present: title, + meta/description, + and meta/keywords are both indexed and stored + for query-time display. + + A predefined non-standard meta tag + will also be processed by &RCL; without further + configuration: if a date tag is present + and has the right format, it will be used as the document + date (for display and sorting), in preference to the file + modification date. The date format should be as follows: + +<meta name="date" content="YYYY-mm-dd HH:MM:SS"> +or +<meta name="date" content="YYYY-mm-ddTHH:MM:SS"> + + Example: + +<meta name="date" content="2013-02-24 17:50:00"> + + Filters also have the possibility to "invent" field - names. This should be output as meta tags: + names. This should also be output as meta tags: <meta name="somefield" content="Some textual data" /> @@ -3401,8 +3436,9 @@ application/x-chm = execm rclchm <meta name="somefield" markup="html" content="Some <i>textual</i> data" /> - See the following section for details about configuring - how field data is processed by the indexer. + As written above, the processing of fields is described + in a further + section. diff --git a/src/sampleconf/mimeview b/src/sampleconf/mimeview index 5f2ad8c2..e71eb645 100644 --- a/src/sampleconf/mimeview +++ b/src/sampleconf/mimeview @@ -17,7 +17,7 @@ # - For pages of CHM and EPUB documents where we can choose to open the # parent document instead of a temporary html file. xallexcepts = application/pdf application/postscript application/x-dvi \ - text/html|gnuinfo text/html|chm text/html|epub + text/html|gnuinfo text/html|chm text/html|epub [view] # Pseudo entry used if the 'use desktop' preference is set in the GUI diff --git a/website/doc.html b/website/doc.html index 9b801e6f..7eb273f9 100644 --- a/website/doc.html +++ b/website/doc.html @@ -46,6 +46,10 @@

Other documentation