*** empty log message ***
This commit is contained in:
parent
bf4a2ccf5d
commit
6ed2673331
119
src/INSTALL
119
src/INSTALL
@ -23,40 +23,35 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
4.4. Configuration overview
|
||||
|
||||
4.5. Extending Recoll
|
||||
|
||||
4.1. Installing a prebuilt copy
|
||||
|
||||
Recoll binary installations are always linked statically to the xapian
|
||||
libraries, and have no other dependencies. You will only have to check or
|
||||
install supporting applications for the file types that you want to index
|
||||
beyond text, HTML and mail files.
|
||||
Recoll binary packages from the Recoll web site are always linked
|
||||
statically to the Xapian libraries, and have no other dependencies. You
|
||||
will only have to check or install supporting applications for the file
|
||||
types that you want to index beyond text, HTML and mail files, and maybe
|
||||
have a look at the configuration section (but this may not be necessary
|
||||
for a quick test with default parameters).
|
||||
|
||||
4.1.1. Installing through a package system
|
||||
|
||||
If you use a BSD-type port system or a prebuilt package (RPM or other),
|
||||
just follow the usual procedure, and maybe have a look at the
|
||||
configuration section (but this may not be necessary for a quick test with
|
||||
default parameters).
|
||||
just follow the usual procedure for your system.
|
||||
|
||||
4.1.2. Installing a prebuilt Recoll
|
||||
|
||||
The unpackaged binary versions are just compressed tar files of a build
|
||||
tree, where only the useful parts were kept (executables and sample
|
||||
configuration).
|
||||
The unpackaged binary versions on the Recoll web site are just compressed
|
||||
tar files of a build tree, where only the useful parts were kept
|
||||
(executables and sample configuration).
|
||||
|
||||
The executable binary files are built with a static link to libxapian and
|
||||
libiconv, to make installation easier (no dependencies). However, this
|
||||
also means that you cannot change the versions which are used.
|
||||
libiconv, to make installation easier (no dependencies).
|
||||
|
||||
After extracting the tar file, you can proceed with installation as if you
|
||||
had built the package from source (that is, just type make install). The
|
||||
binary trees are built for installation to /usr/local.
|
||||
|
||||
You may then need to install external applications to process some file
|
||||
types that you want indexed (ie: acrobat, postscript ...). See next
|
||||
section.
|
||||
|
||||
Finally, you may want to have a look at the configuration section.
|
||||
|
||||
--------------------------------------------------------------------------
|
||||
|
||||
Prev Home Next
|
||||
@ -120,9 +115,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
4.3.1. Prerequisites
|
||||
|
||||
At the very least, you will need to download and install the xapian core
|
||||
package (Recoll development currently uses version 0.9.5), and the qt
|
||||
run-time and development packages (Recoll development currently uses
|
||||
version 3.3.5, but any 3.3 version is probably OK).
|
||||
package (Recoll 1.9 normally uses version 1.0.2, but any 0.9 or 1.0.x
|
||||
version will work too), and the qt run-time and development packages
|
||||
(Recoll development currently uses version 3.3.5, but any 3.3 version is
|
||||
probably OK).
|
||||
|
||||
You will most probably be able to find a binary package for qt for your
|
||||
system. You may have to compile Xapian but this is not difficult (if you
|
||||
@ -135,8 +131,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
4.3.2. Building
|
||||
|
||||
Recoll has been built on Linux (redhat7.3, mandriva 2005/6, Fedora Core
|
||||
3/4/5), FreeBSD and Solaris 8. If you build on another system, I would
|
||||
very much welcome patches.
|
||||
3/4/5/6), FreeBSD 5/6, macosx, and Solaris 8. If you build on another
|
||||
system, and need to modify things, I would very much welcome patches.
|
||||
|
||||
Depending on the qt configuration on your system, you may have to set the
|
||||
QTDIR and QMAKESPECS variables in your environment:
|
||||
@ -190,9 +186,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
Link: HOME
|
||||
Link: UP
|
||||
Link: PREVIOUS
|
||||
Link: NEXT
|
||||
|
||||
Recoll user manual
|
||||
Prev Chapter 4. Installation
|
||||
Prev Chapter 4. Installation Next
|
||||
|
||||
--------------------------------------------------------------------------
|
||||
|
||||
@ -334,20 +331,14 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
value, and is the default. The daemversion is specific to the
|
||||
indexing monitor daemon.
|
||||
|
||||
filtersdir
|
||||
|
||||
A directory to search for the external filter scripts used to
|
||||
index some types of files. The value should not be changed, except
|
||||
if you want to modify one of the default scripts. The value can be
|
||||
redefined for any sub-directory.
|
||||
|
||||
indexstemminglanguages
|
||||
|
||||
A list of languages for which the stem expansion databases will be
|
||||
built. See recollindex(1) for possible values. You can add a stem
|
||||
expansion database for a different language by using recollindex
|
||||
-s, but it will be deleted during the next indexing. Only
|
||||
languages listed in the configuration file are permanent.
|
||||
built. See recollindex(1) or use the recollindex -l command for
|
||||
possible values. You can add a stem expansion database for a
|
||||
different language by using recollindex -s, but it will be deleted
|
||||
during the next indexing. Only languages listed in the
|
||||
configuration file are permanent.
|
||||
|
||||
defaultcharset
|
||||
|
||||
@ -357,6 +348,32 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
character set used is the one defined by the nls environment
|
||||
(LC_ALL, LC_CTYPE, LANG), or iso8859-1 if nothing is set.
|
||||
|
||||
maxfsoccuppc
|
||||
|
||||
Maximum file system occupation before we stop indexing. The value
|
||||
is a percentage, corresponding to what the "Capacity" df output
|
||||
column shows. The default value is 0, meaning no checking.
|
||||
|
||||
idxflushmb
|
||||
|
||||
Threshold (megabytes of new text data) where we flush from memory
|
||||
to disk index. Setting this can help control memory usage. A value
|
||||
of 0 means no explicit flushing, letting Xapian use its own
|
||||
default, which is flushing every 10000 documents (memory usage
|
||||
depends on average document size). The default value is 10.
|
||||
|
||||
filtersdir
|
||||
|
||||
A directory to search for the external filter scripts used to
|
||||
index some types of files. The value should not be changed, except
|
||||
if you want to modify one of the default scripts. The value can be
|
||||
redefined for any sub-directory.
|
||||
|
||||
iconsdir
|
||||
|
||||
The name of the directory where recoll result list icons are
|
||||
stored. You can change this if you want different images.
|
||||
|
||||
guesscharset
|
||||
|
||||
Decide if we try to guess the character set of files if no
|
||||
@ -389,11 +406,6 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
section or just be the beginning of the text). The default value
|
||||
is 250.
|
||||
|
||||
iconsdir
|
||||
|
||||
The name of the directory where recoll result list icons are
|
||||
stored. You can change this if you want different images.
|
||||
|
||||
aspellLanguage
|
||||
|
||||
Language definitions to use when creating the aspell dictionary.
|
||||
@ -525,29 +537,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
argument and should output the text contents in html format on the
|
||||
standard output.
|
||||
|
||||
The html could be very minimal like the following example:
|
||||
|
||||
<html><head>
|
||||
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
|
||||
</head>
|
||||
<body>some text content</body></html>
|
||||
|
||||
|
||||
You should take care to escape some characters inside the text by
|
||||
transforming them into appropriate entities. "&" should be transformed
|
||||
into "&", "<" should be transformed into "<".
|
||||
|
||||
The character set needs to be specified in the header. It does not need to
|
||||
be UTF-8 (Recoll will take care of translating it), but it must be
|
||||
accurate for good results.
|
||||
|
||||
Recoll will also make use of other header fields if they are present:
|
||||
title, description, keywords.
|
||||
|
||||
The easiest way to write a new filter is probably to start from an
|
||||
existing one.
|
||||
You can find more details about writing a Recoll filter in the section
|
||||
about writing filters
|
||||
|
||||
--------------------------------------------------------------------------
|
||||
|
||||
Prev Home
|
||||
Building from source Up
|
||||
Prev Home Next
|
||||
Building from source Up Extending Recoll
|
||||
|
||||
228
src/README
228
src/README
@ -11,7 +11,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
Copyright (c) 2005 Jean-Francois Dockes
|
||||
|
||||
This document introduces full text search notions and describes the
|
||||
installation and use of the Recoll application.
|
||||
installation and use of the Recoll application. It currently describes
|
||||
Recoll 1.9.
|
||||
|
||||
[ Split HTML / Single HTML ]
|
||||
|
||||
@ -105,6 +106,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
4.4.5. Examples of configuration adjustments
|
||||
|
||||
4.5. Extending Recoll
|
||||
|
||||
4.5.1. Writing a document filter
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
Chapter 1. Introduction
|
||||
@ -370,9 +375,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
configuration files.
|
||||
|
||||
The configuration is documented inside the installation chapter of this
|
||||
document, or in the recoll.conf(5) man page. The most immediately useful
|
||||
variable you may interested in is probably topdirs, which determines what
|
||||
subtrees get indexed.
|
||||
document, or in the recoll.conf(5) man page, but the most current
|
||||
information will most likely be the comments inside the sample file. The
|
||||
most immediately useful variable you may interested in is probably
|
||||
topdirs, which determines what subtrees get indexed.
|
||||
|
||||
The applications needed to index file types other than text, HTML or email
|
||||
(ie: pdf, postscript, ms-word...) are described in the external packages
|
||||
@ -660,23 +666,6 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
or lennon and either live or unplugged but not potatoes (in any part of
|
||||
the document).
|
||||
|
||||
The first element author:"john doe" is a phrase search limited to a
|
||||
specific field. Phrase searches are specified as usual by enclosing the
|
||||
words in double quotes. The field specification appears before the colon
|
||||
(of course this is not limited to phrases, author:Balzac would be ok too).
|
||||
Recoll currently manages the following fields:
|
||||
|
||||
* title, subject or caption are synonyms which specify data to be
|
||||
searched for in the document title or subject.
|
||||
|
||||
* author or from for searching the documents originators.
|
||||
|
||||
* keyword for searching the document specified keywords (few documents
|
||||
actually have any).
|
||||
|
||||
The query language is currently the only way to use the Recoll field
|
||||
search capability.
|
||||
|
||||
All elements in the search entry are normally combined with an implicit
|
||||
AND. It is possible to specify that elements be OR'ed instead, as in
|
||||
Beatles OR Lennon. The OR must be entered literally (capitals), and it has
|
||||
@ -686,8 +675,40 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
An entry preceded by a - specifies a term that should not appear.
|
||||
|
||||
The first element in the above exemple, author:"john doe" is a phrase
|
||||
search limited to a specific field. Phrase searches are specified as usual
|
||||
by enclosing the words in double quotes. The field specification appears
|
||||
before the colon (of course this is not limited to phrases, author:Balzac
|
||||
would be ok too). Recoll currently manages the following fields:
|
||||
|
||||
* title, subject or caption are synonyms which specify data to be
|
||||
searched for in the document title or subject.
|
||||
|
||||
* author or from for searching the documents originators.
|
||||
|
||||
* keyword for searching the document specified keywords (few documents
|
||||
actually have any).
|
||||
|
||||
As of release 1.9, the filters have the possibility to create other fields
|
||||
with arbitrary names. No standard filters use this possibility yet.
|
||||
|
||||
There are two other elements which may be specified through the field
|
||||
syntax, but are somewhat special:
|
||||
|
||||
* ext for specifying the file name extension (Ex: ext:html)
|
||||
|
||||
* mime for specifying the mime type. This one is quite special because
|
||||
you can specify several values which will be OR'ed (the normal default
|
||||
for the language is AND). Ex: mime:text/plain mime:text/html.
|
||||
Specifying an explicit boolean operator or negation (-) before a mime
|
||||
specification is not supported and will produce strange results.
|
||||
|
||||
The query language is currently the only way to use the Recoll field
|
||||
search capability.
|
||||
|
||||
Words inside phrases and capitalized words are not stem-expanded.
|
||||
Wildcards may be used anywhere.
|
||||
Wildcards may be used anywhere inside a term. Specifying a wild-card on
|
||||
the left of a term can produce a very slow search.
|
||||
|
||||
You can use the show query link at the top of the result list to check the
|
||||
exact query which was finally executed by Xapian.
|
||||
@ -873,8 +894,13 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
3.9. Document history
|
||||
|
||||
Documents that you actually view (with the internal preview or an external
|
||||
tool) are entered into the document history, which is remembered. You can
|
||||
display the history list by using the Tools/Doc History menu entry.
|
||||
tool) are entered into the document history, which is remembered.
|
||||
|
||||
You can display the history list by using the Tools/Doc History menu
|
||||
entry.
|
||||
|
||||
You can erase the document history by using the Erase document history
|
||||
entry in the File menu.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
@ -891,6 +917,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
The sort parameters stay in effect until they are explicitly reset, or the
|
||||
program exits. An activated sort is indicated in the result list header.
|
||||
|
||||
Sort parameters are remembered between program invocations, but result
|
||||
sorting is normally always inactive when the program starts. It is
|
||||
possible to keep the sorting activation state between program invocations
|
||||
by checking the Remember sort activation state option in the preferences.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
3.11. Search tips, shortcuts
|
||||
@ -984,6 +1015,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
* %D. Date
|
||||
|
||||
* %I. Icon image name
|
||||
|
||||
* %K. Keywords (if any)
|
||||
|
||||
* %L. Preview and Edit links
|
||||
@ -1002,7 +1035,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
The default value for the string is:
|
||||
|
||||
%R %S %L <b>%T</b><br>
|
||||
<img src="%I" align="left">%R %S %L <b>%T</b><br>
|
||||
%M %D <i>%U</i><br>
|
||||
%A %K
|
||||
|
||||
@ -1014,19 +1047,30 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
%A<font color=#008000>%U - %S</font> - %L
|
||||
|
||||
|
||||
Or the clean looking:
|
||||
|
||||
<img src="%I" align="left">%L <font color="#900000">%R</font>
|
||||
<b>%T</b><br>%S
|
||||
<font color="#808080"><i>%U</i></font>
|
||||
<table bgcolor="#e0e0e0">
|
||||
<tr><td><div>%A</div></td></tr>
|
||||
</table>%K
|
||||
|
||||
|
||||
The format of the Preview and Edit links is <a href="Pdocnum"> and <a
|
||||
href="Edocnum"> where docnum is what %N would print. This makes the
|
||||
title a preview link in the above format.
|
||||
|
||||
Please note that, due to the way the program handles right mouse
|
||||
clicks in the result list, if the custom formatting results in
|
||||
multiple paragraphs per result, right clicks will only work inside the
|
||||
first one.
|
||||
|
||||
* HTML help browser: this will let you chose your preferred browser
|
||||
which will be started from the Help menu to read the user manual. You
|
||||
can enter a simple name if the command is in your PATH, or browse for
|
||||
a full pathname.
|
||||
|
||||
* Show document type icons in result list: icons in the result list can
|
||||
be turned off. They take quite a lot of space and convey relatively
|
||||
little useful information.
|
||||
|
||||
* Auto-start simple search on white space entry: if this is checked, a
|
||||
search will be executed each time you enter a space in the simple
|
||||
search input field. This lets you look at the result list as you enter
|
||||
@ -1086,42 +1130,35 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
4.1. Installing a prebuilt copy
|
||||
|
||||
Recoll binary installations are always linked statically to the xapian
|
||||
libraries, and have no other dependencies. You will only have to check or
|
||||
install supporting applications for the file types that you want to index
|
||||
beyond text, HTML and mail files.
|
||||
Recoll binary packages from the Recoll web site are always linked
|
||||
statically to the Xapian libraries, and have no other dependencies. You
|
||||
will only have to check or install supporting applications for the file
|
||||
types that you want to index beyond text, HTML and mail files, and maybe
|
||||
have a look at the configuration section (but this may not be necessary
|
||||
for a quick test with default parameters).
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.1.1. Installing through a package system
|
||||
|
||||
If you use a BSD-type port system or a prebuilt package (RPM or other),
|
||||
just follow the usual procedure, and maybe have a look at the
|
||||
configuration section (but this may not be necessary for a quick test with
|
||||
default parameters).
|
||||
just follow the usual procedure for your system.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.1.2. Installing a prebuilt Recoll
|
||||
|
||||
The unpackaged binary versions are just compressed tar files of a build
|
||||
tree, where only the useful parts were kept (executables and sample
|
||||
configuration).
|
||||
The unpackaged binary versions on the Recoll web site are just compressed
|
||||
tar files of a build tree, where only the useful parts were kept
|
||||
(executables and sample configuration).
|
||||
|
||||
The executable binary files are built with a static link to libxapian and
|
||||
libiconv, to make installation easier (no dependencies). However, this
|
||||
also means that you cannot change the versions which are used.
|
||||
libiconv, to make installation easier (no dependencies).
|
||||
|
||||
After extracting the tar file, you can proceed with installation as if you
|
||||
had built the package from source (that is, just type make install). The
|
||||
binary trees are built for installation to /usr/local.
|
||||
|
||||
You may then need to install external applications to process some file
|
||||
types that you want indexed (ie: acrobat, postscript ...). See next
|
||||
section.
|
||||
|
||||
Finally, you may want to have a look at the configuration section.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.2. Supporting packages
|
||||
@ -1161,9 +1198,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
4.3.1. Prerequisites
|
||||
|
||||
At the very least, you will need to download and install the xapian core
|
||||
package (Recoll development currently uses version 0.9.5), and the qt
|
||||
run-time and development packages (Recoll development currently uses
|
||||
version 3.3.5, but any 3.3 version is probably OK).
|
||||
package (Recoll 1.9 normally uses version 1.0.2, but any 0.9 or 1.0.x
|
||||
version will work too), and the qt run-time and development packages
|
||||
(Recoll development currently uses version 3.3.5, but any 3.3 version is
|
||||
probably OK).
|
||||
|
||||
You will most probably be able to find a binary package for qt for your
|
||||
system. You may have to compile Xapian but this is not difficult (if you
|
||||
@ -1178,8 +1216,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
4.3.2. Building
|
||||
|
||||
Recoll has been built on Linux (redhat7.3, mandriva 2005/6, Fedora Core
|
||||
3/4/5), FreeBSD and Solaris 8. If you build on another system, I would
|
||||
very much welcome patches.
|
||||
3/4/5/6), FreeBSD 5/6, macosx, and Solaris 8. If you build on another
|
||||
system, and need to modify things, I would very much welcome patches.
|
||||
|
||||
Depending on the qt configuration on your system, you may have to set the
|
||||
QTDIR and QMAKESPECS variables in your environment:
|
||||
@ -1370,20 +1408,14 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
value, and is the default. The daemversion is specific to the
|
||||
indexing monitor daemon.
|
||||
|
||||
filtersdir
|
||||
|
||||
A directory to search for the external filter scripts used to
|
||||
index some types of files. The value should not be changed, except
|
||||
if you want to modify one of the default scripts. The value can be
|
||||
redefined for any sub-directory.
|
||||
|
||||
indexstemminglanguages
|
||||
|
||||
A list of languages for which the stem expansion databases will be
|
||||
built. See recollindex(1) for possible values. You can add a stem
|
||||
expansion database for a different language by using recollindex
|
||||
-s, but it will be deleted during the next indexing. Only
|
||||
languages listed in the configuration file are permanent.
|
||||
built. See recollindex(1) or use the recollindex -l command for
|
||||
possible values. You can add a stem expansion database for a
|
||||
different language by using recollindex -s, but it will be deleted
|
||||
during the next indexing. Only languages listed in the
|
||||
configuration file are permanent.
|
||||
|
||||
defaultcharset
|
||||
|
||||
@ -1393,6 +1425,32 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
character set used is the one defined by the nls environment
|
||||
(LC_ALL, LC_CTYPE, LANG), or iso8859-1 if nothing is set.
|
||||
|
||||
maxfsoccuppc
|
||||
|
||||
Maximum file system occupation before we stop indexing. The value
|
||||
is a percentage, corresponding to what the "Capacity" df output
|
||||
column shows. The default value is 0, meaning no checking.
|
||||
|
||||
idxflushmb
|
||||
|
||||
Threshold (megabytes of new text data) where we flush from memory
|
||||
to disk index. Setting this can help control memory usage. A value
|
||||
of 0 means no explicit flushing, letting Xapian use its own
|
||||
default, which is flushing every 10000 documents (memory usage
|
||||
depends on average document size). The default value is 10.
|
||||
|
||||
filtersdir
|
||||
|
||||
A directory to search for the external filter scripts used to
|
||||
index some types of files. The value should not be changed, except
|
||||
if you want to modify one of the default scripts. The value can be
|
||||
redefined for any sub-directory.
|
||||
|
||||
iconsdir
|
||||
|
||||
The name of the directory where recoll result list icons are
|
||||
stored. You can change this if you want different images.
|
||||
|
||||
guesscharset
|
||||
|
||||
Decide if we try to guess the character set of files if no
|
||||
@ -1425,11 +1483,6 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
section or just be the beginning of the text). The default value
|
||||
is 250.
|
||||
|
||||
iconsdir
|
||||
|
||||
The name of the directory where recoll result list icons are
|
||||
stored. You can change this if you want different images.
|
||||
|
||||
aspellLanguage
|
||||
|
||||
Language definitions to use when creating the aspell dictionary.
|
||||
@ -1571,7 +1624,34 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
argument and should output the text contents in html format on the
|
||||
standard output.
|
||||
|
||||
The html could be very minimal like the following example:
|
||||
You can find more details about writing a Recoll filter in the section
|
||||
about writing filters
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.5. Extending Recoll
|
||||
|
||||
4.5.1. Writing a document filter
|
||||
|
||||
Recoll filters are executable programs which translate from a specific
|
||||
format (ie: openoffice, acrobat, etc.) to the Recoll indexing input
|
||||
format, which was chosen to be HTML.
|
||||
|
||||
Recoll filters are usually shell-scripts, but this is in no way necessary.
|
||||
These programs are extremely simple and most of the difficulty lies in
|
||||
extracting the text from the native format, not outputting what is
|
||||
expected by Recoll. Happily enough, most document formats already have
|
||||
translators or text extractors which handle the difficult part and can be
|
||||
called from the filter.
|
||||
|
||||
Filters are called with a single argument which is the source file name.
|
||||
They should output the result to stdout.
|
||||
|
||||
The RECOLL_FILTER_FORPREVIEW environment variable (values yes, no) tells
|
||||
the filter if the operation is for indexing or previewing. Some filters
|
||||
use this to output a slightly different format. This is not essential.
|
||||
|
||||
The output HTML could be very minimal like the following example:
|
||||
|
||||
<html><head>
|
||||
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
|
||||
@ -1590,6 +1670,16 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
Recoll will also make use of other header fields if they are present:
|
||||
title, description, keywords.
|
||||
|
||||
As of Recoll release 1.9, filters also have the possibility to "invent"
|
||||
field names. This should be output as meta tags:
|
||||
|
||||
<meta name="somefield" content="Some textual data" />
|
||||
|
||||
In this case, a correspondance between field name and Xapian prefix should
|
||||
also be added to the mimeconf file. See the existing entries for
|
||||
inspiration. The field can then be used inside the query language to
|
||||
narrow searches.
|
||||
|
||||
The easiest way to write a new filter is probably to start from an
|
||||
existing one.
|
||||
|
||||
|
||||
@ -1,4 +1,4 @@
|
||||
.\" $Id: recoll.conf.5,v 1.4 2006-11-20 18:07:02 dockes Exp $ (C) 2005 J.F.Dockes\$
|
||||
.\" $Id: recoll.conf.5,v 1.5 2007-07-13 10:18:49 dockes Exp $ (C) 2005 J.F.Dockes\$
|
||||
.TH RECOLL.CONF 5 "8 January 2006"
|
||||
.SH NAME
|
||||
recoll.conf \- main personal configuration file for Recoll
|
||||
@ -10,6 +10,11 @@ The system-wide configuration file is normally located inside
|
||||
/usr/[local]/share/recoll/examples. Any parameter set in the common file
|
||||
may be overriden by setting it in the personal configuration file, by default:
|
||||
.IR $HOME/.recoll/recoll.conf
|
||||
.LP
|
||||
Please note while we try to keep this manual page reasonably up to date, it
|
||||
will frequently lag the current state of the software. The best source of
|
||||
information about the configuration are the comments in the configuration
|
||||
file.
|
||||
|
||||
.LP
|
||||
A short extract of the file might look as follows:
|
||||
@ -65,6 +70,12 @@ The list can be redefined for subdirectories, but is only actually changed
|
||||
for the top level ones in
|
||||
.I topdirs
|
||||
.TP
|
||||
.BI "skippedPaths = " patterns
|
||||
A space-separated list of patterns for paths the indexer should not descend
|
||||
into. Together with topdirs, this allows pruning the indexed tree to one's
|
||||
content. daemSkippedPaths can be used to define a specific value for the
|
||||
real time indexing monitor.
|
||||
.TP
|
||||
.BI "loglevel = " value
|
||||
Verbosity level for recoll and recollindex. A value of 4 lists quite a lot of
|
||||
debug/information messages. 3 lists only errors.
|
||||
@ -76,32 +87,46 @@ Where should the messages go. 'stderr' can be used as a special value.
|
||||
.B daemlogfilename
|
||||
can be used to specify a different value for the real-time indexing daemon.
|
||||
.TP
|
||||
.BI "dbdir = " directory
|
||||
The name of the Xapian database directory. It will be created if needed
|
||||
when the database is initialized. If this is not an absolute pathname, it
|
||||
will be taken relative to the configuration directory.
|
||||
.TP
|
||||
.BI "indexstemminglanguages = " languages
|
||||
A list of languages for which the stem expansion databases will be
|
||||
built. See recollindex(1) for possible values.
|
||||
.TP
|
||||
.BI "defaultcharset = " charset
|
||||
The name of the character set used for files that do not contain a
|
||||
character set definition (ie: plain text files). This can be redefined for
|
||||
any subdirectory.
|
||||
.TP
|
||||
.BI "maxfsoccuppc = " percentnumber
|
||||
Maximum file system occupation before we
|
||||
stop indexing. The value is a percentage, corresponding to
|
||||
what the "Capacity" df output column shows. The default
|
||||
value is 0, meaning no checking.
|
||||
.TP
|
||||
.BI "idxflushmb = " megabytes
|
||||
Threshold (megabytes of new text data)
|
||||
where we flush from memory to disk index. Setting this can
|
||||
help control memory usage. A value of 0 means no explicit
|
||||
flushing, letting Xapian use its own default, which is
|
||||
flushing every 10000 documents (memory usage depends on
|
||||
average document size). The default value is 10.
|
||||
.TP
|
||||
.BI "filtersdir = " directory
|
||||
A directory to search for the external filter scripts used to index some
|
||||
types of files. The value should not be changed, except if you want to
|
||||
modify one of the default scripts. The value can be redefined for any
|
||||
subdirectory.
|
||||
.TP
|
||||
.BI "indexstemminglanguages = " languages
|
||||
A list of languages for which the stem expansion databases will be
|
||||
built. See recollindex(1) for possible values.
|
||||
.TP
|
||||
.BI "iconsdir = " directory
|
||||
The name of the directory where
|
||||
.B recoll
|
||||
result list icons are stored. You can change this if you want different
|
||||
images.
|
||||
.TP
|
||||
.BI "dbdir = " directory
|
||||
The name of the Xapian database directory. It will be created if needed
|
||||
when the database is initialized. If this is not an absolute pathname, it
|
||||
will be taken relative to the configuration directory.
|
||||
.TP
|
||||
.BI "defaultcharset = " charset
|
||||
The name of the character set used for files that do not contain a
|
||||
character set definition (ie: plain text files). This can be redefined for
|
||||
any subdirectory.
|
||||
.TP
|
||||
.BI "guesscharset = " boolean
|
||||
Try to guess the character set of files if no internal value is available
|
||||
(ie: for plain text files). This does not work well in general, and should
|
||||
|
||||
@ -1308,7 +1308,7 @@ Peut ralentir l'affichage si les documents sont gros.</translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Show document type icons in result list.</source>
|
||||
<translation>Afficher les icônes de type de fichier dans la liste de résultats.</translation>
|
||||
<translation type="obsolete">Afficher les icônes de type de fichier dans la liste de résultats.</translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Auto-start simple search on whitespace entry.</source>
|
||||
@ -1434,7 +1434,7 @@ Peut ralentir l'affichage si les documents sont gros.</translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:<br>%A Abstract<br> %D Date<br> %K Keywords (if any)<br> %L Preview and Edit links<br> %M Mime type<br> %N Result number<br> %R Relevance percentage<br> %S Size information<br> %T Title<br> %U Url<br></source>
|
||||
<translation>Définit the format pour chaque paragraphe de la liste de résultats. Utilise le format html qt et des remplacements à la printf:<br>%A Résumé<br> %D Date<br> %K Mots clefs (s'il y en a)<br> %L Liens aperçu et édition<br> %M Type Mime<br> %N Numéro de résultat<br> %R Pertinence<br> %S Taille<br> %T Titre<br> %U Url<br></translation>
|
||||
<translation type="obsolete">Définit the format pour chaque paragraphe de la liste de résultats. Utilise le format html qt et des remplacements à la printf:<br>%A Résumé<br> %D Date<br> %K Mots clefs (s'il y en a)<br> %L Liens aperçu et édition<br> %M Type Mime<br> %N Numéro de résultat<br> %R Pertinence<br> %S Taille<br> %T Titre<br> %U Url<br></translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Automatically add phrase to simple searches</source>
|
||||
@ -1488,7 +1488,15 @@ Ceci devrait donner une meilleure pertinence aux résultats où les termes reche
|
||||
</message>
|
||||
<message>
|
||||
<source>Remember sorting preference between invocations.</source>
|
||||
<translation>Mémoriser l'état des paramètres de tri.</translation>
|
||||
<translation type="obsolete">Mémoriser l'état des paramètres de tri.</translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:<br>%A Abstract<br> %D Date<br> %I Icon image name<br> %K Keywords (if any)<br> %L Preview and Edit links<br> %M Mime type<br> %N Result number<br> %R Relevance percentage<br> %S Size information<br> %T Title<br> %U Url<br></source>
|
||||
<translation>Definit le format des paragraphes de la liste de resultats. Utilise le format html qt et des directives de substitution de type printf:<br>%A Abstract<br> %D Date<br> %I Icon image name<br> %K Keywords (if any)<br> %L Preview and Edit links<br> %M Mime type<br> %N Result number<br> %R Relevance percentage<br> %S Size information<br> %T Title<br> %U Url<br></translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Remember sort activation state.</source>
|
||||
<translation>Memoriser l'etat d'activation du tri.</translation>
|
||||
</message>
|
||||
</context>
|
||||
<context>
|
||||
|
||||
@ -1309,7 +1309,7 @@ Peut ralentir l'affichage si les documents sont gros.</translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Show document type icons in result list.</source>
|
||||
<translation type="unfinished">Mostra le icone nella lsita dei risultati.</translation>
|
||||
<translation type="obsolete">Mostra le icone nella lsita dei risultati.</translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Auto-start simple search on whitespace entry.</source>
|
||||
@ -1435,7 +1435,7 @@ Può essere lento per grossi documenti..</translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:<br>%A Abstract<br> %D Date<br> %K Keywords (if any)<br> %L Preview and Edit links<br> %M Mime type<br> %N Result number<br> %R Relevance percentage<br> %S Size information<br> %T Title<br> %U Url<br></source>
|
||||
<translation>Definisci il formato per ogni risultato. Usa il formato qt-html:e simile a quello di printf:<br>%A Riassunto<br> %D Data<br> %K Keywords (se ci sono)<br> %L Links Preview e Edita<br> %M Tipo Mime<br> %N Numero di risultati<br> %R Rilevanza<br> %S Size<br> %T Titolo<br> %U Url<br></translation>
|
||||
<translation type="obsolete">Definisci il formato per ogni risultato. Usa il formato qt-html:e simile a quello di printf:<br>%A Riassunto<br> %D Data<br> %K Keywords (se ci sono)<br> %L Links Preview e Edita<br> %M Tipo Mime<br> %N Numero di risultati<br> %R Rilevanza<br> %S Size<br> %T Titolo<br> %U Url<br></translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Automatically add phrase to simple searches</source>
|
||||
@ -1489,7 +1489,11 @@ Questo dovrebbe dare la precedenza ai risultati che contengono i termini esattam
|
||||
<translation type="unfinished">Rimuovi dalla lista. Non ha effetto sull'indice del disco</translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Remember sorting preference between invocations.</source>
|
||||
<source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:<br>%A Abstract<br> %D Date<br> %I Icon image name<br> %K Keywords (if any)<br> %L Preview and Edit links<br> %M Mime type<br> %N Result number<br> %R Relevance percentage<br> %S Size information<br> %T Title<br> %U Url<br></source>
|
||||
<translation type="unfinished"></translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Remember sort activation state.</source>
|
||||
<translation type="unfinished"></translation>
|
||||
</message>
|
||||
</context>
|
||||
|
||||
@ -1374,7 +1374,7 @@ May be slow for big documents.</source>
|
||||
</message>
|
||||
<message>
|
||||
<source>Show document type icons in result list.</source>
|
||||
<translation>Отображать типы документов в списке результатов.</translation>
|
||||
<translation type="obsolete">Отображать типы документов в списке результатов.</translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Auto-start simple search on whitespace entry.</source>
|
||||
@ -1498,10 +1498,6 @@ May be slow for big documents.</source>
|
||||
<source>Result paragraph<br>format string</source>
|
||||
<translation type="unfinished"></translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:<br>%A Abstract<br> %D Date<br> %K Keywords (if any)<br> %L Preview and Edit links<br> %M Mime type<br> %N Result number<br> %R Relevance percentage<br> %S Size information<br> %T Title<br> %U Url<br></source>
|
||||
<translation type="unfinished"></translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Automatically add phrase to simple searches</source>
|
||||
<translation type="unfinished"></translation>
|
||||
@ -1552,7 +1548,11 @@ This should give higher precedence to the results where the search terms appear
|
||||
<translation type="unfinished"></translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Remember sorting preference between invocations.</source>
|
||||
<source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:<br>%A Abstract<br> %D Date<br> %I Icon image name<br> %K Keywords (if any)<br> %L Preview and Edit links<br> %M Mime type<br> %N Result number<br> %R Relevance percentage<br> %S Size information<br> %T Title<br> %U Url<br></source>
|
||||
<translation type="unfinished"></translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Remember sort activation state.</source>
|
||||
<translation type="unfinished"></translation>
|
||||
</message>
|
||||
</context>
|
||||
|
||||
@ -1205,7 +1205,7 @@ May be slow for big documents.</source>
|
||||
</message>
|
||||
<message>
|
||||
<source>Show document type icons in result list.</source>
|
||||
<translation>Відображати типи документів у списку результатів.</translation>
|
||||
<translation type="obsolete">Відображати типи документів у списку результатів.</translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Auto-start simple search on whitespace entry.</source>
|
||||
@ -1329,10 +1329,6 @@ May be slow for big documents.</source>
|
||||
<source>Result paragraph<br>format string</source>
|
||||
<translation type="unfinished"></translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:<br>%A Abstract<br> %D Date<br> %K Keywords (if any)<br> %L Preview and Edit links<br> %M Mime type<br> %N Result number<br> %R Relevance percentage<br> %S Size information<br> %T Title<br> %U Url<br></source>
|
||||
<translation type="unfinished"></translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Automatically add phrase to simple searches</source>
|
||||
<translation type="unfinished"></translation>
|
||||
@ -1383,7 +1379,11 @@ This should give higher precedence to the results where the search terms appear
|
||||
<translation type="unfinished"></translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Remember sorting preference between invocations.</source>
|
||||
<source>Defines the format for each result list paragraph. Use qt html format and printf-like replacements:<br>%A Abstract<br> %D Date<br> %I Icon image name<br> %K Keywords (if any)<br> %L Preview and Edit links<br> %M Mime type<br> %N Result number<br> %R Relevance percentage<br> %S Size information<br> %T Title<br> %U Url<br></source>
|
||||
<translation type="unfinished"></translation>
|
||||
</message>
|
||||
<message>
|
||||
<source>Remember sort activation state.</source>
|
||||
<translation type="unfinished"></translation>
|
||||
</message>
|
||||
</context>
|
||||
|
||||
@ -8,9 +8,16 @@ Latest (1.8.2):
|
||||
- There are a few problems in the qt4 version of recoll: some accelerators
|
||||
(esc-spc, ctl-arrow) do not work, neither do copy/paste between the
|
||||
result list and preview windows and x11 applications.
|
||||
|
||||
- The q3textedit find() method is extremely slow. Positionning to first
|
||||
search term in preview has been disabled in qt4, and the application will
|
||||
sometimes appear to be looping when using the find feature in the
|
||||
preview window (it's not looping, it's searching :( )
|
||||
|
||||
- The dates shown for email attachments in a result list are the email
|
||||
folder modification date. This should be inherited from the parent
|
||||
message instead.
|
||||
|
||||
- There are sometimes problems with document deletions: the index can
|
||||
get in a state where deleted or moved documents are not purged from the
|
||||
index (the log file says that the doc are deleted, but they aren't
|
||||
@ -19,6 +26,15 @@ Latest (1.8.2):
|
||||
fixed in a future release. You can apply the following patch to xapian
|
||||
1.0.1 to fix it:
|
||||
http://www.lesbonscomptes.com/recoll/xapian/xapian-delete-document.patch
|
||||
|
||||
- Under ubuntu (at least), the default awk interpreter (mawk) is buggy,
|
||||
and the recoll pdf input filter does not work (removes all space
|
||||
characters). This can be solved by installing the gawk package.
|
||||
|
||||
- If the user-chosen result list entry format results in several paragraphs
|
||||
(in the qt textedit sense), right clicks will only work inside the first
|
||||
one for each entry.
|
||||
|
||||
- NEAR crashes: 1.6 has added NEAR searches. Unlike what recoll did
|
||||
with PHRASES, stemming expansion is performed on terms inside NEAR
|
||||
clauses (except if prevented by a capitalized entry of course). There is
|
||||
|
||||
@ -1,30 +1,61 @@
|
||||
CHANGES
|
||||
|
||||
1.9.0
|
||||
- Add option to remember sort tool state between program invocations (it is
|
||||
reset to inactive by default)
|
||||
- Improve qt4 build: no more need for --enable-qt4
|
||||
- Fixed a number of qt4 glitches: selection and keyboard shortcuts.
|
||||
- Incompatible change: the icon image reference is now part of the result
|
||||
list paragraph format string:
|
||||
- If you had a standard config, you need do nothing.
|
||||
- If you had a custom format string, you need to add
|
||||
<img src="%I" align="left"> at its beginning to get the same result as
|
||||
before.
|
||||
- If you had unchecked the "show icons" option, you need to remove the
|
||||
above string from the paragraph format to make the icons go away.
|
||||
Changes to the format string are performed in the
|
||||
"Preferences->Query Configuration->User Interface" dialog tab.
|
||||
|
||||
- New filters: abiword and kword, rcljpeg, rclflac, rclogg (contributed
|
||||
filters). The jpeg and audio filters should be extended to make use of
|
||||
the new field indexing/search capability (hint :) )
|
||||
|
||||
- When searching for an empty string inside the preview window, position
|
||||
the window to the next occurrence of the primary search terms.
|
||||
- Have email attachments inherit date and author from their parent message
|
||||
the window to the next occurrence of a primary search term.
|
||||
|
||||
- Added ext: and mime: selectors to the query language.
|
||||
|
||||
- Added an adjustable flush threshold during indexing: should help control
|
||||
memory usage. See the idxflushmb configuration parameter.
|
||||
memory usage. See the idxflushmb configuration variable.
|
||||
|
||||
- Added a check for file system free space. Indexing will stop if the
|
||||
threshold is reached. See the maxfsoccuppc configuration parameter.
|
||||
- Fix bus error on rclmon exit
|
||||
- Better handle aspell errors inside rclmon
|
||||
|
||||
- Add preference option to remember sort tool state between program
|
||||
invocations (it is reset to inactive by default)
|
||||
|
||||
- Added File menu entry to erase document history.
|
||||
- Added ext: and mime: selectors to the query language.
|
||||
|
||||
- Bound the space and backspace keys to PgUp/PgDown in preview.
|
||||
|
||||
- (Hopefully) Improved abstract (keyword in context) generation
|
||||
|
||||
- Improve qt4 build: no more need for --enable-qt4. Note: the qt4 build
|
||||
still needs the qt3 support library.
|
||||
|
||||
- Added support for arbitrary fields. Filters can now produce any number of
|
||||
fields which will be selectively searchable through the query language.
|
||||
- Added abiword and kword support.
|
||||
- Contributed filter: rcljpeg. This should be extended to use the new field
|
||||
support.
|
||||
fields which will be selectively searchable through the query
|
||||
language. This could be useful, for exemple, for the mp3 and jpeg filters
|
||||
(but is not currently used).
|
||||
|
||||
- Changed the icon to an ugly one. The previous one was nicer but looked
|
||||
too much like Xapian's.
|
||||
|
||||
- Added some kind of support for a stopword list.
|
||||
- Bound space and backspace to PgUp/PgDown in preview.
|
||||
- Have email attachments inherit date and author from their parent message
|
||||
memory usage. See the idxflushmb configuration parameter.
|
||||
|
||||
- Fix bus error on rclmon exit
|
||||
|
||||
- Better handling of aspell errors inside rclmon
|
||||
|
||||
- Fixed a number of qt4 glitches: selection and keyboard shortcuts.
|
||||
|
||||
1.8.2 2007-05-19
|
||||
- Fixed method name for compatibility with xapian 1.0.0
|
||||
@ -293,4 +324,4 @@ or keep only the modified parameters.
|
||||
identification for suffix-less or unknown files.
|
||||
- Typo had removed support for .Z compression
|
||||
- Use more appropriate conjonction operators when computing the advanced
|
||||
search query (OP_AND_MAYBE, OP_FILTER instead of OP_AND)
|
||||
search query (OP_AND_MAYBE, OP_FILTER instead of OP_AND)
|
||||
|
||||
@ -56,11 +56,12 @@
|
||||
<p><i>For building from source</i>, you will need a xapian-core
|
||||
installation. You will find source and binary packages on the
|
||||
<a href="http://www.xapian.org/download.php">Xapian download
|
||||
page</a>. Recoll should build with any 0.9.x Xapian version
|
||||
(the current one is 0.9.10).</p>
|
||||
page</a>. Recoll 1.8.2 should build with any 0.9.x or 1.0.x
|
||||
Xapian version (the current one is 1.0.1).</p>
|
||||
|
||||
<p>You need Qt 3.3 (or qt 4) in all cases (configure Recoll with
|
||||
<em>configure --enable-qt4</em> to build with qt4).</p>
|
||||
<em>configure --enable-qt4</em> to build with qt4, this needs
|
||||
the qt3 support library to be present).</p>
|
||||
|
||||
<p>Recoll relies on external packages for some
|
||||
of its functionality (ie: for many of the non-text file
|
||||
@ -124,9 +125,7 @@
|
||||
of which is the new default index format. In order to take
|
||||
advantage of the new format (which is not mandatory) Recoll
|
||||
users updating from an older release need to delete their old
|
||||
index. There are <a
|
||||
href="usermanual/usermanual.html#RCL.INDEXING.STORAGE.FORMAT">more
|
||||
details in the user manual</a>.</p>
|
||||
index. <a href="xapUpg100.html">More details</a>.</p>
|
||||
|
||||
<p>Older recoll releases:
|
||||
<a href="recoll-1.8.1.tar.gz">1.8.1</a>
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user