This commit is contained in:
parent
c85e74db66
commit
d2fa1befc1
93
src/INSTALL
93
src/INSTALL
@ -11,21 +11,21 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
--------------------------------------------------------------------------
|
||||
|
||||
Chapter 5. Installation
|
||||
Chapter 7. Installation
|
||||
|
||||
Table of Contents
|
||||
|
||||
5.1. Installing a prebuilt copy
|
||||
7.1. Installing a prebuilt copy
|
||||
|
||||
5.2. Supporting packages
|
||||
7.2. Supporting packages
|
||||
|
||||
5.3. Building from source
|
||||
7.3. Building from source
|
||||
|
||||
5.4. Configuration overview
|
||||
7.4. Configuration overview
|
||||
|
||||
5.5. The KDE Kicker Recoll applet
|
||||
7.5. The KDE Kicker Recoll applet
|
||||
|
||||
5.1. Installing a prebuilt copy
|
||||
7.1. Installing a prebuilt copy
|
||||
|
||||
Recoll binary packages from the Recoll web site are always linked
|
||||
statically to the Xapian libraries, and have no other dependencies. You
|
||||
@ -34,12 +34,12 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
have a look at the configuration section (but this may not be necessary
|
||||
for a quick test with default parameters).
|
||||
|
||||
5.1.1. Installing through a package system
|
||||
7.1.1. Installing through a package system
|
||||
|
||||
If you use a BSD-type port system or a prebuilt package (RPM or other),
|
||||
just follow the usual procedure for your system.
|
||||
|
||||
5.1.2. Installing a prebuilt Recoll
|
||||
7.1.2. Installing a prebuilt Recoll
|
||||
|
||||
The unpackaged binary versions on the Recoll web site are just compressed
|
||||
tar files of a build tree, where only the useful parts were kept
|
||||
@ -62,11 +62,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
Link: NEXT
|
||||
|
||||
Recoll user manual
|
||||
Prev Chapter 5. Installation Next
|
||||
Prev Chapter 7. Installation Next
|
||||
|
||||
--------------------------------------------------------------------------
|
||||
|
||||
5.2. Supporting packages
|
||||
7.2. Supporting packages
|
||||
|
||||
Recoll uses external applications to index some file types. You need to
|
||||
install them for the file types that you wish to have indexed (these are
|
||||
@ -122,13 +122,13 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
Link: NEXT
|
||||
|
||||
Recoll user manual
|
||||
Prev Chapter 5. Installation Next
|
||||
Prev Chapter 7. Installation Next
|
||||
|
||||
--------------------------------------------------------------------------
|
||||
|
||||
5.3. Building from source
|
||||
7.3. Building from source
|
||||
|
||||
5.3.1. Prerequisites
|
||||
7.3.1. Prerequisites
|
||||
|
||||
At the very least, you will need to download and install the xapian core
|
||||
package (Recoll 1.9 normally uses version 1.0.2, but any 0.9 or 1.0.x
|
||||
@ -144,7 +144,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
not be critical). On Linux systems, the iconv interface is part of libc
|
||||
and you should not need to do anything special.
|
||||
|
||||
5.3.2. Building
|
||||
7.3.2. Building
|
||||
|
||||
Recoll has been built on Linux (redhat7.3, mandriva 2005/6, Fedora Core
|
||||
3/4/5/6), FreeBSD 5/6, macosx, and Solaris 8. If you build on another
|
||||
@ -182,7 +182,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
manually copy and modify one of the existing files (the new file name
|
||||
should be the output of uname -s).
|
||||
|
||||
5.3.3. Installation
|
||||
7.3.3. Installation
|
||||
|
||||
Either type make install or execute recollinstall prefix, in the root of
|
||||
the source tree. This will copy the commands to prefix/bin and the sample
|
||||
@ -205,28 +205,41 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
Link: NEXT
|
||||
|
||||
Recoll user manual
|
||||
Prev Chapter 5. Installation Next
|
||||
Prev Chapter 7. Installation Next
|
||||
|
||||
--------------------------------------------------------------------------
|
||||
|
||||
5.4. Configuration overview
|
||||
7.4. Configuration overview
|
||||
|
||||
Most of the parameters specific to the recoll GUI are set through the
|
||||
Preferences menu and stored in the standard QT place ($HOME/.qt/recollrc).
|
||||
You probably do not want to edit this by hand.
|
||||
|
||||
For other options, Recoll uses text configuration files. You will have to
|
||||
edit them by hand for now (there is still some hope for a GUI
|
||||
configuration tool in the future). The most accurate documentation for the
|
||||
configuration parameters is given by comments inside the default files,
|
||||
and we will just give a general overview here.
|
||||
Recoll indexing options are set inside text configuration files located in
|
||||
a configuration directory. There can be several such directories, each of
|
||||
which define the parameters for one index.
|
||||
|
||||
There are two sets of configuration files. The system-wide files are kept
|
||||
in a directory named like /usr/[local/]share/recoll/examples, they define
|
||||
default values for the system. A parallel set of files exists by default
|
||||
in the .recoll directory in your home. This directory can be changed with
|
||||
the RECOLL_CONFDIR environment variable or the -c option parameter to
|
||||
recoll and recollindex.
|
||||
The configuration files can be edited by hand or through the Indexing
|
||||
configuration dialog (Preferences menu). The GUI tool will try to respect
|
||||
your formatting and comments as much as possible, so it is quite possible
|
||||
to use both ways.
|
||||
|
||||
The most accurate documentation for the configuration parameters is given
|
||||
by comments inside the default files, and we will just give a general
|
||||
overview here.
|
||||
|
||||
For each index, there are two sets of configuration files. System-wide
|
||||
configuration files are kept in a directory named like
|
||||
/usr/[local/]share/recoll/examples, and define default values, shared by
|
||||
all indexes. For each index, a parallel set of files defines the
|
||||
customized parameters.
|
||||
|
||||
The default location of the configuration is the .recoll directory in your
|
||||
home. Most people will only use this directory.
|
||||
|
||||
This location can be changed, or others can be added with the
|
||||
RECOLL_CONFDIR environment variable or the -c option parameter to recoll
|
||||
and recollindex.
|
||||
|
||||
If the .recoll directory does not exist when recoll or recollindex are
|
||||
started, it will be created with a set of empty configuration files.
|
||||
@ -267,7 +280,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
White space is used for separation inside lists. List elements with
|
||||
embedded spaces can be quoted using double-quotes.
|
||||
|
||||
5.4.1. Main configuration file
|
||||
7.4.1. Main configuration file
|
||||
|
||||
recoll.conf is the main configuration file. It defines things like what to
|
||||
index (top directories and things to ignore), and the default character
|
||||
@ -424,6 +437,14 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
If the variable is unspecified or the list empty (the default),
|
||||
all supported types are processed.
|
||||
|
||||
compressedfilemaxkbs
|
||||
|
||||
Size limit for compressed (.gz or .bz2) files. These need to be
|
||||
decompressed in a temporary directory for identification, which
|
||||
can be very wasteful if 'uninteresting' big compressed files are
|
||||
present. Negative means no limit, 0 means no processing of any
|
||||
compressed file. Defaults to -1.
|
||||
|
||||
indexallfilenames
|
||||
|
||||
Recoll indexes file names in a special section of the database to
|
||||
@ -475,7 +496,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
cases. A value of 3 would allow more precision and efficiency on
|
||||
longer words, but the index will be approximately twice as large.
|
||||
|
||||
5.4.2. The mimemap file
|
||||
7.4.2. The mimemap file
|
||||
|
||||
mimemap specifies the file name extension to mime type mappings.
|
||||
|
||||
@ -499,7 +520,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
given Recoll version. Having it there avoids cluttering the more
|
||||
user-oriented and locally customized skippedNames.
|
||||
|
||||
5.4.3. The mimeconf file
|
||||
7.4.3. The mimeconf file
|
||||
|
||||
mimeconf specifies how the different mime types are handled for indexing,
|
||||
and which icons are displayed in the recoll result lists.
|
||||
@ -511,7 +532,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
recoll in the result lists (the values are the basenames of the png images
|
||||
inside the iconsdir directory (specified in recoll.conf).
|
||||
|
||||
5.4.4. The mimeview file
|
||||
7.4.4. The mimeview file
|
||||
|
||||
mimeview specifies which programs are started when you click on an Edit
|
||||
link in a result list. Ie: HTML is normally displayed using firefox, but
|
||||
@ -532,9 +553,9 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
user preferences, all mimeview entries will be ignored except the one
|
||||
labelled application/x-all (which is set to use xdg-open by default).
|
||||
|
||||
5.4.5. Examples of configuration adjustments
|
||||
7.4.5. Examples of configuration adjustments
|
||||
|
||||
5.4.5.1. Adding an external viewer for an non-indexed type
|
||||
7.4.5.1. Adding an external viewer for an non-indexed type
|
||||
|
||||
Imagine that you have some kind of file which does not have indexable
|
||||
content, but for which you would like to have a functional Edit link in
|
||||
@ -565,7 +586,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
The entries you add in your personal file override those in the central
|
||||
configuration, which you do not need to alter
|
||||
|
||||
5.4.5.2. Adding indexing support for a new file type
|
||||
7.4.5.2. Adding indexing support for a new file type
|
||||
|
||||
Let us now imagine that the above .blob files actually contain indexable
|
||||
text and that you know how to extract it with a command line program.
|
||||
|
||||
420
src/README
420
src/README
@ -12,9 +12,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
This document introduces full text search notions and describes the
|
||||
installation and use of the Recoll application. It currently describes
|
||||
Recoll 1.9.
|
||||
|
||||
[ Split HTML / Single HTML ]
|
||||
Recoll 1.12.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
@ -50,7 +48,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
2.5. Real time indexing
|
||||
|
||||
3. Searching
|
||||
3. Searching with the Qt graphical user interface
|
||||
|
||||
3.1. Simple search
|
||||
|
||||
@ -72,7 +70,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
3.9. Document history
|
||||
|
||||
3.10. Sorting search results
|
||||
3.10. Sorting search results and collapsing duplicates
|
||||
|
||||
3.11. Search tips, shortcuts
|
||||
|
||||
@ -84,51 +82,59 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
3.12. Customizing the search interface
|
||||
|
||||
4. Programming interface
|
||||
4. Searching with the KDE KIO slave
|
||||
|
||||
4.1. Writing a document filter
|
||||
4.1. What's this
|
||||
|
||||
4.1.1. Filter HTML output
|
||||
4.2. Searchable documents
|
||||
|
||||
4.2. Field data processing configuration
|
||||
5. Searching on the command line
|
||||
|
||||
4.3. API
|
||||
6. Programming interface
|
||||
|
||||
4.3.1. Interface elements
|
||||
6.1. Writing a document filter
|
||||
|
||||
4.3.2. Python interface
|
||||
6.1.1. Filter HTML output
|
||||
|
||||
5. Installation
|
||||
6.2. Field data processing configuration
|
||||
|
||||
5.1. Installing a prebuilt copy
|
||||
6.3. API
|
||||
|
||||
5.1.1. Installing through a package system
|
||||
6.3.1. Interface elements
|
||||
|
||||
5.1.2. Installing a prebuilt Recoll
|
||||
6.3.2. Python interface
|
||||
|
||||
5.2. Supporting packages
|
||||
7. Installation
|
||||
|
||||
5.3. Building from source
|
||||
7.1. Installing a prebuilt copy
|
||||
|
||||
5.3.1. Prerequisites
|
||||
7.1.1. Installing through a package system
|
||||
|
||||
5.3.2. Building
|
||||
7.1.2. Installing a prebuilt Recoll
|
||||
|
||||
5.3.3. Installation
|
||||
7.2. Supporting packages
|
||||
|
||||
5.4. Configuration overview
|
||||
7.3. Building from source
|
||||
|
||||
5.4.1. Main configuration file
|
||||
7.3.1. Prerequisites
|
||||
|
||||
5.4.2. The mimemap file
|
||||
7.3.2. Building
|
||||
|
||||
5.4.3. The mimeconf file
|
||||
7.3.3. Installation
|
||||
|
||||
5.4.4. The mimeview file
|
||||
7.4. Configuration overview
|
||||
|
||||
5.4.5. Examples of configuration adjustments
|
||||
7.4.1. Main configuration file
|
||||
|
||||
5.5. The KDE Kicker Recoll applet
|
||||
7.4.2. The mimemap file
|
||||
|
||||
7.4.3. The mimeconf file
|
||||
|
||||
7.4.4. The mimeview file
|
||||
|
||||
7.4.5. Examples of configuration adjustments
|
||||
|
||||
7.5. The KDE Kicker Recoll applet
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
@ -143,7 +149,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
Do not do this if your home directory contains a huge number of documents
|
||||
and you do not want to wait or are very short on disk space. In this case,
|
||||
you may want to edit the configuration file first to restrict the indexed
|
||||
you may first want to customize the configuration to restrict the indexed
|
||||
area.
|
||||
|
||||
Also be aware that you may need to install the appropriate supporting
|
||||
@ -216,15 +222,14 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
currently makes no attempt at automatic language recognition.
|
||||
|
||||
Recoll has many parameters which define exactly what to index, and how to
|
||||
classify and decode the source documents. These are kept in a
|
||||
configuration file. A default configuration is copied into a standard
|
||||
location (usually something like /usr/[local/]share/recoll/examples)
|
||||
during installation. The default parameters from this file may be
|
||||
overridden by values that you set inside your personal configuration,
|
||||
found by default in the .recoll sub-directory of your home directory. The
|
||||
default configuration will index your home directory with default
|
||||
parameters and should be sufficient for giving Recoll a try, but you may
|
||||
want to adjust it later.
|
||||
classify and decode the source documents. These are kept in configuration
|
||||
files. A default configuration is copied into a standard location (usually
|
||||
something like /usr/[local/]share/recoll/examples) during installation.
|
||||
The default parameters from this file may be overridden by values that you
|
||||
set inside your personal configuration, found by default in the .recoll
|
||||
sub-directory of your home directory. The default configuration will index
|
||||
your home directory with default parameters and should be sufficient for
|
||||
giving Recoll a try, but you may want to adjust it later.
|
||||
|
||||
Indexing is started automatically the first time you execute the recoll
|
||||
search graphical user interface, or by executing the recollindex command.
|
||||
@ -419,9 +424,9 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
2.3.1. The indexing configuration GUI
|
||||
|
||||
As of Recoll 1.10, most parameters for a given indexing configuration can
|
||||
be set from a recoll GUI running on this configuration (either as default,
|
||||
or by setting RECOLL_CONFDIR or the -c option.)
|
||||
Most parameters for a given indexing configuration can be set from a
|
||||
recoll GUI running on this configuration (either as default, or by setting
|
||||
RECOLL_CONFDIR or the -c option.)
|
||||
|
||||
The interface is started from the Preferences menu. It has two main
|
||||
panels. The first panel allows setting global variables, like the list of
|
||||
@ -533,10 +538,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
Chapter 3. Searching
|
||||
Chapter 3. Searching with the Qt graphical user interface
|
||||
|
||||
The recoll program provides the user interface for searching. It is based
|
||||
on the QT library.
|
||||
The recoll program provides the main user interface for searching. It is
|
||||
based on the QT library.
|
||||
|
||||
recoll has two search modes:
|
||||
|
||||
@ -554,10 +559,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
from another text window, punctation and all.
|
||||
|
||||
The main case where you should enter text differently from how it is
|
||||
printed is for east-oriental languages written with Chinese characters.
|
||||
Words composed of single or multiple characters should be entered
|
||||
separated by white space in this case (they would typically be printed
|
||||
without white space).
|
||||
printed is for east-asian languages (Chinese, Japanese, Korean). Words
|
||||
composed of single or multiple characters should be entered separated by
|
||||
white space in this case (they would typically be printed without white
|
||||
space).
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
@ -565,7 +570,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
1. Start the recoll program.
|
||||
|
||||
2. Possibly choose a search mode: Any term or All terms or File name.
|
||||
2. Possibly choose a search mode: Any term, All terms, File name or Query
|
||||
language.
|
||||
|
||||
3. Enter search term(s) in the text field at the top of the window.
|
||||
|
||||
@ -579,7 +585,9 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
File name will specifically look for file names. The entry will be split
|
||||
at white space characters, and each pattern will be separately expanded.
|
||||
If you want to search for a pattern including white space, you need to use
|
||||
double quotes.
|
||||
double quotes. The point of having a separate file name search is that
|
||||
wild card expansion can be performed more efficiently on a relatively
|
||||
small subset of the index.
|
||||
|
||||
The fourth entry (Query Language) is described in its own section.
|
||||
|
||||
@ -593,8 +601,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
Character case has no influence on search, except that you can disable
|
||||
stem expansion for any term by capitalizing it. Ie: a search for floor
|
||||
will also normally look for flooring, floored, etc., but a search for
|
||||
Floor will only look for floor, in any character case (stemming can also
|
||||
be disabled globally in the preferences).
|
||||
Floor will only look for floor, in any character case. Sstemming can also
|
||||
be disabled globally in the preferences.
|
||||
|
||||
Recoll remembers the last few searches that you performed. You can use the
|
||||
simple search text entry widget (a combobox) to recall them (click on the
|
||||
@ -634,17 +642,20 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
documents side by side. (You can also browse successive results in a
|
||||
single preview window by typing Shift+ArrowUp/Down in the window).
|
||||
|
||||
Clicking the Edit link will attempt to start an external viewer. The
|
||||
viewers can be configured through the user preferences dialog, or by
|
||||
Clicking the Edit link will attempt to start an external editor. The
|
||||
editors can be configured through the user preferences dialog, or by
|
||||
editing the mimeview configuration file.
|
||||
|
||||
The Preview and Edit edit links may not be present for all entries,
|
||||
meaning that Recoll has no configured way to preview a given file type
|
||||
(which was indexed by name only), or no configured external viewer for the
|
||||
(which was indexed by name only), or no configured external editor for the
|
||||
file type. This can sometimes be adjusted simply by tweaking the mimemap
|
||||
and mimeview configuration files (the latter can be modified with the user
|
||||
preferences dialog).
|
||||
|
||||
The format of the result list entries is entirely configurable by using
|
||||
the preference dialog to edit an HTML fragment.
|
||||
|
||||
You can click on the Query details link at the top of the results page to
|
||||
see the query actually performed, after stem expansion and other
|
||||
processing.
|
||||
@ -672,7 +683,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
* Copy Url
|
||||
|
||||
* Find similar
|
||||
* Save to File
|
||||
|
||||
* Find similar
|
||||
|
||||
@ -683,6 +694,12 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
The Copy File Name and Copy Url copy the relevant data to the clipboard,
|
||||
for later pasting.
|
||||
|
||||
Save to File allows saving the contents of a result document to a chosen
|
||||
file. This entry will only appear if the document does not correspond to
|
||||
an existing file, but is a subdocument inside such a file (ie: an email
|
||||
attachment). It is especially useful to extract attachments with no
|
||||
associated editor.
|
||||
|
||||
The Find similar entry will select a number of relevant term from the
|
||||
current document and enter them into the simple search field. You can then
|
||||
start a simple search, with a good chance of finding documents related to
|
||||
@ -732,6 +749,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
string is found, the cursor will be positioned at the first occurrence of
|
||||
the search string.
|
||||
|
||||
A right-click menu in the text area allows switching between displaying
|
||||
the main text or the contents of fields associated to the document (ie:
|
||||
author, abtract, etc.). This is especially useful in cases where the term
|
||||
match did not occur in the main text but in one of the fields.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
3.4. The query language
|
||||
@ -833,39 +855,60 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
3.5. Complex/advanced search
|
||||
|
||||
The advanced search dialog has a number of fields that will allow a more
|
||||
refined search. Each entry field is configurable for the following modes:
|
||||
The advanced search dialog helps you build more complex queries. It can be
|
||||
opened through the Tools menu or through the main toolbar.
|
||||
|
||||
* All terms.
|
||||
The dialog has three parts:
|
||||
|
||||
* Any term.
|
||||
* The top part allows constructing a query by combining multiple clauses
|
||||
of different types. Each entry field is configurable for the following
|
||||
modes:
|
||||
|
||||
* None of the terms.
|
||||
* All terms.
|
||||
|
||||
* Phrase (exact terms in order within an adjustable window).
|
||||
* Any term.
|
||||
|
||||
* Proximity (terms in any order within an adjustable window).
|
||||
* None of the terms.
|
||||
|
||||
* Filename search with wildcards.
|
||||
* Phrase (exact terms in order within an adjustable window).
|
||||
|
||||
Additional entry fields can be created by clicking the Add clause button.
|
||||
* Proximity (terms in any order within an adjustable window).
|
||||
|
||||
You can choose that all relevant fields will be combined by either an AND
|
||||
or an OR conjunction. All types of clauses except "phrase" and "near" can
|
||||
accept a mix of single words and phrases enclosed in double quotes.
|
||||
Stemming expansion will be performed for all terms not beginning with a
|
||||
capital letter, except for terms inside "phrase" clauses. Wildcards will
|
||||
be processed everywhere.
|
||||
* Filename search.
|
||||
|
||||
Advanced search will also let you search for documents of specific mime
|
||||
types (ie: only text/plain, or text/HTML or application/pdf etc...). The
|
||||
state of the file type selection can be saved as the default (the file
|
||||
type filter will not be activated at program start-up, but the lists will
|
||||
be in the restored state).
|
||||
Additional entry fields can be created by clicking the Add clause
|
||||
button.
|
||||
|
||||
You can also restrict the search results to a sub-tree of the indexed
|
||||
area. If you need to do this often, you may think of setting up multiple
|
||||
indexes instead, as the performance will be much better.
|
||||
When searching, the non-empty clauses will be combined either with an
|
||||
AND or an OR conjunction, depending on the choice made on the left
|
||||
(All clauses or Any clause).
|
||||
|
||||
Entries of all types except "Phrase" and "Near" accept a mix of single
|
||||
words and phrases enclosed in double quotes. Stemming and wildcard
|
||||
expansion will be performed as for simple search.
|
||||
|
||||
* The next part allows filtering the results by their mime types.
|
||||
|
||||
The state of the file type selection can be saved as the default (the
|
||||
file type filter will not be activated at program start-up, but the
|
||||
lists will be in the restored state).
|
||||
|
||||
* The bottom part allows restricting the search results to a sub-tree of
|
||||
the indexed area. If you need to do this often, you may think of
|
||||
setting up multiple indexes instead, as the performance will be much
|
||||
better.
|
||||
|
||||
Phrases and Proximity searches. These two clauses work in similar ways,
|
||||
with the difference that proximity searches do not impose an order on the
|
||||
words. In both cases, an adjustable number (slack) of non-matched words
|
||||
may be accepted between the searched ones (use the counter on the left to
|
||||
adjust this count). For phrases, the default count is zero (exact match).
|
||||
For proximity it is ten (meaning that two search terms, would be matched
|
||||
if found within a window of twelve words). Examples: a phrase search for
|
||||
quick fox with a slack of 0 will match quick fox but not quick brown fox.
|
||||
With a slack of 1 it will match the latter, but not fox quick. A proximity
|
||||
search for quick fox with the default slack will match the latter, and
|
||||
also a fox is a cunning and quick animal.
|
||||
|
||||
Click on the Start Search button in the advanced search dialog, or type
|
||||
Enter in any text field to start the search. The button in the main window
|
||||
@ -1020,7 +1063,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
3.10. Sorting search results
|
||||
3.10. Sorting search results and collapsing duplicates
|
||||
|
||||
The documents in a result list are normally sorted in order of relevance.
|
||||
It is possible to specify different sort parameters by using the Sort
|
||||
@ -1038,6 +1081,13 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
possible to keep the sorting activation state between program invocations
|
||||
by checking the Remember sort activation state option in the preferences.
|
||||
|
||||
It is also possible to hide duplicate entries inside the result list
|
||||
(documents with the exact same contents as the displayed one). The test of
|
||||
identity is based on an MD5 hash of the document container, not only of
|
||||
the text contents (so that ie, a text document with an image added will
|
||||
not be a duplicate of the text only). Duplicates hiding is controlled by
|
||||
an entry in the Query configuration dialog, and is off by default.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
3.11. Search tips, shortcuts
|
||||
@ -1081,10 +1131,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
Phrases and Proximity searches. A phrase can be looked for by enclosing it
|
||||
in double quotes. Example: "user manual" will look only for occurrences of
|
||||
user immediately followed by manual. You can use the This exact phrase
|
||||
field of the advanced search dialog to the same effect. Phrases can be
|
||||
entered along simple terms in all simple or advanced search entry fields
|
||||
(except This exact phrase).
|
||||
user immediately followed by manual. You can use the This phrase field of
|
||||
the advanced search dialog to the same effect. Phrases can be entered
|
||||
along simple terms in all simple or advanced search entry fields (except
|
||||
This exact phrase).
|
||||
|
||||
AutoPhrases. This option can be set in the preferences dialog. If it is
|
||||
set, a phrase will be automatically built and added to simple searches
|
||||
@ -1136,6 +1186,9 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
* Number of results in a result page:
|
||||
|
||||
* Hide duplicate results: decides if result list entries are shown for
|
||||
identical documents found in different places.
|
||||
|
||||
* Highlight color for query terms: Terms from the user query are
|
||||
highlighted in the result list samples and the preview window. The
|
||||
color can be chosen here. Any QT color string should work (ie red,
|
||||
@ -1267,7 +1320,107 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
Chapter 4. Programming interface
|
||||
Chapter 4. Searching with the KDE KIO slave
|
||||
|
||||
4.1. What's this
|
||||
|
||||
The Recoll KIO slave allows performing a Recoll search by entering an
|
||||
appropriate URL in a KDE open dialog, or with an HTML-based interface
|
||||
displayed in Konqueror.
|
||||
|
||||
The HTML-based interface is similar to the QT-based interface, but
|
||||
slightly less powerful for now. Its advantage is that you can perform your
|
||||
search while staying fully within the KDE framework: drag and drop from
|
||||
the result list works normally and you have your normal choice of
|
||||
applications for opening files.
|
||||
|
||||
The alternative interface uses a directory view of search results. Due to
|
||||
limitations in the current KIO slave interface, it is currently not
|
||||
obviously useful (to me).
|
||||
|
||||
The interface is described in more detail inside a help file which you can
|
||||
access by entering recoll:/ inside the konqueror URL line (this works only
|
||||
if the recoll KIO slave has been previously installed).
|
||||
|
||||
The instructions for building this module are located in the source tree.
|
||||
See: kde/kio/recoll/00README.txt
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.2. Searchable documents
|
||||
|
||||
As a sample application, the Recoll KIO slave could allow preparing a set
|
||||
of HTML documents (for example a manual) so that they become their own
|
||||
search interface inside konqueror.
|
||||
|
||||
This can be done by either explicitely inserting <a href="recoll:/...">
|
||||
links around some document areas, or automatically by adding a very small
|
||||
javascript program to the documents, like the following example, which
|
||||
would initiate a search by double-clicking any term:
|
||||
|
||||
<script language="JavaScript">
|
||||
function recollsearch() {
|
||||
var t = document.getSelection();
|
||||
window.location.href = 'recoll://search/query?qtp=a&p=0&q=' +
|
||||
encodeURIComponent(t);
|
||||
}
|
||||
</script>
|
||||
....
|
||||
<body ondblclick="recollsearch()">
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
Chapter 5. Searching on the command line
|
||||
|
||||
There are several ways to obtain search results as a text stream, without
|
||||
a graphical interface:
|
||||
|
||||
* By passing option -t to the recoll program.
|
||||
|
||||
* By using the recollq program.
|
||||
|
||||
* By writing a custom Python program, using the Recoll Python API.
|
||||
|
||||
The first two methods work in the same way and accept/need the same
|
||||
arguments (except for the additional -t to recoll). The query to be
|
||||
executed is specified as command line arguments.
|
||||
|
||||
recollq is not built by default. You can use the Makefile in the query
|
||||
directory to build it. This is a very simple program, and it will often be
|
||||
useful to taylor its output format to your needs.
|
||||
|
||||
recollq has a man page (not installed by default, look in the doc/man
|
||||
directory). The Usage string is as follows:
|
||||
|
||||
recollq [-o|-a|-f] <query string>
|
||||
Runs a recoll query and displays result lines.
|
||||
Default: will interpret the argument(s) as a query language string
|
||||
-o Emulate the gui simple search in ANY TERM mode
|
||||
-a Emulate the gui simple search in ALL TERMS mode
|
||||
-f Emulate the gui simple search in filename mode
|
||||
Common options:
|
||||
-c <configdir> : specify config directory, overriding $RECOLL_CONFDIR
|
||||
-d also dump file contents
|
||||
-n <cnt> limit the maximum number of results (0->no limit, default 2000)
|
||||
-b : basic. Just output urls, no mime types or titles
|
||||
-m : dump the whole document meta[] array
|
||||
-S fld : sort by field name
|
||||
-D : sort descending
|
||||
|
||||
Sample execution:
|
||||
|
||||
recollq 'ilur -nautique mime:text/html'
|
||||
Recoll query: ((((ilur:(wqf=11) OR ilurs) AND_NOT (nautique:(wqf=11)
|
||||
OR nautiques OR nautiqu OR nautiquement)) FILTER Ttext/html))
|
||||
4 results
|
||||
text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/comptes.html] [comptes.html] 18593 bytes
|
||||
text/html [file:///Users/uncrypted-dockes/projets/nautique/webnautique/articles/ilur1/index.html] [Constructio...
|
||||
text/html [file:///Users/uncrypted-dockes/projets/pagepers/index.html] [psxtcl/writemime/recoll]...
|
||||
text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/recu-chasse-maree....
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
Chapter 6. Programming interface
|
||||
|
||||
Recoll has an Application programming Interface, usable both for indexing
|
||||
and searching, currently accessible from the Python language.
|
||||
@ -1280,7 +1433,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.1. Writing a document filter
|
||||
6.1. Writing a document filter
|
||||
|
||||
Recoll filters are executable programs which translate from a specific
|
||||
format (ie: openoffice, acrobat, etc.) to the Recoll indexing input
|
||||
@ -1334,7 +1487,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.1.1. Filter HTML output
|
||||
6.1.1. Filter HTML output
|
||||
|
||||
The output HTML could be very minimal like the following example:
|
||||
|
||||
@ -1367,7 +1520,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.2. Field data processing configuration
|
||||
6.2. Field data processing configuration
|
||||
|
||||
Fields are named pieces of information in or about documents, like title,
|
||||
author, abstract.
|
||||
@ -1402,9 +1555,9 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.3. API
|
||||
6.3. API
|
||||
|
||||
4.3.1. Interface elements
|
||||
6.3.1. Interface elements
|
||||
|
||||
A few elements in the interface are specific and and need an explanation.
|
||||
|
||||
@ -1445,9 +1598,9 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.3.2. Python interface
|
||||
6.3.2. Python interface
|
||||
|
||||
4.3.2.1. Introduction
|
||||
6.3.2.1. Introduction
|
||||
|
||||
Recoll versions after 1.11 define a Python programming interface, both for
|
||||
searching and indexing.
|
||||
@ -1463,7 +1616,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.3.2.2. Interface manual
|
||||
6.3.2.2. Interface manual
|
||||
|
||||
NAME
|
||||
recoll - This is an interface to the Recoll full text indexer.
|
||||
@ -1653,7 +1806,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
4.3.2.3. Example code
|
||||
6.3.2.3. Example code
|
||||
|
||||
The following sample would query the index with a user language string.
|
||||
See the python/samples directory inside the Recoll source for other
|
||||
@ -1684,9 +1837,9 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
Chapter 5. Installation
|
||||
Chapter 7. Installation
|
||||
|
||||
5.1. Installing a prebuilt copy
|
||||
7.1. Installing a prebuilt copy
|
||||
|
||||
Recoll binary packages from the Recoll web site are always linked
|
||||
statically to the Xapian libraries, and have no other dependencies. You
|
||||
@ -1697,14 +1850,14 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.1.1. Installing through a package system
|
||||
7.1.1. Installing through a package system
|
||||
|
||||
If you use a BSD-type port system or a prebuilt package (RPM or other),
|
||||
just follow the usual procedure for your system.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.1.2. Installing a prebuilt Recoll
|
||||
7.1.2. Installing a prebuilt Recoll
|
||||
|
||||
The unpackaged binary versions on the Recoll web site are just compressed
|
||||
tar files of a build tree, where only the useful parts were kept
|
||||
@ -1719,7 +1872,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.2. Supporting packages
|
||||
7.2. Supporting packages
|
||||
|
||||
Recoll uses external applications to index some file types. You need to
|
||||
install them for the file types that you wish to have indexed (these are
|
||||
@ -1767,9 +1920,9 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.3. Building from source
|
||||
7.3. Building from source
|
||||
|
||||
5.3.1. Prerequisites
|
||||
7.3.1. Prerequisites
|
||||
|
||||
At the very least, you will need to download and install the xapian core
|
||||
package (Recoll 1.9 normally uses version 1.0.2, but any 0.9 or 1.0.x
|
||||
@ -1787,7 +1940,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.3.2. Building
|
||||
7.3.2. Building
|
||||
|
||||
Recoll has been built on Linux (redhat7.3, mandriva 2005/6, Fedora Core
|
||||
3/4/5/6), FreeBSD 5/6, macosx, and Solaris 8. If you build on another
|
||||
@ -1827,7 +1980,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.3.3. Installation
|
||||
7.3.3. Installation
|
||||
|
||||
Either type make install or execute recollinstall prefix, in the root of
|
||||
the source tree. This will copy the commands to prefix/bin and the sample
|
||||
@ -1842,24 +1995,37 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.4. Configuration overview
|
||||
7.4. Configuration overview
|
||||
|
||||
Most of the parameters specific to the recoll GUI are set through the
|
||||
Preferences menu and stored in the standard QT place ($HOME/.qt/recollrc).
|
||||
You probably do not want to edit this by hand.
|
||||
|
||||
For other options, Recoll uses text configuration files. You will have to
|
||||
edit them by hand for now (there is still some hope for a GUI
|
||||
configuration tool in the future). The most accurate documentation for the
|
||||
configuration parameters is given by comments inside the default files,
|
||||
and we will just give a general overview here.
|
||||
Recoll indexing options are set inside text configuration files located in
|
||||
a configuration directory. There can be several such directories, each of
|
||||
which define the parameters for one index.
|
||||
|
||||
There are two sets of configuration files. The system-wide files are kept
|
||||
in a directory named like /usr/[local/]share/recoll/examples, they define
|
||||
default values for the system. A parallel set of files exists by default
|
||||
in the .recoll directory in your home. This directory can be changed with
|
||||
the RECOLL_CONFDIR environment variable or the -c option parameter to
|
||||
recoll and recollindex.
|
||||
The configuration files can be edited by hand or through the Indexing
|
||||
configuration dialog (Preferences menu). The GUI tool will try to respect
|
||||
your formatting and comments as much as possible, so it is quite possible
|
||||
to use both ways.
|
||||
|
||||
The most accurate documentation for the configuration parameters is given
|
||||
by comments inside the default files, and we will just give a general
|
||||
overview here.
|
||||
|
||||
For each index, there are two sets of configuration files. System-wide
|
||||
configuration files are kept in a directory named like
|
||||
/usr/[local/]share/recoll/examples, and define default values, shared by
|
||||
all indexes. For each index, a parallel set of files defines the
|
||||
customized parameters.
|
||||
|
||||
The default location of the configuration is the .recoll directory in your
|
||||
home. Most people will only use this directory.
|
||||
|
||||
This location can be changed, or others can be added with the
|
||||
RECOLL_CONFDIR environment variable or the -c option parameter to recoll
|
||||
and recollindex.
|
||||
|
||||
If the .recoll directory does not exist when recoll or recollindex are
|
||||
started, it will be created with a set of empty configuration files.
|
||||
@ -1902,7 +2068,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.4.1. Main configuration file
|
||||
7.4.1. Main configuration file
|
||||
|
||||
recoll.conf is the main configuration file. It defines things like what to
|
||||
index (top directories and things to ignore), and the default character
|
||||
@ -2059,6 +2225,14 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
If the variable is unspecified or the list empty (the default),
|
||||
all supported types are processed.
|
||||
|
||||
compressedfilemaxkbs
|
||||
|
||||
Size limit for compressed (.gz or .bz2) files. These need to be
|
||||
decompressed in a temporary directory for identification, which
|
||||
can be very wasteful if 'uninteresting' big compressed files are
|
||||
present. Negative means no limit, 0 means no processing of any
|
||||
compressed file. Defaults to -1.
|
||||
|
||||
indexallfilenames
|
||||
|
||||
Recoll indexes file names in a special section of the database to
|
||||
@ -2112,7 +2286,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.4.2. The mimemap file
|
||||
7.4.2. The mimemap file
|
||||
|
||||
mimemap specifies the file name extension to mime type mappings.
|
||||
|
||||
@ -2138,7 +2312,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.4.3. The mimeconf file
|
||||
7.4.3. The mimeconf file
|
||||
|
||||
mimeconf specifies how the different mime types are handled for indexing,
|
||||
and which icons are displayed in the recoll result lists.
|
||||
@ -2152,7 +2326,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.4.4. The mimeview file
|
||||
7.4.4. The mimeview file
|
||||
|
||||
mimeview specifies which programs are started when you click on an Edit
|
||||
link in a result list. Ie: HTML is normally displayed using firefox, but
|
||||
@ -2175,9 +2349,9 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.4.5. Examples of configuration adjustments
|
||||
7.4.5. Examples of configuration adjustments
|
||||
|
||||
5.4.5.1. Adding an external viewer for an non-indexed type
|
||||
7.4.5.1. Adding an external viewer for an non-indexed type
|
||||
|
||||
Imagine that you have some kind of file which does not have indexable
|
||||
content, but for which you would like to have a functional Edit link in
|
||||
@ -2210,7 +2384,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.4.5.2. Adding indexing support for a new file type
|
||||
7.4.5.2. Adding indexing support for a new file type
|
||||
|
||||
Let us now imagine that the above .blob files actually contain indexable
|
||||
text and that you know how to extract it with a command line program.
|
||||
@ -2241,7 +2415,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
5.5. The KDE Kicker Recoll applet
|
||||
7.5. The KDE Kicker Recoll applet
|
||||
|
||||
The Recoll source tree contains the source code to the recoll_applet, a
|
||||
small application derived from the find_applet. This can be used to add a
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user