release 3928
This commit is contained in:
parent
64b0c9ca32
commit
37f44f1b07
165
src/INSTALL
165
src/INSTALL
@ -16,45 +16,29 @@ Chapter 5. Installation and configuration
|
||||
|
||||
5.1. Installing a binary copy
|
||||
|
||||
There are three types of binary Recoll installations:
|
||||
Recoll binary copies are always distributed as regular packages for your
|
||||
system. They can be obtained either through the system's normal software
|
||||
distribution framework (e.g. Debian/Ubuntu apt, FreeBSD ports, etc.), or
|
||||
from some type of "backports" repository providing versions newer than the
|
||||
standard ones, or found on the Recoll WEB site in some cases.
|
||||
|
||||
o Through your system normal software distribution framework (ie,
|
||||
Debian/Ubuntu apt, FreeBSD ports, etc.).
|
||||
There used to exist another form of binary install, as pre-compiled source
|
||||
trees, but these are just less convenient than the packages and don't
|
||||
exist any more.
|
||||
|
||||
o From a package downloaded from the Recoll web site.
|
||||
The package management tools will usually automatically deal with hard
|
||||
dependancies for packages obtained from a proper package repository. You
|
||||
will have to deal with them by hand for downloaded packages (for example,
|
||||
when dpkg complains about missing dependancies).
|
||||
|
||||
o From a prebuilt tree downloaded from the Recoll web site.
|
||||
|
||||
In all cases, the strict software dependancies (ie on Xapian or iconv)
|
||||
will be automatically satisfied, you should not have to worry about them.
|
||||
|
||||
You will only have to check or install supporting applications for the
|
||||
file types that you want to index beyond those that are natively processed
|
||||
by Recoll (text, HTML, email files, and a few others).
|
||||
In all cases, you will have to check or install supporting applications
|
||||
for the file types that you want to index beyond those that are natively
|
||||
processed by Recoll (text, HTML, email files, and a few others).
|
||||
|
||||
You should also maybe have a look at the configuration section (but this
|
||||
may not be necessary for a quick test with default parameters). Most
|
||||
parameters can be more conveniently set from the GUI interface.
|
||||
|
||||
5.1.1. Installing through a package system
|
||||
|
||||
If you use a BSD-type port system or a prebuilt package (DEB, RPM,
|
||||
manually or through the system software configuration utility), just
|
||||
follow the usual procedure for your system.
|
||||
|
||||
5.1.2. Installing a prebuilt Recoll
|
||||
|
||||
The unpackaged binary versions on the Recoll web site are just compressed
|
||||
tar files of a build tree, where only the useful parts were kept
|
||||
(executables and sample configuration).
|
||||
|
||||
The executable binary files are built with a static link to libxapian and
|
||||
libiconv, to make installation easier (no dependencies).
|
||||
|
||||
After extracting the tar file, you can proceed with installation as if you
|
||||
had built the package from source (that is, just type make install). The
|
||||
binary trees are built for installation to /usr/local.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
Prev Next
|
||||
@ -282,7 +266,7 @@ Chapter 5. Installation and configuration
|
||||
Normal procedure:
|
||||
|
||||
cd recoll-xxx
|
||||
configure
|
||||
./configure
|
||||
make
|
||||
(practices usual hardship-repelling invocations)
|
||||
|
||||
@ -432,7 +416,51 @@ Chapter 5. Installation and configuration
|
||||
text files with appropriate encodings, and concatenate them to create
|
||||
the complete configuration.
|
||||
|
||||
5.4.1. The main configuration file, recoll.conf
|
||||
5.4.1. Environment variables
|
||||
|
||||
RECOLL_CONFDIR
|
||||
|
||||
Defines the main configuration directory.
|
||||
|
||||
RECOLL_TMPDIR, TMPDIR
|
||||
|
||||
Locations for temporary files, in this order of priority. The
|
||||
default if none of these is set is to use /tmp. Big temporary
|
||||
files may be created during indexing, mostly for decompressing,
|
||||
and also for processing, e.g. email attachments.
|
||||
|
||||
RECOLL_CONFTOP, RECOLL_CONFMID
|
||||
|
||||
Allow adding configuration directories with priorities below and
|
||||
above the user directory (see above the Configuration overview
|
||||
section for details).
|
||||
|
||||
RECOLL_EXTRA_DBS, RECOLL_ACTIVE_EXTRA_DBS
|
||||
|
||||
Help for setting up external indexes. See this paragraph for
|
||||
explanations.
|
||||
|
||||
RECOLL_DATADIR
|
||||
|
||||
Defines replacement for the default location of Recoll data files,
|
||||
normally found in, e.g., /usr/share/recoll).
|
||||
|
||||
RECOLL_FILTERSDIR
|
||||
|
||||
Defines replacement for the default location of Recoll filters,
|
||||
normally found in, e.g., /usr/share/recoll/filters).
|
||||
|
||||
ASPELL_PROG
|
||||
|
||||
aspell program to use for creating the spelling dictionary. The
|
||||
result has to be compatible with the libaspell which Recoll is
|
||||
using.
|
||||
|
||||
VARNAME
|
||||
|
||||
Blabla
|
||||
|
||||
5.4.2. The main configuration file, recoll.conf
|
||||
|
||||
recoll.conf is the main configuration file. It defines things like what to
|
||||
index (top directories and things to ignore), and the default character
|
||||
@ -447,7 +475,7 @@ Chapter 5. Installation and configuration
|
||||
Configuration menu in the recoll interface. Some can only be set by
|
||||
editing the configuration file.
|
||||
|
||||
5.4.1.1. Parameters affecting what documents we index:
|
||||
5.4.2.1. Parameters affecting what documents we index:
|
||||
|
||||
topdirs
|
||||
|
||||
@ -481,8 +509,23 @@ Chapter 5. Installation and configuration
|
||||
like ~/.thunderbird or ~/.evolution in topdirs.
|
||||
|
||||
Not even the file names are indexed for patterns in this list. See
|
||||
the recoll_noindex variable in mimemap for an alternative approach
|
||||
which indexes the file names.
|
||||
the noContentSuffixes variable for an alternative approach which
|
||||
indexes the file names.
|
||||
|
||||
noContentSuffixes
|
||||
|
||||
This is a list of file name endings (not wildcard expressions, nor
|
||||
dot-delimited suffixes). Only the names of matching files will be
|
||||
indexed (no attempt at MIME type identification, no decompression,
|
||||
no content indexing). This can be redefined for subdirectories,
|
||||
and edited from the GUI. The default value is:
|
||||
|
||||
noContentSuffixes = .md5 .map \
|
||||
.o .lib .dll .a .sys .exe .com \
|
||||
.mpp .mpt .vsd \
|
||||
.img .img.gz .img.bz2 .img.xz .image .image.gz .image.bz2 .image.xz \
|
||||
.dat .bak .rdf .log.gz .log .db .msf .pid \
|
||||
,v ~ #
|
||||
|
||||
skippedPaths and daemSkippedPaths
|
||||
|
||||
@ -602,7 +645,7 @@ Chapter 5. Installation and configuration
|
||||
Firefox plugin as ~/.recollweb/ToIndex so there should be no need
|
||||
to change it.
|
||||
|
||||
5.4.1.2. Parameters affecting how we generate terms:
|
||||
5.4.2.2. Parameters affecting how we generate terms:
|
||||
|
||||
Changing some of these parameters will imply a full reindex. Also, when
|
||||
using multiple indexes, it may not make sense to search indexes that don't
|
||||
@ -777,7 +820,7 @@ Chapter 5. Installation and configuration
|
||||
|
||||
field1 and field2 will be set inside the document metadata.
|
||||
|
||||
5.4.1.3. Parameters affecting where and how we store things:
|
||||
5.4.2.3. Parameters affecting where and how we store things:
|
||||
|
||||
dbdir
|
||||
|
||||
@ -836,7 +879,7 @@ Chapter 5. Installation and configuration
|
||||
memory, you can try higher values between 20 and 80. In my
|
||||
experience, values beyond 100 are always counterproductive.
|
||||
|
||||
5.4.1.4. Parameters affecting multithread processing
|
||||
5.4.2.4. Parameters affecting multithread processing
|
||||
|
||||
The Recoll indexing process recollindex can use multiple threads to speed
|
||||
up indexing on multiprocessor systems. The work done to index files is
|
||||
@ -899,7 +942,7 @@ Chapter 5. Installation and configuration
|
||||
|
||||
thrQSizes = -1 -1 -1
|
||||
|
||||
5.4.1.5. Miscellaneous parameters:
|
||||
5.4.2.5. Miscellaneous parameters:
|
||||
|
||||
autodiacsens
|
||||
|
||||
@ -929,6 +972,16 @@ Chapter 5. Installation and configuration
|
||||
value, and is the default. The daemversion is specific to the
|
||||
indexing monitor daemon.
|
||||
|
||||
checkneedretryindexscript
|
||||
|
||||
This defines the name for a command executed by recollindex when
|
||||
starting indexing. If the exit status of the command is 0,
|
||||
recollindex retries to index all files which previously could not
|
||||
be indexed because of data extraction errors. The default value is
|
||||
a script which checks if any of the common bin directories have
|
||||
changed (indicating that a helper program may have been
|
||||
installed).
|
||||
|
||||
mondelaypatterns
|
||||
|
||||
This allows specify wildcard path patterns (processed with
|
||||
@ -1019,7 +1072,7 @@ Chapter 5. Installation and configuration
|
||||
be set for directories which hold Thunderbird data, as their
|
||||
folder format is weird.
|
||||
|
||||
5.4.2. The fields file
|
||||
5.4.3. The fields file
|
||||
|
||||
This file contains information about dynamic fields handling in Recoll.
|
||||
Some very basic fields have hard-wired behaviour, and, mostly, you should
|
||||
@ -1090,7 +1143,7 @@ Chapter 5. Installation and configuration
|
||||
# mailmytag field name
|
||||
x-my-tag = mailmytag
|
||||
|
||||
5.4.2.1. Extended attributes in the fields file
|
||||
5.4.3.1. Extended attributes in the fields file
|
||||
|
||||
Recoll versions 1.19 and later process user extended file attributes as
|
||||
documents fields by default.
|
||||
@ -1102,7 +1155,7 @@ Chapter 5. Installation and configuration
|
||||
translations from extended attributes names to Recoll field names. An
|
||||
empty translation disables use of the corresponding attribute data.
|
||||
|
||||
5.4.3. The mimemap file
|
||||
5.4.4. The mimemap file
|
||||
|
||||
mimemap specifies the file name extension to MIME type mappings.
|
||||
|
||||
@ -1115,18 +1168,12 @@ Chapter 5. Installation and configuration
|
||||
handled specially, which is possible because they are usually all located
|
||||
in one place.
|
||||
|
||||
mimemap also has a recoll_noindex variable which is a list of suffixes.
|
||||
Matching files will be skipped (which avoids unnecessary decompressions or
|
||||
file executions). This is partially redundant with skippedNames in the
|
||||
main configuration file, with a few differences: it will not affect
|
||||
directories, it cannot be made dependant on the file-system location (it
|
||||
is a configuration-wide parameter), and the file names will still be
|
||||
indexed (not even the file names are indexed for patterns in skippedNames.
|
||||
recoll_noindex is used mostly for things known to be unindexable by a
|
||||
given Recoll version. Having it there avoids cluttering the more
|
||||
user-oriented and locally customized skippedNames.
|
||||
The recoll_noindex mimemap variable has been moved to recoll.conf and
|
||||
renamed to noContentSuffixes, while keeping the same function, as of
|
||||
Recoll version 1.21. For older Recoll versions, see the documentation for
|
||||
noContentSuffixes but use recoll_noindex in mimemap.
|
||||
|
||||
5.4.4. The mimeconf file
|
||||
5.4.5. The mimeconf file
|
||||
|
||||
mimeconf specifies how the different MIME types are handled for indexing,
|
||||
and which icons are displayed in the recoll result lists.
|
||||
@ -1138,7 +1185,7 @@ Chapter 5. Installation and configuration
|
||||
recoll in the result lists (the values are the basenames of the png images
|
||||
inside the iconsdir directory (specified in recoll.conf).
|
||||
|
||||
5.4.5. The mimeview file
|
||||
5.4.6. The mimeview file
|
||||
|
||||
mimeview specifies which programs are started when you click on an Open
|
||||
link in a result list. Ie: HTML is normally displayed using firefox, but
|
||||
@ -1207,7 +1254,7 @@ Chapter 5. Installation and configuration
|
||||
document. This could be used in combination with field customisation to
|
||||
help with opening the document.
|
||||
|
||||
5.4.6. The ptrans file
|
||||
5.4.7. The ptrans file
|
||||
|
||||
ptrans specifies query-time path translations. These can be useful in
|
||||
multiple cases.
|
||||
@ -1226,9 +1273,9 @@ Chapter 5. Installation and configuration
|
||||
/server/volume2/docdir = /net/server/volume2/docdir
|
||||
|
||||
|
||||
5.4.7. Examples of configuration adjustments
|
||||
5.4.8. Examples of configuration adjustments
|
||||
|
||||
5.4.7.1. Adding an external viewer for an non-indexed type
|
||||
5.4.8.1. Adding an external viewer for an non-indexed type
|
||||
|
||||
Imagine that you have some kind of file which does not have indexable
|
||||
content, but for which you would like to have a functional Open link in
|
||||
@ -1258,7 +1305,7 @@ Chapter 5. Installation and configuration
|
||||
configuration, which you do not need to alter. mimeview can also be
|
||||
modified from the Gui.
|
||||
|
||||
5.4.7.2. Adding indexing support for a new file type
|
||||
5.4.8.2. Adding indexing support for a new file type
|
||||
|
||||
Let us now imagine that the above .blob files actually contain indexable
|
||||
text and that you know how to extract it with a command line program.
|
||||
|
||||
343
src/README
343
src/README
@ -8,7 +8,7 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
<jfd@recoll.org>
|
||||
|
||||
Copyright (c) 2005-2014 Jean-Francois Dockes
|
||||
Copyright (c) 2005-2015 Jean-Francois Dockes
|
||||
|
||||
Permission is granted to copy, distribute and/or modify this document
|
||||
under the terms of the GNU Free Documentation License, Version 1.3 or any
|
||||
@ -17,8 +17,8 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
license can be found at the following location: GNU web site.
|
||||
|
||||
This document introduces full text search notions and describes the
|
||||
installation and use of the Recoll application. It currently describes
|
||||
Recoll 1.20.
|
||||
installation and use of the Recoll application. This version describes
|
||||
Recoll 1.21.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
@ -42,7 +42,9 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
2.1.3. Document types
|
||||
|
||||
2.1.4. Recovery
|
||||
2.1.4. Indexing failures
|
||||
|
||||
2.1.5. Recovery
|
||||
|
||||
2.2. Index storage
|
||||
|
||||
@ -107,7 +109,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
3.1.13. Search tips, shortcuts
|
||||
|
||||
3.1.14. Customizing the search interface
|
||||
3.1.14. Saving and restoring queries (1.21 and
|
||||
later)
|
||||
|
||||
3.1.15. Customizing the search interface
|
||||
|
||||
3.2. Searching with the KDE KIO slave
|
||||
|
||||
@ -163,10 +168,6 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
5.1. Installing a binary copy
|
||||
|
||||
5.1.1. Installing through a package system
|
||||
|
||||
5.1.2. Installing a prebuilt Recoll
|
||||
|
||||
5.2. Supporting packages
|
||||
|
||||
5.3. Building from source
|
||||
@ -179,19 +180,21 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or
|
||||
|
||||
5.4. Configuration overview
|
||||
|
||||
5.4.1. The main configuration file, recoll.conf
|
||||
5.4.1. Environment variables
|
||||
|
||||
5.4.2. The fields file
|
||||
5.4.2. The main configuration file, recoll.conf
|
||||
|
||||
5.4.3. The mimemap file
|
||||
5.4.3. The fields file
|
||||
|
||||
5.4.4. The mimeconf file
|
||||
5.4.4. The mimemap file
|
||||
|
||||
5.4.5. The mimeview file
|
||||
5.4.5. The mimeconf file
|
||||
|
||||
5.4.6. The ptrans file
|
||||
5.4.6. The mimeview file
|
||||
|
||||
5.4.7. Examples of configuration adjustments
|
||||
5.4.7. The ptrans file
|
||||
|
||||
5.4.8. Examples of configuration adjustments
|
||||
|
||||
Chapter 1. Introduction
|
||||
|
||||
@ -352,9 +355,20 @@ Chapter 2. Indexing
|
||||
index build can be forced later by specifying an option to the indexing
|
||||
command (recollindex -z or -Z).
|
||||
|
||||
recollindex skips files which caused an error during a previous pass. This
|
||||
is a performance optimization, and a new behaviour in version 1.21 (failed
|
||||
files were always retried by previous versions). The command line option
|
||||
-k can be set to retry failed files, for example after updating a filter.
|
||||
|
||||
The following sections give an overview of different aspects of the
|
||||
indexing processes and configuration, with links to detailed sections.
|
||||
|
||||
Depending on your data, temporary files may be needed during indexing,
|
||||
some of them possibly quite big. You can use the RECOLL_TMPDIR or TMPDIR
|
||||
environment variables to determine where they are created (the default is
|
||||
to use /tmp). Using TMPDIR has the nice property that it may also be taken
|
||||
into account by auxiliary commands executed by recollindex.
|
||||
|
||||
2.1.1. Indexing modes
|
||||
|
||||
Recoll indexing can be performed along two different modes:
|
||||
@ -462,7 +476,28 @@ Chapter 2. Indexing
|
||||
main configuration file (recoll.conf), or from the GUI index configuration
|
||||
tool.
|
||||
|
||||
2.1.4. Recovery
|
||||
2.1.4. Indexing failures
|
||||
|
||||
Indexing may fail for some documents, for a number of reasons: a helper
|
||||
program may be missing, the document may be corrupt, we may fail to
|
||||
uncompress a file because no file system space is available, etc.
|
||||
|
||||
Recoll versions prior to 1.21 always retried to index files which had
|
||||
previously caused an error. This guaranteed that anything that may have
|
||||
become indexable (for example because a helper had been installed) would
|
||||
be indexed. However this was bad for performance because some indexing
|
||||
failures may be quite costly (for example failing to uncompress a big file
|
||||
because of insufficient disk space).
|
||||
|
||||
The indexer in Recoll versions 1.21 and later do not retry failed file by
|
||||
default. Retrying will only occur if an explicit option (-k) is set on the
|
||||
recollindex command line, or if a script executed when recollindex starts
|
||||
up says so. The script is defined by a configuration variable
|
||||
(checkneedretryindexscript), and makes a rather lame attempt at deciding
|
||||
if a helper command may have been installed, by checking if any of the
|
||||
common bin directories have changed.
|
||||
|
||||
2.1.5. Recovery
|
||||
|
||||
In the rare case where the index becomes corrupted (which can signal
|
||||
itself by weird search results or crashes), the index files need to be
|
||||
@ -785,6 +820,9 @@ Chapter 2. Indexing
|
||||
rebuilt, which can be a significant advantage if it is very big (some
|
||||
installations need days for a full index rebuild).
|
||||
|
||||
Option -k will force retrying files which previously failed to be indexed,
|
||||
for example because of a missing helper program.
|
||||
|
||||
Of special interest also, maybe, are the -i and -f options. -i allows
|
||||
indexing an explicit list of files (given as command line parameters or
|
||||
read on stdin). -f tells recollindex to ignore file selection parameters
|
||||
@ -867,11 +905,12 @@ Chapter 2. Indexing
|
||||
option -x to disable X11 session monitoring (else the daemon will not
|
||||
start).
|
||||
|
||||
By default, the messages from the indexing daemon will be discarded. You
|
||||
may want to change this by setting the daemlogfilename and daemloglevel
|
||||
configuration parameters. Also the log file will only be truncated when
|
||||
the daemon starts. If the daemon runs permanently, the log file may grow
|
||||
quite big, depending on the log level.
|
||||
By default, the messages from the indexing daemon will be setn to the same
|
||||
file as those from the interactive commands (logfilename). You may want to
|
||||
change this by setting the daemlogfilename and daemloglevel configuration
|
||||
parameters. Also the log file will only be truncated when the daemon
|
||||
starts. If the daemon runs permanently, the log file may grow quite big,
|
||||
depending on the log level.
|
||||
|
||||
When building Recoll, the real time indexing support can be customised
|
||||
during package configuration with the --with[out]-fam or
|
||||
@ -946,6 +985,10 @@ Chapter 3. Searching
|
||||
white space in this case (they would typically be printed without white
|
||||
space).
|
||||
|
||||
Some searches can be quite complex, and you may want to re-use them later,
|
||||
perhaps with some tweaking. Recoll versions 1.21 and later can save and
|
||||
restore searches, using XML files. See Saving and restoring queries.
|
||||
|
||||
3.1.1. Simple search
|
||||
|
||||
1. Start the recoll program.
|
||||
@ -1373,6 +1416,8 @@ Chapter 3. Searching
|
||||
memorizing the search language constructs. It can be opened through the
|
||||
Tools menu or through the main toolbar.
|
||||
|
||||
Recoll keeps a history of searches. See Advanced search history.
|
||||
|
||||
The dialog has two tabs:
|
||||
|
||||
1. The first tab lets you specify terms to search for, and permits
|
||||
@ -1745,7 +1790,24 @@ Chapter 3. Searching
|
||||
|
||||
Quitting. Entering Ctrl-Q almost anywhere will close the application.
|
||||
|
||||
3.1.14. Customizing the search interface
|
||||
3.1.14. Saving and restoring queries (1.21 and later)
|
||||
|
||||
Both simple and advanced query dialogs save recent history, but the amount
|
||||
is limited: old queries will eventually be forgotten. Also, important
|
||||
queries may be difficult to find among others. This is why both types of
|
||||
queries can also be explicitely saved to files, from the GUI menus: File
|
||||
-> Save last query / Load last query
|
||||
|
||||
The default location for saved queries is a subdirectory of the current
|
||||
configuration directory, but saved queries are ordinary files and can be
|
||||
written or moved anywhere.
|
||||
|
||||
Some of the saved query parameters are part of the preferences (e.g.
|
||||
autophrase or the active external indexes), and may differ when the query
|
||||
is loaded from the time it was saved. In this case, Recoll will warn of
|
||||
the differences, but will not change the user preferences.
|
||||
|
||||
3.1.15. Customizing the search interface
|
||||
|
||||
You can customize some aspects of the search interface by using the GUI
|
||||
configuration entry in the Preferences menu.
|
||||
@ -1912,29 +1974,33 @@ Chapter 3. Searching
|
||||
alternative indexer may also need to implement a way of purging the index
|
||||
from stale data,
|
||||
|
||||
3.1.14.1. The result list format
|
||||
3.1.15.1. The result list format
|
||||
|
||||
Newer versions of Recoll (from 1.17) normally use WebKit HTML widgets for
|
||||
the result list and the snippets window (this may be disabled at build
|
||||
time). Total customisation is possible with full support for CSS and
|
||||
Javascript. Conversely, there are limits to what you can do with the older
|
||||
Qt QTextBrowser, but still, it is possible to decide what data each result
|
||||
will contain, and how it will be displayed.
|
||||
|
||||
The result list presentation can be exhaustively customized by adjusting
|
||||
two elements:
|
||||
|
||||
o The paragraph format
|
||||
|
||||
o HTML code inside the header section
|
||||
o HTML code inside the header section. For versions 1.21 and later, this
|
||||
is also used for the snippets window
|
||||
|
||||
These can be edited from the Result list tab of the GUI configuration.
|
||||
The paragraph format and the header fragment can be edited from the Result
|
||||
list tab of the GUI configuration.
|
||||
|
||||
Newer versions of Recoll (from 1.17) use a WebKit HTML object by default
|
||||
(this may be disabled at build time), and total customisation is possible
|
||||
with full support for CSS and Javascript. Conversely, there are limits to
|
||||
what you can do with the older Qt QTextBrowser, but still, it is possible
|
||||
to decide what data each result will contain, and how it will be
|
||||
displayed.
|
||||
The header fragment is used both for the result list and the snippets
|
||||
window. The snippets list is a table and has a snippets class attribute.
|
||||
Each paragraph in the result list is a table, with class respar, but this
|
||||
can be changed by editing the paragraph format.
|
||||
|
||||
No more detail will be given about the header part (only useful with the
|
||||
WebKit build), if there are restrictions to what you can do, they are
|
||||
beyond this author's HTML/CSS/Javascript abilities... There are a few
|
||||
examples on the page about customising the result list on the Recoll web
|
||||
site.
|
||||
There are a few examples on the page about customising the result list on
|
||||
the Recoll web site.
|
||||
|
||||
The paragraph format
|
||||
|
||||
@ -1997,9 +2063,13 @@ Chapter 3. Searching
|
||||
|
||||
The default value for the paragraph format string is:
|
||||
|
||||
<img src="%I" align="left">%R %S %L <b>%T</b><br>
|
||||
%M %D <i>%U</i> %i<br>
|
||||
%A %K
|
||||
"<table class=\"respar\">\n"
|
||||
"<tr>\n"
|
||||
"<td><a href='%U'><img src='%I' width='64'></a></td>\n"
|
||||
"<td>%L <i>%S</i> <b>%T</b><br>\n"
|
||||
"<span style='white-space:nowrap'><i>%M</i> %D</span> <i>%U</i> %i<br>\n"
|
||||
"%A %K</td>\n"
|
||||
"</tr></table>\n"
|
||||
|
||||
You may, for example, try the following for a more web-like experience:
|
||||
|
||||
@ -2205,7 +2275,8 @@ Chapter 3. Searching
|
||||
|
||||
An element is composed of an optional field specification, and a value,
|
||||
separated by a colon (the field separator is the last colon in the
|
||||
element). Example: Eugenie, author:balzac, dc:title:grandet
|
||||
element). Examples: Eugenie, author:balzac, dc:title:grandet
|
||||
dc:title:"eugenie grandet"
|
||||
|
||||
The colon, if present, means "contains". Xesam defines other relations,
|
||||
which are mostly unsupported for now (except in special cases, described
|
||||
@ -2218,13 +2289,22 @@ Chapter 3. Searching
|
||||
(word2 OR word3) not (word1 AND word2) OR word3. Explicit parenthesis are
|
||||
not supported.
|
||||
|
||||
An element preceded by a - specifies a term that should not appear. Pure
|
||||
negative queries are forbidden.
|
||||
As of Recoll 1.21, you can use parentheses to group elements, which will
|
||||
sometimes make things clearer, and may allow expressing combinations which
|
||||
would have been difficult otherwise.
|
||||
|
||||
An element preceded by a - specifies a term that should not appear.
|
||||
|
||||
As usual, words inside quotes define a phrase (the order of words is
|
||||
significant), so that title:"prejudice pride" is not the same as
|
||||
title:prejudice title:pride, and is unlikely to find a result.
|
||||
|
||||
Words inside phrases and capitalized words are not stem-expanded.
|
||||
Wildcards may be used anywhere inside a term. Specifying a wild-card on
|
||||
the left of a term can produce a very slow search (or even an incorrect
|
||||
one if the expansion is truncated because of excessive size). Also see
|
||||
More about wildcards.
|
||||
|
||||
To save you some typing, recent Recoll versions (1.20 and later) interpret
|
||||
a comma-separated list of terms as an AND list inside the field. Use slash
|
||||
characters ('/') for an OR list. No white space is allowed. So
|
||||
@ -2238,8 +2318,10 @@ Chapter 3. Searching
|
||||
|
||||
would search for john or ringo.
|
||||
|
||||
Modifiers can be set on a phrase clause, for example to specify a
|
||||
proximity search (unordered). See the modifier section.
|
||||
Modifiers can be set on a double-quote value, for example to specify a
|
||||
proximity search (unordered). See the modifier section. No space must
|
||||
separate the final double-quote and the modifiers value, e.g. "two
|
||||
one"po10
|
||||
|
||||
Recoll currently manages the following default fields:
|
||||
|
||||
@ -2356,12 +2438,6 @@ Chapter 3. Searching
|
||||
permit filtering results in the main GUI screen. Categories are OR'ed
|
||||
like MIME types above. This can't be negated with - either.
|
||||
|
||||
Words inside phrases and capitalized words are not stem-expanded.
|
||||
Wildcards may be used anywhere inside a term. Specifying a wild-card on
|
||||
the left of a term can produce a very slow search (or even an incorrect
|
||||
one if the expansion is truncated because of excessive size). Also see
|
||||
More about wildcards.
|
||||
|
||||
The document input handlers used while indexing have the possibility to
|
||||
create other fields with arbitrary names, and aliases may be defined in
|
||||
the configuration, so that the exact field search possibilities may be
|
||||
@ -3249,45 +3325,29 @@ Chapter 5. Installation and configuration
|
||||
|
||||
5.1. Installing a binary copy
|
||||
|
||||
There are three types of binary Recoll installations:
|
||||
Recoll binary copies are always distributed as regular packages for your
|
||||
system. They can be obtained either through the system's normal software
|
||||
distribution framework (e.g. Debian/Ubuntu apt, FreeBSD ports, etc.), or
|
||||
from some type of "backports" repository providing versions newer than the
|
||||
standard ones, or found on the Recoll WEB site in some cases.
|
||||
|
||||
o Through your system normal software distribution framework (ie,
|
||||
Debian/Ubuntu apt, FreeBSD ports, etc.).
|
||||
There used to exist another form of binary install, as pre-compiled source
|
||||
trees, but these are just less convenient than the packages and don't
|
||||
exist any more.
|
||||
|
||||
o From a package downloaded from the Recoll web site.
|
||||
The package management tools will usually automatically deal with hard
|
||||
dependancies for packages obtained from a proper package repository. You
|
||||
will have to deal with them by hand for downloaded packages (for example,
|
||||
when dpkg complains about missing dependancies).
|
||||
|
||||
o From a prebuilt tree downloaded from the Recoll web site.
|
||||
|
||||
In all cases, the strict software dependancies (ie on Xapian or iconv)
|
||||
will be automatically satisfied, you should not have to worry about them.
|
||||
|
||||
You will only have to check or install supporting applications for the
|
||||
file types that you want to index beyond those that are natively processed
|
||||
by Recoll (text, HTML, email files, and a few others).
|
||||
In all cases, you will have to check or install supporting applications
|
||||
for the file types that you want to index beyond those that are natively
|
||||
processed by Recoll (text, HTML, email files, and a few others).
|
||||
|
||||
You should also maybe have a look at the configuration section (but this
|
||||
may not be necessary for a quick test with default parameters). Most
|
||||
parameters can be more conveniently set from the GUI interface.
|
||||
|
||||
5.1.1. Installing through a package system
|
||||
|
||||
If you use a BSD-type port system or a prebuilt package (DEB, RPM,
|
||||
manually or through the system software configuration utility), just
|
||||
follow the usual procedure for your system.
|
||||
|
||||
5.1.2. Installing a prebuilt Recoll
|
||||
|
||||
The unpackaged binary versions on the Recoll web site are just compressed
|
||||
tar files of a build tree, where only the useful parts were kept
|
||||
(executables and sample configuration).
|
||||
|
||||
The executable binary files are built with a static link to libxapian and
|
||||
libiconv, to make installation easier (no dependencies).
|
||||
|
||||
After extracting the tar file, you can proceed with installation as if you
|
||||
had built the package from source (that is, just type make install). The
|
||||
binary trees are built for installation to /usr/local.
|
||||
|
||||
5.2. Supporting packages
|
||||
|
||||
Recoll uses external applications to index some file types. You need to
|
||||
@ -3487,7 +3547,7 @@ Chapter 5. Installation and configuration
|
||||
Normal procedure:
|
||||
|
||||
cd recoll-xxx
|
||||
configure
|
||||
./configure
|
||||
make
|
||||
(practices usual hardship-repelling invocations)
|
||||
|
||||
@ -3624,7 +3684,51 @@ Chapter 5. Installation and configuration
|
||||
text files with appropriate encodings, and concatenate them to create
|
||||
the complete configuration.
|
||||
|
||||
5.4.1. The main configuration file, recoll.conf
|
||||
5.4.1. Environment variables
|
||||
|
||||
RECOLL_CONFDIR
|
||||
|
||||
Defines the main configuration directory.
|
||||
|
||||
RECOLL_TMPDIR, TMPDIR
|
||||
|
||||
Locations for temporary files, in this order of priority. The
|
||||
default if none of these is set is to use /tmp. Big temporary
|
||||
files may be created during indexing, mostly for decompressing,
|
||||
and also for processing, e.g. email attachments.
|
||||
|
||||
RECOLL_CONFTOP, RECOLL_CONFMID
|
||||
|
||||
Allow adding configuration directories with priorities below and
|
||||
above the user directory (see above the Configuration overview
|
||||
section for details).
|
||||
|
||||
RECOLL_EXTRA_DBS, RECOLL_ACTIVE_EXTRA_DBS
|
||||
|
||||
Help for setting up external indexes. See this paragraph for
|
||||
explanations.
|
||||
|
||||
RECOLL_DATADIR
|
||||
|
||||
Defines replacement for the default location of Recoll data files,
|
||||
normally found in, e.g., /usr/share/recoll).
|
||||
|
||||
RECOLL_FILTERSDIR
|
||||
|
||||
Defines replacement for the default location of Recoll filters,
|
||||
normally found in, e.g., /usr/share/recoll/filters).
|
||||
|
||||
ASPELL_PROG
|
||||
|
||||
aspell program to use for creating the spelling dictionary. The
|
||||
result has to be compatible with the libaspell which Recoll is
|
||||
using.
|
||||
|
||||
VARNAME
|
||||
|
||||
Blabla
|
||||
|
||||
5.4.2. The main configuration file, recoll.conf
|
||||
|
||||
recoll.conf is the main configuration file. It defines things like what to
|
||||
index (top directories and things to ignore), and the default character
|
||||
@ -3639,7 +3743,7 @@ Chapter 5. Installation and configuration
|
||||
Configuration menu in the recoll interface. Some can only be set by
|
||||
editing the configuration file.
|
||||
|
||||
5.4.1.1. Parameters affecting what documents we index:
|
||||
5.4.2.1. Parameters affecting what documents we index:
|
||||
|
||||
topdirs
|
||||
|
||||
@ -3673,8 +3777,23 @@ Chapter 5. Installation and configuration
|
||||
like ~/.thunderbird or ~/.evolution in topdirs.
|
||||
|
||||
Not even the file names are indexed for patterns in this list. See
|
||||
the recoll_noindex variable in mimemap for an alternative approach
|
||||
which indexes the file names.
|
||||
the noContentSuffixes variable for an alternative approach which
|
||||
indexes the file names.
|
||||
|
||||
noContentSuffixes
|
||||
|
||||
This is a list of file name endings (not wildcard expressions, nor
|
||||
dot-delimited suffixes). Only the names of matching files will be
|
||||
indexed (no attempt at MIME type identification, no decompression,
|
||||
no content indexing). This can be redefined for subdirectories,
|
||||
and edited from the GUI. The default value is:
|
||||
|
||||
noContentSuffixes = .md5 .map \
|
||||
.o .lib .dll .a .sys .exe .com \
|
||||
.mpp .mpt .vsd \
|
||||
.img .img.gz .img.bz2 .img.xz .image .image.gz .image.bz2 .image.xz \
|
||||
.dat .bak .rdf .log.gz .log .db .msf .pid \
|
||||
,v ~ #
|
||||
|
||||
skippedPaths and daemSkippedPaths
|
||||
|
||||
@ -3794,7 +3913,7 @@ Chapter 5. Installation and configuration
|
||||
Firefox plugin as ~/.recollweb/ToIndex so there should be no need
|
||||
to change it.
|
||||
|
||||
5.4.1.2. Parameters affecting how we generate terms:
|
||||
5.4.2.2. Parameters affecting how we generate terms:
|
||||
|
||||
Changing some of these parameters will imply a full reindex. Also, when
|
||||
using multiple indexes, it may not make sense to search indexes that don't
|
||||
@ -3969,7 +4088,7 @@ Chapter 5. Installation and configuration
|
||||
|
||||
field1 and field2 will be set inside the document metadata.
|
||||
|
||||
5.4.1.3. Parameters affecting where and how we store things:
|
||||
5.4.2.3. Parameters affecting where and how we store things:
|
||||
|
||||
dbdir
|
||||
|
||||
@ -4028,7 +4147,7 @@ Chapter 5. Installation and configuration
|
||||
memory, you can try higher values between 20 and 80. In my
|
||||
experience, values beyond 100 are always counterproductive.
|
||||
|
||||
5.4.1.4. Parameters affecting multithread processing
|
||||
5.4.2.4. Parameters affecting multithread processing
|
||||
|
||||
The Recoll indexing process recollindex can use multiple threads to speed
|
||||
up indexing on multiprocessor systems. The work done to index files is
|
||||
@ -4091,7 +4210,7 @@ Chapter 5. Installation and configuration
|
||||
|
||||
thrQSizes = -1 -1 -1
|
||||
|
||||
5.4.1.5. Miscellaneous parameters:
|
||||
5.4.2.5. Miscellaneous parameters:
|
||||
|
||||
autodiacsens
|
||||
|
||||
@ -4121,6 +4240,16 @@ Chapter 5. Installation and configuration
|
||||
value, and is the default. The daemversion is specific to the
|
||||
indexing monitor daemon.
|
||||
|
||||
checkneedretryindexscript
|
||||
|
||||
This defines the name for a command executed by recollindex when
|
||||
starting indexing. If the exit status of the command is 0,
|
||||
recollindex retries to index all files which previously could not
|
||||
be indexed because of data extraction errors. The default value is
|
||||
a script which checks if any of the common bin directories have
|
||||
changed (indicating that a helper program may have been
|
||||
installed).
|
||||
|
||||
mondelaypatterns
|
||||
|
||||
This allows specify wildcard path patterns (processed with
|
||||
@ -4211,7 +4340,7 @@ Chapter 5. Installation and configuration
|
||||
be set for directories which hold Thunderbird data, as their
|
||||
folder format is weird.
|
||||
|
||||
5.4.2. The fields file
|
||||
5.4.3. The fields file
|
||||
|
||||
This file contains information about dynamic fields handling in Recoll.
|
||||
Some very basic fields have hard-wired behaviour, and, mostly, you should
|
||||
@ -4282,7 +4411,7 @@ Chapter 5. Installation and configuration
|
||||
# mailmytag field name
|
||||
x-my-tag = mailmytag
|
||||
|
||||
5.4.2.1. Extended attributes in the fields file
|
||||
5.4.3.1. Extended attributes in the fields file
|
||||
|
||||
Recoll versions 1.19 and later process user extended file attributes as
|
||||
documents fields by default.
|
||||
@ -4294,7 +4423,7 @@ Chapter 5. Installation and configuration
|
||||
translations from extended attributes names to Recoll field names. An
|
||||
empty translation disables use of the corresponding attribute data.
|
||||
|
||||
5.4.3. The mimemap file
|
||||
5.4.4. The mimemap file
|
||||
|
||||
mimemap specifies the file name extension to MIME type mappings.
|
||||
|
||||
@ -4307,18 +4436,12 @@ Chapter 5. Installation and configuration
|
||||
handled specially, which is possible because they are usually all located
|
||||
in one place.
|
||||
|
||||
mimemap also has a recoll_noindex variable which is a list of suffixes.
|
||||
Matching files will be skipped (which avoids unnecessary decompressions or
|
||||
file executions). This is partially redundant with skippedNames in the
|
||||
main configuration file, with a few differences: it will not affect
|
||||
directories, it cannot be made dependant on the file-system location (it
|
||||
is a configuration-wide parameter), and the file names will still be
|
||||
indexed (not even the file names are indexed for patterns in skippedNames.
|
||||
recoll_noindex is used mostly for things known to be unindexable by a
|
||||
given Recoll version. Having it there avoids cluttering the more
|
||||
user-oriented and locally customized skippedNames.
|
||||
The recoll_noindex mimemap variable has been moved to recoll.conf and
|
||||
renamed to noContentSuffixes, while keeping the same function, as of
|
||||
Recoll version 1.21. For older Recoll versions, see the documentation for
|
||||
noContentSuffixes but use recoll_noindex in mimemap.
|
||||
|
||||
5.4.4. The mimeconf file
|
||||
5.4.5. The mimeconf file
|
||||
|
||||
mimeconf specifies how the different MIME types are handled for indexing,
|
||||
and which icons are displayed in the recoll result lists.
|
||||
@ -4330,7 +4453,7 @@ Chapter 5. Installation and configuration
|
||||
recoll in the result lists (the values are the basenames of the png images
|
||||
inside the iconsdir directory (specified in recoll.conf).
|
||||
|
||||
5.4.5. The mimeview file
|
||||
5.4.6. The mimeview file
|
||||
|
||||
mimeview specifies which programs are started when you click on an Open
|
||||
link in a result list. Ie: HTML is normally displayed using firefox, but
|
||||
@ -4399,7 +4522,7 @@ Chapter 5. Installation and configuration
|
||||
document. This could be used in combination with field customisation to
|
||||
help with opening the document.
|
||||
|
||||
5.4.6. The ptrans file
|
||||
5.4.7. The ptrans file
|
||||
|
||||
ptrans specifies query-time path translations. These can be useful in
|
||||
multiple cases.
|
||||
@ -4418,9 +4541,9 @@ Chapter 5. Installation and configuration
|
||||
/server/volume2/docdir = /net/server/volume2/docdir
|
||||
|
||||
|
||||
5.4.7. Examples of configuration adjustments
|
||||
5.4.8. Examples of configuration adjustments
|
||||
|
||||
5.4.7.1. Adding an external viewer for an non-indexed type
|
||||
5.4.8.1. Adding an external viewer for an non-indexed type
|
||||
|
||||
Imagine that you have some kind of file which does not have indexable
|
||||
content, but for which you would like to have a functional Open link in
|
||||
@ -4450,7 +4573,7 @@ Chapter 5. Installation and configuration
|
||||
configuration, which you do not need to alter. mimeview can also be
|
||||
modified from the Gui.
|
||||
|
||||
5.4.7.2. Adding indexing support for a new file type
|
||||
5.4.8.2. Adding indexing support for a new file type
|
||||
|
||||
Let us now imagine that the above .blob files actually contain indexable
|
||||
text and that you know how to extract it with a command line program.
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user