diff --git a/src/doc/user/usermanual.html b/src/doc/user/usermanual.html index 9cb3df43..1a9cf345 100644 --- a/src/doc/user/usermanual.html +++ b/src/doc/user/usermanual.html @@ -92,11 +92,11 @@ alink="#0000FF"> "#RCL.INDEXING.INTRODUCTION.CONFIG">Configurations, multiple indexes
List of Tables
C:/Program Files
- (x86)/Recoll, esp. anything referenced in /usr/share in this document will be found
- int the Share subdirectory).
- The user configuration is stored by default under
- AppData/Local/Recoll inside the
- user directory, along with the index itself.
+ (x86)/Recoll). Especially, anything referenced inside
+ /usr/share in this document
+ will be found in the Share
+ subdirectory of the installation). The user configuration is
+ stored by default under AppData/Local/Recoll inside the user
+ directory, along with the index itself.
The indexing process is started automatically (after asking permission), the first time you @@ -684,8 +686,8 @@ alink="#0000FF"> "command">recollindex command. Recoll indexing is multithreaded by default when appropriate hardware - resources are available, and can perform in parallel - multiple tasks for text extraction, segmentation and index + resources are available, and can perform multiple tasks in + parallel for text extraction, segmentation and index updates.
Searches are usually @@ -701,6 +703,16 @@ alink="#0000FF"> "3.4. Searching on the command line">command line interface.
+A Gnome Shell Search Provider .
+A "https://www.lesbonscomptes.com/recoll/pages/download.html" target="_top">Scope module.
A Gnome Shell Search Provider.
-Real - time indexing . (Only available on - Unix-like - systems). Real time + indexing . recollindex runs permanently as a daemon and uses a file system alteration monitor (e.g. inotify) to detect file - changes. New or updated files are indexed at once. - Monitoring a big file system tree can consume + "application">inotify on Unix-like systems) to detect + file changes. New or updated files are indexed at + once. Monitoring a big file system tree can consume significant system resources.
The choice between the two methods is mostly a matter of preference, and they can be combined by - setting up multiple indexes (ie: use periodic indexing - on a big documentation directory, and real time - indexing on a small home directory), or, with - Recoll 1.24 and newer, - by configuring - the index so that only a subset of the tree will be + setting up multiple indexes (e.g.: use periodic + indexing on a big documentation directory, and real + time indexing on a small home directory), or by + configuring the index + so that only a subset of the tree will be monitored.
The choice of method and the parameters used can be
configured from the
is located in $HOME/.recoll/ for Unix-like systems and %LOCALAPPDATA%\Recoll on %LOCALAPPDATA%/Recoll on Windows (typically C:\Users\[me]\Appdata\Local\Recoll).
All configuration parameters have defaults, defined in system-wide files. Without further customisation, the default configuration will process your complete home @@ -936,11 +933,11 @@ alink="#0000FF"> many cases where you know the subset of files that should be searched, and where narrowing the search can improve the results. You can achieve approximately the same - effect with the directory filter in advanced search, but - multiple indexes may have better performance and may be - worth the trouble in some cases.
+ effect by using a directory filter clause in a search, + but multiple indexes may have better performance and may + be worth the trouble in some cases.A more advanced use case would be to use multiple - index to improve indexing performance, by updating + indexes to improve indexing performance, by updating several indexes in parallel (using multiple CPU cores and disks, or possibly several machines), and then merging them, or querying them in parallel.
@@ -953,8 +950,8 @@ alink="#0000FF"> @@ -977,7 +974,7 @@ alink="#0000FF"> "command">recollindex processes plain text, HTML, OpenDocument (Open/LibreOffice), email formats, and a few others internally. -Other file types (ie: postscript, pdf, ms-word, rtf +
Other file types (e.g.: postscript, pdf, ms-word, rtf ...) need external applications for preprocessing. The list is in the menu. Excluding by type can be done by setting the excludedmimetypes - list in the configuration file (1.20 and later). This can - be redefined for subdirectories.
+ list in the configuration file. This can be redefined for + subdirectories.You can also define an exclusive list of MIME types to
be indexed (no others will be indexed), by setting the
@@ -1098,8 +1095,8 @@ alink="#0000FF">
@@ -1135,7 +1132,10 @@ alink="#0000FF">
xapiandb subdirectory of the
Recoll configuration
directory, typically $HOME/.recoll/xapiandb/. This can be
+ "filename">$HOME/.recoll/xapiandb/ on Unix-like systems or C:/Users/[me]/Appdata/Local/Recoll/xapiandb
+ on Windows. This can be
changed via two different methods (with different
purposes):
The first time you start "varname">topdirs, which lists the subtrees and files to be indexed.
The applications needed to index file types other than - text, HTML or email (ie: pdf, postscript, ms-word...) are + text, HTML or email (e.g.: pdf, postscript, ms-word...) are described in the external packages @@ -1418,7 +1418,7 @@ alink="#0000FF"> using a text editor on the files, or, for most parameters, by using the recoll index configuration GUI. In the latter case, the configuration directory for which parameters are modified @@ -1468,7 +1468,7 @@ mkdir -c option (you could create a desktop file to do it for you), and then using the GUI + title="2.3.3. The index configuration GUI">GUI index configuration tool to set up the index.
recoll -c
Most parameters for a given index configuration can be
+ set from a recoll GUI running on
+ this configuration (either as default, or by setting
+ RECOLL_CONFDIR or the
+ -c option.)
The interface is started from the Preferences → Index Configuration menu entry. It + is divided in four tabs, Global + parameters, Local + parameters, Web + history (which is explained in the next section) + and Search parameters.
+The Global parameters + tab allows setting global variables, like the lists of + top directories, skipped paths, or stemming + languages.
+The Local parameters tab + allows setting variables that can be redefined for + subdirectories. This second tab has an initially empty + list of customisation directories, to which you can add. + The variables are then set for the currently selected + directory (or at the top level if the empty line is + selected).
+The Search parameters + section defines parameters which are used at query time, + but are global to an index and affect all search tools, + not only the GUI.
+The meaning for most entries in the interface is
+ self-evident and documented by a ToolTip popup on the text label. For
+ more detail, you will need to refer to the configuration
+ section of this guide.
The configuration tool normally respects the comments + and most of the formatting inside the configuration file, + so that it is quite possible to use it on hand-edited + files, which you might nevertheless want to backup + first...
+Note: you don't probably don't need to read this. The
+ default automatic configuration is fine is most cases.
+ Only the part about disabling multithreading may be more
+ commonly useful, so I'll prepend it here. In recoll.conf:
+ thrQSizes = -1 -1 -1 +
The Recoll indexing process recollindex can use @@ -1678,58 +1738,6 @@ recoll -c
Most parameters for a given index configuration can be
- set from a recoll GUI running on
- this configuration (either as default, or by setting
- RECOLL_CONFDIR or the
- -c option.)
The interface is started from the Preferences → Index Configuration menu entry. It - is divided in four tabs, Global - parameters, Local - parameters, Web - history (which is explained in the next section) - and Search parameters.
-The Global parameters - tab allows setting global variables, like the lists of - top directories, skipped paths, or stemming - languages.
-The Local parameters tab - allows setting variables that can be redefined for - subdirectories. This second tab has an initially empty - list of customisation directories, to which you can add. - The variables are then set for the currently selected - directory (or at the top level if the empty line is - selected).
-The Search parameters - section defines parameters which are used at query time, - but are global to an index and affect all search tools, - not only the GUI.
-The meaning for most entries in the interface is
- self-evident and documented by a ToolTip popup on the text label. For
- more detail, you will need to refer to the configuration
- section of this guide.
The configuration tool normally respects the comments - and most of the formatting inside the configuration file, - so that it is quite possible to use it on hand-edited - files, which you might nevertheless want to backup - first...
--m command. With this option, recollindex will detach
- from the terminal and become a daemon, permanently
- monitoring file changes and updating the index.
- In this situation, the recollindex will + permanently monitor file changes and update the index.
+On Windows systems, the + monitoring process is started from the recoll GUI File menu. On Unix-like systems, there are other + possibilities, see the following sections.
+When this is in use, the recoll GUI menu makes two operations available:
and
Automatic
- daemon start with systemd
+ "RCL.INDEXING.MONITOR.START.SYSTEMD">Unix-like
+ systems: automatic daemon start with systemd
Automatic daemon
- start from the desktop session
+ "RCL.INDEXING.MONITOR.START">Unix-like systems: automatic
+ daemon start from the desktop session
@@ -2708,7 +2721,8 @@ fvwm
Also the log file will only be truncated when the daemon
starts. If the daemon runs permanently, the log file may
grow quite big, depending on the log level.
- Increasing resources for inotify. On Linux
+
Unix-like systems:
+ increasing resources for inotify. On Linux
systems, monitoring a big tree may need increasing the
resources available to inotify, which are normally
defined in Slowing down the reindexing rate for fast changing
files. When using the real time monitor, it may
happen that some files need to be indexed, but change so
- often that they impose an excessive load for the system.
- Recoll provides a
+ often that they impose an excessive load for the
+ system.Recoll provides a
configuration option to specify the minimum time before
which a file, specified by a wildcard pattern, cannot be
reindexed. See the
Simple search (the default, on the main screen)
has a single entry field where you can enter multiple
- words.
+ words or a query language query.
Advanced search (a panel accessed through the
@@ -2855,7 +2869,13 @@ fs.inotify.max_user_watches=32768
- In most cases, you can enter the terms as you think
+
The Advanced Search tool
+ is easier to use, but not actually more powerful, than the
+ Simple Search in query
+ language mode. Its name is historical, but Assisted Search would probably have been
+ a better designation.
+ In most text areas, you can enter the terms as you think
them, even if they contain embedded punctuation or other
non-textual characters (e.g. Recoll can handle things like email
@@ -2870,8 +2890,8 @@ fs.inotify.max_user_watches=32768
re-use them later, perhaps with some tweaking. Recoll can save and restore searches.
See
- Saving and restoring queries.
+ "3.2.16. Saving and restoring queries">Saving and
+ restoring queries.
@@ -2915,9 +2935,10 @@ fs.inotify.max_user_watches=32768
directives, this will look for documents containing all
of the search terms (the ones with more terms will get
better scores), just like the All
- terms mode. Any term
- will search for documents where at least one of the terms
- appear. File name will
+ Terms mode.
+ Any term will search for
+ documents where at least one of the terms appear.
+ File name will
exclusively look for file names, not contents
All search modes allow terms to be expanded with
wildcards characters (*,
@@ -2938,8 +2959,8 @@ fs.inotify.max_user_watches=32768
When using a stripped index (the default), character
case has no influence on search, except that you can
disable stem expansion for any term by capitalizing it.
- Ie: a search for floor will
- also normally look for floor
+ will also normally look for flooring, floored, etc., but a search for
Floor will only look for
@@ -2974,12 +2995,6 @@ fs.inotify.max_user_watches=32768
mode from the Query
Language mode, where you have to care about the
syntax.
- You can use the Tools → Advanced search dialog for more
- complex searches.
The File name search
mode will specifically look for file names. The point of
having a separate file name search is that wildcard
@@ -3001,11 +3016,12 @@ fs.inotify.max_user_watches=32768
An entry without any wildcard character and not
capitalized will be prepended and appended with '*'
- (ie: etc
+ (e.g.: etc ->
+ *etc*,
+ but Etc
-> *etc*, but
- Etc ->
- etc).
+ "replaceable">etc).
If you have a big index (many files),
@@ -3242,7 +3258,7 @@ fs.inotify.max_user_watches=32768
file. It will only appear for results which are
top-level files. See
+ "3.2.5. Unix-like systems: running arbitrary commands on result files">
further for a more detailed description.
The Copy File Name and
Copy Url copy the
@@ -3251,7 +3267,7 @@ fs.inotify.max_user_watches=32768
saving the contents of a result document to a chosen
file. This entry will only appear if the document does
not correspond to an existing file, but is a
- subdocument inside such a file (ie: an email
+ subdocument inside such a file (e.g.: an email
attachment). It is especially useful to extract
attachments with no associated editor.
The Open/Preview Parent
@@ -3325,6 +3341,49 @@ fs.inotify.max_user_watches=32768
application, and an equivalent right-click menu. Typing
Esc (the
Escape key) will unfreeze the display.
+ Using Shift-click on a row will display the document
+ extracted text (somewhat like a preview) instead of the
+ document details. The functions of Click and Shift-Click
+ can be reversed in the GUI preferences.
+
+
+
+ By default, the GUI displays the filters panel on the
+ left of the results area. This is new in version 1.32.
+ You can adjust the width of the panel, and hide it by
+ squeezing it completely. The width will be memorized for
+ the next session.
+ The panel currently has two areas, for filtering the
+ results by dates, or by filesystem location.
+ The panel is only active in Query Language search mode, and its
+ effect is to add date: and
+ dir: clauses to the actual
+ search.
+ The dates filter can be activated by clicking the
+ checkbox. It has two assisted date entry widgets, for the
+ minimum and maximum dates of the search period.
+ The directory filter displays a subset of the
+ filesystem directories, reduced to the indexed area, as
+ defined by the topdirs list
+ and the name exclusion parameters. You can independantly
+ select and deselect directories by clicking them. Note
+ that selecting a directory will activate the whole
+ subtree for searching, there is no need to select the
+ subdirectories, and no way to exclude some of them (use
+ Query language dir: clauses if this is needed).
@@ -3332,7 +3391,7 @@ fs.inotify.max_user_watches=32768
@@ -3385,7 +3444,7 @@ fs.inotify.max_user_watches=32768
@@ -3417,7 +3476,7 @@ fs.inotify.max_user_watches=32768
@@ -3451,7 +3510,7 @@ fs.inotify.max_user_watches=32768
keys).
A right-click menu in the text area allows switching
between displaying the main text or the contents of
- fields associated to the document (ie: author, abtract,
+ fields associated to the document (e.g.: author, abtract,
etc.). This is especially useful in cases where the term
match did not occur in the main text but in one of the
fields. In the case of images, you can switch between
@@ -3465,7 +3524,7 @@ fs.inotify.max_user_watches=32768
"keycap">Ctrl + P) in the window
text.
-
+
@@ -3537,7 +3596,7 @@ fs.inotify.max_user_watches=32768
@@ -3655,8 +3714,8 @@ fs.inotify.max_user_watches=32768
@@ -3679,9 +3738,9 @@ fs.inotify.max_user_watches=32768
are combined to build the search.
- The second tab lets filter the results according
- to file size, date of modification, MIME type, or
- location.
+ The second tab allows filtering the results
+ according to file size, date of modification, MIME
+ type, or location.
@@ -3704,7 +3763,7 @@ fs.inotify.max_user_watches=32768
- This part of the dialog lets you constructc a query
+
This part of the dialog lets you construct a query
by combining multiple clauses of different types. Each
entry field is configurable for the following
modes:
@@ -3797,12 +3856,12 @@ fs.inotify.max_user_watches=32768
"literal">k/K, m/M, g/G, t/T for 1E3, 1E6, 1E9, 1E12
+ "literal">t/T for 10E3, 10E6, 10E9, 10E12
respectively.
The next section allows filtering the results
- by their MIME types, or MIME categories (ie:
+ by their MIME types, or MIME categories (e.g.:
media/text/message/etc.).
You can transfer the types between two boxes,
to define which will be included or excluded by
@@ -3823,7 +3882,7 @@ fs.inotify.max_user_watches=32768
multiple indexes instead, as the performance may
be better.
You can use relative/partial paths for
- filtering. Ie, entering dirA/dirB would match either
/dir1/dirA/dirB/myfile1 or
@@ -3861,16 +3920,16 @@ fs.inotify.max_user_watches=32768
Recoll automatically
manages the expansion of search terms to their
- derivatives (ie: plural/singular, verb inflections). But
- there are other cases where the exact search term is not
- known. For example, you may not remember the exact
+ derivatives (e.g.: plural/singular, verb inflections).
+ But there are other cases where the exact search term is
+ not known. For example, you may not remember the exact
spelling, or only know the beginning of the name.
The search will only propose replacement terms with
spelling variations when no matching document were found.
@@ -3880,20 +3939,22 @@ fs.inotify.max_user_watches=32768
The term explorer tool (started from the toolbar icon
or from the Term explorer
entry of the Tools menu)
- can be used to search the full index terms list. It has
- three modes of operations:
+ can be used to search the full index terms list, or
+ (later addition), display some statistics or other index
+ information. It has several modes of operations:
- Wildcard
-
In this mode of operation, you can enter a
search string with shell-like wildcards (*, ?, []).
- ie: xapi*
- would display all index terms beginning with
- xapi.
- (More about wildcards xapi* would display
+ all index terms beginning with xapi. (More about
+ wildcards here ).
+ "3.6.1. Wildcards">here).
- Regular expression
-
@@ -3901,10 +3962,10 @@ fs.inotify.max_user_watches=32768
input. Example:
word[0-9]+. The
expression is implicitly anchored at the beginning.
- Ie: press
- will match pression but not
- press will match
+ pression
+ but not expression. You can
use .*press to match
@@ -3955,7 +4016,7 @@ fs.inotify.max_user_watches=32768
Note that in cases where Recoll does not know the beginning
- of the string to search for (ie a wildcard expression
+ of the string to search for (e.g. a wildcard expression
like *coll),
the expansion can take quite a long time because the full
index term list will have to be processed. The expansion
@@ -3973,7 +4034,7 @@ fs.inotify.max_user_watches=32768
@@ -3985,13 +4046,14 @@ fs.inotify.max_user_watches=32768
the recoll
GUI are described here.
A recoll
- program instance is always associated with a specific
- index, which is the one to be updated when requested from
- the menu, but it can
- use any number of Recoll
- indexes for searching. The external indexes can be
- selected through the external
- indexes tab in the preferences dialog.
+ program instance is always associated with a main index,
+ which is the one to be updated when requested from the
+ menu, but it can use
+ any number of external Recoll indexes for searching. The
+ external indexes can be selected through the external indexes tab in the preferences
+ dialog.
Index selection is performed in two phases. A set of
all usable indexes must first be defined, and then the
subset of indexes to be used for searching. These
@@ -4018,7 +4080,7 @@ fs.inotify.max_user_watches=32768
might typically be set up by a system administrator so
that every user does not have to do it. The variable
should define a colon-separated list of index
- directories, ie:
+ directories, e.g.:
export RECOLL_EXTRA_DBS=/some/place/xapiandb:/some/other/db
Another environment variable,
3.2.11. Document
+ "RCL.SEARCH.GUI.HISTORY">3.2.12. Document
history
@@ -4062,7 +4124,7 @@ fs.inotify.max_user_watches=32768
@@ -4086,8 +4148,8 @@ fs.inotify.max_user_watches=32768
the result list (documents with the exact same contents
as the displayed one). The test of identity is based on
an MD5 hash of the document container, not only of the
- text contents (so that ie, a text document with an image
- added will not be a duplicate of the text only).
+ text contents (so that e.g., a text document with an
+ image added will not be a duplicate of the text only).
Duplicates hiding is controlled by an entry in the
GUI configuration dialog,
and is off by default.
@@ -4103,7 +4165,7 @@ fs.inotify.max_user_watches=32768
@@ -4124,7 +4186,7 @@ fs.inotify.max_user_watches=32768
Shortcut column, and type
the desired sequence.
-
+
Table 3.1. Keyboard
shortcuts
@@ -4328,7 +4390,7 @@ fs.inotify.max_user_watches=32768
@@ -4434,7 +4496,7 @@ fs.inotify.max_user_watches=32768
looking for Any terms.
This will not change radically the results, but will
give a relevance boost to the results where the search
- terms appear as a phrase. Ie: searching for
+ terms appear as a phrase. E.g.: searching for
virtual reality will still
find all documents where either virtual or I.B.M.
are also automatically indexed as a word without the
dots: IBM. Searching for
- the word inside a phrase (ie: "the IBM company") will only match the
dotted abrreviation if you increase the phrase slack
(using the advanced search panel control, or the
@@ -4514,8 +4576,8 @@ fs.inotify.max_user_watches=32768
@@ -4545,7 +4607,7 @@ fs.inotify.max_user_watches=32768
@@ -4568,7 +4630,7 @@ fs.inotify.max_user_watches=32768
terms: Terms from the user query are
highlighted in the result list samples and the
preview window. The color can be chosen here. Any
- Qt color string should work (ie red, #ff0000). The default is
blue.
@@ -4839,7 +4901,7 @@ fs.inotify.max_user_watches=32768
"RCL.SEARCH.GUI.CUSTOM.EXTRADB">External
indexes: This panel will let you browse for
additional indexes that you may want to search. External
- indexes are designated by their database directory (ie:
+ indexes are designated by their database directory (e.g.:
/home/someothergui/.recoll/xapiandb,
+ "3.2.5. Unix-like systems: running arbitrary commands on result files">
section about defining scripts.
In addition to the predefined values above, all
strings like
- By using the By using the actual recollq
program.
@@ -5315,7 +5377,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
This would search for all documents with John Doe appearing as a
phrase in the author field (exactly what this is would
- depend on the document type, ie: the From: header, for an email message), and
containing either beatles or You can also use OR
conjunctions with dir:
clauses.
- A special aspect of On Unix-like
+ systems, a special aspect of dir clauses is that the values in
the index are not transcoded to UTF-8, and never
lower-cased or unaccented, but stored as binary.
@@ -5651,7 +5714,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
Periods can also be specified with small letters
- (ie: p2y).
+ (e.g.: p2y).
mime or
@@ -5899,7 +5962,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
would think, and strange search results. You can
use the term
+ "3.2.10. The term explorer tool">term
explorer tool to check what completions exist
for a given term. You can also see exactly what
search was performed by clicking on the link at the
@@ -5912,7 +5975,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
-
+
@@ -5984,10 +6047,10 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
"^my term"o5.
Anchored searches can be very useful for searches
inside somewhat structured documents like scientific
- articles, in case explicit metadata has not been supplied
- (a most frequent case), for example for looking for
- matches inside the abstract or the list of authors (which
- occur at the top of the document).
+ articles, in case explicit metadata has not been
+ supplied, for example for looking for matches inside the
+ abstract or the list of authors (which occur at the top
+ of the document).
@@ -6542,7 +6605,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
no) tells the handler if the
operation is for indexing or previewing. Some handlers
use this to output a slightly different format, for
- example stripping uninteresting repeated keywords (ie:
+ example stripping uninteresting repeated keywords (e.g.:
Subject: for email) when
indexing. This is not essential.
You should look at one of the simple handlers, for
@@ -8465,7 +8528,7 @@ hasextract = False
"literal">"mysql manual" (only inside phrase
searches).
--with-file-command
- Specify the version of the 'file' command to use (ie:
+ Specify the version of the 'file' command to use (e.g.:
--with-file-command=/usr/local/bin/file). Can be useful
to enable the gnu version on systems where the native
one is bad.
@@ -8845,7 +8908,7 @@ hasextract = False
Help for setting up external indexes. See
this
+ title="3.2.11. Multiple indexes">this
paragraph for explanations.
mimeview specifies which
programs are started when you click on an Open link in a result list. Ie: HTML is
- normally displayed using Open link in a result list. E.g.: HTML
+ is normally displayed using firefox, but you may prefer
Konqueror, your
openoffice.org program
@@ -10728,7 +10791,7 @@ other = rclcat:other
entry, (placed at the top level, outside of the
[view] section), holds a
list of MIME types that should not be uncompressed before
- starting the viewer (if they are found compressed, ie:
+ starting the viewer (if they are found compressed, e.g.:
mydoc.doc.gz).
The right side of each assignment holds a command to
@@ -10743,7 +10806,7 @@ other = rclcat:other
%f. File name. This may be the name
of a temporary file if it was necessary to create
- one (ie: to extract a subdocument from a
+ one (e.g.: to extract a subdocument from a
container).
diff --git a/src/doc/user/usermanual.xml b/src/doc/user/usermanual.xml
index 8241da85..fd5a743a 100644
--- a/src/doc/user/usermanual.xml
+++ b/src/doc/user/usermanual.xml
@@ -62,18 +62,16 @@
application. It is updated for &RCL; &RCLVERSION;.
&RCL; was for a long time dedicated to Unix-like systems. It
- was only lately (2015) ported to
- MS-Windows . Many references in this
- manual, especially file locations, are specific to Unix, and not
- valid on &WIN;, where some described features are also not available.
- The manual will be progressively updated. Until this happens, on
- &WIN;, most references to shared files can be translated by looking
- under the Recoll installation directory (Typically C:/Program
- Files (x86)/Recoll , esp. anything referenced
- in /usr/share in this document will be found int
- the Share subdirectory). The user configuration is
- stored by default under AppData/Local/Recoll
- inside the user directory, along with the index itself.
+ was only lately (2015) ported to MS-Windows . Many references in
+ this manual, especially file locations, are specific to Unix, and not valid on &WIN;, where
+ some described features are also not available. The manual will be progressively
+ updated. Until this happens, on &WIN;, most references to shared files can be translated by
+ looking under the Recoll installation directory (Typically C:/Program
+ Files (x86)/Recoll ). Especially, anything referenced
+ inside /usr/share in this document will be found in
+ the Share subdirectory of the installation). The user configuration is
+ stored by default under AppData/Local/Recoll inside the user directory,
+ along with the index itself.
Giving it a try
@@ -86,26 +84,20 @@
directory, allowing you to search immediately after
indexing completes.
- Do not do this if your home directory contains a huge
- number of documents and you do not want to wait or are very
- short on disk space. In this case, you may first want to customize
- the configuration
- to restrict the indexed area (shortcut: from the
- recoll GUI go to:
-
+ Do not do this if your home directory contains a huge number of documents and you do not
+ want to wait or are very short on disk space. In this case, you may first want to customize
+ the configuration to restrict the indexed area
+ (shortcut: from the recoll GUI go to:
Preferences
Indexing configuration
, then adjust the Top
directories section).
On &LIN;, you may need to install the
- appropriate
- supporting applications
- for document types that need them (for
- example antiword for
- Microsoft Word files).
- The &RCL; for &WIN; package is self-contained and includes
- most useful auxiliary programs.
+ appropriate supporting applications for document
+ types that need them (for example antiword
+ for Microsoft Word files). The &WIN; package is self-contained and
+ includes most useful auxiliary programs.
@@ -250,17 +242,14 @@
index your home directory with generic parameters. Most common
parameters can be set by using
configuration menus in the recoll GUI. Some less
- common parameters can only be set by editing the text files (the
- new values will be preserved by the GUI).
+ common parameters can only be set by editing the text files.
- The indexing process
- is started automatically (after asking permission), the
- first time you execute the recoll GUI. Indexing
- can also be performed by executing the recollindex
- command. &RCL; indexing is multithreaded by default when appropriate
- hardware resources are available, and can perform in parallel
- multiple tasks for text extraction, segmentation and index
- updates.
+ The indexing process is started
+ automatically (after asking permission), the first time you execute
+ the recoll GUI. Indexing can also be performed by executing
+ the recollindex command. &RCL; indexing is multithreaded by default when
+ appropriate hardware resources are available, and can perform multiple tasks in parallel for
+ text extraction, segmentation and index updates.
Searches are usually
performed inside the recoll GUI, which has many
@@ -268,22 +257,21 @@
are other ways to query the index:
A
- command line interface.
-
+ command line interface.
A
- Python programming interface
+ Web interface .
- A KDE KIO slave module.
-
- A Ubuntu Unity
- Scope
- module.
A Gnome Shell
- Search Provider .
-
+ Search Provider
+ .
A
- Web interface .
-
+ Python programming
+ interface
+ A KDE KIO slave
+ module.
+ A Ubuntu Unity
+ Scope
+ module.
@@ -346,26 +334,25 @@
Real time indexing
- (Only available on &LIN;). recollindex runs
- permanently as a daemon and uses a file system alteration monitor
- (e.g. inotify ) to detect file
- changes. New or updated files are indexed at once. Monitoring a
- big file system tree can consume
- significant system resources.
+ recollindex runs permanently as a daemon and uses a file system
+ alteration monitor (e.g. inotify on &LIN;) to detect file
+ changes. New or updated files are indexed at once. Monitoring a big file system tree
+ can consume significant system resources.
- &LIN;: choosing an indexing mode
+ Choosing an indexing mode
The choice between the two methods is mostly a matter of
- preference, and they can be combined by setting up multiple
- indexes (ie: use periodic indexing on a big documentation
- directory, and real time indexing on a small home
- directory), or, with &RCL; 1.24 and newer, by
- configuring the index so that only a subset of the tree will be monitored.
+ preference, and they can be combined by setting up multiple
+ indexes (e.g.: use periodic indexing on a big documentation
+ directory, and real time indexing on a small home
+ directory), or by
+ configuring the index so that only a subset of the
+ tree will be monitored.
The choice of method and the parameters used can be
- configured from the recoll GUI:
+ configured from the recoll GUI:
Preferences
Indexing schedule
@@ -388,9 +375,8 @@
default configuration directory. This configuration is the one used
for indexing and querying when no specific configuration is
specified. It is located in $HOME/.recoll/ for
- &LIN; and %LOCALAPPDATA%\Recoll on &WIN;
- (typically
- C:\Users\[me]\Appdata\Local\Recoll ).
+ &LIN; and %LOCALAPPDATA%/Recoll on &WIN;
+ (typically C:/Users/[me]/Appdata/Local/Recoll ).
All configuration parameters have defaults, defined in
system-wide files. Without further customisation, the default
@@ -427,18 +413,15 @@
recoll GUI, or some other functions in the
command line and programming tools.
- A plausible usage scenario for the multiple index feature
- would be for a system administrator to set up a central index for
- shared data, that you choose to search or not in addition to your
- personal data. Of course, there are other possibilities. for
- example, there are many cases where you know the subset of files
- that should be searched, and where narrowing the search can improve
- the results. You can achieve approximately the same effect with the
- directory filter in advanced search, but multiple indexes may have
- better performance and may be worth the trouble in some
- cases.
+ A plausible usage scenario for the multiple index feature would be for a system
+ administrator to set up a central index for shared data, that you choose to search or not in
+ addition to your personal data. Of course, there are other possibilities. for example, there
+ are many cases where you know the subset of files that should be searched, and where
+ narrowing the search can improve the results. You can achieve approximately the same effect
+ by using a directory filter clause in a search, but multiple indexes may have better
+ performance and may be worth the trouble in some cases.
- A more advanced use case would be to use multiple index to
+ A more advanced use case would be to use multiple indexes to
improve indexing performance, by updating several indexes in
parallel (using multiple CPU cores and disks, or possibly several
machines), and then merging them, or querying them in
@@ -472,7 +455,7 @@
OpenDocument (Open/LibreOffice), email formats, and a few others
internally.
- Other file types (ie: postscript, pdf, ms-word, rtf ...)
+ Other file types (e.g.: postscript, pdf, ms-word, rtf ...)
need external applications for preprocessing. The list is in the
installation
section. After every indexing operation, &RCL; updates a list of
@@ -515,8 +498,7 @@
menu. Excluding by type can be done by setting the
excludedmimetypes
- list in the configuration file (1.20 and later). This can be
- redefined for subdirectories.
+ list in the configuration file. This can be redefined for subdirectories.
You can also define an exclusive list of MIME types to be
indexed (no others will be indexed), by setting
@@ -599,10 +581,10 @@
Index storage
- The default location for the index data is the
- xapiandb subdirectory of the &RCL;
- configuration directory, typically
- $HOME/.recoll/xapiandb/ . This can be
+ The default location for the index data is the xapiandb
+ subdirectory of the &RCL; configuration directory,
+ typically $HOME/.recoll/xapiandb/ on &LIN;
+ or C:/Users/[me]/Appdata/Local/Recoll/xapiandb on &WIN;. This can be
changed via two different methods (with different purposes):
@@ -773,7 +755,7 @@
which lists the subtrees and files to be indexed.
The applications needed to index file types other than
- text, HTML or email (ie: pdf, postscript, ms-word...) are
+ text, HTML or email (e.g.: pdf, postscript, ms-word...) are
described in the external packages section.
@@ -944,10 +926,67 @@ recoll -c /path/to/my/new/config
+
+ The index configuration GUI
+
+ Most parameters for a given index configuration can
+ be set from a recoll GUI running on this
+ configuration (either as default, or by setting
+ RECOLL_CONFDIR or the
+ option.)
+
+ The interface is started from the
+
+ Preferences
+ Index Configuration
+
+ menu entry. It is divided in four tabs,
+ Global parameters , Local
+ parameters , Web history
+ (which is explained in the next section) and Search
+ parameters .
+
+ The Global parameters tab allows setting
+ global variables, like the lists of top directories, skipped paths,
+ or stemming languages.
+
+ The Local parameters tab allows setting
+ variables that can be redefined for subdirectories. This second tab
+ has an initially empty list of customisation directories, to which
+ you can add. The variables are then set for the currently selected
+ directory (or at the top level if the empty line is
+ selected).
+
+ The Search parameters section defines
+ parameters which are used at query time, but are global to an
+ index and affect all search tools, not only the GUI.
+
+ The meaning for most entries in the interface is
+ self-evident and documented by a ToolTip
+ popup on the text label. For more detail, you will need to
+ refer to the
+ configuration section
+ of this guide.
+
+ The configuration tool normally respects the comments
+ and most of the formatting inside the configuration file, so
+ that it is quite possible to use it on hand-edited files,
+ which you might nevertheless want to backup first...
+
+
+
Indexing threads configuration (&LIN;)
+ Note: you don't probably don't need to read this. The default automatic configuration
+ is fine is most cases. Only the part about disabling multithreading may be more commonly
+ useful, so I'll prepend it here. In recoll.conf :
+
+ thrQSizes = -1 -1 -1
+
+
+
The &RCL; indexing process
recollindex can use multiple threads to
speed up indexing on multiprocessor systems. The work done
@@ -1042,55 +1081,6 @@ recoll -c /path/to/my/new/config
-
-
- The index configuration GUI
-
- Most parameters for a given index configuration can
- be set from a recoll GUI running on this
- configuration (either as default, or by setting
- RECOLL_CONFDIR or the
- option.)
-
- The interface is started from the
-
- Preferences
- Index Configuration
-
- menu entry. It is divided in four tabs,
- Global parameters , Local
- parameters , Web history
- (which is explained in the next section) and Search
- parameters .
-
- The Global parameters tab allows setting
- global variables, like the lists of top directories, skipped paths,
- or stemming languages.
-
- The Local parameters tab allows setting
- variables that can be redefined for subdirectories. This second tab
- has an initially empty list of customisation directories, to which
- you can add. The variables are then set for the currently selected
- directory (or at the top level if the empty line is
- selected).
-
- The Search parameters section defines
- parameters which are used at query time, but are global to an
- index and affect all search tools, not only the GUI.
-
- The meaning for most entries in the interface is
- self-evident and documented by a ToolTip
- popup on the text label. For more detail, you will need to
- refer to the
- configuration section
- of this guide.
-
- The configuration tool normally respects the comments
- and most of the formatting inside the configuration file, so
- that it is quite possible to use it on hand-edited files,
- which you might nevertheless want to backup first...
-
-
@@ -1725,35 +1715,34 @@ metadatacmds = ; tags = tmsu tags %f
- &LIN;: real time indexing
+ Real time indexing
- Real time monitoring/indexing is performed by starting the
- recollindex command.
- With this option, recollindex will detach
- from the terminal and become a daemon, permanently monitoring
- file changes and updating the index.
+ Real time monitoring/indexing is performed by starting
+ the recollindex command. With this
+ option, recollindex will permanently monitor file changes and update the
+ index.
- In this situation, the recoll
- GUI File menu makes two
- operations available: Stop
- and Trigger incremental pass .
-
+ On &WIN; systems, the monitoring process is started from the recoll
+ GUI File menu. On &LIN;, there are other possibilities, see
+ the following sections.
+
+ When this is in use, the recoll
+ GUI File menu makes two operations
+ available: Stop and Trigger incremental
+ pass .
Trigger incremental pass has the
- same effect as restarting the indexer, and will cause a complete
- walk of the indexed area, processing the changed files, then switch
- to monitoring. This is only marginally useful, maybe in cases where
- the indexer is configured to delay updates, or to force an
- immediate rebuild of the stemming and phonetic data, which are only
- processed at intervals by the real time indexer.
+ same effect as restarting the indexer, and will cause a complete walk of the indexed area,
+ processing the changed files, then switch to monitoring. This is only marginally useful,
+ maybe in cases where the indexer is configured to delay updates, or to force an immediate
+ rebuild of the stemming and phonetic data, which are only processed at intervals by the real
+ time indexer.
While it is convenient that data is indexed in real time,
- repeated indexing can generate a significant load on the
- system when files such as email folders change. Also,
- monitoring large file trees by itself significantly taxes
- system resources. You probably do not want to enable it if
- your system is short on resources. Periodic indexing is
- adequate in most cases.
+ repeated indexing can generate a significant load on the system when files such as email
+ folders change. Also, monitoring large file trees by itself significantly taxes system
+ resources. You probably do not want to enable it if your system is short on
+ resources. Periodic indexing is adequate in most cases.
As of &RCL; 1.24, you can set the
monitordirs
@@ -1765,8 +1754,9 @@ metadatacmds = ; tags = tmsu tags %f
process. The recoll GUI also has a menu entry for
this.
+
- Automatic daemon start with systemd
+ &LIN;: automatic daemon start with systemd
The installation contains two example files
(in share/recoll/examples ) for starting the indexing daemon with
@@ -1796,7 +1786,7 @@ systemctl enable --now recollindex@username .service
- Automatic daemon start from the desktop session
+ &LIN;: automatic daemon start from the desktop session
Under KDE ,
Gnome and some other desktop
@@ -1852,11 +1842,12 @@ fvwm
the daemon runs permanently, the log file may grow quite big,
depending on the log level.
- Increasing resources for inotify
- On Linux systems, monitoring a big tree may need
- increasing the resources available to inotify, which are
- normally defined in /etc/sysctl.conf .
-
+
+ &LIN;: increasing resources for inotify
+ On Linux systems, monitoring a big tree may need
+ increasing the resources available to inotify, which are
+ normally defined in /etc/sysctl.conf .
+
### inotify
#
# cat /proc/sys/fs/inotify/max_queued_events - 16384
@@ -1870,25 +1861,23 @@ fs.inotify.max_user_instances=256
fs.inotify.max_user_watches=32768
- Especially, you will need to trim your tree or adjust
- the max_user_watches value if indexing exits with
- a message about errno ENOSPC (28) from
- inotify_add_watch .
-
+ Especially, you will need to trim your tree or adjust
+ the max_user_watches value if indexing exits with
+ a message about errno ENOSPC (28) from
+ inotify_add_watch .
- Slowing down the reindexing rate for fast changing
- files
- When using the real time monitor, it may happen that some
- files need to be indexed, but change so often that they impose an
- excessive load for the system.
+
+ Slowing down the reindexing rate for fast changing files
+ When using the real time monitor, it may happen that some
+ files need to be indexed, but change so often that they impose an
+ excessive load for the system.
- &RCL; provides a configuration option to specify the minimum
- time before which a file, specified by a wildcard pattern, cannot be
- reindexed. See the mondelaypatterns parameter in
- the configuration section.
-
+ &RCL; provides a configuration option to specify the minimum
+ time before which a file, specified by a wildcard pattern, cannot be reindexed. See
+ the mondelaypatterns parameter in the
+ configuration section.
@@ -1897,6 +1886,8 @@ fs.inotify.max_user_watches=32768
+
+
Searching
@@ -1949,38 +1940,38 @@ fs.inotify.max_user_watches=32768
Searching with the Qt graphical user interface
The recoll program provides the main user
- interface for searching. It is based on the
- Qt library.
+ interface for searching. It is based on the Qt library.
recoll has two search interfaces:
Simple search (the default, on the main screen) has
- a single entry field where you can enter multiple words.
+ a single entry field where you can enter multiple words or a query language
+ query.
Advanced search (a panel accessed through the
- Tools menu or the toolbox bar icon) has
- multiple entry fields, which you may use to build a logical
- condition, with additional filtering on file type, location
- in the file system, modification date, and size.
+ Tools menu or the toolbox bar icon) has multiple entry fields,
+ which you may use to build a logical condition, with additional filtering on file type,
+ location in the file system, modification date, and size.
- In most cases, you can enter the terms as you think them, even
- if they contain embedded punctuation or other non-textual characters
- (e.g. &RCL; can handle things like email addresses).
+ The Advanced Search tool is easier to use, but not actually more
+ powerful, than the Simple Search in query language mode. Its name
+ is historical, but Assisted Search would probably have been a better
+ designation.
- The main case where you should enter text differently from
- how it is printed is for east-asian languages (Chinese,
- Japanese, Korean). Words composed of single or multiple
- characters should be entered separated by white space in this
- case (they would typically be printed without white
- space).
+ In most text areas, you can enter the terms as you think them, even if they contain
+ embedded punctuation or other non-textual characters (e.g. &RCL; can handle things like
+ email addresses).
- Some searches can be quite complex, and you may want to re-use
- them later, perhaps with some tweaking. &RCL; can save and restore
- searches. See Saving and restoring
- queries.
-
+ The main case where you should enter text differently from how it is printed is for
+ east-asian languages (Chinese, Japanese, Korean). Words composed of single or multiple
+ characters should be entered separated by white space in this case (they would typically be
+ printed without white space).
+
+ Some searches can be quite complex, and you may want to re-use them later, perhaps with
+ some tweaking. &RCL; can save and restore searches.
+ See Saving and restoring queries.
Simple search
@@ -2003,12 +1994,14 @@ fs.inotify.max_user_watches=32768
The initial default search mode is Query
language . Without special directives, this will look for
- documents containing all of the search terms (the ones with more
- terms will get better scores), just like the All
- terms mode. Any term will search
- for documents where at least one of the terms
- appear. File name will exclusively look for
- file names, not contents
+ documents containing all of the search terms (the ones with more terms will get better
+ scores), just like the All Terms mode.
+
+ Any term will search for documents where at least one of the
+ terms appear.
+
+ File name will exclusively look for file names, not
+ contents
All search modes allow terms to be expanded with wildcards
characters (* , ? ,
@@ -2026,14 +2019,13 @@ fs.inotify.max_user_watches=32768
When using a stripped index (the default), character case has
- no influence on search, except that you can disable stem expansion
- for any term by capitalizing it. Ie: a search for
- floor will also normally look for
- flooring , floored , etc., but
- a search for Floor will only look for
- floor , in any character case. Stemming can also
- be disabled globally in the preferences. When using a raw index,
- the rules are a bit more complicated.
+ no influence on search, except that you can disable stem expansion for any term by
+ capitalizing it. E.g.: a search for floor will also normally look
+ for flooring , floored , etc., but a search
+ for Floor will only look for floor , in any character
+ case. Stemming can also be disabled globally in the preferences. When using a raw
+ index, the rules are a bit more
+ complicated.
&RCL; remembers the last few searches that you performed. You
can directly access the search history by clicking the clock button
@@ -2058,33 +2050,30 @@ fs.inotify.max_user_watches=32768
this mode from the Query Language mode, where
you have to care about the syntax.
- You can use the Tools Advanced search
- dialog for more complex searches.
-
The File name search mode will
- specifically look for file names. The point of having a separate
- file name search is that wildcard expansion can be performed more
- efficiently on a small subset of the index (allowing wildcards on
- the left of terms without excessive cost). Things to know:
-
- White space in the entry should match white
- space in the file name, and is not treated specially.
-
- The search is insensitive to character case and
- accents, independently of the type of index.
-
- An entry without any wildcard
- character and not capitalized will be prepended and appended
- with '*' (ie: etc ->
- *etc* , but
- Etc ->
- etc ).
-
- If you have a big index (many files),
- excessively generic fragments may result in inefficient
- searches.
-
-
+ specifically look for file names. The point of having a separate
+ file name search is that wildcard expansion can be performed more
+ efficiently on a small subset of the index (allowing wildcards on
+ the left of terms without excessive cost). Things to know:
+
+ White space in the entry should match white
+ space in the file name, and is not treated specially.
+
+ The search is insensitive to character case and
+ accents, independently of the type of index.
+
+ An entry without any wildcard
+ character and not capitalized will be prepended and appended
+ with '*' (e.g.: etc ->
+ *etc* , but
+ Etc ->
+ etc ).
+
+ If you have a big index (many files),
+ excessively generic fragments may result in inefficient
+ searches.
+
+
@@ -2265,7 +2254,7 @@ fs.inotify.max_user_watches=32768
Save to File allows saving the
contents of a result document to a chosen file. This entry
will only appear if the document does not correspond to an
- existing file, but is a subdocument inside such a file (ie: an
+ existing file, but is a subdocument inside such a file (e.g.: an
email attachment). It is especially useful to extract attachments
with no associated editor.
@@ -2303,6 +2292,7 @@ fs.inotify.max_user_watches=32768
+
The result table
@@ -2335,8 +2325,43 @@ fs.inotify.max_user_watches=32768
Esc (the Escape key) will unfreeze the
display.
+ Using Shift-click on a row will display the document extracted text (somewhat like a
+ preview) instead of the document details. The functions of Click and Shift-Click can be
+ reversed in the GUI preferences.
+
+
+
+ The filters panel
+
+ By default, the GUI displays the filters panel on the left of the results area. This
+ is new in version 1.32. You can adjust the width of the panel, and hide it
+ by squeezing it completely. The width will be memorized for the next session.
+
+ The panel currently has two areas, for filtering the results by dates, or by
+ filesystem location.
+
+ The panel is only active in Query Language search mode, and its
+ effect is to add date: and dir: clauses to the
+ actual search.
+
+ The dates filter can be activated by clicking the checkbox. It has two assisted date
+ entry widgets, for the minimum and maximum dates of the search period.
+
+ The directory filter displays a subset of the filesystem directories, reduced to the
+ indexed area, as defined by the topdirs list and the name exclusion
+ parameters. You can independantly select and deselect directories by clicking them. Note
+ that selecting a directory will activate the whole subtree for searching, there is no need
+ to select the subdirectories, and no way to exclude some of them (use
+ Query
+ language dir: clauses if this
+ is needed).
+
+
+
+
+
&LIN;: running arbitrary commands on result files
@@ -2382,6 +2407,7 @@ fs.inotify.max_user_watches=32768
+
&LIN;: displaying thumbnails
@@ -2407,6 +2433,7 @@ fs.inotify.max_user_watches=32768
+
The preview window
@@ -2435,7 +2462,7 @@ fs.inotify.max_user_watches=32768
A right-click menu in the text area allows switching
between displaying the main text or the contents of fields
- associated to the document (ie: author, abtract, etc.). This is
+ associated to the document (e.g.: author, abtract, etc.). This is
especially useful in cases where the term match did not occur in
the main text but in one of the fields. In the case of
images, you can switch between three displays: the image
@@ -2449,7 +2476,7 @@ fs.inotify.max_user_watches=32768
P ) in the window text.
-
+
Searching inside the preview
The preview window has an internal search capability,
@@ -2499,7 +2526,7 @@ fs.inotify.max_user_watches=32768
-
+
@@ -2601,12 +2628,12 @@ fs.inotify.max_user_watches=32768
- Complex/advanced search
+ Assisted Complex Search (A.K.A. "Advanced Search")
The advanced search dialog helps you build more complex queries
- without memorizing the search language constructs. It can be opened
- through the Tools menu or through the main
- toolbar.
+ without memorizing the search language constructs. It can be opened
+ through the Tools menu or through the main
+ toolbar.
&RCL; keeps a history of searches. See
Advanced search history.
@@ -2615,17 +2642,10 @@ fs.inotify.max_user_watches=32768
The dialog has two tabs:
-
- The first tab lets you specify terms to search
- for, and permits specifying multiple clauses which are combined
- to build the search.
-
-
- The second tab lets filter the results according
- to file size, date of modification, MIME type, or
- location.
-
-
+ The first tab lets you specify terms to search for, and permits specifying
+ multiple clauses which are combined to build the search.
+ The second tab allows filtering the results according to file size, date
+ of modification, MIME type, or location.
Click on the Start Search button in
@@ -2639,9 +2659,8 @@ fs.inotify.max_user_watches=32768
Advanced search: the "find" tab
- This part of the dialog lets you constructc a query by
- combining multiple clauses of different types. Each entry
- field is configurable for the following modes:
+ This part of the dialog lets you construct a query by combining multiple clauses of
+ different types. Each entry field is configurable for the following modes:
All terms.
@@ -2702,48 +2721,29 @@ fs.inotify.max_user_watches=32768
criteria
-
-
- The first section allows filtering by dates of last
- modification. You can specify both a minimum and a maximum
- date. The initial values are set according to the oldest and
- newest documents found in the index.
-
-
-
- The next section allows filtering the results by
- file size. There are two entries for minimum and maximum
- size. Enter decimal numbers. You can use suffix multipliers:
- k/K , m/M ,
- g/G , t/T for 1E3, 1E6,
- 1E9, 1E12 respectively.
-
-
-
- The next section allows filtering the results by their MIME
- types, or MIME categories (ie: media/text/message/etc.).
- You can transfer the types between two boxes, to define
- which will be included or excluded by the search.
- The state of the file type selection can be saved as
- the default (the file type filter will not be activated at
- program start-up, but the lists will be in the restored
- state).
-
-
-
- The bottom section allows restricting the search results to a
- sub-tree of the indexed area. You can use the
- Invert checkbox to search for files not in
- the sub-tree instead. If you use directory filtering often and on
- big subsets of the file system, you may think of setting up
- multiple indexes instead, as the performance may be
- better.
- You can use relative/partial paths for filtering. Ie,
- entering dirA/dirB would match either
- /dir1/dirA/dirB/myfile1 or
- /dir2/dirA/dirB/someother/myfile2 .
-
-
+ The first section allows filtering by dates of last
+ modification. You can specify both a minimum and a maximum date. The initial values
+ are set according to the oldest and newest documents found in the
+ index.
+ The next section allows filtering the results by file size. There are
+ two entries for minimum and maximum size. Enter decimal numbers. You can use suffix
+ multipliers: k/K , m/M ,
+ g/G , t/T for 10E3, 10E6, 10E9, 10E12
+ respectively.
+ The next section allows filtering the results by their MIME
+ types, or MIME categories (e.g.: media/text/message/etc.). You can
+ transfer the types between two boxes, to define which will be included or excluded by
+ the search. The state of the file type selection can be saved as the
+ default (the file type filter will not be activated at program start-up, but the lists
+ will be in the restored state).
+ The bottom section allows restricting the search results to a
+ sub-tree of the indexed area. You can use the Invert checkbox to
+ search for files not in the sub-tree instead. If you use directory filtering often and
+ on big subsets of the file system, you may think of setting up multiple indexes
+ instead, as the performance may be better. You can use relative/partial
+ paths for filtering. E.g., entering dirA/dirB would match
+ either /dir1/dirA/dirB/myfile1
+ or /dir2/dirA/dirB/someother/myfile2 .
@@ -2769,32 +2769,28 @@ fs.inotify.max_user_watches=32768
The term explorer tool
- &RCL; automatically manages the expansion of search terms
- to their derivatives (ie: plural/singular, verb
- inflections). But there are other cases where the exact search
- term is not known. For example, you may not remember the exact
- spelling, or only know the beginning of the name.
+ &RCL; automatically manages the expansion of search terms to their derivatives (e.g.:
+ plural/singular, verb inflections). But there are other cases where the exact search term
+ is not known. For example, you may not remember the exact spelling, or only know the
+ beginning of the name.
The search will only propose replacement terms with
- spelling variations when no matching document were found. In some
- cases, both proper spellings and mispellings are present in the
- index, and it may be interesting to look for them explicitly.
+ spelling variations when no matching document were found. In some cases, both proper
+ spellings and mispellings are present in the index, and it may be interesting to look for
+ them explicitly.
The term explorer tool (started from the toolbar icon or
- from the Term explorer entry of the
- Tools menu) can be used to search the full index
- terms list. It has three modes of operations:
-
+ from the Term explorer entry of the Tools menu)
+ can be used to search the full index terms list, or (later addition), display some
+ statistics or other index information. It has several modes of
+ operations:
+
- Wildcard
- In this mode of operation, you can enter a
- search string with shell-like wildcards (*, ?, []). ie:
- xapi* would display all index terms
- beginning with xapi . (More
- about wildcards
- here
- ).
+ Wildcard In this mode of operation, you can enter a search
+ string with shell-like wildcards (*, ?, []). e.g.: xapi*
+ would display all index terms beginning with xapi . (More
+ about wildcards here).
@@ -2802,7 +2798,7 @@ fs.inotify.max_user_watches=32768
This mode will accept a regular expression
as input. Example:
word[0-9]+ . The expression is
- implicitly anchored at the beginning. Ie:
+ implicitly anchored at the beginning. E.g.:
press will match
pression but not
expression . You can use
@@ -2848,7 +2844,7 @@ fs.inotify.max_user_watches=32768
Note that in cases where &RCL; does not know the beginning
- of the string to search for (ie a wildcard expression like
+ of the string to search for (e.g. a wildcard expression like
*coll ), the expansion can take quite
a long time because the full index term list will have to be
processed. The expansion is currently limited at 10000 results for
@@ -2870,12 +2866,11 @@ fs.inotify.max_user_watches=32768
generalities. Only the aspects concerning the
recoll GUI are described here.
- A recoll program instance is always
- associated with a specific index, which is the one to be updated
- when requested from the File menu, but it can
- use any number of &RCL; indexes for searching. The external
- indexes can be selected through the external
- indexes tab in the preferences dialog.
+ A recoll program instance is always associated with a main index,
+ which is the one to be updated when requested from the File menu, but it
+ can use any number of external &RCL; indexes for searching. The external indexes can be
+ selected through the external indexes tab in the preferences
+ dialog.
Index selection is performed in two phases. A set of all usable
indexes must first be defined, and then the subset of indexes to be
@@ -2900,7 +2895,7 @@ fs.inotify.max_user_watches=32768
variable to provide an initial set. This might typically be
set up by a system administrator so that every user does not
have to do it. The variable should define a colon-separated list
- of index directories, ie:
+ of index directories, e.g.:
export RECOLL_EXTRA_DBS=/some/place/xapiandb:/some/other/db
@@ -2951,14 +2946,11 @@ fs.inotify.max_user_watches=32768
Remember sort activation state option in
the preferences.
- It is also possible to hide duplicate entries inside
- the result list (documents with the exact same contents as the
- displayed one). The test of identity is based on an MD5 hash
- of the document container, not only of the text contents (so
- that ie, a text document with an image added will not be a
- duplicate of the text only). Duplicates hiding is controlled
- by an entry in the GUI configuration
- dialog, and is off by default.
+ It is also possible to hide duplicate entries inside the result list (documents with
+ the exact same contents as the displayed one). The test of identity is based on an MD5 hash
+ of the document container, not only of the text contents (so that e.g., a text document with
+ an image added will not be a duplicate of the text only). Duplicates hiding is controlled by
+ an entry in the GUI configuration dialog, and is off by default.
When a result document does have undisplayed duplicates,
a Dups link will be shown with the result list
@@ -3242,7 +3234,7 @@ fs.inotify.max_user_watches=32768
searches when looking for Any terms . This
will not change radically the results, but will give a relevance
boost to the results where the search terms appear as a
- phrase. Ie: searching for virtual reality
+ phrase. E.g.: searching for virtual reality
will still find all documents where either
virtual or reality or
both appear, but those which contain
@@ -3260,7 +3252,7 @@ fs.inotify.max_user_watches=32768
Dotted abbreviations like
I.B.M. are also automatically indexed as a
word without the dots: IBM . Searching for
- the word inside a phrase (ie: "the IBM
+ the word inside a phrase (e.g.: "the IBM
company" ) will only match the dotted abrreviation
if you increase the phrase slack (using the advanced search
panel control, or the o query language
@@ -3326,7 +3318,7 @@ fs.inotify.max_user_watches=32768
- Saving and restoring queries (1.21 and later)
+ Saving and restoring queries
Both simple and advanced query dialogs save recent
history, but the amount is limited: old queries will eventually
@@ -3372,7 +3364,7 @@ fs.inotify.max_user_watches=32768
Highlight color for query
terms : Terms from the user query are highlighted in
the result list samples and the preview window. The color can
- be chosen here. Any Qt color string should work (ie
+ be chosen here. Any Qt color string should work (e.g.
red , #ff0000 ). The
default is blue .
@@ -3607,7 +3599,7 @@ fs.inotify.max_user_watches=32768
External indexes:
This panel will let you browse for additional indexes
that you may want to search. External indexes are designated by
- their database directory (ie:
+ their database directory (e.g.:
/home/someothergui/.recoll/xapiandb ,
/usr/local/recollglobal/xapiandb ).
@@ -3878,21 +3870,16 @@ fs.inotify.max_user_watches=32768
stream, without a graphical interface:
By passing option to the
- recoll program, or by calling it as
- recollq (through a link).
-
- By using the recollq program.
-
- By writing a custom
- Python program, using the
- Recoll Python API.
-
+ recoll program, or by calling it as
+ recollq (through a link).
+ By using the actual recollq program.
+ By writing a custom Python program, using the
+ Recoll Python API.
- The first two methods work in the same way and accept/need the same
- arguments (except for the additional to
- recoll ). The query to be executed is specified
- as command line arguments.
+ The first two methods work in the same way and accept/need the same arguments (except
+ for the additional to recoll ). The query to be
+ executed is specified as command line arguments.
recollq is not always built by default. You
can use the Makefile in the
@@ -3991,7 +3978,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
This would search for all documents with
John Doe
appearing as a phrase in the author field (exactly what this is
- would depend on the document type, ie: the
+ would depend on the document type, e.g.: the
From: header, for an email message),
and containing either beatles or
lennon and either
@@ -4178,7 +4165,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
You can also use OR conjunctions
with dir: clauses.
- A special aspect of dir clauses is
+ On &LIN;, a special aspect of dir clauses is
that the values in the index are not transcoded to UTF-8, and never lower-cased or
unaccented, but stored as binary. This means that you need to enter the values in the
exact lower or upper case, and that searches for names with diacritics may sometimes be
@@ -4239,8 +4226,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
2003 or older.
- Periods can also be specified with small letters (ie:
- p2y).
+ Periods can also be specified with small letters (e.g.: p2y).
mime or format for specifying the MIME
@@ -4426,7 +4412,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
term).
-
+
Wildcards and path filtering
Due to the way that &RCL; processes wildcards
@@ -4449,7 +4435,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
the best we can do (and it may be actually more useful in
some cases).
-
+
@@ -4479,15 +4465,14 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
beginning of the text would be a match for
"^my term"o5 .
- Anchored searches can be very useful for searches inside
- somewhat structured documents like scientific articles, in case
- explicit metadata has not been supplied (a most frequent case), for
- example for looking for matches inside the abstract or the list of
- authors (which occur at the top of the document).
-
+ Anchored searches can be very useful for searches inside somewhat structured documents
+ like scientific articles, in case explicit metadata has not been supplied, for example for
+ looking for matches inside the abstract or the list of authors (which occur at the top of
+ the document).
+
@@ -4930,7 +4915,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
variable (values yes , no )
tells the handler if the operation is for indexing or
previewing. Some handlers use this to output a slightly different
- format, for example stripping uninteresting repeated keywords (ie:
+ format, for example stripping uninteresting repeated keywords (e.g.:
Subject: for email) when indexing. This is not
essential.
@@ -6429,7 +6414,7 @@ hasextract = False
searches).
Specify
- the version of the 'file' command to use (ie:
+ the version of the 'file' command to use (e.g.:
--with-file-command=/usr/local/bin/file). Can be useful to
enable the gnu version on systems where the native one is
bad.
@@ -7045,7 +7030,7 @@ other = rclcat:other
mimeview specifies which programs
are started when you click on an Open link
- in a result list. Ie: HTML is normally displayed using
+ in a result list. E.g.: HTML is normally displayed using
firefox , but you may prefer
Konqueror , your
openoffice.org
@@ -7087,7 +7072,7 @@ other = rclcat:other
The nouncompforviewmts entry, (placed at
the top level, outside of the [view] section),
holds a list of MIME types that should not be uncompressed before
- starting the viewer (if they are found compressed, ie:
+ starting the viewer (if they are found compressed, e.g.:
mydoc.doc.gz ).
The right side of each assignment holds a command to be
@@ -7102,7 +7087,7 @@ other = rclcat:other
%f
File name. This may be the name of a temporary file if
- it was necessary to create one (ie: to extract a subdocument
+ it was necessary to create one (e.g.: to extract a subdocument
from a container).