diff --git a/src/doc/user/usermanual.html b/src/doc/user/usermanual.html index 45bb1dd2..7e5bd4c2 100644 --- a/src/doc/user/usermanual.html +++ b/src/doc/user/usermanual.html @@ -92,11 +92,11 @@ alink="#0000FF"> "#RCL.INDEXING.INTRODUCTION.CONFIG">Configurations, multiple indexes
Recoll supports - defining multiple indexes.
-Each index is defined by its own configuration directory, in which several configuration files describe what should be indexed and how.
@@ -904,46 +903,66 @@ alink="#0000FF"> changed to process a different area of the file system, select files in different ways, and many other things. -In some cases, it may be interesting, for example, to - index different areas of the file system into separate - indexes, or use different options. You can do this by - creating additional configuration directories.
-Examples of usage would be to separate personal and - shared indexes, or to take advantage of the organization - of your data to improve search precision.
+In some cases, it may be useful to create additional + configuration directories, for example, to separate + personal and shared indexes, or to take advantage of the + organization of your data to improve search + precision.
+A plausible usage scenario for the multiple index + feature would be for a system administrator to set up a + central index for shared data, that you choose to search + or not in addition to your personal data. Of course, + there are other possibilities. for example, there are + many cases where you know the subset of files that should + be searched, and where narrowing the search can improve + the results. You can achieve approximately the same + effect with the directory filter in advanced search, but + multiple indexes may have better performance and may be + worth the trouble in some cases.
+A more advanced use case would be to use multiple + index to improve indexing performance, by updating + several indexes in parallel (using multiple CPU cores and + disks, or possibly several machines), and then merging + them, or querying them in parallel.
A specific configuration can be selected by setting
the RECOLL_CONFDIR environment
variable, or giving the -c
option to any of the Recoll commands.
When generating indexes, the different configurations - are entirely independant (no parameters are ever shared - between configurations when indexing).
-Multiple indexes can be queryied concurrently, either - from the GUI or the command line. When doing this, there - is always a main configuration, from which both - configuration and index data are used. Only the index - data from the additional indexes is used (their - configuration parameters are ignored).
-This is important and sometimes confusing, so it will
- be rephrased here: for index generation, multiple
- configurations are totally independant from each other.
- When querying, configuration and data are used from the
- main index (the one designated by -c or When creating or updating indexes, the different
+ configurations are entirely independant (no parameters
+ are ever shared between configurations when indexing).
+ The recollindex program
+ always works on a single index.
When querying, multiple indexes can be accessed + concurrently, either from the GUI or the command line. + When doing this, there is always one main configuration, + from which both configuration and index data are used. + Only the index data from the additional indexes is used + (their configuration parameters are ignored).
+The behaviour of index update and query regarding
+ multiple configurations is important and sometimes
+ confusing, so it will be rephrased here: for index
+ generation, multiple configurations are totally
+ independant from each other. When querying, configuration
+ and data are used from the main index (the one designated
+ by -c or RECOLL_CONFDIR), and only the data from
- the additional indexes is used. This also implies that
- some parameters
- should be consistent among the configurations for
- indexes which are to be used together.
See the section about configuring multiple + indexes for more detail
When working with the Index configuration parameters can be set either by
+ using a text editor on the files, or, for most
+ parameters, by using the recoll index
- configuration GUI, the configuration directory for which
- parameters are modified is the one which was selected by
- RECOLL_CONFDIR or the
- -c parameter, and there is no
- way to switch configurations within the GUI.
Additional configuration directories (beyond
- ~/.recoll) must be created
- by hand (mkdir or such), the GUI
- will not do it. This is to avoid mistakenly creating
- additional directories when an argument is mistyped.
A typical usage scenario for the multiple index - feature would be for a system administrator to set up a - central index for shared data, that you choose to search - or not in addition to your personal data. Of course, - there are other possibilities. There are many cases where - you know the subset of files that should be searched, and - where narrowing the search can improve the results. You - can achieve approximately the same effect with the - directory filter in advanced search, but multiple indexes - will have better performance and may be worth the - trouble.
-A RECOLL_CONFDIR or the -c parameter, and there is no way to
+ switch configurations within the GUI.
As a remainder from a previous section, a recollindex program instance can only update one specific index, and it will only use parameters from a single configuration (no parameters are ever shared between configurations when - indexing).
-Multiple indexes can be queryied concurrently, either - from the GUI or the command line. When doing this, there - is always a main configuration, from which both - configuration and index data are used. Only the index - data from the additional indexes is used (their - configuration parameters are ignored).
+ indexing). All the query methods (recoll, recollq, the Python + API, etc.) operate with a main configuration, from which + both configuration and index data are used, but can also + query data from multiple additional indexes. Only the + index data from the latter is used, their configuration + parameters are ignored.When searching, the current main index (defined by
RECOLL_CONFDIR or -c) is always active. If this is
@@ -1428,6 +1434,60 @@ alink="#0000FF">
The different search interfaces (GUI, command line, ...) have different methods to define the set of indexes to be used, see the appropriate section.
+At the moment, using multiple configurations implies a
+ small level of command line usage. Additional
+ configuration directories (beyond ~/.recoll) must be created by hand
+ (mkdir or
+ such), the GUI will not do it. This is to avoid
+ mistakenly creating additional directories when an
+ argument is mistyped. Also, the GUI or the indexer must
+ be launched with a specific option or environment to work
+ on the right configuration.
To be more practical, here follows a few examples of + the commands need to create, configure, update, and query + an additional index.
+Initially creating the configuration and index:
+
+mkdir /path/to/my/new/config
+ Configuring the new index can be done from the
+ recoll GUI,
+ launched from the command line to pass the -c option (you could create a desktop
+ file to do it for you), and then using the GUI index
+ configuration tool to set up the index.
+recoll -c /path/to/my/new/config
+ Alternatively, you can just start a text editor on the
+ main configuration file
+ recoll.conf .
Creating and updating the index can be done from the + command line:
+recollindex -c /path/to/my/new/config
+
+ or from the File menu of a GUI launched with the same + option (recoll, see above).
+The same GUI would also let you set up batch indexing
+ for the new index. Real time indexing can only be set up
+ from the GUI for the default index (the menu entry will
+ be inactive if the GUI was started with a non-default
+ -c option).
The new index can be queried alone with
+
+recoll -c /path/to/my/new/config
+ Or, in parallel with the default index, by starting
+ recoll
+ without a -c option, and
+ using the →
+
+ menu.