From 7e3acf2d0aaa37413e9cc1d0eb32e7c104abc430 Mon Sep 17 00:00:00 2001 From: Jean-Francois Dockes Date: Sun, 31 Mar 2019 16:48:53 +0200 Subject: [PATCH] doc --- src/doc/user/usermanual.html | 200 +++++++++++++++++++++++------------ src/doc/user/usermanual.xml | 170 +++++++++++++++++++---------- 2 files changed, 244 insertions(+), 126 deletions(-) diff --git a/src/doc/user/usermanual.html b/src/doc/user/usermanual.html index 45bb1dd2..7e5bd4c2 100644 --- a/src/doc/user/usermanual.html +++ b/src/doc/user/usermanual.html @@ -92,11 +92,11 @@ alink="#0000FF"> "#RCL.INDEXING.INTRODUCTION.CONFIG">Configurations, multiple indexes
2.1.3. Document types
+ "#idm229">Document types
2.1.4. Indexing failures
+ "#idm270">Indexing failures
2.1.5. Recovery
+ "#idm282">Recovery
2.2.

Recoll supports - defining multiple indexes.

-

Each index is defined by its own configuration directory, in which several configuration files describe what should be indexed and how.

@@ -904,46 +903,66 @@ alink="#0000FF"> changed to process a different area of the file system, select files in different ways, and many other things.

-

In some cases, it may be interesting, for example, to - index different areas of the file system into separate - indexes, or use different options. You can do this by - creating additional configuration directories.

-

Examples of usage would be to separate personal and - shared indexes, or to take advantage of the organization - of your data to improve search precision.

+

In some cases, it may be useful to create additional + configuration directories, for example, to separate + personal and shared indexes, or to take advantage of the + organization of your data to improve search + precision.

+

A plausible usage scenario for the multiple index + feature would be for a system administrator to set up a + central index for shared data, that you choose to search + or not in addition to your personal data. Of course, + there are other possibilities. for example, there are + many cases where you know the subset of files that should + be searched, and where narrowing the search can improve + the results. You can achieve approximately the same + effect with the directory filter in advanced search, but + multiple indexes may have better performance and may be + worth the trouble in some cases.

+

A more advanced use case would be to use multiple + index to improve indexing performance, by updating + several indexes in parallel (using multiple CPU cores and + disks, or possibly several machines), and then merging + them, or querying them in parallel.

A specific configuration can be selected by setting the RECOLL_CONFDIR environment variable, or giving the -c option to any of the Recoll commands.

-

When generating indexes, the different configurations - are entirely independant (no parameters are ever shared - between configurations when indexing).

-

Multiple indexes can be queryied concurrently, either - from the GUI or the command line. When doing this, there - is always a main configuration, from which both - configuration and index data are used. Only the index - data from the additional indexes is used (their - configuration parameters are ignored).

-

This is important and sometimes confusing, so it will - be rephrased here: for index generation, multiple - configurations are totally independant from each other. - When querying, configuration and data are used from the - main index (the one designated by -c or When creating or updating indexes, the different + configurations are entirely independant (no parameters + are ever shared between configurations when indexing). + The recollindex program + always works on a single index.

+

When querying, multiple indexes can be accessed + concurrently, either from the GUI or the command line. + When doing this, there is always one main configuration, + from which both configuration and index data are used. + Only the index data from the additional indexes is used + (their configuration parameters are ignored).

+

The behaviour of index update and query regarding + multiple configurations is important and sometimes + confusing, so it will be rephrased here: for index + generation, multiple configurations are totally + independant from each other. When querying, configuration + and data are used from the main index (the one designated + by -c or RECOLL_CONFDIR), and only the data from - the additional indexes is used. This also implies that - some parameters - should be consistent among the configurations for - indexes which are to be used together.

+ the additional indexes is used. This implies that some + parameters should be consistent among the configurations + for indexes which are to be used together.

+

See the section about configuring multiple + indexes for more detail

-

2.1.3. Document types

+

2.1.3. Document types

@@ -1040,8 +1059,8 @@ alink="#0000FF">
-

2.1.4. Indexing failures

+

2.1.4. Indexing failures

@@ -1076,8 +1095,8 @@ alink="#0000FF">
-

2.1.5. Recovery

+

2.1.5. Recovery

@@ -1368,42 +1387,29 @@ alink="#0000FF"> recoll and recollindex.

-

When working with the Index configuration parameters can be set either by + using a text editor on the files, or, for most + parameters, by using the recoll index - configuration GUI, the configuration directory for which - parameters are modified is the one which was selected by - RECOLL_CONFDIR or the - -c parameter, and there is no - way to switch configurations within the GUI.

-

Additional configuration directories (beyond - ~/.recoll) must be created - by hand (mkdir or such), the GUI - will not do it. This is to avoid mistakenly creating - additional directories when an argument is mistyped.

-

A typical usage scenario for the multiple index - feature would be for a system administrator to set up a - central index for shared data, that you choose to search - or not in addition to your personal data. Of course, - there are other possibilities. There are many cases where - you know the subset of files that should be searched, and - where narrowing the search can improve the results. You - can achieve approximately the same effect with the - directory filter in advanced search, but multiple indexes - will have better performance and may be worth the - trouble.

-

A RECOLL_CONFDIR or the -c parameter, and there is no way to + switch configurations within the GUI.

+

As a remainder from a previous section, a recollindex program instance can only update one specific index, and it will only use parameters from a single configuration (no parameters are ever shared between configurations when - indexing).

-

Multiple indexes can be queryied concurrently, either - from the GUI or the command line. When doing this, there - is always a main configuration, from which both - configuration and index data are used. Only the index - data from the additional indexes is used (their - configuration parameters are ignored).

+ indexing). All the query methods (recoll, recollq, the Python + API, etc.) operate with a main configuration, from which + both configuration and index data are used, but can also + query data from multiple additional indexes. Only the + index data from the latter is used, their configuration + parameters are ignored.

When searching, the current main index (defined by RECOLL_CONFDIR or -c) is always active. If this is @@ -1428,6 +1434,60 @@ alink="#0000FF">

The different search interfaces (GUI, command line, ...) have different methods to define the set of indexes to be used, see the appropriate section.

+

At the moment, using multiple configurations implies a + small level of command line usage. Additional + configuration directories (beyond ~/.recoll) must be created by hand + (mkdir or + such), the GUI will not do it. This is to avoid + mistakenly creating additional directories when an + argument is mistyped. Also, the GUI or the indexer must + be launched with a specific option or environment to work + on the right configuration.

+

To be more practical, here follows a few examples of + the commands need to create, configure, update, and query + an additional index.

+

Initially creating the configuration and index:

+
+mkdir /path/to/my/new/config
+

Configuring the new index can be done from the + recoll GUI, + launched from the command line to pass the -c option (you could create a desktop + file to do it for you), and then using the GUI index + configuration tool to set up the index.

+
+recoll -c /path/to/my/new/config
+

Alternatively, you can just start a text editor on the + main configuration file + recoll.conf .

+

Creating and updating the index can be done from the + command line:

+
recollindex -c /path/to/my/new/config
+
+

or from the File menu of a GUI launched with the same + option (recoll, see above).

+

The same GUI would also let you set up batch indexing + for the new index. Real time indexing can only be set up + from the GUI for the default index (the menu entry will + be inactive if the GUI was started with a non-default + -c option).

+

The new index can be queried alone with

+
+recoll -c /path/to/my/new/config
+

Or, in parallel with the default index, by starting + recoll + without a -c option, and + using the Preferences → + External Index Dialog + menu.

diff --git a/src/doc/user/usermanual.xml b/src/doc/user/usermanual.xml index b5cf259e..e82442ec 100644 --- a/src/doc/user/usermanual.xml +++ b/src/doc/user/usermanual.xml @@ -395,12 +395,10 @@ Configurations, multiple indexes - &RCL; supports defining multiple indexes. - - Each index is defined by its own configuration directory, in - which several configuration files describe what should be indexed - and how. + &RCL; supports defining multiple indexes, each defined by its + own configuration + directory, in which several configuration files describe + what should be indexed and how. A default personal configuration directory ($HOME/.recoll/) is created @@ -415,38 +413,58 @@ different area of the file system, select files in different ways, and many other things. - In some cases, it may be interesting, for example, to index - different areas of the file system into separate indexes, or use - different options. You can do this by creating additional - configuration directories. + In some cases, it may be useful to create additional + configuration directories, for example, to separate personal and + shared indexes, or to take advantage of the organization of your + data to improve search precision. - Examples of usage would be to separate personal and shared - indexes, or to take advantage of the organization of your data - to improve search precision. + A plausible usage scenario for the multiple index feature + would be for a system administrator to set up a central index for + shared data, that you choose to search or not in addition to your + personal data. Of course, there are other possibilities. for + example, there are many cases where you know the subset of files + that should be searched, and where narrowing the search can improve + the results. You can achieve approximately the same effect with the + directory filter in advanced search, but multiple indexes may have + better performance and may be worth the trouble in some + cases. + + A more advanced use case would be to use multiple index to + improve indexing performance, by updating several indexes in + parallel (using multiple CPU cores and disks, or possibly several + machines), and then merging them, or querying them in + parallel. A specific configuration can be selected by setting the RECOLL_CONFDIR environment variable, or giving the option to any of the &RCL; commands. - When generating indexes, the different configurations are - entirely independant (no parameters are ever shared between - configurations when indexing). + When creating or updating indexes, the different + configurations are entirely independant (no parameters are ever + shared between configurations when indexing). The + recollindex program always works on a single + index. - Multiple indexes can be queryied concurrently, either from - the GUI or the command line. When doing this, there is always a - main configuration, from which both configuration and index data - are used. Only the index data from the additional indexes is used - (their configuration parameters are ignored). + When querying, multiple indexes can be accessed concurrently, + either from the GUI or the command line. When doing this, there is + always one main configuration, from which both configuration and + index data are used. Only the index data from the additional + indexes is used (their configuration parameters are + ignored). - This is important and sometimes confusing, so it will be + The behaviour of index update and query regarding multiple + configurations is important and sometimes confusing, so it will be rephrased here: for index generation, multiple configurations are totally independant from each other. When querying, configuration and data are used from the main index (the one designated by -c or RECOLL_CONFDIR), and only - the data from the additional indexes is used. This also implies - that some parameters - should be consistent among the configurations for indexes - which are to be used together. + the data from the additional indexes is used. This implies + that some parameters should be consistent among the configurations + for indexes which are to be used together. + + See the section about configuring multiple + indexes for more detail @@ -784,38 +802,24 @@ option to recoll and recollindex. - When working with the recoll index - configuration GUI, the configuration directory for which parameters - are modified is the one which was selected by - RECOLL_CONFDIR or the parameter, - and there is no way to switch configurations within the GUI. + Index configuration parameters can be set either by using a + text editor on the files, or, for most parameters, by using the + recoll index configuration GUI. In the latter + case, the configuration directory for which parameters are modified + is the one which was selected by RECOLL_CONFDIR or + the parameter, and there is no way to switch + configurations within the GUI. - Additional configuration directories (beyond - ~/.recoll) must be created by hand - (mkdir or such), the GUI will not do it. This is - to avoid mistakenly creating additional directories when an - argument is mistyped. - - A typical usage scenario for the multiple index feature would - be for a system administrator to set up a central index for shared - data, that you choose to search or not in addition to your personal - data. Of course, there are other possibilities. There are many - cases where you know the subset of files that should be searched, - and where narrowing the search can improve the results. You can - achieve approximately the same effect with the directory filter in - advanced search, but multiple indexes will have better performance - and may be worth the trouble. - - A recollindex program instance can only - update one specific index, and it will only use parameters from a - single configuration (no parameters are ever shared between - configurations when indexing). - - Multiple indexes can be queryied concurrently, either from - the GUI or the command line. When doing this, there is always a + As a remainder from a previous section, a + recollindex program instance can only update one + specific index, and it will only use parameters from a single + configuration (no parameters are ever shared between configurations + when indexing). All the query methods (recoll, + recollq, the Python API, etc.) operate with a main configuration, from which both configuration and index data - are used. Only the index data from the additional indexes is used - (their configuration parameters are ignored). + are used, but can also query data from multiple additional + indexes. Only the index data from the latter is used, their + configuration parameters are ignored. When searching, the current main index (defined by RECOLL_CONFDIR or ) is always @@ -841,6 +845,60 @@ have different methods to define the set of indexes to be used, see the appropriate section. + At the moment, using multiple configurations implies a small + level of command line usage. Additional configuration directories + (beyond ~/.recoll) must be created by hand + (mkdir or such), the GUI will not do it. This is + to avoid mistakenly creating additional directories when an + argument is mistyped. Also, the GUI or the indexer must be launched + with a specific option or environment to work on the right + configuration. + + To be more practical, here follows a few examples of the + commands need to create, configure, update, and query an additional + index. + + Initially creating the configuration and index: +mkdir /path/to/my/new/config + + Configuring the new index can be done from the + recoll GUI, launched from the + command line to pass the -c option + (you could create a desktop file to do it for you), and then using the + GUI index configuration tool to set up the index. + +recoll -c /path/to/my/new/config + + + + Alternatively, you can just start a text editor on the main + configuration file recoll.conf + . + + +Creating and updating the index can be done from the command line: + +recollindex -c /path/to/my/new/config + +or from the File menu of a GUI launched with the same option +(recoll, see above). + + The same GUI would also let you set up batch indexing for + the new index. Real time indexing can only be set up from the GUI + for the default index (the menu entry will be inactive if the GUI + was started with a non-default -c + option). + + The new index can be queried alone with +recoll -c /path/to/my/new/config + Or, in parallel with the default index, by starting + recoll without a -c option, + and using the + + Preferences + External Index Dialog + menu.