release 3304
This commit is contained in:
parent
b825ccbcfa
commit
1345b2e91f
60
src/INSTALL
60
src/INSTALL
@ -737,7 +737,65 @@ Chapter 5. Installation and configuration
|
|||||||
memory, you can try higher values between 20 and 80. In my
|
memory, you can try higher values between 20 and 80. In my
|
||||||
experience, values beyond 100 are always counterproductive.
|
experience, values beyond 100 are always counterproductive.
|
||||||
|
|
||||||
5.4.1.4. Miscellaneous parameters:
|
5.4.1.4. Indexing parallelism configuration
|
||||||
|
|
||||||
|
The Recoll indexing process recollindex can use multiple threads to speed
|
||||||
|
up indexing on multiprocessor systems. The work done to index files is
|
||||||
|
divided in several stages and some of the stages can be executed by
|
||||||
|
multiple threads. The stages are:
|
||||||
|
|
||||||
|
1. File system walking: this is always performed by the main thread.
|
||||||
|
2. File conversion and data extraction.
|
||||||
|
3. Text processing (splitting, stemming, etc.)
|
||||||
|
4. Xapian index update.
|
||||||
|
|
||||||
|
You can also read a longer document about the transformation of Recoll
|
||||||
|
indexing to multithreading.
|
||||||
|
|
||||||
|
The threads configuration is controlled by two configuration file
|
||||||
|
parameters.
|
||||||
|
|
||||||
|
thrQSizes
|
||||||
|
|
||||||
|
This variable defines the job input queues configuration. There
|
||||||
|
are three possible queues for stages 2, 3 and 4, and this
|
||||||
|
parameter should give the queue depth for each stage (three
|
||||||
|
integer values). If a value of -1 is used for a given stage, no
|
||||||
|
queue is used, and the thread will go on performing the next
|
||||||
|
stage. In practise, deep queues have not been shown to increase
|
||||||
|
performance. A value of 0 for the first queue tells Recoll to
|
||||||
|
perform autoconfiguration (no need for the two other values in
|
||||||
|
this case)- this is the default configuration.
|
||||||
|
|
||||||
|
thrTCounts
|
||||||
|
|
||||||
|
This defines the number of threads used for each stage. If a value
|
||||||
|
of -1 is used for one of the queue depths, the corresponding
|
||||||
|
thread count is ignored. It makes no sense to use a value other
|
||||||
|
than 1 for the last stage because updating the Xapian index is
|
||||||
|
necessarily single-threaded (and protected by a mutex).
|
||||||
|
|
||||||
|
The following example would use three queues (of depth 2), and 4 threads
|
||||||
|
for converting source documents, 2 for processing their text, and one to
|
||||||
|
update the index. This was tested to be the best configuration on the test
|
||||||
|
system (quadri-processor with multiple disks).
|
||||||
|
|
||||||
|
thrQSizes = 2 2 2
|
||||||
|
thrTCounts = 4 2 1
|
||||||
|
|
||||||
|
The following example would use a single queue, and the complete
|
||||||
|
processing for each document would be performed by a single thread
|
||||||
|
(several documents will still be processed in parallel in most cases). The
|
||||||
|
threads will use mutual exclusion when entering the index update stage. In
|
||||||
|
practise the performance would be close to the precedent case in general,
|
||||||
|
but worse in certain cases (e.g. a Zip archive would be performed purely
|
||||||
|
sequentially), so the previous approach is preferred. YMMV... The 2 last
|
||||||
|
values for thrTCounts are ignored.
|
||||||
|
|
||||||
|
thrQSizes = 2 -1 -1
|
||||||
|
thrTCounts = 6 1 1
|
||||||
|
|
||||||
|
5.4.1.5. Miscellaneous parameters:
|
||||||
|
|
||||||
autodiacsens
|
autodiacsens
|
||||||
|
|
||||||
|
|||||||
60
src/README
60
src/README
@ -3642,7 +3642,65 @@ Chapter 5. Installation and configuration
|
|||||||
memory, you can try higher values between 20 and 80. In my
|
memory, you can try higher values between 20 and 80. In my
|
||||||
experience, values beyond 100 are always counterproductive.
|
experience, values beyond 100 are always counterproductive.
|
||||||
|
|
||||||
5.4.1.4. Miscellaneous parameters:
|
5.4.1.4. Indexing parallelism configuration
|
||||||
|
|
||||||
|
The Recoll indexing process recollindex can use multiple threads to speed
|
||||||
|
up indexing on multiprocessor systems. The work done to index files is
|
||||||
|
divided in several stages and some of the stages can be executed by
|
||||||
|
multiple threads. The stages are:
|
||||||
|
|
||||||
|
1. File system walking: this is always performed by the main thread.
|
||||||
|
2. File conversion and data extraction.
|
||||||
|
3. Text processing (splitting, stemming, etc.)
|
||||||
|
4. Xapian index update.
|
||||||
|
|
||||||
|
You can also read a longer document about the transformation of Recoll
|
||||||
|
indexing to multithreading.
|
||||||
|
|
||||||
|
The threads configuration is controlled by two configuration file
|
||||||
|
parameters.
|
||||||
|
|
||||||
|
thrQSizes
|
||||||
|
|
||||||
|
This variable defines the job input queues configuration. There
|
||||||
|
are three possible queues for stages 2, 3 and 4, and this
|
||||||
|
parameter should give the queue depth for each stage (three
|
||||||
|
integer values). If a value of -1 is used for a given stage, no
|
||||||
|
queue is used, and the thread will go on performing the next
|
||||||
|
stage. In practise, deep queues have not been shown to increase
|
||||||
|
performance. A value of 0 for the first queue tells Recoll to
|
||||||
|
perform autoconfiguration (no need for the two other values in
|
||||||
|
this case)- this is the default configuration.
|
||||||
|
|
||||||
|
thrTCounts
|
||||||
|
|
||||||
|
This defines the number of threads used for each stage. If a value
|
||||||
|
of -1 is used for one of the queue depths, the corresponding
|
||||||
|
thread count is ignored. It makes no sense to use a value other
|
||||||
|
than 1 for the last stage because updating the Xapian index is
|
||||||
|
necessarily single-threaded (and protected by a mutex).
|
||||||
|
|
||||||
|
The following example would use three queues (of depth 2), and 4 threads
|
||||||
|
for converting source documents, 2 for processing their text, and one to
|
||||||
|
update the index. This was tested to be the best configuration on the test
|
||||||
|
system (quadri-processor with multiple disks).
|
||||||
|
|
||||||
|
thrQSizes = 2 2 2
|
||||||
|
thrTCounts = 4 2 1
|
||||||
|
|
||||||
|
The following example would use a single queue, and the complete
|
||||||
|
processing for each document would be performed by a single thread
|
||||||
|
(several documents will still be processed in parallel in most cases). The
|
||||||
|
threads will use mutual exclusion when entering the index update stage. In
|
||||||
|
practise the performance would be close to the precedent case in general,
|
||||||
|
but worse in certain cases (e.g. a Zip archive would be performed purely
|
||||||
|
sequentially), so the previous approach is preferred. YMMV... The 2 last
|
||||||
|
values for thrTCounts are ignored.
|
||||||
|
|
||||||
|
thrQSizes = 2 -1 -1
|
||||||
|
thrTCounts = 6 1 1
|
||||||
|
|
||||||
|
5.4.1.5. Miscellaneous parameters:
|
||||||
|
|
||||||
autodiacsens
|
autodiacsens
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user