100 lines
4.1 KiB
Plaintext
100 lines
4.1 KiB
Plaintext
== Using the log file to investigate indexing issues
|
|
|
|
All *Recoll* processes print trace messages. By default these go to the
|
|
standard error output, and you may not ever see them (in the case, for
|
|
example, of the *recoll* GUI started from the desktop interface).
|
|
|
|
There are a number of potential issues with indexing that may need
|
|
investigation, such as:
|
|
|
|
- A file can't be found by searching even if it appears that it should have
|
|
be indexed (this could happen because the file is not selected at all or
|
|
because a filter program crashes).
|
|
- The indexing process gets stuck and never finishes.
|
|
- The indexing process ends up with an error.
|
|
- The indexing process seems to be using too much system capacity.
|
|
|
|
The right way to approach these problems is to use the *recollindex*
|
|
command line tool (instead of the *recoll* GUI), and to set up the
|
|
trace log to provide information about what indexing is actually doing.
|
|
|
|
Trace log parameters can be set either from the GUI _Preferences->Indexing
|
|
Configuration->Global Parameters_ panel, or by editing the configuration
|
|
file '~/.recoll/recoll.conf'. You should set the following parameters:
|
|
|
|
----
|
|
loglevel = 6
|
|
logfilename = stderr
|
|
thrQSizes = -1 -1 -1
|
|
----
|
|
|
|
We use _stderr_ instead of an actual file in order to capture direct filter
|
|
messages (such as a *python* stack trace) along with normal
|
|
*recollindex* messages.
|
|
|
|
The last line sets recollindex for single-threaded operation, which will
|
|
make the log much more readable.
|
|
|
|
You should then check that no *recoll* or *recollindex* process is
|
|
currently running, and kill any you find.
|
|
|
|
Then, if this is an issue about an identified file, try indexing it only:
|
|
|
|
----
|
|
recollindex -i myunfindablefile.xxx > /tmp/myindexlog 2>&1
|
|
----
|
|
|
|
If this is a general issue with indexing (process not finishing properly),
|
|
just start it:
|
|
|
|
----
|
|
recollindex > /tmp/myindexlog 2>&1
|
|
----
|
|
|
|
Usually, having a look at the trace will allow to see what is wrong (e.g.:
|
|
a configuration issue or missing filter), and solve the problem.
|
|
|
|
In case of indexer misbehaviour (e.g. using too much memory, you should run
|
|
_tail -f_ on the log to see what is going on.
|
|
|
|
If this is not enough, please
|
|
link:http://bitbucket.org/medoc/recoll/issues/new[open a tracker issue] and
|
|
attach or link to the log data, or just email me (jfd at recoll.org).
|
|
|
|
*recollindex* and *recollindex -i* usually have the same criteria to
|
|
include a file or not (but see the _Path gotcha_ note below). It may
|
|
happen that they behave differently, so it may sometimes be useful to run a
|
|
full *recollindex* even for a specific file, but this will produce a
|
|
big log file.
|
|
|
|
When you are done, it is better to reset the verbosity to a reasonable
|
|
level (e.g.: +2+ : just errors, +4+ : basic traces).
|
|
|
|
=== Note: the path gotcha
|
|
|
|
*recollindex -i* will only index files under the directories defined by the
|
|
+topdirs+ configuration variable (your home directory by
|
|
default). Unfortunately, the test is done on the file path text, ignoring
|
|
possible symbolic links. If you give a simple file name as a parameter to
|
|
*recollindex -i* and there are symbolic links inside the +topdirs+
|
|
entries, the comparison may fail. For example, if your home directory is
|
|
'/home/me/' and '/home/' is a link to '/usr/home/', *recollindex -i
|
|
somefilename* will actually try to index '/usr/home/somefilename/', and
|
|
fail (because '/usr/home/me/' is not a subdirectory of '/home/me/'). This
|
|
will manifest itself in the log by a message like the following.
|
|
|
|
----
|
|
:4:../index/fsindexer.cpp:149:FsIndexer::indexFiles: skipping [/usr/home/me/somefile] (ntd)
|
|
----
|
|
|
|
If this happens, give a full path consistent with what is found in the
|
|
configuration file (e.g.: _recollindex -i /home/me/somefile_).
|
|
|
|
=== File system occupation
|
|
|
|
One of the possible reasons for failed indexing is a +maxfsoccup+
|
|
parameter set too low. This is the value of file system occupation, not
|
|
free space, where indexing will stop. It is set from the GUI indexing
|
|
configuration or by editing 'recoll.conf'. A value of 0 implies no
|
|
checking, but a very low, non-zero, value will just prevent indexing.
|