topdirsSpecifies the list of directories or files to index (recursively for directories). You can use symbolic links as elements of this list. See the
followLinksoption about following symbolic links found under the top elements (not followed by default).skippedNamesA space-separated list of wilcard patterns for names of files or directories that should be completely ignored. The list defined in the default file is:
skippedNames = #* bin CVS Cache cache* caughtspam tmp .thumbnails .svn \ *~ .beagle .git .hg .bzr loop.ps .xsession-errors \ .recoll* xapiandb recollrc recoll.conf
The list can be redefined at any sub-directory in the indexed area.
The top-level directories are not affected by this list (that is, a directory in
topdirsmight match and would still be indexed).The list in the default configuration does not exclude hidden directories (names beginning with a dot), which means that it may index quite a few things that you do not want. On the other hand, email user agents like thunderbird usually store messages in hidden directories, and you probably want this indexed. One possible solution is to have
.*inskippedNames, and add things like~/.thunderbirdor~/.evolutionintopdirs.Not even the file names are indexed for patterns in this list. See the
noContentSuffixesvariable for an alternative approach which indexes the file names.noContentSuffixesThis is a list of file name endings (not wildcard expressions, nor dot-delimited suffixes). Only the names of matching files will be indexed (no attempt at MIME type identification, no decompression, no content indexing). This can be redefined for subdirectories, and edited from the GUI. The default value is:
noContentSuffixes = .md5 .map \ .o .lib .dll .a .sys .exe .com \ .mpp .mpt .vsd \ .img .img.gz .img.bz2 .img.xz .image .image.gz .image.bz2 .image.xz \ .dat .bak .rdf .log.gz .log .db .msf .pid \ ,v ~ #skippedPathsanddaemSkippedPathsA space-separated list of patterns for paths of files or directories that should be skipped. There is no default in the sample configuration file, but the code always adds the configuration and database directories in there.
skippedPathsis used both by batch and real time indexing.daemSkippedPathscan be used to specify things that should be indexed at startup, but not monitored.Example of use for skipping text files only in a specific directory:
skippedPaths = ~/somedir/*.txtskippedPathsFnmPathnameThe values in the
*skippedPathsvariables are matched by default withfnmatch(3), with the FNM_PATHNAME flag. This means that '/' characters must be matched explicitely. You can setskippedPathsFnmPathnameto 0 to disable the use of FNM_PATHNAME (meaning that /*/dir3 will match /dir1/dir2/dir3).zipSkippedNamesA space-separated list of patterns for names of files or directories that should be ignored inside zip archives. This is used directly by the zip handler, and has a function similar to skippedNames, but works independantly. Can be redefined for filesystem subdirectories. For versions up to 1.19, you will need to update the Zip handler and install a supplementary Python module. The details are described on the Recoll wiki.
followLinksSpecifies if the indexer should follow symbolic links while walking the file tree. The default is to ignore symbolic links to avoid multiple indexing of linked files. No effort is made to avoid duplication when this option is set to true. This option can be set individually for each of the
topdirsmembers by using sections. It can not be changed below thetopdirslevel.indexedmimetypesRecoll normally indexes any file which it knows how to read. This list lets you restrict the indexed MIME types to what you specify. If the variable is unspecified or the list empty (the default), all supported types are processed. Can be redefined for subdirectories.
excludedmimetypesThis list lets you exclude some MIME types from indexing. Can be redefined for subdirectories.
compressedfilemaxkbsSize limit for compressed (.gz or .bz2) files. These need to be decompressed in a temporary directory for identification, which can be very wasteful if 'uninteresting' big compressed files are present. Negative means no limit, 0 means no processing of any compressed file. Defaults to -1.
textfilemaxmbsMaximum size for text files. Very big text files are often uninteresting logs. Set to -1 to disable (default 20MB).
textfilepagekbsIf set to other than -1, text files will be indexed as multiple documents of the given page size. This may be useful if you do want to index very big text files as it will both reduce memory usage at index time and help with loading data to the preview window. A size of a few megabytes would seem reasonable (default: 1MB).
membermaxkbsThis defines the maximum size in kilobytes for an archive member (zip, tar or rar at the moment). Bigger entries will be skipped.
indexallfilenamesRecoll indexes file names in a special section of the database to allow specific file names searches using wild cards. This parameter decides if file name indexing is performed only for files with MIME types that would qualify them for full text indexing, or for all files inside the selected subtrees, independently of MIME type.
usesystemfilecommandDecide if we execute a system command (file
-iby default) as a final step for determining the MIME type for a file (the main procedure uses suffix associations as defined in themimemapfile). This can be useful for files with suffix-less names, but it will also cause the indexing of many bogus "text" files.systemfilecommandCommand to use for mime for mime type determination if
usesystefilecommandis set. Recent versions of xdg-mime sometimes work better than file.processwebqueueIf this is set, process the directory where Web browser plugins copy visited pages for indexing.
webqueuedirThe path to the web indexing queue. This is hard-coded in the Firefox plugin as
~/.recollweb/ToIndexso there should be no need to change it.

