From 5afe1aa631094581529e776c1c4f03b5f2d496f0 Mon Sep 17 00:00:00 2001
From: Jean-Francois Dockes With the help of a Firefox extension, Recoll can index the Internet pages
- that you visit. The extension was initially designed for
- the Beagle indexer, but it
- has recently be renamed and better adapted to Recoll.2.4.�Indexing WEB pages
- you wisit
+ "RCL.INDEXING.WEBQUEUE">2.4.�Indexing the WEB
+ pages which you wisit.
The extension works by copying visited WEB pages to an indexing queue directory, which Recoll then processes, indexing the data, storing it into a local cache, then removing the file from the queue.
-This feature can be enabled in the GUI Index configuration panel, or by editing
- the configuration file (set processwebqueue to 1).
Because the WebExtensions API introduces more
+ constraints to what extensions can do, the new version
+ works with one more step: the files are first created in
+ the browser default downloads location (typically
+ $HOME/Downloads ), then moved
+ by a script in the old queue location. The script is
+ automatically executed by the Recoll indexer versions 1.23.5 and
+ newer. It could conceivably be executed independantly to
+ make the new browser extension compatible with an older
+ Recoll version (the script
+ is named recoll-we-move-files.py).
For the WebExtensions-based version to work, it is
+ necessary to set the webdownloadsdir value in the
+ configuration if it was changed from the default
+ $HOME/Downloads in the
+ browser preferences.
The visited WEB pages indexing feature can be enabled in
+ the GUI Index configuration
+ panel, or by editing the configuration file (set
+ processwebqueue to 1).
A current pointer to the extension can be found, along
with up-to-date instructions, on the
If set, this defines the file character set
+ (mostly useful for plain text files). By default, other attributes are handled as
+ https://www.lesbonscomptes.com/recoll/faqsandhowtos/FilteringOutZipArchiveMembers.html The path to the Web indexing queue. This is
- hard-coded in the plugin as ~/.recollweb/ToIndex so
- there should be no need or possibility to change
- it. The path to the Web indexing queue. This used to
+ be hard-coded in the old plugin as
+ ~/.recollweb/ToIndex so there would be no need or
+ possibility to change it, but the WebExtensions
+ plugin now downloads the files to the user
+ Downloads directory, and a script moves them to
+ webqueuedir. The script reads this value from the
+ config so it has become possible to change it. The path to browser downloads directory. This is
+ where the new browser add-on extension has to
+ create the files. They are then moved by a script
+ to webqueuedir.webqueuedir
webdownloadsdir
+