From d1cc20d8e9281de3ed9b257b7953b2524eaa429c Mon Sep 17 00:00:00 2001 From: dockes Date: Wed, 21 Nov 2007 09:00:15 +0000 Subject: [PATCH] *** empty log message *** --- src/INSTALL | 35 +++++++++++++++-- src/README | 110 ++++++++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 130 insertions(+), 15 deletions(-) diff --git a/src/INSTALL b/src/INSTALL index 85aea8fb..822d02fd 100644 --- a/src/INSTALL +++ b/src/INSTALL @@ -23,7 +23,9 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or 4.4. Configuration overview - 4.5. Extending Recoll + 4.5. The KDE Kicker Recoll applet + + 4.6. Extending Recoll 4.1. Installing a prebuilt copy @@ -87,6 +89,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or * RTF: unrtf + * TeX: Recoll uses the untex program. Your distribution may have a + package for it. If it doesn't, there is a copy of the source on the + Recoll web site, because the program has no obvious home. The filter + can also work with detex and will use it if it is installed. + * dvi: dvips * djvu: DjVuLibre @@ -402,6 +409,13 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or useful for files with suffix-less names, but it will also cause the indexing of many bogus "text" files. + indexedmimetypes + + Recoll normally indexes any file which it knows how to read. This + list lets you restrict the indexed mime types to what you specify. + If the variable is unspecified or the list empty (the default), + all supported types are processed. + indexallfilenames Recoll indexes file names in a special section of the database to @@ -438,6 +452,21 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or Useful for cases where you don't need the functionality or when it is unusable because aspell crashes during dictionary generation. + nocjk + + If this set to true, specific east asian (Chinese Korean Japanese) + characters/word splitting is turned off. This will save a small + amount of cpu if you have no CJK documents. If your document base + does include such text but you are not interested in searching it, + setting nocjk may be a significant time and space saver. + + cjkngramlen + + This lets you adjust the size of n-grams used for indexing CJK + text. The default value of 2 is probably appropriate in most + cases. A value of 3 would allow more precision and efficiency on + longer words, but the index will be approximately twice as large. + 4.4.2. The mimemap file mimemap specifies the file name extension to mime type mappings. @@ -560,5 +589,5 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or -------------------------------------------------------------------------- - Prev Home Next - Building from source Up Extending Recoll + Prev Home Next + Building from source Up The KDE Kicker Recoll applet diff --git a/src/README b/src/README index f44d8910..322f63ba 100644 --- a/src/README +++ b/src/README @@ -34,11 +34,13 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or 2.2. Index storage - 2.2.1. Index formats + 2.2.1. Xapian index formats 2.2.2. Security aspects - 2.3. The indexing configuration + 2.3. Indexing configuration + + 2.3.1. The indexing configuration GUI 2.4. Periodic indexing @@ -106,9 +108,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or 4.4.5. Examples of configuration adjustments - 4.5. Extending Recoll + 4.5. The KDE Kicker Recoll applet - 4.5.1. Writing a document filter + 4.6. Extending Recoll + + 4.6.1. Writing a document filter ---------------------------------------------------------------------- @@ -315,7 +319,10 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or ---------------------------------------------------------------------- - 2.2.1. Index formats + 2.2.1. Xapian index formats + + If your first installation of Recoll was 1.9.0 or more recent, you can + skip this section. Xapian has had two possible index formats for quite some time. The "old" one named Quartz, and the new one named Flint. Xapian 0.9 used Quartz by @@ -354,15 +361,17 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or in appropriate protection. If you use another setup, you should think of the kind of protection you - need for your index, and set the directory and files access modes - appropriately. + need for your index, set the directory and files access modes + appropriately, and also maybe adjust the umask used during index updates. ---------------------------------------------------------------------- -2.3. The indexing configuration +2.3. Indexing configuration - You can control which areas of the file system are indexed, and how files - are processed, by setting variables inside the Recoll configuration files. + Variables set inside the Recoll configuration files control which areas of + the file system are indexed, and how files are processed. These variables + can be set either by editing the text files or using the dialogs in the + recoll GUI. You can also use multiple indexes defined by separate configurations, typically to separate personal and shared indexes, or to take advantage of @@ -386,6 +395,31 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or ---------------------------------------------------------------------- + 2.3.1. The indexing configuration GUI + + As of Recoll 1.10, most parameters for a given indexing configuration can + be set from a recoll GUI running on this configuration (either as default, + or by setting RECOLL_CONFDIR or the -c option.) + + The interface is started from the Preferences menu. It has two main + panels. The first panel allows setting global variables, like the list of + top directories or the list of skipped paths. The second panel allows + setting variables that can be redefined for subdirectories. This second + panel has an initially empty list of customisation directories, to which + you can add. The variables are then set for the currently selected + directory (or at the top level if the empty line is selected). + + The meaning for most entries in the interface is self-evident and + documented by a ToolTip popup on the text label. For more detail, you will + need to refer to the configuration section of this guide. + + The configuration tool normally respects the comments and most of the + formatting inside the configuration file, so that it is quite possible to + use it on hand-edited files, which you might nevertheless want to backup + first... + + ---------------------------------------------------------------------- + 2.4. Periodic indexing 2.4.1. Starting indexing @@ -718,6 +752,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or * ext for specifying the file name extension (Ex: ext:html) + * dir for specifying the file location (Ex: dir:/home/me/somedir). + Please note that this is quite inefficient, that it may produce very + slow searches, and that it may be worth in some cases to set up + separate databases instead. + * mime for specifying the mime type. This one is quite special because you can specify several values which will be OR'ed (the normal default for the language is AND). Ex: mime:text/plain mime:text/html. @@ -1203,6 +1242,11 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or * RTF: unrtf + * TeX: Recoll uses the untex program. Your distribution may have a + package for it. If it doesn't, there is a copy of the source on the + Recoll web site, because the program has no obvious home. The filter + can also work with detex and will use it if it is installed. + * dvi: dvips * djvu: DjVuLibre @@ -1500,6 +1544,13 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or useful for files with suffix-less names, but it will also cause the indexing of many bogus "text" files. + indexedmimetypes + + Recoll normally indexes any file which it knows how to read. This + list lets you restrict the indexed mime types to what you specify. + If the variable is unspecified or the list empty (the default), + all supported types are processed. + indexallfilenames Recoll indexes file names in a special section of the database to @@ -1536,6 +1587,21 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or Useful for cases where you don't need the functionality or when it is unusable because aspell crashes during dictionary generation. + nocjk + + If this set to true, specific east asian (Chinese Korean Japanese) + characters/word splitting is turned off. This will save a small + amount of cpu if you have no CJK documents. If your document base + does include such text but you are not interested in searching it, + setting nocjk may be a significant time and space saver. + + cjkngramlen + + This lets you adjust the size of n-grams used for indexing CJK + text. The default value of 2 is probably appropriate in most + cases. A value of 3 would allow more precision and efficiency on + longer words, but the index will be approximately twice as large. + ---------------------------------------------------------------------- 4.4.2. The mimemap file @@ -1668,9 +1734,29 @@ More documentation can be found in the doc/ directory or at http://www.recoll.or ---------------------------------------------------------------------- -4.5. Extending Recoll +4.5. The KDE Kicker Recoll applet - 4.5.1. Writing a document filter + The Recoll source tree contains the source code to the recoll_applet, a + small application derived from the find_applet. This can be used to add a + small Recoll launcher to the KDE panel. + + The applet is not automatically built with the main Recoll programs. To + build it, you need to unpack the Recoll source code, then go to the + kde/recoll_applet/ directory, and type the usual configure;make;make + install. + + You can then add the applet to the panel by right-clicking the panel and + choosing the Add applet entry. + + The recoll_applet has a small text window where you can type a Recoll + query (in query language form), and an icon which can be used to restrict + the search to certain types of files. + + ---------------------------------------------------------------------- + +4.6. Extending Recoll + + 4.6.1. Writing a document filter Recoll filters are executable programs which translate from a specific format (ie: openoffice, acrobat, etc.) to the Recoll indexing input