diff --git a/src/doc/user/usermanual.sgml b/src/doc/user/usermanual.sgml index 13b7cad3..f8ca5075 100644 --- a/src/doc/user/usermanual.sgml +++ b/src/doc/user/usermanual.sgml @@ -45,12 +45,12 @@ Giving it a try - If you do not like reading manuals (who does?) and would - like to give &RCL; a try, just perform installation and start the - recoll user interface, which will index your - home directory by default, allowing you to search immediately after - indexing completes. + If you do not like reading manuals (who does?) and would like + to give &RCL; a try, just install the application and + start the recoll graphical user interface (GUI), + which will ask to index your home directory by default, allowing + you to search immediately after indexing completes. Do not do this if your home directory contains a huge number of documents and you do not want to wait or are very @@ -176,21 +176,24 @@ or by using configuration menus in the recoll GUI - Indexing - is started automatically the first time you execute the - recoll search graphical user interface, or by - executing the recollindex command. + The indexing + process is started automatically the first time you + execute the recoll GUI. Indexing can also be + performed by executing the recollindex + command. Searches are usually - performed inside the recoll graphical user - interface (GUI) program, which has many options to help you find - what you are looking for. However, there are other ways to perform - &RCL; searches: mostly a - command line tool, a + performed inside the recoll GUI, which has many + options to help you find what you are looking for. However, there + are other ways to perform &RCL; searches: mostly a + command line interface, a Python - programming interface, and a - KDE KIO slave module. + programming interface, a + KDE KIO slave module, and + a Ubuntu Unity Lens module. + @@ -251,25 +254,27 @@ configuration files. Most file types, like HTML or word processing files, only hold - one document. Some file types, like mail folder files or zip + one document. Some file types, like email folders or zip archives, can hold many individually indexed documents, which may in turn be themselves compound ones. Such hierarchies can go quite - deep, and &RCL; has no problem processing, for example, an ms-word - document which would be an attachment to an email message part of - a folder file archived inside a zip file... + deep, and &RCL; can process, for example, an + ms-word + document stored as an attachment to an email message inside an + email folder archived in a zip file... - &RCL; indexing processes plain text, HTML, openoffice - and e-mail files, and a few others internally. + &RCL; indexing processes plain text, HTML, OpenDocument + (Open/LibreOffice), email formats, and a few others internally. Other file types (ie: postscript, pdf, ms-word, rtf ...) need external applications for preprocessing. The list is in the installation section. After every indexing operation, &RCL; updates a list of commands that would be needed for indexing existing files - types. This list can be displayed from the - recoll File menu. It is - stored in the missing text file - inside the configuration directory. + types. This list can be displayed by selecting the menu option + File->Show Missing Helpers + in the recoll GUI. It is stored in the + missing text file inside the configuration + directory. Without further configuration, &RCL; will index all appropriate files from your home directory, with a reasonable @@ -353,9 +358,9 @@ recoll indexed). Of course, images, sound and video do not increase the - index size, which means that it will be quite typical nowadays - (2006), that even a big index will be negligible against the - total amount of data on the computer. + index size, which means that nowadays (2012), typically, even a big + index will be negligible against the total amount of data on the + computer. The index data directory (xapiandb) only contains data that can be completely rebuilt by an index run @@ -456,14 +461,20 @@ recoll option.) The interface is started from the - Preferences menu. It has two main - panels. The first panel allows setting global variables, like - the list of top directories or the list of skipped paths. The - second panel allows setting variables that can be redefined - for subdirectories. This second panel has an initially empty list of - customisation directories, to which you can add. The variables - are then set for the currently selected directory (or at the top - level if the empty line is selected). + Preferences->Indexing + Configuration menu entry. It is divided in three tabs, + Global parameters, Local + parameters, and Beagle web history, + which is explained in the next section. + + The first tab allows setting global variables, like the lists + of top directories, skipped paths, or stemming languages. + + The second tab allows setting variables that can be redefined + for subdirectories. This second tab has an initially empty list of + customisation directories, to which you can add. The variables are + then set for the currently selected directory (or at the top level + if the empty line is selected). The meaning for most entries in the interface is self-evident and documented by a ToolTip @@ -538,15 +549,17 @@ recoll if canceled). The recollindex indexing process can be - interrupted by sending an interrupt (^C, SIGINT) or terminate + interrupted by sending an interrupt (Ctrl-C, SIGINT) or terminate (SIGTERM) signal. Some time may elapse before the process exits, - because it needs to properly flush and close the index. The - indexing thread can be equivalently stopped from the menu. + because it needs to properly flush and close the index. This can + also be done from the recoll GUI + File->Stop Indexing + menu entry. After such an interruption, the index will be somewhat inconsistent because some operations which are normally performed at the end of the indexing pass will have been skipped (for - exemple, the stemming and spelling databases will be inexistant + example, the stemming and spelling databases will be inexistant or out of date). You just need to restart indexing at a later time to restore consistency. The indexing will restart at the interruption point (the full file tree will be traversed, @@ -593,7 +606,8 @@ recoll As of version 1.17 the &RCL; GUI has dialogs to manage crontab entries for recollindex. You can reach them from the - Preferences->Indexing Schedule menu. They only + Preferences->Indexing + Schedule menu. They only work with the good old cron, and do not give access to all features of cron scheduling. @@ -669,12 +683,13 @@ fvwm on the log level. When building &RCL;, the real time indexing support can be - customised during package - configuration - with the --with[out]-fam or + customised during package configuration with the + --with[out]-fam or --with[out]-inotify options. The default is - currently to include inotify monitoring on systems that support - it, and, as of recoll 1.17, gamin support on FreeBSD. + currently to include inotify monitoring + on systems that support it, and, as of recoll 1.17, + gamin support on FreeBSD. While it is convenient that data is indexed in real time, repeated indexing can generate a significant load on the @@ -729,7 +744,7 @@ fvwm In most cases, you can enter the terms as you think them, even if they contain embedded punctuation or other non-textual characters. For - exemple, &RCL; can handle things like e-mail addresses, or + example, &RCL; can handle things like email addresses, or arbitrary cut and paste from another text window, punctation and all. @@ -967,7 +982,7 @@ fvwm that you can't actually visualize the folder (there will be an error dialog if you try). &RCL; is unfortunately not yet smart enough to disable the entry in this case. In other cases, the - Open option makes sense, for exemple to + Open option makes sense, for example to start a chm viewer on the parent document for a help page. @@ -1023,7 +1038,7 @@ fvwm create a new preview window. The old one stays open until you close it. - You can close a preview tab by typing ^W + You can close a preview tab by typing Ctrl-W (Ctrl + W) in the window. Closing the last tab for a window will also close the window. @@ -1047,7 +1062,7 @@ fvwm F3 inside the text area to get to the next occurrence. - If you have a search string entered and you use ^Up/^Down + If you have a search string entered and you use Ctrl-Up/Ctrl-Down to browse the results, the search is initiated for each successive document. If the string is found, the cursor will be positioned at the first occurrence of the search string. @@ -1059,8 +1074,8 @@ fvwm the main text but in one of the fields. You can print the current preview window contents by typing - ^P (Ctrl + P) in - the window text. + Ctrl-P (Ctrl + + P) in the window text. @@ -1556,19 +1571,19 @@ fvwm Closing previews - Entering ^W in a tab will + Entering Ctrl-W in a tab will close it (and, for the last tab, close the preview window). Entering Esc will close the preview window and all its tabs. Printing previews - Entering ^P in a preview window will print + Entering Ctrl-P in a preview window will print the currently displayed text. Quitting - Entering ^Q almost anywhere will + Entering Ctrl-Q almost anywhere will close the application. @@ -1605,9 +1620,10 @@ fvwm on startup. The default value is empty, but there is a skeleton style sheet (recoll.qss) inside the /usr/share/recoll/examples - directory. Using a style sheet, you can change most Recoll - graphical parameters: colors, fonts, etc. See the sample - file for a few simple examples. + directory. Using a style sheet, you can change most + recoll graphical parameters: colors, + fonts, etc. See the sample file for a few simple + examples. Maximum text size highlighted for @@ -1847,7 +1863,7 @@ fvwm No more detail will be given about the header part (only useful with the WebKit build), if there are restrictions to what you can do, they are beyond this author's HTML/CSS/Javascript - abilities... There are a few exemples on the + abilities... There are a few examples on the page about customising the result list on the &RCL; web site. @@ -2143,7 +2159,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r potatoes (in any part of the document). An element is composed of an optional field specification, - and a value, separated by a colon. Exemple: + and a value, separated by a colon. Example: Beatles, author:balzac, dc:title:grandet @@ -2180,7 +2196,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r title:prejudice title:pride, and is unlikely to find a result. - Modifiers can be set on a phrase clause, for exemple to specify + Modifiers can be set on a phrase clause, for example to specify a proximity search (unordered). See the modifier section. @@ -2226,7 +2242,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r size for filtering the - results on file size. Exemple: + results on file size. Example: size<10000. You can use <, > or = as operators. You can specify a range like the @@ -2250,7 +2266,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r The days and months parts may be missing. If the / is present but an element is missing, the missing element is interpreted as the lowest or highest date in the - index. Exemples: + index. Examples: 2001-03-01/2002-05-01 the basic syntax for an interval of dates. @@ -2572,7 +2588,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r Subject: for email) when indexing. This is not essential. - You should look to one of the simple filters, for exemple + You should look to one of the simple filters, for example rclps for a starting point. Don't forget to make your filter executable before @@ -3104,7 +3120,7 @@ while query.next >= 0 and query.next < nres: You will only have to check or install supporting applications for the file types that you want to index beyond those that are - natively processed by &RCL; (text, HTML, mail files, and a few + natively processed by &RCL; (text, HTML, email files, and a few others). You should also maybe have a look at the @@ -3276,13 +3292,13 @@ while query.next >= 0 and query.next < nres: Konqueror webarchive format with Python (uses the Tarfile module). - mimehtml web archive format (support based on the mail + mimehtml web archive format (support based on the email filter, which introduces some mild weirdness, but still usable). - Text, HTML, mail folders, and Scribus files are + Text, HTML, email folders, and Scribus files are processed internally. Lyx is used to index Lyx files. Many filters need iconv and the standard sed and awk. @@ -3628,7 +3644,7 @@ skippedNames = #* bin CVS Cache cache* caughtspam tmp .thumbnails .svn \ The list in the default configuration does not exclude hidden directories (names beginning with a dot), which means that it may index quite a few things - that you do not want. On the other hand, mail user + that you do not want. On the other hand, email user agents like thunderbird usually store messages in hidden directories, and you probably want this indexed. One possible solution is to @@ -3835,7 +3851,7 @@ skippedPaths = ~/somedir/∗.txt maildefcharset This can be used to define the default - character set specifically for mail messages which don't + character set specifically for email messages which don't specify it. This is mainly useful for readpst (libpst) dumps, which are utf-8 but do not say so. @@ -4098,9 +4114,9 @@ mondelaypatterns = *.log:20 "this one has spaces*:10" filter-specific sections Some filters may need specific - configuration for handling fields. Only the mail message filter + configuration for handling fields. Only the email message filter currently has such a section (named - [mail]). It allows indexing arbitrary mail + [mail]). It allows indexing arbitrary email headers in addition to the ones indexed by default. Other such sections may appear in the future. @@ -4110,9 +4126,9 @@ mondelaypatterns = *.log:20 "this one has spaces*:10" Here follows a small example of a personal fields - file. This would extract a specific mail header and + file. This would extract a specific email header and use it as a searchable field, with data displayable inside result - lists. (Side note: as the mail filter does no decoding on the values, + lists. (Side note: as the email filter does no decoding on the values, only plain ascii headers can be indexed, and only the first occurrence will be used for headers that occur several times).