diff --git a/src/doc/user/usermanual.sgml b/src/doc/user/usermanual.sgml
index 1574c975..00194db6 100644
--- a/src/doc/user/usermanual.sgml
+++ b/src/doc/user/usermanual.sgml
@@ -24,7 +24,7 @@
Dockes
- $Id: usermanual.sgml,v 1.35 2007-01-15 13:03:35 dockes Exp $
+ $Id: usermanual.sgml,v 1.36 2007-01-25 15:47:45 dockes Exp $This document introduces full text search notions
@@ -178,7 +178,7 @@
is normally incremental: documents will only be processed if
they have been modified. On the first execution, of course, all
documents will need processing. A full index build can be forced
- later on by specifying an option to the indexing command
+ later by specifying an option to the indexing command
(recollindex -z).&RCL; indexing can be performed with two different
@@ -486,7 +486,7 @@ fvwm
- Search
+ SearchingThe recoll program provides the user
interface for searching. It is based on the
@@ -510,19 +510,27 @@ fvwm
- The initial default search mode is Any
- term. This will look for documents with any of the
- search terms (the ones with more terms will get better scores).
- All terms will ensure
- that only documents with all the terms will be
- returned. File name will specifically
- look for file names, and allows using wildcards
- (*, ? ,
- []).
+ The initial default search mode is All
+ terms. This will look for documents containing all
+ of the search terms (the ones with more terms will get better
+ scores). Any term will search for
+ documents where at least one of the terms appear. File
+ name will specifically look for file names.
+
+ The fourth entry (Query Language) is
+ described in its own
+ section.
+
+ All search modes allow wildcards inside terms
+ (*, ?,
+ []). You may want to have a look at the
+ section about wildcards
+ for more information about this.You can search for exact phrases (adjacent words in a
given order) by enclosing the input inside double quotes. Ex:
"virtual reality".
+
Character case has no influence on search, except that you
can disable stem expansion for any term by capitalizing it. Ie:
a search for floor will also normally look for
@@ -537,7 +545,7 @@ fvwm
text field). Please note, however, that only the search texts
are remembered, not the mode (all/any/file name).
- Typing EscSpace) while
+ Typing EscSpace while
entering a word in the simple search entry will open a window
with possible completions for the word. The completions are
extracted from the database.
@@ -568,7 +576,10 @@ fvwm
tabs in the existing preview window. You can use
Shift+Click to force the creation of another
preview window, which may be useful to view the documents side
- by side.
+ by side. (You can also browse successive results in a single
+ preview window by typing
+ Shift+ArrowUp/Down in the
+ window).Clicking the Edit link will attempt to
start an external viewer. The viewers can be configured through the
@@ -618,9 +629,11 @@ fvwm
The Preview and
Edit entries do the same thing as the
- corresponding links. The two following entries will copy either
- an URL or the file path to the clipboard, for pasting into
- another application.
+ corresponding links.
+
+ The Copy File Name and
+ Copy Url copy the relevant data to the
+ clipboard, for later pasting.The Find similar entry will select
a number of relevant term from the current document and enter
@@ -628,10 +641,6 @@ fvwm
search, with a good chance of finding documents related to the
current result.
- The Copy File Name and
- Copy Url copy the relevant data to the
- clipboard, for later pasting.
-
The Parent document entry will
appear for documents which are not actually files but are
part of, or attached to, a higher level document. This entry
@@ -653,7 +662,9 @@ fvwm
Preview link inside the result list.Subsequent preview requests for a given search open new
- tabs in the existing window.
+ tabs in the existing window (except if you hold the
+ Shift key while clicking which will open a new
+ window for side by side viewing).Starting another search and requesting a preview will
create a new preview window. The old one stays open until you
@@ -690,12 +701,93 @@ fvwm
+
+ The query language
+
+ The query language processor is activated on the
+ simple search entry when the search mode selector is set to
+ Query Language.
+
+ Here follows a sample request that we are going to
+ explain:
+
+ mime:message/rfc822 author:"john doe" Beatles OR Lennon Live OR Unplugged -potatoes
+
+
+ This would search for all email messages with
+ John Doe
+ appearing as a phrase in the From: header,
+ and containing either beatles or
+ lennon and either
+ live or
+ unplugged but not
+ potatoes.
+
+ The first element, mime:message/rfc822
+ is a special switch that restricts the results to be email
+ messages. There could be several such switches, which would form
+ a list of allowed types.
+
+ The second element author:"john doe" is
+ a phrase search limited to a specific field. Phrase searches are
+ specified as usual by enclosing the words in double quotes. The
+ field specification appears before the colon. &RCL; currently
+ manages the following fields:
+
+ title,
+ subject or caption are
+ synonyms which specify data to be searched for in the
+ document title or subject.
+
+ author or
+ from for searching the documents originators.
+
+ keyword for searching the
+ document specified keywords (few documents actually have any).
+
+
+
+ The query language is currently the only way to use the
+ &RCL; field search capability.
+
+ All elements in the search entry are normally combined
+ with an implicit AND. It is possible to specify that elements be
+ OR'ed instead, as in Beatles
+ ORLennon. The
+ OR must be entered literally (capitals), and
+ it has priority over the AND associations:
+ word1
+ word2OR
+ word3
+ means
+ word1 AND
+ (word2OR
+ word3)
+ not
+ (word1 AND
+ word2) OR
+ word3. Do not enter explicit
+ parenthesis, they are not supported for now.
+
+ An entry preceded by a - specifies a
+ term that should not appear.
+
+ Words inside phrases and capitalized words are not
+ stem-expanded. Wildcards may be used anywhere.
+
+ You can use the show query link at the
+ top of the result list to check the exact query which was
+ finally executed by Xapian.
+
+
+
Complex/advanced search
- The advanced search dialog has fields that will allow a more
- refined search. It has a number of entry fields, each of which
- is configurable for the following modes:
+ The advanced search dialog has a number of fields that
+ will allow a more refined search. Each entry field is
+ configurable for the following modes:
+
All terms.
@@ -712,16 +804,17 @@ fvwm
Filename search with wildcards.
-
+
Additional entry fields can be created by clicking the
Add clause button.
- All relevant fields will be combined by an implicit AND
- or OR conjunction. All types of clauses except "phrase" and
- "near" can accept a mix of single words and phrases enclosed
- in double quotes. Stemming expansion will be performed for all
- terms not beginning with a capital letter, except for "phrase"
- clauses.
+ You can choose that all relevant fields will be combined
+ by either an AND or an OR conjunction. All types of clauses
+ except "phrase" and "near" can accept a mix of single words and
+ phrases enclosed in double quotes. Stemming expansion will be
+ performed for all terms not beginning with a capital letter,
+ except for terms inside "phrase" clauses. Wildcards will be
+ processed everywhere.Advanced search will also let you search for documents of
specific mime types (ie: only text/plain, or
@@ -764,18 +857,26 @@ fvwm
WildcardIn this mode of operation, you can enter a
- search string with shell-like wildcards (*, ?). ie:
- xapi* .
+ search string with shell-like wildcards (*, ?, []). ie:
+ xapi* would display all index terms
+ beginning with xapi. (More
+ about wildcards here).Regular expressionThis mode will accept a regular expression
as input. Example:
- word[0-9]+ . The regular
- expression is anchored by enclosing in
- ^ and $ before
- execution.
+ word[0-9]+. The expression is
+ implicitely anchored at the beginning. Ie:
+ press will match
+ pression but not
+ expression. You can use
+ .*press to match the latter,
+ but be aware that this will cause a full index term list
+ scan, which can be quite long.
+
@@ -815,6 +916,53 @@ fvwm
+
+ More about wildcards
+ All words entered in &RCL; search fields will be processed
+ for wildcard expansion before the request is finally
+ executed.
+
+ The wildcard characters are:
+
+
+ * which matches 0 or more
+ characters.
+
+ ? which matches
+ a single character.
+
+ [] which allow
+ defining sets of characters to be matched (ex:
+ [abc]
+ matches a single character which may be 'a' or 'b' or 'c',
+ [0-9]
+ matches any number.
+
+
+
+ You should be aware of a few things before using
+ wildcards.
+
+
+ Using a wildcard character at the beginning of
+ a word can make for a slow search because &RCL; will have to
+ scan the whole index term list to find the matches.
+
+ Using a * at the end of a
+ word can produce more matches than you would think, and
+ strange search results. You can use the term explorer tool to
+ check what completions exist for a given term. You can also
+ see exactly what search was performed by clicking on the link
+ at the top of the result list. In general, for natural
+ language terms, stem expansion will produce better results
+ than an ending * (stem expansion is turned
+ off when any wildcard character appears in the term).
+
+
+
+
+
Multiple databases
@@ -861,14 +1009,14 @@ fvwm
A typical usage scenario for the multiple index feature
would be for a system administrator to set up a central index
- for shared data, that you may choose to search, or not, in
- addition to your personal data. Of course, there are other
+ for shared data, that you choose to search or not in addition to
+ your personal data. Of course, there are other
possibilities. There are many cases where you know the subset of
- files that you want to be searched for a given query, and where
- restricting the query will much improve the precision of the
- results. This can also be performed with the directory filter in
- advanced search, but multiple indexes will have much better
- performance and may be worth the trouble.
+ files that should be searched, and where narrowing the search
+ can improve the results. You can achieve approximately the same
+ effect with the directory filter in advanced search, but
+ multiple indexes will have much better performance and may be
+ worth the trouble.
@@ -1167,10 +1315,10 @@ fvwm
/usr/local/recollglobal/xapiandb).
Once entered, the indexes will appear in the
- All indexes list, and you can
- chose which ones you want to use at any moment by transferring
- them to/from the Active indexes
- list.
+ External indexes list, and you can
+ chose which ones you want to use at any moment by checking or
+ unchecking their entries.
+
Your main database (the one the current configuration
indexes to), is always implicitly active. If this is not
desirable, you can set up your configuration so that it indexes,
@@ -1292,8 +1440,11 @@ fvwm
- Text, HTML, mail folders and Openoffice files are
- processed internally.
+ Text, HTML, mail folders Openoffice and Scribus files
+ are processed internally. Lyx is used to index Lyx files. Many
+ filters need sed and awk.
+
+