diff --git a/src/doc/user/usermanual.html b/src/doc/user/usermanual.html index 043827d3..b5f6d94a 100644 --- a/src/doc/user/usermanual.html +++ b/src/doc/user/usermanual.html @@ -257,20 +257,25 @@ alink="#0000FF">
3.5.1. General + syntax
+
3.5.2. Special field-like + specifiers
+
3.5.3. Range clauses
-
3.5.2. 3.5.4. Modifiers
3.6. Anchored searches and - wildcards
+ "#RCL.SEARCH.ANCHORWILD">Wildcards and anchored + searches
3.6.1. More about - wildcards
+ "#RCL.SEARCH.WILDCARDS">Wildcards
3.6.2. Anchored searches
@@ -423,7 +428,7 @@ alink="#0000FF">

List of Tables

-
3.1. Keyboard shortcuts
+
3.1. Keyboard shortcuts
@@ -2133,22 +2138,23 @@ metadatacmds = ; tags:some/alternate/values or tags:all,these,values (the - compact field search syntax is supported for recoll 1.20 - and later. For older versions, you would need to repeat the - tags: specifier - for each term, e.g. tags:all,these,values. The + compact comma- or slash-based field search syntax is + supported for recoll 1.20 and later. For older versions, + you would need to repeat the tags: specifier for each + term, e.g. tags:some OR tags:alternate).

+ "replaceable">tags:alternate.

Tags changes will not be detected by the indexer if the file itself did not change. One possible workaround would be to update the file ctime when you modify the tags, which would be consistent with how extended attributes function. A pair of chmod commands could - accomplish this, or a touch -a - . Alternatively, just couple the tag update with a + accomplish this, or a touch + -a. Alternatively, just couple the tag update with a recollindex -e -i /path/to/the/file.

@@ -2771,11 +2777,16 @@ fs.inotify.max_user_watches=32768 documents containing all your input terms.

  • -

    Query Language mode - behaves like All Terms - in the absence of special input, but it can also do - much more. This is the best mode for getting the most - of Recoll.

    +

    The Query Language + mode behaves like All + Terms in the absence of special input, but it + can also do much more. This is the best mode for + getting the most of Recoll. It is usable from all + possible interfaces (GUI, command line, WEB UI, ...), + and is described + here.

  • In Any Term mode, @@ -2906,8 +2917,8 @@ fs.inotify.max_user_watches=32768 ?, []). See the section about - wildcards for more details.

    + "3.6.1. Wildcards">section about wildcards for + more details.

    In all modes except File name, you can search for exact phrases (adjacent words in a given order) by enclosing the input inside @@ -2964,9 +2975,9 @@ fs.inotify.max_user_watches=32768 complex searches.

    The File name search mode will specifically look for file names. The point of - having a separate file name search is that wild card + having a separate file name search is that wildcard expansion can be performed more efficiently on a small - subset of the index (allowing wild cards on the left of + subset of the index (allowing wildcards on the left of terms without excessive cost). Things to know:

    • -

      An entry without any wild card character and not +

      An entry without any wildcard character and not capitalized will be prepended and appended with '*' (ie: etc -> xapi. (More about wildcards here ).

      + "3.6.1. Wildcards">here ).

  • Regular expression
    @@ -4064,7 +4075,7 @@ fs.inotify.max_user_watches=32768 given context (e.g. within a preview window, within the result table).

    - +

    Table 3.1. Keyboard shortcuts

    @@ -4291,8 +4302,7 @@ fs.inotify.max_user_watches=32768

    Wildcards. Wildcards can be used inside search terms in all forms of searches. More about - wildcards.

    + "3.6.1. Wildcards">More about wildcards.

    Automatic suffixes. Words like odt or ods can be automatically turned into @@ -4361,7 +4371,7 @@ fs.inotify.max_user_watches=32768 Example: "user manual"p would also match "manual user". Also see the modifier section from + "3.5.4. Modifiers">the modifier section from the query language documentation.

    AutoPhrases. This option can be set in the preferences dialog. If it is set, a phrase will be @@ -5213,389 +5223,447 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r

    +

    The Recoll query + language was based on the now defunct Xesam user search language specification. + It allows defining general boolean searches within the main + body text or specific fields, and has many additional + features, broadly equivalent to those provided by + complex search + interface in the GUI.

    The query language processor is activated in the GUI simple search entry when the search mode selector is set to - Query Language. It can also - be used with the KIO slave or the command line search. It - broadly has the same capabilities as the complex search - interface in the GUI.

    -

    The language was based on the now defunct Xesam user search language - specification.

    + Query Language. It can also be + used from the command line search, the KIO slave, or the + WEB UI.

    If the results of a query language search puzzle you and you doubt what has been actually searched for, you can use the GUI Show Query link at the top of the result list to check the exact query which was finally executed by Xapian.

    -

    Here follows a sample request that we are going to - explain:

    -
    +        
    +
    +
    +
    +

    3.5.1. General + syntax

    +
    +
    +
    +

    Here follows a sample request that we are going to + explain:

    +
             author:"john doe" Beatles OR Lennon Live OR Unplugged -potatoes
           
    -

    This would search for all documents with John Doe appearing as a - phrase in the author field (exactly what this is would - depend on the document type, ie: the From: header, for an email message), and - containing either beatles or lennon and either - live or - unplugged but not - potatoes (in any - part of the document).

    -

    An element is composed of an optional field - specification, and a value, separated by a colon (the field - separator is the last colon in the element). Examples: - Eugenie, - author:balzac, - dc:title:grandet - dc:title:"eugenie - grandet"

    -

    The colon, if present, means "contains". Xesam defines - other relations, which are mostly unsupported for now - (except in special cases, described further down).

    -

    All elements in the search entry are normally combined - with an implicit AND. It is possible to specify that - elements be OR'ed instead, as in Beatles OR Lennon. The OR must be entered literally (capitals), - and it has priority over the AND associations: word1 word2 OR word3 means word1 AND (word2 OR word3) not (word1 AND word2) OR word3.

    -

    Recoll versions 1.21 - and later, allow using parentheses to group elements, which - will sometimes make things clearer, and may allow - expressing combinations which would have been difficult - otherwise.

    -

    An element preceded by a - - specifies a term that should not appear.

    -

    As usual, words inside quotes define a phrase (the order - of words is significant), so that title:"prejudice pride" is - not the same as title:prejudice - title:pride, and is unlikely to find a - result.

    -

    Words inside phrases and capitalized words are not - stem-expanded. Wildcards may be used anywhere inside a - term. Specifying a wild-card on the left of a term can - produce a very slow search (or even an incorrect one if the - expansion is truncated because of excessive size). Also see - More about - wildcards.

    -

    To save you some typing, recent Recoll versions (1.20 and later) - interpret a comma-separated list of terms for a field as an - AND list inside the field. Use slash characters ('/') for - an OR list. No white space is allowed. So

    -
    author:john,lennon
    -

    will search for documents with john and lennon inside the author field (in any order), and

    -
    author:john/ringo
    -

    would search for john or - ringo. This behaviour only - happens for field queries (input without a field, comma- or - slash- separated input will produce a phrase search). You - can use a text field name to - search the main text this way.

    -

    Modifiers can be set on a double-quote value, for - example to specify a proximity search (unordered). See - the modifier section. No space - must separate the final double-quote and the modifiers - value, e.g. "two - one"po10

    -

    Recoll currently - manages the following default fields:

    -
    -
      -
    • -

      title, subject or caption are synonyms which specify - data to be searched for in the document title or - subject.

      -
    • -
    • -

      author or - from for searching the - documents originators.

      -
    • -
    • -

      recipient or - to for searching the - documents recipients.

      -
    • -
    • -

      keyword for searching - the document-specified keywords (few documents - actually have any).

      -
    • -
    • -

      filename for the - document's file name. This is not necessarily set for - all documents: internal documents contained inside a - compound one (for example an EPUB section) do not - inherit the container file name any more, this was - replaced by an explicit field (see next). - Sub-documents can still have a specific filename, if it is implied by the - document format, for example the attachment file name - for an email attachment.

      -
    • -
    • -

      containerfilename. - This is set for all documents, both top-level and - contained sub-documents, and is always the name of - the filesystem directory entry which contains the - data. The terms from this field can only be matched - by an explicit field specification (as opposed to - terms from filename - which are also indexed as general document content). - This avoids getting matches for all the sub-documents - when searching for the container file name.

      -
    • -
    • -

      ext specifies the - file name extension (Ex: ext:html).

      -
    • -
    • -

      rclmd5 the MD5 - checksum for the document. This is used for - displaying the duplicates of a search result (when - querying with the option to collapse duplicate - results). Incidentally, this could be used to find - the duplicates of any given file by computing its MD5 - checksum and executing a query with just the - rclmd5 value.

      -
    • -
    +

    This would search for all documents with John Doe appearing as a + phrase in the author field (exactly what this is would + depend on the document type, ie: the From: header, for an email message), and + containing either beatles or lennon and either + live or + unplugged but + not potatoes + (in any part of the document).

    +

    An element is composed of an optional field + specification, and a value, separated by a colon (the + field separator is the last colon in the element). + Examples:

    +
    +
      +
    • Eugenie
    • +
    • author:balzac
    • +
    • dc:title:grandet
    • +
    • dc:title:"eugenie + grandet"
    • +
    +
    +

    The colon, if present, means "contains". Xesam defines + other relations, which are mostly unsupported for now + (except in special cases, described further down).

    +

    All elements in the search entry are normally combined + with an implicit AND. It is possible to specify that + elements be OR'ed instead, as in Beatles OR Lennon. The OR must be entered literally (capitals), + and it has priority over the AND associations: word1 word2 OR word3 means word1 AND (word2 OR word3) not (word1 AND word2) OR word3.

    +

    You can use parentheses to group elements (from + version 1.21), which will sometimes make things clearer, + and may allow expressing combinations which would have + been difficult otherwise.

    +

    An element preceded by a - specifies a term that should + not appear.

    +

    As usual, words inside quotes define a phrase (the + order of words is significant), so that title:"prejudice pride" + is not the same as title:prejudice + title:pride, and is unlikely to find a + result.

    +

    Words inside phrases and capitalized words are not + stem-expanded. Wildcards may be used anywhere inside a + term. Specifying a wildcard on the left of a term can + produce a very slow search (or even an incorrect one if + the expansion is truncated because of excessive size). + Also see More about + wildcards.

    +

    To save you some typing, Recoll versions 1.20 and later + interpret a field value given as a comma-separated list + of terms as an AND list and a slash-separated list as an + OR list. No white space is allowed. So

    +
    author:john,lennon
    +

    will search for documents with john and lennon inside the author field (in any order), and

    +
    author:john/ringo
    +

    would search for john or + ringo. This behaviour is + only triggered by a field prefix: without it, comma- or + slash- separated input will produce a phrase search. + However, you can use a text + field name to search the main text this way, as an + alternate to using an explicit OR, e.g. text:napoleon/bonaparte would generate a + search for napoleon or bonaparte in the main + text body.

    +

    Modifiers can be set on a double-quote value, for + example to specify a proximity search (unordered). See + the modifier section. No + space must separate the final double-quote and the + modifiers value, e.g. "two + one"po10

    +

    Recoll currently + manages the following default fields:

    +
    +
      +
    • +

      title, subject or caption are synonyms which specify + data to be searched for in the document title or + subject.

      +
    • +
    • +

      author or + from for searching the + documents originators.

      +
    • +
    • +

      recipient or + to for searching the + documents recipients.

      +
    • +
    • +

      keyword for + searching the document-specified keywords (few + documents actually have any).

      +
    • +
    • +

      filename for the + document's file name. You can use the shorter + fn alias. This value + is not set for all documents: internal documents + contained inside a compound one (for example an + EPUB section) do not inherit the container file + name any more, this was replaced by an explicit + field (see next). Sub-documents can still have a + filename, if it is + implied by the document format, for example the + attachment file name for an email attachment.

      +
    • +
    • +

      containerfilename, + aliased as cfn. This + is set for all documents, both top-level and + contained sub-documents, and is always the name of + the filesystem file which contains the data. The + terms from this field can only be matched by an + explicit field specification (as opposed to terms + from filename which + are also indexed as general document content). This + avoids getting matches for all the sub-documents + when searching for the container file name.

      +
    • +
    • +

      ext specifies the + file name extension (Ex: ext:html).

      +
    • +
    • +

      rclmd5 the MD5 + checksum for the document. This is used for + displaying the duplicates of a search result (when + querying with the option to collapse duplicate + results). Incidentally, this could be used to find + the duplicates of any given file by computing its + MD5 checksum and executing a query with just the + rclmd5 value.

      +
    • +
    +
    +

    You can define aliases for field names, in order to + use your preferred denomination or to save typing (e.g. + the predefined fn and + cfn aliases defined for + filename and containerfilename). See the section about the + fields file.

    +

    The document input handlers have the possibility to + create other fields with arbitrary names, and aliases may + be defined in the configuration, so that the exact field + search possibilities may be different for you if someone + took care of the customisation.

    -

    Recoll 1.20 and later - have a way to specify aliases for the field names, which - will save typing, for example by aliasing filename to fn or containerfilename to cfn. See the section about the - fields file.

    -

    The document input handlers used while indexing have the - possibility to create other fields with arbitrary names, - and aliases may be defined in the configuration, so that - the exact field search possibilities may be different for - you if someone took care of the customisation.

    -

    The field syntax also supports a few field-like, but - special, criteria:

    -
    -
      -
    • -

      dir for filtering the - results on file location (Ex: dir:/home/me/somedir). -dir also works to find results not - in the specified directory (release >= 1.15.8). - Tilde expansion will be performed as usual (except - for a bug in versions 1.19 to 1.19.11p1). Wildcards - will be expanded, but please have a look at an - important limitation of wildcards in path - filters.

      -

      Relative paths also make sense, for example, - dir:share/doc would - match either /usr/share/doc or /usr/local/share/doc

      -

      Several dir clauses - can be specified, both positive and negative. For - example the following makes sense:

      -
      -          dir:recoll dir:src -dir:utils -dir:common
      -          
      -

      This would select results which have both - recoll and src in the path (in any order), and - which have not either utils or common.

      -

      You can also use OR - conjunctions with dir: - clauses.

      -

      A special aspect of dir clauses is that the values in - the index are not transcoded to UTF-8, and never - lower-cased or unaccented, but stored as binary. This - means that you need to enter the values in the exact - lower or upper case, and that searches for names with - diacritics may sometimes be impossible because of - character set conversion issues. Non-ASCII UNIX file - paths are an unending source of trouble and are best - avoided.

      -

      You need to use double-quotes around the path - value if it contains space characters.

      -
    • -
    • -

      size for filtering - the results on file size. Example: size<10000. You can use - <, > or = as operators. You can specify a - range like the following: size>100 size<1000. The usual - k/K, m/M, g/G, t/T can - be used as (decimal) multipliers. Ex: size>1k to search for files - bigger than 1000 bytes.

      -
    • -
    • -

      date for searching or - filtering on dates. The syntax for the argument is - based on the ISO8601 standard for dates and time - intervals. Only dates are supported, no times. The - general syntax is 2 elements separated by a - / character. Each - element can be a date or a period of time. Periods - are specified as PnYnMnD. The n numbers are the - respective numbers of years, months or days, any of - which may be missing. Dates are specified as - YYYY-MM-DD. The days and - months parts may be missing. If the / is present but an element is - missing, the missing element is interpreted as the - lowest or highest date in the index. Examples:

      -
      -
        -
      • -

        2001-03-01/2002-05-01 the - basic syntax for an interval of dates.

        -
      • -
      • -

        2001-03-01/P1Y2M the same - specified with a period.

        -
      • -
      • -

        2001/ from the - beginning of 2001 to the latest date in the - index.

        -
      • -
      • -

        2001 the whole - year of 2001

        -
      • -
      • -

        P2D/ means 2 - days ago up to now if there are no documents - with dates in the future.

        -
      • -
      • -

        /2003 all - documents from 2003 or older.

        -
      • -
      +
      +
      +
      +
      +

      3.5.2. Special + field-like specifiers

      -

      Periods can also be specified with small letters - (ie: p2y).

      -
    • -
    • -

      mime or format for specifying the MIME type. - These clauses are processed besides the normal - Boolean logic of the search. Multiple values will be - OR'ed (instead of the normal AND). You can specify - types to be excluded, with the usual -, and use wildcards. Example: - mime:text/* - -mime:text/plain Specifying an explicit - boolean operator before a mime specification is not supported - and will produce strange results.

      -
    • -
    • -

      type or rclcat for specifying the category - (as in text/media/presentation/etc.). The - classification of MIME types in categories is defined - in the Recoll - configuration (mimeconf), and can be modified or - extended. The default category names are those which - permit filtering results in the main GUI screen. - Categories are OR'ed like MIME types above, and can - be negated with -.

      -
    • -
    • -

      issub for specifying - that only standalone (issub:0) or only embedded - (issub:1) documents - should be returned as results.

      -
    • -
    -
    -
    -

    Note

    -

    mime, rclcat, size, issub - and date criteria always - affect the whole query (they are applied as a final - filter), even if set with other terms inside a - parenthese.

    -
    -
    -

    Note

    -

    mime (or the equivalent - rclcat) is the only field with an - OR default. You do need to - use OR with ext terms for example.

    +
    +
    +

    The field syntax also supports a few field-like, but + special, criteria, for which the values are interpreted + differently. Regular processing does not apply (for + example the slash- or comma- separated lists don't work). + A list follows.

    +
    +
      +
    • +

      dir + for filtering the results on file location. For + example, dir:/home/me/somedir will restrict + the search to results found anywhere under the + /home/me/somedir + directory (including subdirectories).

      +

      Tilde expansion will be performed as usual. + Wildcards will be expanded, but please have a look at + an important limitation of wildcards in path + filters.

      +

      You can also use relative paths. For example, + dir:share/doc would + match either /usr/share/doc or /usr/local/share/doc.

      +

      -dir will find + results not + in the specified location.

      +

      Several dir clauses + can be specified, both positive and negative. For + example the following makes sense:

      +
      dir:recoll dir:src -dir:utils -dir:common
      +

      This would select results which have both + recoll and + src in the path (in + any order), and which have not either utils or common.

      +

      You can also use OR + conjunctions with dir: + clauses.

      +

      A special aspect of dir clauses is that the values in + the index are not transcoded to UTF-8, and never + lower-cased or unaccented, but stored as binary. + This means that you need to enter the values in the + exact lower or upper case, and that searches for + names with diacritics may sometimes be impossible + because of character set conversion issues. + Non-ASCII UNIX file paths are an unending source of + trouble and are best avoided.

      +

      You need to use double-quotes around the path + value if it contains space characters.

      +

      The shortcut syntax to define OR or AND lists + within fields with commas or slash characters is + not available.

      +
    • +
    • +

      size for filtering + the results on file size. Example: size<10000. You can use + <, > or = as operators. You can specify a + range like the following: size>100 size<1000. The + usual k/K, m/M, g/G, + t/T can be used as (decimal) multipliers. + Ex: size>1k to + search for files bigger than 1000 bytes.

      +
    • +
    • +

      date for searching + or filtering on dates. The syntax for the argument + is based on the ISO8601 standard for dates and time + intervals. Only dates are supported, no times. The + general syntax is 2 elements separated by a + / character. Each + element can be a date or a period of time. Periods + are specified as PnYnMnD. The n numbers are the + respective numbers of years, months or days, any of + which may be missing. Dates are specified as + YYYY-MM-DD. The days and + months parts may be missing. If the / is present but an element is + missing, the missing element is interpreted as the + lowest or highest date in the index. Examples:

      +
      +
        +
      • +

        2001-03-01/2002-05-01 the + basic syntax for an interval of dates.

        +
      • +
      • +

        2001-03-01/P1Y2M the same + specified with a period.

        +
      • +
      • +

        2001/ from + the beginning of 2001 to the latest date in + the index.

        +
      • +
      • +

        2001 the + whole year of 2001

        +
      • +
      • +

        P2D/ means 2 + days ago up to now if there are no documents + with dates in the future.

        +
      • +
      • +

        /2003 all + documents from 2003 or older.

        +
      • +
      +
      +

      Periods can also be specified with small letters + (ie: p2y).

      +
    • +
    • +

      mime or + format for specifying + the MIME type. These clauses are processed apart + from the normal Boolean logic of the search: + multiple values will be OR'ed (instead of the + normal AND). You can specify types to be excluded, + with the usual -, and + use wildcards. Example: mime:text/* + -mime:text/plain. Specifying an + explicit boolean operator before a mime specification is not + supported and will produce strange results.

      +
    • +
    • +

      type or + rclcat for specifying + the category (as in text/media/presentation/etc.). + The classification of MIME types in categories is + defined in the Recoll configuration + (mimeconf), and can + be modified or extended. The default category names + are those which permit filtering results in the + main GUI screen. Categories are OR'ed like MIME + types above, and can be negated with -.

      +
    • +
    • +

      issub for + specifying that only standalone (issub:0) or only embedded + (issub:1) documents + should be returned as results.

      +
    • +
    +
    +
    +

    Note

    +

    mime, rclcat, size, issub and date criteria always affect the whole + query (they are applied as a final filter), even if set + with other terms inside a parenthese.

    +
    +
    +

    Note

    +

    mime (or the equivalent + rclcat) is the + only field with + an OR default. You do need + to use OR with + ext terms for example.

    +

    3.5.1. Range + id="RCL.SEARCH.LANG.RANGES">3.5.3. Range clauses

    @@ -5634,7 +5702,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r

    3.5.2. Modifiers

    + "RCL.SEARCH.LANG.MODIFIERS">3.5.4. Modifiers
    @@ -5698,8 +5766,8 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r

    3.6. Anchored - searches and wildcards

    + "RCL.SEARCH.ANCHORWILD">3.6. Wildcards and + anchored searches
    @@ -5714,8 +5782,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r

    3.6.1. More - about wildcards

    + id="RCL.SEARCH.WILDCARDS">3.6.1. Wildcards
    diff --git a/src/doc/user/usermanual.xml b/src/doc/user/usermanual.xml index e3f49673..c8b801e5 100644 --- a/src/doc/user/usermanual.xml +++ b/src/doc/user/usermanual.xml @@ -1399,26 +1399,22 @@ metadatacmds = ; tags = tmsu tags %f extend the field configuration. - Once re-indexing is performed (you will need to force the file - reindexing, &RCL; will not detect the need by itself), you will be - able to search from the query language, through any of its aliases: - tags:some/alternate/values or - tags:all,these,values (the compact field search - syntax is supported for recoll 1.20 and later. For older versions, - you would need to repeat the tags: - specifier for each term, e.g. tags:some - OR - tags:alternate). + Once re-indexing is performed (you will need to force the file reindexing, &RCL; will + not detect the need by itself), you will be able to search from the query language, through + any of its aliases: tags:some/alternate/values + or tags:all,these,values. The compact comma- or slash-based field + search syntax is supported for recoll 1.20 and later. For older versions, you would need to + repeat the tags: specifier for each term, + e.g. tags:some OR + tags:alternate. - Tags changes will not be detected by - the indexer if the file itself did not change. One possible - workaround would be to update the file ctime when - you modify the tags, which - would be consistent with how extended attributes function. A pair of - chmod commands could accomplish this, or a - touch -a . Alternatively, just - couple the tag update with a - recollindex -e -i /path/to/the/file. + Tags changes will not be detected by the indexer if the file itself did not change. One + possible workaround would be to update the file ctime when you modify the + tags, which would be consistent with how extended attributes function. A pair + of chmod commands could accomplish this, or a + touch -a. + Alternatively, just couple the tag update with a + recollindex -e -i /path/to/the/file. @@ -1918,11 +1914,12 @@ fs.inotify.max_user_watches=32768 In All Terms mode, &RCL; looks for documents containing all your input terms. - Query Language mode behaves like - All Terms in the absence of special input, but - it can also do much more. This is the best mode for getting the - most of &RCL;. + The Query Language mode behaves like All + Terms in the absence of special input, but it can also do much more. This is the + best mode for getting the most of &RCL;. It is usable from all possible interfaces (GUI, + command line, WEB UI, ...), and is described + here. In Any Term mode, &RCL; looks for documents containing any your input terms, preferring those @@ -2067,8 +2064,8 @@ fs.inotify.max_user_watches=32768 The File name search mode will specifically look for file names. The point of having a separate - file name search is that wild card expansion can be performed more - efficiently on a small subset of the index (allowing wild cards on + file name search is that wildcard expansion can be performed more + efficiently on a small subset of the index (allowing wildcards on the left of terms without excessive cost). Things to know: White space in the entry should match white @@ -2077,7 +2074,7 @@ fs.inotify.max_user_watches=32768 The search is insensitive to character case and accents, independently of the type of index. - An entry without any wild card + An entry without any wildcard character and not capitalized will be prepended and appended with '*' (ie: etc -> *etc*, but @@ -3940,24 +3937,26 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r The query language - The query language processor is activated in the GUI - simple search entry when the search mode selector is set to - Query Language. It can also be used with the KIO - slave or the command line search. It broadly has the same - capabilities as the complex search interface in the - GUI. + The &RCL; query language was based on the now defunct + + Xesam user search language specification. It allows defining general boolean + searches within the main body text or specific fields, and has many additional features, + broadly equivalent to those provided by complex search interface in the + GUI. - The language was based on the now defunct - - Xesam user search language specification. + The query language processor is activated in the GUI simple search entry when the search + mode selector is set to Query Language. It can also be used from the + command line search, the KIO slave, or the WEB UI. If the results of a query language search puzzle you and you - doubt what has been actually searched for, you can use the GUI - Show Query link at the top of the result list to - check the exact query which was finally executed by Xapian. + doubt what has been actually searched for, you can use the GUI Show Query + link at the top of the result list to check the exact query which was finally executed by + Xapian. - Here follows a sample request that we are going to - explain: + + General syntax + + Here follows a sample request that we are going to explain: author:"john doe" Beatles OR Lennon Live OR Unplugged -potatoes @@ -3977,10 +3976,12 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r An element is composed of an optional field specification, and a value, separated by a colon (the field separator is the last colon in the element). Examples: - Eugenie, - author:balzac, - dc:title:grandet - dc:title:"eugenie grandet" + + Eugenie + author:balzac + dc:title:grandet + dc:title:"eugenie grandet" + The colon, if present, means "contains". Xesam defines other @@ -4005,41 +4006,38 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r word2) OR word3. - &RCL; versions 1.21 and later, allow using parentheses to - group elements, which will sometimes make things clearer, and may - allow expressing combinations which would have been difficult + You can use parentheses to group elements (from version 1.21), which will sometimes make + things clearer, and may allow expressing combinations which would have been difficult otherwise. An element preceded by a - specifies a - term that should not appear. + term that should not appear. - As usual, words inside quotes define a phrase - (the order of words is significant), so that - title:"prejudice pride" is not the same as - title:prejudice title:pride, and is - unlikely to find a result. + As usual, words inside quotes define a phrase (the order of words is significant), so + that title:"prejudice pride" is not the same + as title:prejudice title:pride, and is unlikely to find a + result. - Words inside phrases and capitalized words are not - stem-expanded. Wildcards may be used anywhere inside a term. - Specifying a wild-card on the left of a term can produce a very - slow search (or even an incorrect one if the expansion is - truncated because of excessive size). Also see - More about wildcards. + Words inside phrases and capitalized words are not stem-expanded. Wildcards may be used + anywhere inside a term. Specifying a wildcard on the left of a term can produce a very slow + search (or even an incorrect one if the expansion is truncated because of excessive + size). Also see More about wildcards. - To save you some typing, recent &RCL; versions (1.20 and later) - interpret a comma-separated list of terms for a field as an AND list - inside the field. Use slash characters ('/') for an OR list. No white - space is allowed. So - author:john,lennon will search for - documents with john and lennon - inside the author field (in any order), and - author:john/ringo would search for - john or ringo. This behaviour - only happens for field queries (input without a field, comma- or - slash- separated input will produce a phrase search). You can use a - text field name to search the main text this - way. + To save you some typing, &RCL; versions 1.20 and later + interpret a field value given as a comma-separated list of terms as an AND list and a + slash-separated list as an OR list. No white space is + allowed. So author:john,lennon will search for documents + with john and lennon inside + the author field (in any order), + and author:john/ringo would search + for john or ringo. This behaviour is only triggered by + a field prefix: without it, comma- or slash- separated input will produce a phrase + search. However, you can use a text field name to search the main text + this way, as an alternate to using an explicit OR, + e.g. text:napoleon/bonaparte would generate a search + for napoleon or bonaparte in the main + text body. Modifiers can be set on a double-quote value, for example to specify a proximity search (unordered). See @@ -4073,23 +4071,20 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r filename for the document's - file name. This is not necessarily set for all documents: - internal documents contained inside a compound one (for example - an EPUB section) do not inherit the container file name any more, - this was replaced by an explicit field (see next). Sub-documents - can still have a specific filename, if it is - implied by the document format, for example the attachment file - name for an email attachment. + file name. You can use the shorter fn alias. This value is not set + for all documents: internal documents contained inside a compound one (for example an + EPUB section) do not inherit the container file name any more, this was replaced by an + explicit field (see next). Sub-documents can still have a filename, + if it is implied by the document format, for example the attachment file name for an + email attachment. - containerfilename. This is - set for all documents, both top-level and contained - sub-documents, and is always the name of the filesystem directory - entry which contains the data. The terms from this field can - only be matched by an explicit field specification (as opposed - to terms from filename which are also indexed - as general document content). This avoids getting matches for - all the sub-documents when searching for the container file - name. + containerfilename, aliased + as cfn. This is set for all documents, both top-level and contained + sub-documents, and is always the name of the filesystem file which contains the + data. The terms from this field can only be matched by an explicit field specification + (as opposed to terms from filename which are also indexed as general + document content). This avoids getting matches for all the sub-documents when searching + for the container file name. ext specifies the file name extension @@ -4106,66 +4101,69 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r - &RCL; 1.20 and later have a way to specify aliases for the - field names, which will save typing, for example by aliasing - filename to fn or - containerfilename to - cfn. See the - section about the fields file. + You can define aliases for field names, in order to use your preferred denomination or + to save typing (e.g. the predefined fn and cfn aliases + defined for filename and containerfilename). See + the section about the fields + file. - The document input handlers used while indexing have the - possibility to create other fields with arbitrary names, and - aliases may be defined in the configuration, so that the exact - field search possibilities may be different for you if someone - took care of the customisation. + The document input handlers have the possibility to create other fields with arbitrary + names, and aliases may be defined in the configuration, so that the exact field search + possibilities may be different for you if someone took care of the customisation. + - The field syntax also supports a few field-like, but - special, criteria: + + Special field-like specifiers + + The field syntax also supports a few field-like, but special, criteria, for which the + values are interpreted differently. Regular processing does not apply (for example the + slash- or comma- separated lists don't work). A list follows. - dir for filtering the - results on file location - (Ex: dir:/home/me/somedir). - -dir - also works to find results not in the specified directory - (release >= 1.15.8). Tilde expansion will be performed as - usual (except for a bug in versions 1.19 to - 1.19.11p1). Wildcards will be expanded, but - please - have a look - at an important limitation of wildcards in path filters. + + dir for filtering the + results on file location. For example, dir:/home/me/somedir will + restrict the search to results found anywhere under + the /home/me/somedir directory (including + subdirectories). - Relative paths also make sense, for example, - dir:share/doc would match either - /usr/share/doc or - /usr/local/share/doc + Tilde expansion will be performed as usual. Wildcards will be expanded, but + please have a look at an important + limitation of wildcards in path filters. - Several dir clauses can be specified, - both positive and negative. For example the following makes sense: - - dir:recoll dir:src -dir:utils -dir:common - This would select results which have both - recoll and src in the - path (in any order), and which have not either - utils or - common. + You can also use relative paths. For example, dir:share/doc would + match either /usr/share/doc + or /usr/local/share/doc. + + -dir will find + results not in the specified location. + + Several dir clauses can be specified, + both positive and negative. For example the following makes sense: + dir:recoll dir:src -dir:utils -dir:common + This would select results which have both + recoll and src in the + path (in any order), and which have not either + utils or + common. - You can also use OR conjunctions - with dir: clauses. + You can also use OR conjunctions + with dir: clauses. A special aspect of dir clauses is - that the values in the index are not transcoded to UTF-8, and - never lower-cased or unaccented, but stored as binary. This means - that you need to enter the values in the exact lower or upper - case, and that searches for names with diacritics may sometimes - be impossible because of character set conversion - issues. Non-ASCII UNIX file paths are an unending source of - trouble and are best avoided. + that the values in the index are not transcoded to UTF-8, and never lower-cased or + unaccented, but stored as binary. This means that you need to enter the values in the + exact lower or upper case, and that searches for names with diacritics may sometimes be + impossible because of character set conversion issues. Non-ASCII UNIX file paths are an + unending source of trouble and are best avoided. - You need to use double-quotes around the path value if it - contains space characters. + You need to use double-quotes around the path value if it contains space + characters. + + The shortcut syntax to define OR or AND lists within fields with commas or slash + characters is not available. @@ -4219,17 +4217,13 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r p2y). - mime or - format for specifying the - MIME type. These clauses are processed besides the normal - Boolean logic of the search. Multiple values will be OR'ed - (instead of the normal AND). You can specify types to be + mime or format for specifying the MIME + type. These clauses are processed apart from the normal Boolean logic of the search: + multiple values will be OR'ed (instead of the normal AND). You can specify types to be excluded, with the usual -, and use - wildcards. Example: mime:text/* - -mime:text/plain - Specifying an explicit boolean - operator before a mime specification is not - supported and will produce strange results. + wildcards. Example: mime:text/* -mime:text/plain. Specifying an + explicit boolean operator before a mime specification is not supported + and will produce strange results. type or @@ -4264,6 +4258,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r field with an OR default. You do need to use OR with ext terms for example. + Range clauses @@ -4343,20 +4338,18 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r - Anchored searches and wildcards + Wildcards and anchored searches Some special characters are interpreted by &RCL; in search - strings to expand or specialize the search. Wildcards expand a root - term in controlled ways. Anchor characters can restrict a search to - succeed only if the match is found at or near the beginning of the - document or one of its fields. + strings to expand or specialize the search. Wildcards expand a root term in controlled + ways. Anchor characters can restrict a search to succeed only if the match is found at or + near the beginning of the document or one of its fields. - More about wildcards + Wildcards All words entered in &RCL; search fields will be processed - for wildcard expansion before the request is finally - executed. + for wildcard expansion before the request is finally executed. The wildcard characters are: @@ -4376,8 +4369,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r - You should be aware of a few things when using - wildcards. + You should be aware of a few things when using wildcards. Using a wildcard character at the beginning of