Release notes for Recoll 1.18.x
Caveats
Installing over an older version: 1.18 introduces serious index formats changes, and it will be advisable to reset the index in most cases. IF the 1.18 index is not configured for case and diacritics sensitivity, it is mostly compatible with 1.17 indexes though. Case/diacritics sensitivity can be turned off either by a compile flag or a configuration variable, and the default is still a stripped index (so, mostly compatible with 1.17). If you activate case and diacritics sensitivity, you must reset the index.
Always reset the index if installing over an older version (1.14 and older). The simplest way to do this is to quit all recoll programs and just delete the index directory (rm -rf ~/.recoll/xapiandb), then start recoll or recollindex. recollindex -z will do the same in most cases.
Some new, auxiliary, features also require a full reindex:
- The file size filtering functions if the existing index was created by version 1.16 or older.
- The anchored search feature if the index was created by release 1.15 or older.
Changes
Recoll 1.18 has some major changes, the most visible of which is the ability to search for exact matches of character case and diacritics.
Recoll 1.18.0 changes:
- The index can now be configured for case and diacritics sensitivity, in which case raw terms are indexed. On such an index, search insensitivity to case and diacriics is obtained, when desired, by query time expansion, in a similar manner to what is used for stemming. See the manual chapter for details about controlling the feature.
- Recoll has a new capacity to store page break locations and use them when opening a document at the location for a given match. This currently works with PDF, Postscript and DVI documents, and the evince viewer.
- Recoll can now also pass a search string to the native application.
- The GUI result list has a new "snippets" window for documents with page numbers, which let the user choose a snippet and open the document at the appropriate page.
- We now allow multiple directory specifications in the query language, as in: dir:/home/me -dir:tmp
- The search inside the GUI preview window, has been improved, and allows selecting from a list one of the initial term groups as the search target.
- A new script dedicated to laptops, which can start or stop recollindex according to mains power status.
- Added <pre style="white-space: pre-wrap"> to plain text HTML display options. This will often be the best option to display plain text: it will better respect indentation, while folding long lines.
- When running in an UTF-8 locale, and after decoding a plain text file as UTF-8 fails, indexing will try again using an 8bit character set heuristically chosen according to the locale country code.
- A new configuration, maxmemberkbs, has been implemented to limit the size of archive members we process. This will avoid recoll trying to read a 4 GB ISO from a zip archive...
- Proper error reporting when a wildcard expansion is truncated for size. An incomplete search could previously be performed without any indication.
- More effort is also put in choosing the terms used in generating the snippets inside the result list.
- Recoll now uses the Xapian "synonyms" mechanism to store all data about stemming, case, and diacritics expansion (this replaces the previous ad-hoc stemming expansion mechanism).
- Partial autodetection of thunderbird mailboxes found out of the configured location.
- Implemented a list of mime types that should be opened with the locally configured application even when Use Desktop Preferences is checked. This will permit, for example, using evince for its page access capabilities on PDF files, while letting the desktop handle all the other mime types.
- Fixed bugs:
- The unac_except_trans mechanism could be buggy in some cases and generate wrong character translations.
- Don't terminate monitor for permissions-related addwatch error.
- Fix handling of ODF documents exported by Google docs.