244 lines
12 KiB
HTML
244 lines
12 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
||
<html>
|
||
<head>
|
||
<title>Recoll 1.20 series release notes</title>
|
||
<meta name="Author" content="Jean-Francois Dockes">
|
||
<meta name="Description"
|
||
content="recoll is a simple full-text search system for unix and linux based on the powerful and mature xapian engine">
|
||
<meta name="Keywords" content="full text search, desktop search, unix, linux">
|
||
<meta http-equiv="Content-language" content="en">
|
||
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
|
||
<meta name="robots" content="All,Index,Follow">
|
||
<link type="text/css" rel="stylesheet" href="styles/style.css">
|
||
</head>
|
||
|
||
<body>
|
||
|
||
<div class="rightlinks">
|
||
<ul>
|
||
<li><a href="index.html">Home</a></li>
|
||
<li><a href="download.html">Downloads</a></li>
|
||
<li><a href="doc.html">Documentation</a></li>
|
||
</ul>
|
||
</div>
|
||
|
||
<div class="content">
|
||
<h1>Release notes for Recoll 1.20.x</h1>
|
||
|
||
<h2>Caveats</h2>
|
||
|
||
<p><em>Installing over an older version</em>: 1.19 </p>
|
||
|
||
<p>Installing 1.20 over an 1.19 index is possible, but there
|
||
have been small changes in the way compound words (e.g. email
|
||
addresses) are indexed, so it will be best to reset the
|
||
index. Still, in a pinch, 1.20 search can mostly use an 1.19
|
||
index.</p>
|
||
|
||
<p>Always reset the index if you do not know by which version it
|
||
was created (you're not sure it's at least 1.18). The best method
|
||
is to quit all Recoll programs and delete the index directory
|
||
(<span class="literal">
|
||
rm -rf ~/.recoll/xapiandb</span>), then start <code>recoll</code>
|
||
or <code>recollindex</code>. <br>
|
||
|
||
<span class="literal">recollindex -z</span> will do the same
|
||
in most, but not all, cases. It's better to use
|
||
the <tt>rm</tt> method, which will also ensure that no debris
|
||
from older releases remain (e.g.: old stemming files which are
|
||
not used any more).</p>
|
||
|
||
<p>Case/diacritics sensitivity is off by default. It can be
|
||
turned on <em>only</em> by editing
|
||
recoll.conf (
|
||
<a href="usermanual/usermanual.html#RCL.INDEXING.CONFIG.SENS">
|
||
see the manual</a>). If you do so, you must then reset the
|
||
index.</p>
|
||
|
||
|
||
<h2><a name="minor_releases">Minor releases at a glance</a></h2>
|
||
|
||
<ul>
|
||
<li>1.20.5 is 1.20.4 with a few Qt 5 compatibility
|
||
tweaks. Builds and runs with Qt 5.3.2, fails with 5.2.</li>
|
||
<li>1.20.4 has a fix to skip compress file system images
|
||
like xxx.img.gz by default. This should have been in 1.20.3</li>
|
||
<li>1.20.3 has a very minor change to copy the Query
|
||
Fragments Window config file from the default if it does
|
||
not exist in the user config dir.</li>
|
||
<li>1.20.2 fixes a bug which prevented the real time indexer
|
||
from indexing the web history queue (this was still processed
|
||
when starting up). It also adds systray capability to
|
||
the GUI.</li>
|
||
</ul>
|
||
|
||
<h2>Changes in Recoll 1.20.1</h2>
|
||
|
||
<ul>
|
||
|
||
<li>An <em>Open With</em> entry was added to the result list
|
||
and result table popup menus. This lets you choose an
|
||
alternative application to open a document. The list of
|
||
applications is built from the information inside
|
||
the <span class="filename">
|
||
/usr/share/applications</span> desktop files.</li>
|
||
|
||
<li>A new way for specifying multiple terms to be searched
|
||
inside a given field: it used to be that an entry lacking
|
||
whitespace but splittable, like [term1,term2] was
|
||
transformed into a phrase search, which made sense in some
|
||
cases, but no so many. The code was changed so that
|
||
[term1,term2] now means [term1 AND term2], and
|
||
[term1/term2] means [term1 OR term2]. This is
|
||
useful for field searches where you would previously be
|
||
forced to repeat the field name for every term.
|
||
[somefield:term1 somefield:term2] can now be expressed as
|
||
[somefield:term1,term2].
|
||
</li>
|
||
|
||
<li>(1.20.1) The <b>Query Fragments</b> tool was added to
|
||
the GUI. This is a window with customizable buttons to add
|
||
arbitrary query language fragments to the current
|
||
search. The buttons and fragments are defined in an xml
|
||
file inside the recoll configuration
|
||
directory <span class="filename">~/.recoll/fragbuts.xml</span>. This
|
||
makes it easy to define "pre-cooked" filters for things
|
||
that you need repeatedly.
|
||
<a href="usermanual/usermanual.html#RCL.SEARCH.GUI.FRAGBUTS">
|
||
See the manual</a> for more details.</li>
|
||
|
||
<li>We changed the way terms are generated from a compound
|
||
string (e.g. an email address). Previously, for an address
|
||
like <em>jfd@recoll.org</em>, only the simple terms and
|
||
the terms anchored at the start were generated
|
||
(<em>jfd</em>, <em>recoll</em>, <em>org</em>, <em>jfd@recoll</em>, <em>jfd@recoll.org</em>). The
|
||
new text splitter generates all the other possible terms
|
||
(here, <em>recoll.org</em> only), so that it is now
|
||
possible to search for left-truncated versions of the
|
||
compound, e.g., all emails from a given domain.</li>
|
||
|
||
<li>(1.20.1) New keyboard accelerators for the result table: Ctrl+r
|
||
switches the focus from the search entry to the table,
|
||
Ctrl+o opens the document for the current line, Ctrl+Shift+o
|
||
opens document and closes recoll, Ctrl+d previews the
|
||
document.</li>
|
||
|
||
<li>(1.20.1) A special term is now indexed for results from the web
|
||
history: use "-rclbes:BGL" to exclude the web results,
|
||
"rclbes:BGL" to restrict the results to the web ones. This
|
||
is difficult to remember, but the Query Fragments feature
|
||
means that you don't need to (this is in the sample Query
|
||
Fragments file).</li>
|
||
|
||
<li>Recoll now indexes <em>#hashtags</em> as such.</li>
|
||
|
||
<li>It is now possible to configure the GUI in wide form
|
||
factor by dragging the toolbars to one of the sides (their
|
||
location is remembered between sessions), and moving the
|
||
category filters to a menu (can be set in the
|
||
"Preferences->GUI configuration" panel).</li>
|
||
|
||
<li>We added the <em>indexedmimetypes</em> and
|
||
<em>excludedmimetypes</em> variables to the configuration
|
||
GUI, which was also compacted a bit. A bunch of
|
||
ininteresting variables were also removed.</li>
|
||
|
||
<li>When indexing, we no longer add the top container
|
||
file name as a term for the contained sub-documents (if
|
||
any). This made no sense in most cases, as it meant that
|
||
you would get hits on all the sections from a chm or epub
|
||
when the top file name matched the search, when you
|
||
probably wanted only the parent document in this case.<br>
|
||
However, the container file name was sometimes useful for
|
||
filtering results, and it is still accessible, in a
|
||
different way: the top container file name is added as a
|
||
term to all the sub-documents, <em>only for searching with
|
||
a prefix</em>. The field name
|
||
is <span class="literal">containerfilename</span>, and no
|
||
match on the subdocuments will occur if the field is not
|
||
specified (this is different from
|
||
previous <span class="literal">filename</span> processing,
|
||
which was indexed as a general
|
||
term. <span class="literal">containerfilename</span> is
|
||
also set on files without sub-documents (e.g. a pdf).</li>
|
||
|
||
<li>A new attribute, <span class="literal">pfxonly</span>,
|
||
was created to support the above change. This can be set
|
||
on any metadata field inside
|
||
the <span class="literal">[prefixes]</span> section of
|
||
the <span class="filename">fields</span> file. The
|
||
affected field terms will be indexed <em>only with a
|
||
prefix</em>, so they will cause a hit only for a field
|
||
search (the general behaviour is that field terms are
|
||
indexed both prefixed and not, so they can also cause a
|
||
hit when searched as general terms).</li>
|
||
|
||
<li>A new <span class="literal">[queryaliases]</span>
|
||
section was created in
|
||
the <span class="filename">fields</span>, for definining
|
||
field name aliases to be used only at query time (to avoid
|
||
unwanted collection of data on random fields during
|
||
indexing). The section is empty by default, but 2 obvious
|
||
aliases are commented: <span class="literal">filename=fn</span>
|
||
and <span class="literal">containerfilename=cfn</span>. Setting
|
||
them in your personal file may save you some typing if you
|
||
search on file names.</li>
|
||
|
||
<li>You can now use both <em>-e</em> and <em>-i</em> for
|
||
erasing then updating the index for the given file
|
||
arguments with the same <em>recollindex</em> command.</li>
|
||
|
||
<li>We now allow access to the Xapian docid for Recoll
|
||
documents in <span class="command">recollq</span> and
|
||
Python API search results. This allows writing scripts
|
||
which combine Recoll and pure Xapian operations. A sample
|
||
Python program to find document duplicates, using MD5
|
||
terms was added. See
|
||
<span class="filename">src/python/samples/docdups.py</span></li>
|
||
|
||
<li>The command used to identify the mime types of files
|
||
when the internal method fails used to be hard-coded
|
||
as <span class="literal">file -i</span>. It is now
|
||
possible to customize this command by setting
|
||
the <span class="literal">systemfilecommand</span> in the
|
||
configuration. A suggested value would
|
||
be <span class="filename">xdg-mime</span>, which sometimes
|
||
works better than <span class="filename">file</span>.</li>
|
||
|
||
<li>The result list has two new elements: %P substitution
|
||
for printing the parent folder name, and an <tt>F</tt>
|
||
link target which will open the parent folder in a
|
||
file manager window. e.g. <tt><a href='F%N'>Open parent directory</a></tt>
|
||
</li>
|
||
|
||
<li><span class="filename">/media</span> was added to the default
|
||
skippedPaths list mostly as a reminder that blindly
|
||
processing these with the general indexer is a bad idea
|
||
(use separate indexes instead).</li>
|
||
|
||
<li><span class="command">recollq</span>
|
||
and <span class="command">recoll -t</span> get a new
|
||
option <span class="literal">-N</span> to print field
|
||
names between values when
|
||
<span class="literal">-F</span> is used. In addition,
|
||
<span class="literal">-F ""</span> is taken as a
|
||
directive to print all fields.</li>
|
||
|
||
<li>Unicode <span class="literal">hyphen</span> (0x2010) is
|
||
now translated to ASCII
|
||
<span class="literal">minus</span>
|
||
during indexing and searching. There is no good way to
|
||
handle this character, given the varius misuses of minus
|
||
and hyphen. This choice was deemed "less bad" than the
|
||
previous one.</li>
|
||
|
||
<!-- <li>The purple filter (for pidgin and other messaging apps)
|
||
has been updated for newer log formats.</li> -->
|
||
|
||
</ul>
|
||
|
||
|
||
</div>
|
||
</body>
|
||
</html>
|