This commit is contained in:
dockes 2006-12-24 08:02:11 +00:00
parent 40ee0199a9
commit 5f022b80d5
9 changed files with 114 additions and 34 deletions

View File

@ -22,6 +22,7 @@ share/icons/hicolor/48x48/apps/recoll.png
%%DATADIR%%/filters/rclxls
%%DATADIR%%/images/document.png
%%DATADIR%%/images/drawing.png
%%DATADIR%%/images/folder.png
%%DATADIR%%/images/html.png
%%DATADIR%%/images/image.png
%%DATADIR%%/images/message.png

View File

@ -28,4 +28,5 @@ qtgui/recoll.pro
recollinstall
sampleconf/recoll.conf
sysconf
wasabi
wxgui

View File

@ -4,8 +4,7 @@ Bugs that are listed in an older version section are supposedly fixed in
later versions. Bugs listed in the topmost section may also exist in older
versions.
Latest (1.6.2):
Latest (1.6.3):
- 1.6 NEAR crashes: 1.6 has added NEAR searches. Unlike what recoll did
with PHRASES, stemming expansion is performed on terms inside NEAR
clauses (except if prevented by a capitalized entry of course). There is
@ -53,6 +52,11 @@ Latest (1.6.2):
exception handling (recoll catches an exception while trying the
yest inexistant db).
1.6.2
- Relatively unfrequent issue with message boundary detection in mbox
files, could cause miscellaneous problems.
- Executing an external viewer for a file with single-quotes in the name
would not work.
***************************************************************************
1.5.10
- If a defaultcharset was set in the configuration file for a subdirectory,

View File

@ -1,15 +1,30 @@
CHANGES
Updating from 1.2 to 1.3 or 1.4 or 1.5:
---------------------------------------
From version 1.3 up, there is a new feature to search specifically for file
names (with wildcard processing). If you want to take full advantage of
this, you should perform a full reindex after installing the new version
(ie: use recollindex -z, or delete ~/.recoll/xapiandb).
Also, we now use the central copies of configuration files for default
values, and the user ones only for overrides. Your old configuration files
will still work, but, you may want to remove them if they are unmodified,
or keep only the modified parameters.
1.7.0 2006-12-20
- Email attachments are now indexed.
- Right-click menu option to access the parent document of an embedded
result (ie from mail attachment to parent message).
- The sort tool has been improved: no need to restart the query after sort
criteria change.
- Support for real-time indexing with inotify is now enabled by default
when appropriate.
- Recoll now warns when the configured native viewer can not be found and
starts an interface for chosing another one.
- Categories (text, presentation, spreadsheets, etc.) can be used instead
of raw mime types when filtering on file types in advanced search.
- The port to qt4 is functional and can be enabled with configure --enable-qt4
- 'autophrase' option improved and may now actually be useful.
- Improved highlighting (again...)
- Display term frequencies in term explorer.
- Recollindex -e to remove data from index for listed files.
1.6.3
- Fixed problem with bad detection of mbox message boundaries.
Upgrading can change the message numbering in some cases, and you should
perform a full index update (recollindex -z) after installing
the new version.
- Fixed problem with execution of external viewer for files with
single-quotes in the name.
1.6.2
- Minor solaris compilation glitches only.
@ -34,6 +49,18 @@ or keep only the modified parameters.
managers.
- Improved recall for phrases with composite words like email addresses.
Updating from 1.2 to 1.3 or 1.4 or 1.5:
---------------------------------------
From version 1.3 up, there is a new feature to search specifically for file
names (with wildcard processing). If you want to take full advantage of
this, you should perform a full reindex after installing the new version
(ie: use recollindex -z, or delete ~/.recoll/xapiandb).
Also, we now use the central copies of configuration files for default
values, and the user ones only for overrides. Your old configuration files
will still work, but, you may want to remove them if they are unmodified,
or keep only the modified parameters.
1.5.9
- Fix bad timezone conversion in email dates. Display timezone in result
list dates.

View File

@ -54,21 +54,21 @@
decide what you may want to install.</p>
<h3>Source</h3>
<p><b>Current version:</b>
1.6.1: <a href="recoll-1.6.1.tar.gz">recoll-1.6.1.tar.gz</a>
See the <a href="BUGS.txt">known bugs and issues</a> and <a
href="CHANGES.txt">changes</a>.</p>
<p>recoll 1.6 has the capacity to perform proximity searches (a
bit like phrases, but unordered). There is a still unpatched
problem in Xapian 0.9.9 which will make NEAR searches fail.
If you intend to perform proximity searches, have a look at the
<a href="BUGS.txt">errata</a> for a workaround and Xapian
patch. All the statically linked binary packages below use a
patched Xapian-core library in order for NEAR searches to work.</p>
<p><b>The cutting edge</b>
Version 1.7.0: <a
href="recoll-1.7.0.tar.gz">recoll-1.7.0.tar.gz</a> brings some
nice features such as email attachment indexing, and
improvements to real-time indexing session support. See the
<a href="CHANGES.txt">changes file</a> for more detail.</p>
<p><b>Current version:</b>
1.6.3: <a href="recoll-1.6.3.tar.gz">recoll-1.6.3.tar.gz</a>
See the <a href="BUGS.txt">known bugs and issues</a> and
<a href="CHANGES.txt">changes</a>.</p>
<p>Older recoll releases:
<a href="recoll-1.6.1.tar.gz">1.6.1</a>
<a href="recoll-1.5.11.tar.gz">1.5.11</a>.
<a href="recoll-1.5.6.tar.gz">1.5.6</a>.
<a href="recoll-1.4.3.tar.gz">1.4.3</a>.
@ -94,11 +94,11 @@
<p><b>Mandriva 2006</b> (also works on 2005 and 2007)
RPM:
<a href="recoll-1.6.1-0.1.20060mdk.i586.rpm">
recoll-1.6.1-0.1.20060mdk.i586.rpm</a>.
<a href="recoll-1.6.3-0.1.20060mdk.i586.rpm">
recoll-1.6.3-0.1.20060mdk.i586.rpm</a>.
Source:
<a href="recoll-1.6.1-0.1.20060mdk.src.rpm">
recoll-1.6.1-0.1.20060mdk.src.rpm</a>
<a href="recoll-1.6.3-0.1.20060mdk.src.rpm">
recoll-1.6.3-0.1.20060mdk.src.rpm</a>
</p>
<p><b>Suse 10.1</b>
@ -150,6 +150,9 @@
<a href="http://cvsweb.freebsd.org/ports/deskutils/recoll">
recoll port</a>.</p>
<p>Up to date ports for <a href="port-recoll.tgz">recoll-1.6</a> and
<a href="port-xapian-core.tgz">xapian-0.9.9</a> (without the
NEAR patch).</p>
</div>
</body>
</html>

View File

@ -59,7 +59,7 @@
<li><var class="literal">html</var>.</li>
<li><span class="application">OpenOffice</span>
files.</li>
files (needs <b>unzip</b> command).</li>
<li><var class="literal">maildir</var> and <var
class="literal">mailbox</var> (<span class=
@ -122,8 +122,8 @@
<li>Support for multiple charsets. Internal processing and
storage uses Unicode UTF-8.</li>
<li>Stemming performed at query time (can switch stemming
language after indexing).</li>
<li><a href="#Stemming">Stemming</a> performed at query
time (can switch stemming language after indexing).</li>
<li>Easy installation. No database daemon, web server or
exotic language necessary.</li>
@ -134,7 +134,47 @@
</dd>
</ul>
<h2><a name="#stemming"></a>Stemming</h2>
<p>Stemming is a process which transforms inflected words into
their most basic form. For exemple, <i>flooring</i>,
<i>floors</i>, <i>floored</i> would probably all be transformed
to <i>floor</i> by a stemmer for the English language.</p>
<p>In many search engines, the stemming process occurs during
indexing. The index will only contain the stemmed form of words,
with exceptions for terms which are detected as being probably
proper nouns (ie: capitalized). At query time, the terms entered
by the user are stemmed, then matched against the index.</p>
<p>This process results into a smaller index, but it has the
grave inconvenient of irrevocably losing information during
indexing.</p>
<p>Recoll works in a different way. No stemming is performed at
query time, so that all information gets into the index. The
resulting index is bigger, but most people probably don't care
much about this nowadays, because they have a 100Gb disk 95%
full of binary data <em>which does not get indexed</em>.</p>
<p>At the end of an indexing pass, Recoll builds one or several
stemming dictionaries, where all word stems are listed in
correspondence to the list of their derivatives.</p>
<p>At query time, by default, user-entered terms are stemmed,
then matched against the stem database, and the query is
expanded to include all derivatives. This will yield search
results analogous to those obtained by a classical engine.
The benefits of this approach is that stem expansion can be
controlled instantly at query time in several ways:
<ul>
<li>It can be selectively turned-off for any query term by
capitalizing it (<i>Floor</i>).</li>
<li>The stemming language (ie: english, french...) can be
selected (this supposes that several stemming databases have
been built, which can be configured as part of the indexing,
or done later, in a reasonably fast way).</li>
</ul>
</div>
</body>
</html>

View File

@ -47,7 +47,7 @@
<p><span class="application">Recoll</span> is free, open source,
and GPL-licensed. The current version is
<a class="important" href="download.html">1.6.1</a></p>
<a class="important" href="download.html">1.6.3</a></p>
<p>We borrow a lot of code
from other packages, and welcome code and ideas from
contributors, see the <a class="important"

View File

@ -21,6 +21,7 @@
<a href="recoll2.html"><img src="recoll2-thumb.png"></a>
<a href="recoll3.html"><img src="recoll3-thumb.png"></a>
<a href="recoll4.html"><img src="recoll4-thumb.png"></a>
<a href="recoll5.html"><img src="recoll5-thumb.png"></a>
</div>
</body>
</html>

View File

@ -7,7 +7,10 @@
<body>
<h1>Recoll index format details</h1>
<p>Special (capitalized) terms:</p>
<p>Terms are not stemmed before being stored. They are turned to
all minuscule letters with no accents.</p>
<p>Special prefixed terms:</p>
<ul>
<li>Ddate: modification date of file, like YYYYMMDD</li>
@ -64,7 +67,7 @@
<address><a href="mailto:jean-francois.dockes@wanadoo.fr">Jean-Francois Dockes</a></address>
<!-- Created: Thu Dec 7 13:07:40 CET 2006 -->
<!-- hhmts start -->
Last modified: Thu Dec 7 14:13:36 CET 2006
Last modified: Thu Dec 7 14:19:02 CET 2006
<!-- hhmts end -->
</body>
</html>