Merge branch 'master' into 'master'
fixed typos See merge request medoc92/recoll!4
This commit is contained in:
commit
808565985a
@ -488,7 +488,7 @@ recoll (1.21.2-1~ppaPPAVERS~SERIES1) SERIES; urgency=low
|
||||
|
||||
* New special indexing dialog in GUI
|
||||
* Fix advanced search dialog "Any Clause" mode
|
||||
* Fixed a few bounds issues catched by Windows VC++
|
||||
* Fixed a few bounds issues caught by Windows VC++
|
||||
* Miscellaneous other minor fixes
|
||||
|
||||
-- Jean-Francois Dockes <jf@dockes.org> Tue, 01 Oct 2015 09:40:00 +0200
|
||||
@ -589,7 +589,7 @@ recoll (1.19.12p1-1~ppaPPAVERS~SERIES1) SERIES; urgency=low
|
||||
recoll (1.19.12-1~ppaPPAVERS~SERIES1) SERIES; urgency=low
|
||||
|
||||
* Minor fixes to 1.19.11. Make metadata stored length a parameter.
|
||||
* New xls dumper based on mso-dumper removes catdoc dependancy
|
||||
* New xls dumper based on mso-dumper removes catdoc dependency
|
||||
* Move out the code for the Ubuntu Unity lens and scope
|
||||
|
||||
-- Jean-Francois Dockes <jf@dockes.org> Mon, 02 Apr 2014 15:12:00 +0200
|
||||
@ -926,7 +926,7 @@ recoll (1.11.0-1) unstable; urgency=low
|
||||
+ Remembers missing filters in first run (Closes: #500690)
|
||||
* debian/control:
|
||||
+ Added libimage-exiftool-perl as Suggests (Closes: #502427)
|
||||
+ Added Python as recommaded due to filters/rclpython script
|
||||
+ Added Python as recommended due to filters/rclpython script
|
||||
although, its not necessary as it will be installed only
|
||||
when Python is present
|
||||
* debian/patches:
|
||||
|
||||
@ -120,7 +120,7 @@
|
||||
src/mk/localdefs.in, src/php/00README.txt, src/python/README.txt,
|
||||
src/python/recoll/setup.py:
|
||||
Implemented configure --enable-pic flag to build the main lib with
|
||||
position-independant objects. This avoids having to edit localdefs
|
||||
position-independent objects. This avoids having to edit localdefs
|
||||
by hand to build the new php extension, and voids the need for the
|
||||
Python module to recompile Recoll source files.
|
||||
|
||||
@ -314,7 +314,7 @@
|
||||
2009-12-17 20:23 +0000 dockes <dockes> (95eb8a010525)
|
||||
|
||||
* src/doc/user/usermanual.sgml:
|
||||
There was an error in the mimemap format in the config exemple
|
||||
There was an error in the mimemap format in the config example
|
||||
|
||||
2009-12-14 10:33 +0000 dockes <dockes> (1e774739395e)
|
||||
|
||||
@ -697,7 +697,7 @@
|
||||
|
||||
* src/kde/kioslave/recoll/CMakeLists.txt:
|
||||
added beaglequeue/circache to kio build because of internfile
|
||||
dependancy
|
||||
dependency
|
||||
|
||||
2009-11-18 14:27 +0000 dockes <dockes> (d1587dd98290)
|
||||
|
||||
@ -1865,7 +1865,7 @@
|
||||
2009-01-26 13:27 +0000 dockes <dockes> (61567bc09eab)
|
||||
|
||||
* src/utils/transcode.cpp:
|
||||
tested and decided against cacheing iconv_open
|
||||
tested and decided against caching iconv_open
|
||||
|
||||
2009-01-23 15:56 +0000 dockes <dockes> (1998b1608eb0)
|
||||
|
||||
@ -6601,7 +6601,7 @@
|
||||
|
||||
* src/qtgui/plaintorich.cpp, src/qtgui/plaintorich.h,
|
||||
src/qtgui/preview_w.cpp, src/qtgui/reslist.cpp:
|
||||
improve positionning on term groups by storing/passing an occurrence
|
||||
improve positioning on term groups by storing/passing an occurrence
|
||||
index
|
||||
|
||||
2006-11-18 12:30 +0000 dockes <dockes> (f065c8063ff3)
|
||||
@ -6763,7 +6763,7 @@
|
||||
|
||||
* src/common/textsplit.cpp, src/common/textsplit.h,
|
||||
src/rcldb/rcldb.cpp:
|
||||
phrase queries with bot spans and words must be splitted as words
|
||||
phrase queries with bot spans and words must be split as words
|
||||
only
|
||||
|
||||
2006-11-11 15:30 +0000 dockes <dockes> (25647c7c5aac)
|
||||
@ -8815,7 +8815,7 @@
|
||||
* src/excludefile, src/index/indexer.cpp, src/index/indexer.h,
|
||||
src/index/recollindex.cpp, src/rcldb/rcldb.cpp, src/rcldb/rcldb.h,
|
||||
src/utils/Makefile, src/utils/pathut.cpp, src/utils/pathut.h:
|
||||
allow independant creation / deletion of stem dbs
|
||||
allow independent creation / deletion of stem dbs
|
||||
|
||||
2006-01-06 13:55 +0000 dockes <dockes> (8831260252d9)
|
||||
|
||||
|
||||
14
src/INSTALL
14
src/INSTALL
@ -27,9 +27,9 @@ Chapter 5. Installation and configuration
|
||||
exist any more.
|
||||
|
||||
The package management tools will usually automatically deal with hard
|
||||
dependancies for packages obtained from a proper package repository. You
|
||||
dependencies for packages obtained from a proper package repository. You
|
||||
will have to deal with them by hand for downloaded packages (for example,
|
||||
when dpkg complains about missing dependancies).
|
||||
when dpkg complains about missing dependencies).
|
||||
|
||||
In all cases, you will have to check or install supporting applications
|
||||
for the file types that you want to index beyond those that are natively
|
||||
@ -66,7 +66,7 @@ Chapter 5. Installation and configuration
|
||||
|
||||
A list of common file types which need external commands follows. Many of
|
||||
the handlers need the iconv command, which is not always listed as a
|
||||
dependancy.
|
||||
dependency.
|
||||
|
||||
Please note that, due to the relatively dynamic nature of this
|
||||
information, the most up to date version is now kept on
|
||||
@ -346,7 +346,7 @@ Chapter 5. Installation and configuration
|
||||
RECOLL_CONFTOP and RECOLL_CONFMID environment variables. Values from
|
||||
configuration files inside the top directory will override user ones,
|
||||
values from configuration files inside the middle directory will override
|
||||
system ones and be overriden by user ones. These two variables may be of
|
||||
system ones and be overridden by user ones. These two variables may be of
|
||||
use to applications which augment Recoll functionality, and need to add
|
||||
configuration data without disturbing the user's files. Please note that
|
||||
the two, currently single, values will probably be interpreted as
|
||||
@ -486,7 +486,7 @@ Chapter 5. Installation and configuration
|
||||
|
||||
skippedNames
|
||||
|
||||
A space-separated list of wilcard patterns for names of files or
|
||||
A space-separated list of wildcard patterns for names of files or
|
||||
directories that should be completely ignored. The list defined in
|
||||
the default file is:
|
||||
|
||||
@ -1075,7 +1075,7 @@ Chapter 5. Installation and configuration
|
||||
|
||||
mhmboxquirks
|
||||
|
||||
This allows definining location-related quirks for the mailbox
|
||||
This allows defining location-related quirks for the mailbox
|
||||
handler. Currently only the tbird flag is defined, and it should
|
||||
be set for directories which hold Thunderbird data, as their
|
||||
folder format is weird.
|
||||
@ -1270,7 +1270,7 @@ Chapter 5. Installation and configuration
|
||||
The file has a section for any index which needs translations, either the
|
||||
main one or additional query indexes. The sections are named with the
|
||||
Xapian index directory names. No slash character should exist at the end
|
||||
of the paths (all comparisons are textual). An exemple should make things
|
||||
of the paths (all comparisons are textual). An example should make things
|
||||
sufficiently clear
|
||||
|
||||
[/home/me/.recoll/xapiandb]
|
||||
|
||||
50
src/README
50
src/README
@ -416,7 +416,7 @@ Chapter 2. Indexing
|
||||
|
||||
The generated indexes can be queried concurrently in a transparent manner.
|
||||
|
||||
For index generation, multiple configurations are totally independant from
|
||||
For index generation, multiple configurations are totally independent from
|
||||
each other. When multiple indexes need to be used for a single search,
|
||||
some parameters should be consistent among the configurations.
|
||||
|
||||
@ -457,7 +457,7 @@ Chapter 2. Indexing
|
||||
for subdirectories.
|
||||
|
||||
You can also define an exclusive list of MIME types to be indexed (no
|
||||
others will be indexed), by settting the indexedmimetypes configuration
|
||||
others will be indexed), by setting the indexedmimetypes configuration
|
||||
variable. Example:
|
||||
|
||||
indexedmimetypes = text/html application/pdf
|
||||
@ -801,7 +801,7 @@ Chapter 2. Indexing
|
||||
After such an interruption, the index will be somewhat inconsistent
|
||||
because some operations which are normally performed at the end of the
|
||||
indexing pass will have been skipped (for example, the stemming and
|
||||
spelling databases will be inexistant or out of date). You just need to
|
||||
spelling databases will be inexistent or out of date). You just need to
|
||||
restart indexing at a later time to restore consistency. The indexing will
|
||||
restart at the interruption point (the full file tree will be traversed,
|
||||
but files that were indexed up to the interruption and for which the index
|
||||
@ -977,7 +977,7 @@ Chapter 3. Searching
|
||||
In most cases, you can enter the terms as you think them, even if they
|
||||
contain embedded punctuation or other non-textual characters. For example,
|
||||
Recoll can handle things like email addresses, or arbitrary cut and paste
|
||||
from another text window, punctation and all.
|
||||
from another text window, punctuation and all.
|
||||
|
||||
The main case where you should enter text differently from how it is
|
||||
printed is for east-asian languages (Chinese, Japanese, Korean). Words
|
||||
@ -1015,7 +1015,7 @@ Chapter 3. Searching
|
||||
File name will specifically look for file names. The point of having a
|
||||
separate file name search is that wild card expansion can be performed
|
||||
more efficiently on a small subset of the index (allowing wild cards on
|
||||
the left of terms without excessive penality). Things to know:
|
||||
the left of terms without excessive penalty). Things to know:
|
||||
|
||||
o White space in the entry should match white space in the file name,
|
||||
and is not treated specially.
|
||||
@ -1340,7 +1340,7 @@ Chapter 3. Searching
|
||||
search term (the next highlighted zone). If you select a search
|
||||
group from the dropdown list and click Next or Previous, the match
|
||||
list for this group will be walked. This is not the same as a text
|
||||
search, because the occurences will include non-exact matches (as
|
||||
search, because the occurrences will include non-exact matches (as
|
||||
caused by stemming or wildcards). The search will revert to the
|
||||
text mode as soon as you edit the entry area.
|
||||
|
||||
@ -1433,7 +1433,7 @@ Chapter 3. Searching
|
||||
Click on the Show query details link at the top of the result page to see
|
||||
the query expansion.
|
||||
|
||||
3.1.8.1. Avanced search: the "find" tab
|
||||
3.1.8.1. Advanced search: the "find" tab
|
||||
|
||||
This part of the dialog lets you constructc a query by combining multiple
|
||||
clauses of different types. Each entry field is configurable for the
|
||||
@ -1473,7 +1473,7 @@ Chapter 3. Searching
|
||||
search for quick fox with the default slack will match the latter, and
|
||||
also a fox is a cunning and quick animal.
|
||||
|
||||
3.1.8.2. Avanced search: the "filter" tab
|
||||
3.1.8.2. Advanced search: the "filter" tab
|
||||
|
||||
This part of the dialog has several sections which allow filtering the
|
||||
results of a search according to a number of criteria
|
||||
@ -1507,7 +1507,7 @@ Chapter 3. Searching
|
||||
dirA/dirB would match either /dir1/dirA/dirB/myfile1 or
|
||||
/dir2/dirA/dirB/someother/myfile2.
|
||||
|
||||
3.1.8.3. Avanced search history
|
||||
3.1.8.3. Advanced search history
|
||||
|
||||
The advanced search tool memorizes the last 100 searches performed. You
|
||||
can walk the saved searches by using the up and down arrow keys while the
|
||||
@ -1541,7 +1541,7 @@ Chapter 3. Searching
|
||||
Regular expression
|
||||
|
||||
This mode will accept a regular expression as input. Example:
|
||||
word[0-9]+. The expression is implicitely anchored at the
|
||||
word[0-9]+. The expression is implicitly anchored at the
|
||||
beginning. Ie: press will match pression but not expression. You
|
||||
can use .*press to match the latter, but be aware that this will
|
||||
cause a full index term list scan, which can be quite long.
|
||||
@ -1725,7 +1725,7 @@ Chapter 3. Searching
|
||||
IBM. Searching for the word inside a phrase (ie: "the IBM company") will
|
||||
only match the dotted abrreviation if you increase the phrase slack (using
|
||||
the advanced search panel control, or the o query language modifier).
|
||||
Literal occurences of the word will be matched normally.
|
||||
Literal occurrences of the word will be matched normally.
|
||||
|
||||
3.1.13.3. Others
|
||||
|
||||
@ -1734,7 +1734,7 @@ Chapter 3. Searching
|
||||
with email, for example only searching emails from a specific originator:
|
||||
search tips from:helpfulgui
|
||||
|
||||
Ajusting the result table columns. When displaying results in table mode,
|
||||
Adjusting the result table columns. When displaying results in table mode,
|
||||
you can use a right click on the table headers to activate a pop-up menu
|
||||
which will let you adjust what columns are displayed. You can drag the
|
||||
column headers to adjust their order. You can click them to sort by the
|
||||
@ -2218,7 +2218,7 @@ Chapter 3. Searching
|
||||
occur in a number of circumstances:
|
||||
|
||||
o When using multiple indexes it is a relatively common occurrence that
|
||||
some will actually reside on a remote volume, for exemple mounted via
|
||||
some will actually reside on a remote volume, for example mounted via
|
||||
NFS. In this case, the paths used to access the documents on the local
|
||||
machine are not necessarily the same than the ones used while indexing
|
||||
on the remote machine. For example, /home/me may have been used as a
|
||||
@ -2230,7 +2230,7 @@ Chapter 3. Searching
|
||||
disk, but it may happen that the disk is not mounted at the same place
|
||||
so that the documents paths from the index are invalid.
|
||||
|
||||
o As a last exemple, one could imagine that a big directory has been
|
||||
o As a last example, one could imagine that a big directory has been
|
||||
moved, but that it is currently inconvenient to run the indexer.
|
||||
|
||||
More generally, the path translation facility may be useful whenever the
|
||||
@ -2567,7 +2567,7 @@ Chapter 3. Searching
|
||||
|
||||
Due to the way that Recoll processes wildcards inside dir path filtering
|
||||
clauses, they will have a multiplicative effect on the query size. A
|
||||
clause containg wildcards in several paths elements, like, for example,
|
||||
clause containing wildcards in several paths elements, like, for example,
|
||||
dir:/home/me/*/*/docdir, will almost certainly fail if your indexed tree
|
||||
is of any realistic size.
|
||||
|
||||
@ -2609,7 +2609,7 @@ Chapter 3. Searching
|
||||
|
||||
3.8. Desktop integration
|
||||
|
||||
Being independant of the desktop type has its drawbacks: Recoll desktop
|
||||
Being independent of the desktop type has its drawbacks: Recoll desktop
|
||||
integration is minimal. However there are a few tools available:
|
||||
|
||||
o The KDE KIO Slave was described in a previous section.
|
||||
@ -2834,7 +2834,7 @@ Chapter 4. Programming interface
|
||||
iso-8859-1 encoding is specified because it is not the utf-8 default,
|
||||
and not output by unrtf in the HTML header section.
|
||||
|
||||
o application/x-chm is processed by a persistant handler. This is
|
||||
o application/x-chm is processed by a persistent handler. This is
|
||||
determined by the execm keyword.
|
||||
|
||||
4.1.4. Input handler HTML output
|
||||
@ -2962,7 +2962,7 @@ Chapter 4. Programming interface
|
||||
using the appropriate directive in the definition of the result list
|
||||
paragraph format. All fields are displayed on the fields screen of the
|
||||
preview window (which you can reach through the right-click menu).
|
||||
This is independant of the fact that the search which produced the
|
||||
This is independent of the fact that the search which produced the
|
||||
results used the field or not.
|
||||
|
||||
You can find more information in the section about the fields file, or in
|
||||
@ -3337,9 +3337,9 @@ Chapter 5. Installation and configuration
|
||||
exist any more.
|
||||
|
||||
The package management tools will usually automatically deal with hard
|
||||
dependancies for packages obtained from a proper package repository. You
|
||||
dependencies for packages obtained from a proper package repository. You
|
||||
will have to deal with them by hand for downloaded packages (for example,
|
||||
when dpkg complains about missing dependancies).
|
||||
when dpkg complains about missing dependencies).
|
||||
|
||||
In all cases, you will have to check or install supporting applications
|
||||
for the file types that you want to index beyond those that are natively
|
||||
@ -3362,7 +3362,7 @@ Chapter 5. Installation and configuration
|
||||
|
||||
A list of common file types which need external commands follows. Many of
|
||||
the handlers need the iconv command, which is not always listed as a
|
||||
dependancy.
|
||||
dependency.
|
||||
|
||||
Please note that, due to the relatively dynamic nature of this
|
||||
information, the most up to date version is now kept on
|
||||
@ -3615,7 +3615,7 @@ Chapter 5. Installation and configuration
|
||||
RECOLL_CONFTOP and RECOLL_CONFMID environment variables. Values from
|
||||
configuration files inside the top directory will override user ones,
|
||||
values from configuration files inside the middle directory will override
|
||||
system ones and be overriden by user ones. These two variables may be of
|
||||
system ones and be overridden by user ones. These two variables may be of
|
||||
use to applications which augment Recoll functionality, and need to add
|
||||
configuration data without disturbing the user's files. Please note that
|
||||
the two, currently single, values will probably be interpreted as
|
||||
@ -3755,7 +3755,7 @@ Chapter 5. Installation and configuration
|
||||
|
||||
skippedNames
|
||||
|
||||
A space-separated list of wilcard patterns for names of files or
|
||||
A space-separated list of wildcard patterns for names of files or
|
||||
directories that should be completely ignored. The list defined in
|
||||
the default file is:
|
||||
|
||||
@ -4344,7 +4344,7 @@ Chapter 5. Installation and configuration
|
||||
|
||||
mhmboxquirks
|
||||
|
||||
This allows definining location-related quirks for the mailbox
|
||||
This allows defining location-related quirks for the mailbox
|
||||
handler. Currently only the tbird flag is defined, and it should
|
||||
be set for directories which hold Thunderbird data, as their
|
||||
folder format is weird.
|
||||
@ -4539,7 +4539,7 @@ Chapter 5. Installation and configuration
|
||||
The file has a section for any index which needs translations, either the
|
||||
main one or additional query indexes. The sections are named with the
|
||||
Xapian index directory names. No slash character should exist at the end
|
||||
of the paths (all comparisons are textual). An exemple should make things
|
||||
of the paths (all comparisons are textual). An example should make things
|
||||
sufficiently clear
|
||||
|
||||
[/home/me/.recoll/xapiandb]
|
||||
|
||||
@ -2,7 +2,7 @@
|
||||
Categories=Qt;Utility;Filesystem;Database;
|
||||
Comment=Find documents by specifying search terms
|
||||
Comment[ru]=Поиск документов по заданным условиям
|
||||
Comment[de]=Finde Dokumente durch angeben von Suchbegriffen
|
||||
Comment[de]=Finde Dokumente durch Angeben von Suchbegriffen
|
||||
Exec=recoll
|
||||
GenericName=Local Text Search
|
||||
GenericName[ru]=Локальный текстовый поиск
|
||||
|
||||
@ -115,7 +115,7 @@ properly flush and close the index. This can also be done from the recoll
|
||||
GUI (menu entry: File/Stop_Indexing). After such an interruption, the index
|
||||
will be somewhat inconsistent because some operations which are normally
|
||||
performed at the end of the indexing pass will have been skipped (for
|
||||
example, the stemming and spelling databases will be inexistant or out of
|
||||
example, the stemming and spelling databases will be inexistent or out of
|
||||
date). You just need to restart indexing at a later time to restore
|
||||
consistency. The indexing will restart at the interruption point (the full
|
||||
file tree will be traversed, but files that were indexed up to the
|
||||
|
||||
@ -1010,7 +1010,7 @@ alink="#0000FF">
|
||||
list in the configuration file (1.20 and later). This can
|
||||
be redefined for subdirectories.</p>
|
||||
<p>You can also define an exclusive list of MIME types to
|
||||
be indexed (no others will be indexed), by settting the
|
||||
be indexed (no others will be indexed), by setting the
|
||||
<a class="link" href=
|
||||
"#RCL.INSTALL.CONFIG.RECOLLCONF.INDEXEDMIMETYPES">indexedmimetypes</a>
|
||||
configuration variable. Example:</p>
|
||||
@ -1384,7 +1384,7 @@ alink="#0000FF">
|
||||
"command"><strong>recollindex</strong></span> program,
|
||||
used for creating or updating indexes, always works on a
|
||||
single index. The different configurations are entirely
|
||||
independant (no parameters are ever shared between
|
||||
independent (no parameters are ever shared between
|
||||
configurations when indexing).</p>
|
||||
<p>All the search interfaces (<span class=
|
||||
"command"><strong>recoll</strong></span>, <span class=
|
||||
@ -2247,7 +2247,7 @@ metadatacmds = ; <em class=
|
||||
installed, and if the the <a class="link" href=
|
||||
"#RCL.INSTALL.CONFIG.RECOLLCONF.PDFATTACH">pdfattach</a>
|
||||
configuration variable is set, the PDF input handler will
|
||||
try to extract PDF attachements for indexing as
|
||||
try to extract PDF attachments for indexing as
|
||||
sub-documents of the PDF file. This is disabled by
|
||||
default, because it slows down PDF indexing a bit even if
|
||||
not one attachment is ever found (PDF attachments are
|
||||
@ -2381,7 +2381,7 @@ metadatacmds = ; <em class=
|
||||
inconsistent because some operations which are normally
|
||||
performed at the end of the indexing pass will have been
|
||||
skipped (for example, the stemming and spelling databases
|
||||
will be inexistant or out of date). You just need to
|
||||
will be inexistent or out of date). You just need to
|
||||
restart indexing at a later time to restore consistency.
|
||||
The indexing will restart at the interruption point (the
|
||||
full file tree will be traversed, but files that were
|
||||
@ -3429,7 +3429,7 @@ fs.inotify.max_user_watches=32768
|
||||
"guilabel">Next</span> or <span class=
|
||||
"guilabel">Previous</span>, the match list for
|
||||
this group will be walked. This is not the same
|
||||
as a text search, because the occurences will
|
||||
as a text search, because the occurrences will
|
||||
include non-exact matches (as caused by stemming
|
||||
or wildcards). The search will revert to the text
|
||||
mode as soon as you edit the entry area.</p>
|
||||
@ -3540,7 +3540,7 @@ fs.inotify.max_user_watches=32768
|
||||
<p><span class="application">Recoll</span> keeps a
|
||||
history of searches. See <a class="link" href=
|
||||
"#RCL.SEARCH.GUI.COMPLEX.HISTORY" title=
|
||||
"Avanced search history">Advanced search history</a>.</p>
|
||||
"Advanced search history">Advanced search history</a>.</p>
|
||||
<p>The dialog has two tabs:</p>
|
||||
<div class="orderedlist">
|
||||
<ol class="orderedlist" type="1">
|
||||
@ -3570,7 +3570,7 @@ fs.inotify.max_user_watches=32768
|
||||
<div>
|
||||
<h4 class="title"><a name=
|
||||
"RCL.SEARCH.GUI.COMPLEX.TERMS" id=
|
||||
"RCL.SEARCH.GUI.COMPLEX.TERMS"></a>Avanced
|
||||
"RCL.SEARCH.GUI.COMPLEX.TERMS"></a>Advanced
|
||||
search: the "find" tab</h4>
|
||||
</div>
|
||||
</div>
|
||||
@ -3642,7 +3642,7 @@ fs.inotify.max_user_watches=32768
|
||||
<div>
|
||||
<h4 class="title"><a name=
|
||||
"RCL.SEARCH.GUI.COMPLEX.FILTER" id=
|
||||
"RCL.SEARCH.GUI.COMPLEX.FILTER"></a>Avanced
|
||||
"RCL.SEARCH.GUI.COMPLEX.FILTER"></a>Advanced
|
||||
search: the "filter" tab</h4>
|
||||
</div>
|
||||
</div>
|
||||
@ -3710,7 +3710,7 @@ fs.inotify.max_user_watches=32768
|
||||
<div>
|
||||
<h4 class="title"><a name=
|
||||
"RCL.SEARCH.GUI.COMPLEX.HISTORY" id=
|
||||
"RCL.SEARCH.GUI.COMPLEX.HISTORY"></a>Avanced
|
||||
"RCL.SEARCH.GUI.COMPLEX.HISTORY"></a>Advanced
|
||||
search history</h4>
|
||||
</div>
|
||||
</div>
|
||||
@ -3771,7 +3771,7 @@ fs.inotify.max_user_watches=32768
|
||||
<p>This mode will accept a regular expression as
|
||||
input. Example: <em class=
|
||||
"replaceable"><code>word[0-9]+</code></em>. The
|
||||
expression is implicitely anchored at the
|
||||
expression is implicitly anchored at the
|
||||
beginning. Ie: <em class=
|
||||
"replaceable"><code>press</code></em> will match
|
||||
<em class="replaceable"><code>pression</code></em>
|
||||
@ -4103,7 +4103,7 @@ fs.inotify.max_user_watches=32768
|
||||
dotted abrreviation if you increase the phrase slack
|
||||
(using the advanced search panel control, or the
|
||||
<code class="literal">o</code> query language
|
||||
modifier). Literal occurences of the word will be
|
||||
modifier). Literal occurrences of the word will be
|
||||
matched normally.</p>
|
||||
</div>
|
||||
<div class="sect3">
|
||||
@ -4124,7 +4124,7 @@ fs.inotify.max_user_watches=32768
|
||||
for example only searching emails from a specific
|
||||
originator: <code class="literal">search tips
|
||||
from:helpfulgui</code></p>
|
||||
<p><b>Ajusting the result table columns. </b>When
|
||||
<p><b>Adjusting the result table columns. </b>When
|
||||
displaying results in table mode, you can use a right
|
||||
click on the table headers to activate a pop-up menu
|
||||
which will let you adjust what columns are displayed.
|
||||
@ -4578,7 +4578,7 @@ fs.inotify.max_user_watches=32768
|
||||
QTextBrowser widget to display the HTML, which may be
|
||||
necessary if the ones above are not ported on the
|
||||
system, or to reduce the application size and
|
||||
dependancies. There are limits to what you can do in
|
||||
dependencies. There are limits to what you can do in
|
||||
this case, but it is still possible to decide what data
|
||||
each result will contain, and how it will be
|
||||
displayed.</p>
|
||||
@ -5563,7 +5563,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
||||
"application">Recoll</span> processes wildcards inside
|
||||
<code class="literal">dir</code> path filtering
|
||||
clauses, they will have a multiplicative effect on the
|
||||
query size. A clause containg wildcards in several
|
||||
query size. A clause containing wildcards in several
|
||||
paths elements, like, for example, <code class=
|
||||
"literal">dir:</code><em class=
|
||||
"replaceable"><code>/home/me/*/*/docdir</code></em>,
|
||||
@ -5733,7 +5733,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
||||
<li class="listitem">
|
||||
<p>When using multiple indexes it is a relatively
|
||||
common occurrence that some will actually reside on a
|
||||
remote volume, for exemple mounted via NFS. In this
|
||||
remote volume, for example mounted via NFS. In this
|
||||
case, the paths used to access the documents on the
|
||||
local machine are not necessarily the same than the
|
||||
ones used while indexing on the remote machine. For
|
||||
@ -5753,7 +5753,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
||||
invalid.</p>
|
||||
</li>
|
||||
<li class="listitem">
|
||||
<p>As a last exemple, one could imagine that a big
|
||||
<p>As a last example, one could imagine that a big
|
||||
directory has been moved, but that it is currently
|
||||
inconvenient to run the indexer.</p>
|
||||
</li>
|
||||
@ -5883,7 +5883,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<p>Being independant of the desktop type has its drawbacks:
|
||||
<p>Being independent of the desktop type has its drawbacks:
|
||||
<span class="application">Recoll</span> desktop integration
|
||||
is minimal. However there are a few tools available:</p>
|
||||
<div class="itemizedlist">
|
||||
@ -6398,7 +6398,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
||||
</li>
|
||||
<li class="listitem">
|
||||
<p><code class="literal">application/x-chm</code>
|
||||
is processed by a persistant handler. This is
|
||||
is processed by a persistent handler. This is
|
||||
determined by the <code class=
|
||||
"literal">execm</code> keyword.</p>
|
||||
</li>
|
||||
@ -6640,7 +6640,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
||||
"The result list format">result list paragraph
|
||||
format</a>. All fields are displayed on the fields
|
||||
screen of the preview window (which you can reach
|
||||
through the right-click menu). This is independant of
|
||||
through the right-click menu). This is independent of
|
||||
the fact that the search which produced the results
|
||||
used the field or not.</p>
|
||||
</li>
|
||||
@ -7765,11 +7765,11 @@ hasextract = False
|
||||
nothing else to install.</p>
|
||||
<p>On <span class="application">Unix</span>-like systems,
|
||||
the package management tools will automatically install
|
||||
hard dependancies for packages obtained from a proper
|
||||
hard dependencies for packages obtained from a proper
|
||||
package repository. You will have to deal with them by hand
|
||||
for downloaded packages (for example, when <span class=
|
||||
"command"><strong>dpkg</strong></span> complains about
|
||||
missing dependancies).</p>
|
||||
missing dependencies).</p>
|
||||
<p>In all cases, you will have to check or install
|
||||
<a class="link" href="#RCL.INSTALL.EXTERNAL" title=
|
||||
"5.2. Supporting packages">supporting applications</a>
|
||||
@ -7902,12 +7902,12 @@ hasextract = False
|
||||
</div>
|
||||
<p>The following prerequisites are described in broad
|
||||
terms and not as specific package names (which will
|
||||
depend on the exact platform). The dependancies should be
|
||||
depend on the exact platform). The dependencies should be
|
||||
available as packages on most common Unix derivatives,
|
||||
and it should be quite uncommon that you would have to
|
||||
build one of them.</p>
|
||||
<p>If you do not need the GUI, you can avoid all GUI
|
||||
dependancies by disabling its build. (See the configure
|
||||
dependencies by disabling its build. (See the configure
|
||||
section further).</p>
|
||||
<p>The shopping list:</p>
|
||||
<div class="itemizedlist">
|
||||
@ -7938,7 +7938,7 @@ hasextract = False
|
||||
<p>For building the documentation: the <span class=
|
||||
"command"><strong>xsltproc</strong></span> command,
|
||||
and the Docbook XML and style sheet files. You can
|
||||
avoid this dependancy by disabling documentation
|
||||
avoid this dependency by disabling documentation
|
||||
building with the <code class=
|
||||
"literal">--disable-userdoc</code> <span class=
|
||||
"command"><strong>configure</strong></span>
|
||||
@ -7963,7 +7963,7 @@ hasextract = False
|
||||
<p>Development files for <a class="ulink" href=
|
||||
"http://qt-project.org/downloads" target=
|
||||
"_top"><span class="application">Qt 5</span></a> .
|
||||
and its own dependancies (X11 etc.)</p>
|
||||
and its own dependencies (X11 etc.)</p>
|
||||
</li>
|
||||
<li class="listitem">
|
||||
<p>Development files for libxslt</p>
|
||||
@ -8296,7 +8296,7 @@ hasextract = False
|
||||
from configuration files inside the top directory will
|
||||
override user ones, values from configuration files inside
|
||||
the middle directory will override system ones and be
|
||||
overriden by user ones. These two variables may be of use
|
||||
overridden by user ones. These two variables may be of use
|
||||
to applications which augment <span class=
|
||||
"application">Recoll</span> functionality, and need to add
|
||||
configuration data without disturbing the user's files.
|
||||
@ -10164,7 +10164,7 @@ hasextract = False
|
||||
"literal">[guifilters]</code> section, where each control
|
||||
is defined by a variable naming a query language
|
||||
fragment.</p>
|
||||
<p>A simple exemple will hopefully make things
|
||||
<p>A simple example will hopefully make things
|
||||
clearer.</p>
|
||||
<pre class="programlisting">[guifilters]
|
||||
|
||||
@ -10184,7 +10184,7 @@ System Docs = dir:/usr/share/doc
|
||||
<p>Any name text before a colon character will be erased
|
||||
in the display, but used for sorting. You can use this to
|
||||
display the checkboxes in any order you like. For
|
||||
exemple, the following would do exactly the same as
|
||||
example, the following would do exactly the same as
|
||||
above, but ordering the checkboxes in the reverse
|
||||
order.</p>
|
||||
<pre class="programlisting">[guifilters]
|
||||
@ -10345,7 +10345,7 @@ other = rclcat:other
|
||||
indexes. The sections are named with the <span class=
|
||||
"application">Xapian</span> index directory names. No
|
||||
slash character should exist at the end of the paths (all
|
||||
comparisons are textual). An exemple should make things
|
||||
comparisons are textual). An example should make things
|
||||
sufficiently clear</p>
|
||||
<pre class="programlisting">
|
||||
[/home/me/.recoll/xapiandb]
|
||||
|
||||
@ -519,7 +519,7 @@
|
||||
redefined for subdirectories.</para>
|
||||
|
||||
<para>You can also define an exclusive list of MIME types to be
|
||||
indexed (no others will be indexed), by settting
|
||||
indexed (no others will be indexed), by setting
|
||||
the <link linkend="RCL.INSTALL.CONFIG.RECOLLCONF.INDEXEDMIMETYPES">
|
||||
indexedmimetypes</link>
|
||||
configuration variable. Example:<programlisting>
|
||||
@ -797,7 +797,7 @@
|
||||
|
||||
<para>The <command>recollindex</command> program, used for creating
|
||||
or updating indexes, always works on a single index. The different
|
||||
configurations are entirely independant (no parameters are ever
|
||||
configurations are entirely independent (no parameters are ever
|
||||
shared between configurations when indexing). </para>
|
||||
|
||||
<para>All the search interfaces (<command>recoll</command>,
|
||||
@ -1502,7 +1502,7 @@ metadatacmds = ; <replaceable>tags</replaceable> = tmsu tags %f
|
||||
the
|
||||
<link linkend="RCL.INSTALL.CONFIG.RECOLLCONF.PDFATTACH">pdfattach</link>
|
||||
configuration variable is set, the PDF input handler will try to
|
||||
extract PDF attachements for indexing as sub-documents of the PDF
|
||||
extract PDF attachments for indexing as sub-documents of the PDF
|
||||
file. This is disabled by default, because it slows down PDF
|
||||
indexing a bit even if not one attachment is ever found (PDF
|
||||
attachments are uncommon in my experience).</para>
|
||||
@ -1609,7 +1609,7 @@ metadatacmds = ; <replaceable>tags</replaceable> = tmsu tags %f
|
||||
inconsistent because some operations which are normally
|
||||
performed at the end of the indexing pass will have been
|
||||
skipped (for example, the stemming and spelling databases
|
||||
will be inexistant or out of date). You just need to restart
|
||||
will be inexistent or out of date). You just need to restart
|
||||
indexing at a later time to restore consistency. The
|
||||
indexing will restart at the interruption point (the full
|
||||
file tree will be traversed, but files that were indexed up
|
||||
@ -2445,7 +2445,7 @@ fs.inotify.max_user_watches=32768
|
||||
from the dropdown list and click <guilabel>Next</guilabel>
|
||||
or <guilabel>Previous</guilabel>, the match list for this
|
||||
group will be walked. This is not the same as a text
|
||||
search, because the occurences will include non-exact
|
||||
search, because the occurrences will include non-exact
|
||||
matches (as caused by stemming or wildcards). The search
|
||||
will revert to the text mode as soon as you edit the
|
||||
entry area.</para></listitem>
|
||||
@ -2577,7 +2577,7 @@ fs.inotify.max_user_watches=32768
|
||||
the top of the result page to see the query expansion.</para>
|
||||
|
||||
<sect3 id="RCL.SEARCH.GUI.COMPLEX.TERMS">
|
||||
<title>Avanced search: the "find" tab</title>
|
||||
<title>Advanced search: the "find" tab</title>
|
||||
|
||||
<para>This part of the dialog lets you constructc a query by
|
||||
combining multiple clauses of different types. Each entry
|
||||
@ -2635,7 +2635,7 @@ fs.inotify.max_user_watches=32768
|
||||
</sect3>
|
||||
|
||||
<sect3 id="RCL.SEARCH.GUI.COMPLEX.FILTER">
|
||||
<title>Avanced search: the "filter" tab</title>
|
||||
<title>Advanced search: the "filter" tab</title>
|
||||
|
||||
<para>This part of the dialog has several sections which allow
|
||||
filtering the results of a search according to a number of
|
||||
@ -2689,7 +2689,7 @@ fs.inotify.max_user_watches=32768
|
||||
</sect3>
|
||||
|
||||
<sect3 id="RCL.SEARCH.GUI.COMPLEX.HISTORY">
|
||||
<title>Avanced search history</title>
|
||||
<title>Advanced search history</title>
|
||||
|
||||
<para>The advanced search tool memorizes the last 100 searches
|
||||
performed. You can walk the saved searches by using the up and
|
||||
@ -2742,7 +2742,7 @@ fs.inotify.max_user_watches=32768
|
||||
<listitem><para>This mode will accept a regular expression
|
||||
as input. Example:
|
||||
<replaceable>word[0-9]+</replaceable>. The expression is
|
||||
implicitely anchored at the beginning. Ie:
|
||||
implicitly anchored at the beginning. Ie:
|
||||
<replaceable>press</replaceable> will match
|
||||
<replaceable>pression</replaceable> but not
|
||||
<replaceable>expression</replaceable>. You can use
|
||||
@ -3027,7 +3027,7 @@ fs.inotify.max_user_watches=32768
|
||||
company"</literal>) will only match the dotted abrreviation
|
||||
if you increase the phrase slack (using the advanced search
|
||||
panel control, or the <literal>o</literal> query language
|
||||
modifier). Literal occurences of the word will be matched
|
||||
modifier). Literal occurrences of the word will be matched
|
||||
normally.</para>
|
||||
</formalpara>
|
||||
|
||||
@ -3046,7 +3046,7 @@ fs.inotify.max_user_watches=32768
|
||||
</para>
|
||||
</formalpara>
|
||||
|
||||
<formalpara><title>Ajusting the result table columns</title>
|
||||
<formalpara><title>Adjusting the result table columns</title>
|
||||
<para>When displaying results in table mode, you can use a
|
||||
right click on the table headers to activate a pop-up menu
|
||||
which will let you adjust what columns are displayed. You can
|
||||
@ -3458,7 +3458,7 @@ fs.inotify.max_user_watches=32768
|
||||
<para>It is also possible to build &RCL; to use a simpler Qt
|
||||
QTextBrowser widget to display the HTML, which may be necessary
|
||||
if the ones above are not ported on the system, or to reduce
|
||||
the application size and dependancies. There are limits to what
|
||||
the application size and dependencies. There are limits to what
|
||||
you can do in this case, but it is still possible to decide
|
||||
what data each result will contain, and how it will be
|
||||
displayed.</para>
|
||||
@ -4251,7 +4251,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
||||
<para>Due to the way that &RCL; processes wildcards
|
||||
inside <literal>dir</literal> path filtering clauses, they
|
||||
will have a multiplicative effect on the query size. A clause
|
||||
containg wildcards in several paths elements, like, for
|
||||
containing wildcards in several paths elements, like, for
|
||||
example,
|
||||
<literal>dir:</literal><replaceable>/home/me/*/*/docdir</replaceable>,
|
||||
will almost certainly fail if your indexed tree is of any realistic
|
||||
@ -4399,7 +4399,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
||||
<itemizedlist>
|
||||
<listitem><para>When using multiple indexes it is a relatively common
|
||||
occurrence that some will actually reside on a remote volume, for
|
||||
exemple mounted via NFS. In this case, the paths used to access
|
||||
example mounted via NFS. In this case, the paths used to access
|
||||
the documents on the local machine are not necessarily the same
|
||||
than the ones used while indexing on the remote machine. For
|
||||
example, <filename>/home/me</filename> may have been used as
|
||||
@ -4415,7 +4415,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
||||
that the documents paths from the index are
|
||||
invalid.</para></listitem>
|
||||
|
||||
<listitem><para>As a last exemple, one could imagine that a big
|
||||
<listitem><para>As a last example, one could imagine that a big
|
||||
directory has been moved, but that it is currently
|
||||
inconvenient to run the indexer.</para></listitem>
|
||||
</itemizedlist>
|
||||
@ -4527,7 +4527,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
||||
<sect1 id="RCL.SEARCH.DESKTOP">
|
||||
<title>Desktop integration</title>
|
||||
|
||||
<para>Being independant of the desktop type has its drawbacks: &RCL;
|
||||
<para>Being independent of the desktop type has its drawbacks: &RCL;
|
||||
desktop integration is minimal. However there are a few tools
|
||||
available:
|
||||
<itemizedlist>
|
||||
@ -4909,7 +4909,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
||||
<command>unrtf</command> in the HTML header section.</para>
|
||||
</listitem>
|
||||
<listitem><para><literal>application/x-chm</literal> is processed
|
||||
by a persistant handler. This is determined by the
|
||||
by a persistent handler. This is determined by the
|
||||
<literal>execm</literal> keyword.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
@ -5118,7 +5118,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
||||
<link linkend="RCL.SEARCH.GUI.CUSTOM.RESLIST">result list paragraph format</link>.
|
||||
All fields are displayed on the fields screen of
|
||||
the preview window (which you can reach through the right-click
|
||||
menu). This is independant of the fact that the search which
|
||||
menu). This is independent of the fact that the search which
|
||||
produced the results used the field or not.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
@ -6003,10 +6003,10 @@ hasextract = False
|
||||
file, there is nothing else to install.</para>
|
||||
|
||||
<para>On &LIN;, the package management tools will automatically
|
||||
install hard dependancies for packages obtained from a proper package
|
||||
install hard dependencies for packages obtained from a proper package
|
||||
repository. You will have to deal with them by hand for downloaded
|
||||
packages (for example, when <command>dpkg</command> complains about
|
||||
missing dependancies).</para>
|
||||
missing dependencies).</para>
|
||||
|
||||
<para>In all cases, you will have to check or install
|
||||
<link linkend="RCL.INSTALL.EXTERNAL">supporting applications</link>
|
||||
@ -6092,12 +6092,12 @@ hasextract = False
|
||||
|
||||
<para>The following prerequisites are described in broad terms and
|
||||
not as specific package names (which will depend on the exact
|
||||
platform). The dependancies should be available as packages on most
|
||||
platform). The dependencies should be available as packages on most
|
||||
common Unix derivatives, and it should be quite uncommon that you
|
||||
would have to build one of them.</para>
|
||||
|
||||
<para>If you do not need the GUI, you can avoid all GUI
|
||||
dependancies by disabling its build. (See the configure section
|
||||
dependencies by disabling its build. (See the configure section
|
||||
further).</para>
|
||||
|
||||
<para>The shopping list:</para>
|
||||
@ -6116,7 +6116,7 @@ hasextract = False
|
||||
|
||||
<listitem><para>For building the documentation: the
|
||||
<command>xsltproc</command> command, and the Docbook XML and
|
||||
style sheet files. You can avoid this dependancy by disabling
|
||||
style sheet files. You can avoid this dependency by disabling
|
||||
documentation building with the
|
||||
<literal>--disable-userdoc</literal> <command>configure</command>
|
||||
option.</para></listitem>
|
||||
@ -6136,7 +6136,7 @@ hasextract = False
|
||||
|
||||
<listitem> <para>Development files for
|
||||
<ulink url="http://qt-project.org/downloads"><application>Qt 5</application> </ulink>.
|
||||
and its own dependancies (X11 etc.)</para> </listitem>
|
||||
and its own dependencies (X11 etc.)</para> </listitem>
|
||||
|
||||
<listitem><para>Development files for libxslt</para></listitem>
|
||||
|
||||
@ -6405,7 +6405,7 @@ hasextract = False
|
||||
variables. Values from configuration files inside the top
|
||||
directory will override user ones, values from configuration
|
||||
files inside the middle directory will override system ones
|
||||
and be overriden by user ones. These two variables may be of
|
||||
and be overridden by user ones. These two variables may be of
|
||||
use to applications which augment &RCL; functionality, and
|
||||
need to add configuration data without disturbing the user's
|
||||
files. Please note that the two, currently single, values will
|
||||
@ -6787,7 +6787,7 @@ hasextract = False
|
||||
each control is defined by a variable naming a query language
|
||||
fragment.</para>
|
||||
|
||||
<para>A simple exemple will hopefully make things clearer.</para>
|
||||
<para>A simple example will hopefully make things clearer.</para>
|
||||
|
||||
<programlisting>[guifilters]
|
||||
|
||||
@ -6808,7 +6808,7 @@ System Docs = dir:/usr/share/doc
|
||||
|
||||
<para>Any name text before a colon character will be erased in the
|
||||
display, but used for sorting. You can use this to display the
|
||||
checkboxes in any order you like. For exemple, the following would
|
||||
checkboxes in any order you like. For example, the following would
|
||||
do exactly the same as above, but ordering the checkboxes in the
|
||||
reverse order.</para>
|
||||
|
||||
@ -6952,7 +6952,7 @@ other = rclcat:other
|
||||
translations, either the main one or additional query
|
||||
indexes. The sections are named with the &XAP; index
|
||||
directory names. No slash character should exist at the end
|
||||
of the paths (all comparisons are textual). An exemple
|
||||
of the paths (all comparisons are textual). An example
|
||||
should make things sufficiently clear</para>
|
||||
|
||||
<programlisting>
|
||||
|
||||
@ -1,14 +1,14 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Try to guess a text's language and character set by checking how it matches lists of
|
||||
common words. This is not a primary method of detection because it's slow and unreliable, but it
|
||||
may be a help in discrimating, for exemple, before european languages using relatively close
|
||||
may be a help in discrimating, for example, before european languages using relatively close
|
||||
variations of iso-8859.
|
||||
This is used in association with a zip file containing a number of stopwords list: rcllatinstops.zip
|
||||
|
||||
As a note, I am looking for a good iso-8859-7 stop words list for greek, the only ones I found
|
||||
were utf-8 and there are errors when transcoding to iso-8859-7. I guess that there is something
|
||||
about Greek accents that I don't know and would enable fixing this (some kind of simplification
|
||||
allowing transliteration from utf-8 to iso-8859-7). An exemple of difficulty is the small letter
|
||||
allowing transliteration from utf-8 to iso-8859-7). An example of difficulty is the small letter
|
||||
epsilon with dasia (in unicode but not iso). Can this be replaced by either epsilon or epsilon
|
||||
with acute accent ?
|
||||
"""
|
||||
|
||||
@ -17,7 +17,7 @@
|
||||
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
|
||||
########################################################
|
||||
|
||||
# Running OCR programs for Recoll. This is excecuted from,
|
||||
# Running OCR programs for Recoll. This is executed from,
|
||||
# e.g. rclpdf.py if pdftotext returns no data.
|
||||
#
|
||||
# The script tries to retrieve the data from the ocr cache, else it
|
||||
|
||||
@ -34,7 +34,7 @@ import io
|
||||
import keyword, token, tokenize
|
||||
|
||||
#############################################################################
|
||||
### Python Source Parser (does Hilighting)
|
||||
### Python Source Parser (does Highlighting)
|
||||
#############################################################################
|
||||
|
||||
_KEYWORD = token.NT_OFFSET + 1
|
||||
|
||||
@ -172,7 +172,7 @@ awk '
|
||||
s=substr($0, RSTART+5, RLENGTH-5)
|
||||
printf("%s", s)
|
||||
# Note: No way to know if this ends a paragraph, so no "<br>"
|
||||
# but it migth be good to have one in some instances
|
||||
# but it might be good to have one in some instances
|
||||
}
|
||||
else if (match($0, "<para+")) {
|
||||
printf("<br>")
|
||||
|
||||
@ -34,9 +34,9 @@ using namespace std;
|
||||
/// Identification of file from contents. This is called for files with
|
||||
/// unrecognized extensions.
|
||||
///
|
||||
/// The system 'file' utility does not always work for us. For exemple
|
||||
/// The system 'file' utility does not always work for us. For example
|
||||
/// it will mistake mail folders for simple text files if there is no
|
||||
/// 'Received' header, which would be the case, for exemple in a
|
||||
/// 'Received' header, which would be the case, for example in a
|
||||
/// 'Sent' folder. Also "file -i" does not exist on all systems, and
|
||||
/// is quite costly to execute.
|
||||
/// So we first call the internal file identifier, which currently
|
||||
@ -99,7 +99,7 @@ static string mimetypefromdata(RclConfig *cfg, const string &fn, bool usfc)
|
||||
// when 'file' believes that the file name is binary
|
||||
// xdg-mime only outputs the MIME type.
|
||||
|
||||
// If there is no colon and there is a slash, this is hopefuly
|
||||
// If there is no colon and there is a slash, this is hopefully
|
||||
// the mime type
|
||||
if (result.find_first_of(":") == string::npos &&
|
||||
result.find_first_of("/") != string::npos) {
|
||||
|
||||
@ -288,7 +288,7 @@ public:
|
||||
|
||||
// Build a list of things to index, then call purgefiles and/or
|
||||
// indexfiles. This is basically the same as find xxx | recollindex
|
||||
// -i [-e] without the find (so, simpler but less powerfull)
|
||||
// -i [-e] without the find (so, simpler but less powerful)
|
||||
bool recursive_index(RclConfig *config, const string& top,
|
||||
const vector<string>& selpats)
|
||||
{
|
||||
@ -363,7 +363,7 @@ static bool createstemdb(RclConfig *config, const string &lang)
|
||||
return confindexer->createStemDb(lang);
|
||||
}
|
||||
|
||||
// Check that topdir entries are valid (successfull tilde exp + abs
|
||||
// Check that topdir entries are valid (successful tilde exp + abs
|
||||
// path) or fail.
|
||||
// In addition, topdirs, skippedPaths, daemSkippedPaths entries should
|
||||
// match existing files or directories. Warn if they don't
|
||||
|
||||
@ -83,7 +83,7 @@ Misc build problems:
|
||||
|
||||
KUBUNTU 8.10 (updated to 2008-27-11)
|
||||
------------------------------------
|
||||
cmake generates a bad dependancy on
|
||||
cmake generates a bad dependency on
|
||||
/build/buildd/kde4libs-4.1.2/obj-i486-linux-gnu/lib/libkdecore.so
|
||||
inside CMakeFiles/kio_recoll.dir/build.make
|
||||
|
||||
|
||||
@ -222,7 +222,7 @@ bool RecollProtocol::syncSearch(const QueryDesc &qd)
|
||||
}
|
||||
|
||||
// This is used by the html interface, but also by the directory one
|
||||
// when doing file copies for exemple. This is the central dispatcher
|
||||
// when doing file copies for example. This is the central dispatcher
|
||||
// for requests, it has to know a little about both models.
|
||||
void RecollProtocol::get(const KUrl& url)
|
||||
{
|
||||
|
||||
@ -224,7 +224,7 @@ bool RecollProtocol::syncSearch(const QueryDesc& qd)
|
||||
}
|
||||
|
||||
// This is used by the html interface, but also by the directory one
|
||||
// when doing file copies for exemple. This is the central dispatcher
|
||||
// when doing file copies for example. This is the central dispatcher
|
||||
// for requests, it has to know a little about both models.
|
||||
void RecollProtocol::get(const QUrl& url)
|
||||
{
|
||||
|
||||
@ -364,7 +364,7 @@ class CHMFile:
|
||||
'''
|
||||
if self.file:
|
||||
# path = os.path.abspath(document) # wtf?? the index contents
|
||||
# are independant of the os !
|
||||
# are independent of the os !
|
||||
path = document
|
||||
return chmlib.chm_resolve_object(self.file, path)
|
||||
else:
|
||||
|
||||
@ -219,7 +219,7 @@ class ConfTree(ConfSimple):
|
||||
class ConfStack(object):
|
||||
""" A ConfStack manages the superposition of a list of Configuration
|
||||
objects. Values are looked for in each object from the list until found.
|
||||
This typically provides for defaults overriden by sparse values in the
|
||||
This typically provides for defaults overridden by sparse values in the
|
||||
topmost file."""
|
||||
|
||||
def __init__(self, nm, dirs, tp = 'simple'):
|
||||
|
||||
@ -1,6 +1,6 @@
|
||||
#!/usr/bin/env python
|
||||
__doc__ = """
|
||||
An exemple indexer for an arbitrary multi-document file format.
|
||||
An example indexer for an arbitrary multi-document file format.
|
||||
Not supposed to run ''as-is'' or be really useful.
|
||||
|
||||
''Lookup'' notes file indexing
|
||||
|
||||
@ -434,7 +434,7 @@ void RclMain::execViewer(const map<string, string>& subs, bool enterHistory,
|
||||
lcmd.push_back(ncmd);
|
||||
}
|
||||
|
||||
// Also substitute inside the unsplitted command line and display
|
||||
// Also substitute inside the unsplit command line and display
|
||||
// in status bar
|
||||
pcSubst(cmd, ncmd, subs);
|
||||
#ifndef _WIN32
|
||||
|
||||
@ -135,7 +135,7 @@ public:
|
||||
virtual bool takeword(const std::string& term, int pos, int bts, int bte) {
|
||||
LOGDEB1("takeword: [" << term << "] bytepos: "<<bts<<":"<<bte<<endl);
|
||||
// Limit time taken with monster documents. The resulting
|
||||
// abstract will be incorrect or inexistant, but this is
|
||||
// abstract will be incorrect or inexistent, but this is
|
||||
// better than taking forever (the default cutoff value comes
|
||||
// from the snippetMaxPosWalk configuration parameter, and is
|
||||
// 10E6)
|
||||
|
||||
@ -131,7 +131,7 @@ void Query::Native::setDbWideQTermsFreqs()
|
||||
// retrieving the Within Document Frequencies and multiplying by
|
||||
// overal term frequency, then using log-based thresholds.
|
||||
// 2012: it's not too clear to me why exactly we do the log thresholds thing.
|
||||
// Preferring terms wich are rare either or both in the db and the document
|
||||
// Preferring terms which are rare either or both in the db and the document
|
||||
// seems reasonable though
|
||||
// To avoid setting a high quality for a low frequency expansion of a
|
||||
// common stem, which seems wrong, we group the terms by
|
||||
@ -395,7 +395,7 @@ void Query::Native::abstractPopulateQTerm(
|
||||
// the neighboring positions marked, populate the neighbours: for each
|
||||
// term in the document, walk its position list and populate slots
|
||||
// around the query terms. We arbitrarily truncate the list to avoid
|
||||
// taking forever. If we do cutoff, the abstract may be inconsistant
|
||||
// taking forever. If we do cutoff, the abstract may be inconsistent
|
||||
// (missing words, potentially altering meaning), which is bad.
|
||||
void Query::Native::abstractPopulateContextTerms(
|
||||
Xapian::Database& xrdb,
|
||||
@ -524,7 +524,7 @@ int Query::Native::abstractFromIndex(
|
||||
// populating the adjacent slots.
|
||||
unsigned int maxpos = 0;
|
||||
|
||||
// Total number of occurences for all terms. We stop when we have too much
|
||||
// Total number of occurrences for all terms. We stop when we have too much
|
||||
unsigned int totaloccs = 0;
|
||||
|
||||
// First pass to populate the sparse document: we walk the term
|
||||
@ -585,7 +585,7 @@ int Query::Native::abstractFromIndex(
|
||||
LOGABS("makeAbstract:" << chron.millis() <<
|
||||
"mS:chosen number of positions " << totaloccs << "\n");
|
||||
|
||||
// This can happen if there are term occurences in the keywords
|
||||
// This can happen if there are term occurrences in the keywords
|
||||
// etc. but not elsewhere ?
|
||||
if (totaloccs == 0) {
|
||||
LOGDEB("makeAbstract: no occurrences\n");
|
||||
@ -615,7 +615,7 @@ int Query::Native::abstractFromIndex(
|
||||
// query terms. This can either uses the index position lists, or the
|
||||
// stored document text, with very different implementations.
|
||||
//
|
||||
// DatabaseModified and other general exceptions are catched and
|
||||
// DatabaseModified and other general exceptions are caught and
|
||||
// possibly retried by our caller.
|
||||
//
|
||||
// @param[out] vabs the abstract is returned as a vector of snippets.
|
||||
|
||||
@ -596,7 +596,7 @@ private:
|
||||
|
||||
bool getDoc(const std::string& udi, int idxi, Doc& doc);
|
||||
|
||||
/* Copyconst and assignement private and forbidden */
|
||||
/* Copyconst and assignment private and forbidden */
|
||||
Db(const Db &) {}
|
||||
Db& operator=(const Db &) {return *this;};
|
||||
};
|
||||
|
||||
@ -37,7 +37,7 @@ class Query;
|
||||
|
||||
#ifdef IDX_THREADS
|
||||
// Task for the index update thread. This can be
|
||||
// - add/update for a new / update documment
|
||||
// - add/update for a new / update document
|
||||
// - delete for a deleted document
|
||||
// - purgeOrphans when a multidoc file is updated during a partial pass (no
|
||||
// general purge). We want to remove subDocs that possibly don't
|
||||
|
||||
@ -62,7 +62,7 @@ class SearchDataClauseDist;
|
||||
|
||||
The content of each clause when added may not be fully parsed yet
|
||||
(may come directly from a gui field). It will be parsed and may be
|
||||
translated to several queries in the Xapian sense, for exemple
|
||||
translated to several queries in the Xapian sense, for example
|
||||
several terms and phrases as would result from
|
||||
["this is a phrase" term1 term2] .
|
||||
|
||||
|
||||
@ -142,7 +142,7 @@ public:
|
||||
// It may happen in some weird cases that the output from
|
||||
// unac is empty (if the word actually consisted entirely
|
||||
// of diacritics ...) The consequence is that a phrase
|
||||
// search won't work without addional slack.
|
||||
// search won't work without additional slack.
|
||||
return true;
|
||||
}
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user