2261 Commits

Author SHA1 Message Date
Jean-Francois Dockes
0d24b5620b Make unac suppress combining accents found in input. Input in decomposed form was previously not unaccented 2011-11-04 21:06:48 +01:00
Jean-Francois Dockes
ea61e85b8f multi-doc filter: getnext error would cause uncaught exception because of access to uninitialized eof variable 2011-11-04 17:32:14 +01:00
Jean-Francois Dockes
fcf65e3118 Ensure that configure runs even if neither fam nor inotify are available 2011-11-04 17:30:15 +01:00
Jean-Francois Dockes
4a84b6afd2 Ensure that configure runs even if neither fam nor inotify are available 2011-11-04 17:30:12 +01:00
Jean-Francois Dockes
49554e42c2 Factorized common text transcoding code in separate module 2011-10-20 17:53:42 +02:00
Jean-Francois Dockes
f544b28b4a Transcode mh_execm text/plain output like we do for mh_exec. Adjust handling of transcoding errors. These changes should fix most cases of non-utf8 text making it to unac/index 2011-10-20 14:00:38 +02:00
Jean-Francois Dockes
d94a4ec315 doc 2011-10-20 13:45:49 +02:00
Jean-Francois Dockes
90233c0426 doc 2011-10-20 13:39:44 +02:00
Jean-Francois Dockes
8d52e928d1 increase slack for automatic phrases 2011-10-20 13:25:33 +02:00
Jean-Francois Dockes
6c72454396 generate acronyms for dotted abbrevs. ie O.E.C.D -> OECD 2011-10-20 13:24:29 +02:00
Jean-Francois Dockes
348421eae7 glitch prevented autophrase to be set by default 2011-10-20 13:22:57 +02:00
Jean-Francois Dockes
3853c5c0da Build the real-time monitor by default on FreeBSD (depend on USE_FAM). Fix a few glitches in the fam/gamin version 2011-10-14 14:06:24 +02:00
Jean-Francois Dockes
6d82d83037 make all ENOENT errors non-fatal: files and dirs disappear. Reset error string when it is retrieved to avoid accumulating memory in long-running programs 2011-10-14 14:05:33 +02:00
Jean-Francois Dockes
ccd58e0843 always build (but not install) recollq 2011-10-14 14:03:59 +02:00
Jean-Francois Dockes
d6c3853de3 doc 2011-10-14 11:47:53 +02:00
Jean-Francois Dockes
f653b0025e define default man viewer 2011-10-14 11:47:35 +02:00
Jean-Francois Dockes
85191eba16 indexing could crash on different "file -i" output for some (binary) file names 2011-10-13 19:33:38 +02:00
Jean-Francois Dockes
e8f63ec124 The mime identification could potentially get a bad length exception while processing garbled "file" output 2011-10-13 16:38:26 +02:00
Jean-Francois Dockes
56fe54412f Protect against deadlock when using fam/gamin by adding a small timeout to the peek for events done between add calls. Add alarm to the addwatch call in case the deadlock happens anyway 2011-10-13 15:20:28 +02:00
Jean-Francois Dockes
bed77d3095 comment 2011-10-13 11:22:56 +02:00
Jean-Francois Dockes
b37ea1915a real time index: generate MODIFY event when receiving inotify MOVED_TO. We do not seem to receive a modify as was apparently the case at some point 2011-10-12 18:30:47 +02:00
Jean-Francois Dockes
0860b559ee get rid of a few garbage terms during indexing. Set a threshold for conversion errors after which we discard the doc. Stabilize the new termproc pipeline but no commongrams for now 2011-10-12 17:55:58 +02:00
Jean-Francois Dockes
a2c9d2a82b simplify initial memory allocs by using realloc in all cases 2011-10-10 18:44:46 +02:00
Jean-Francois Dockes
d2ad20b4c7 return from main routine instead of exiting to ensure clean-up of temp objects 2011-10-10 18:41:05 +02:00
Jean-Francois Dockes
4a7ff398b2 comments 2011-10-07 08:05:36 +02:00
Jean-Francois Dockes
5fd31172f5 New text to terms processing pipelines: results identical to 1.16 when used with empty stopfile 2011-10-07 07:53:49 +02:00
Jean-Francois Dockes
61bf17aa46 moved routine around to avoid link issues 2011-10-06 13:48:57 +02:00
Jean-Francois Dockes
eda494153e simplify calls to isStop 2011-10-05 17:25:35 +02:00
Jean-Francois Dockes
acb297c9df comments + move the position jump to text_to_words 2011-10-04 16:33:44 +02:00
Jean-Francois Dockes
e4eba0de97 stoplist: use stringToStrings in place of splitter to support quoted space-containing entries 2011-10-04 16:04:28 +02:00
Jean-Francois Dockes
c25272a0d8 new czech translation 2011-10-04 16:02:30 +02:00
Jean-Francois Dockes
bb2685c2f5 Add frequency threshold to avoid adding common term to the automatic phrase search extension. Use autophrase by default with simple search, with a default freq threshold at 2% 2011-10-04 09:03:43 +02:00
Jean-Francois Dockes
4ced9bee49 add termDocCnt method 2011-10-04 08:04:17 +02:00
Jean-Francois Dockes
3e533298c0 add fully parseable base64-encoded output mode for use by external programs 2011-10-04 08:02:57 +02:00
Jean-Francois Dockes
a3898343a7 GUI: removed redundant setQuery from rclmain_w. We ran a good part of the query code two times... 2011-10-04 07:56:46 +02:00
Jean-Francois Dockes
35e5deb9a4 web: notes about building the midi module 2011-10-02 08:52:09 +02:00
Jean-Francois Dockes
6a2bf1f830 none 2011-10-02 08:50:58 +02:00
medoc
b4a69cf9f2 build: linux 3.0 inotify.h is in include/linux not include/sys 2011-10-01 20:11:01 +02:00
Jean-Francois Dockes
38e0957962 const string cleanup 2011-10-01 16:39:38 +02:00
Jean-Francois Dockes
7f0ca13e7f version 2011-10-01 09:35:12 +02:00
Jean-Francois Dockes
9a19d0a3c1 doc 2011-10-01 09:33:22 +02:00
Jean-Francois Dockes
e736dc7a77 log 2011-10-01 09:32:56 +02:00
Jean-Francois Dockes
487b623faf log 2011-10-01 09:31:38 +02:00
Jean-Francois Dockes
3013e843a2 log 2011-10-01 09:20:10 +02:00
Jean-Francois Dockes
e56b286f93 log 2011-09-30 16:19:42 +02:00
Jean-Francois Dockes
0c5f41c41c monitor: properly handle cleanup on directory moves 2011-09-30 08:56:29 +02:00
Jean-Francois Dockes
702fb88a1e Search: remove restriction on empty queries by replacing empty query with Xapian::Query::Matchall. This allows querying all files of a given type, or under a given tree, without an actual text search part 2011-09-30 08:50:50 +02:00
Jean-Francois Dockes
383468e2fc bump doc create/update messages updates to loginfo so that indexing progress can be monitored with less noise 2011-09-30 08:47:39 +02:00
Jean-Francois Dockes
e0aa67f0dc let dir go through indexfiles() (name will be indexed, non recursive) 2011-09-30 08:44:50 +02:00
Jean-Francois Dockes
91778f8943 lower verbosity 2011-09-30 08:21:43 +02:00