192 Commits

Author SHA1 Message Date
Jean-Francois Dockes
27430403e2 comment 2011-11-25 19:44:37 +01:00
Jean-Francois Dockes
49554e42c2 Factorized common text transcoding code in separate module 2011-10-20 17:53:42 +02:00
Jean-Francois Dockes
6c72454396 generate acronyms for dotted abbrevs. ie O.E.C.D -> OECD 2011-10-20 13:24:29 +02:00
Jean-Francois Dockes
56fe54412f Protect against deadlock when using fam/gamin by adding a small timeout to the peek for events done between add calls. Add alarm to the addwatch call in case the deadlock happens anyway 2011-10-13 15:20:28 +02:00
Jean-Francois Dockes
0860b559ee get rid of a few garbage terms during indexing. Set a threshold for conversion errors after which we discard the doc. Stabilize the new termproc pipeline but no commongrams for now 2011-10-12 17:55:58 +02:00
Jean-Francois Dockes
5fd31172f5 New text to terms processing pipelines: results identical to 1.16 when used with empty stopfile 2011-10-07 07:53:49 +02:00
Jean-Francois Dockes
38e0957962 const string cleanup 2011-10-01 16:39:38 +02:00
Jean-Francois Dockes
3013e843a2 log 2011-10-01 09:20:10 +02:00
Jean-Francois Dockes
91778f8943 lower verbosity 2011-09-30 08:21:43 +02:00
Jean-Francois Dockes
424e4173ba threading cleanup: add mutex protection around moronic change to transcode. Add mutex to equiv issue in unac. Rename const strings everywhere to cstr_xx to ease future detection of potentially problematic static variables. Most probably close issue #65 2011-09-28 15:01:14 +02:00
Jean-Francois Dockes
5b3c5d8a5d small OpenBSD fixes (mount.h and FILE_OFFSET_BITS) 2011-09-23 10:32:41 +02:00
Jean-Francois Dockes
cd27645cc2 Avoid fwrite failure while trying to write empty missing helpers string 2011-09-20 07:37:28 +02:00
Jean-Francois Dockes
c5ff0cdf52 Control memory usage when deleting documents: use idxflushmb as when adding/updating 2011-09-07 19:11:11 +02:00
"Jean-Francois Dockes ext:(%22)
bc6587f07a get rid of unused guesscharset 2011-08-21 13:27:37 +02:00
"Jean-Francois Dockes ext:(%22)
ebbcc115a8 Allow setting a weight increase for field terms 2011-07-22 16:43:39 +02:00
"Jean-Francois Dockes ext:(%22)
36516b091b textsplit: discard - in front of words. Handle cjk punctuation characters 2011-07-16 11:51:38 +02:00
"Jean-Francois Dockes ext:(%22)
0e37f64a3c added more punctuation 2011-07-16 11:50:02 +02:00
"Jean-Francois Dockes ext:(%22)
88685d2e64 search/index: fixed a number of bad conversions to properly deal with text documents bigger than 2GB 2011-07-12 08:28:09 -07:00
"Jean-Francois Dockes ext:(%22)
5e59354535 more punctuation 2011-07-12 03:32:00 -07:00
Jean-Francois Dockes
cb0794e92c textsplit: eliminate some garbage terms (ie long sequences of dashes) 2011-07-06 16:20:32 +02:00
Jean-Francois Dockes
442ff819d0 added a number of unicode punctuation characters 2011-07-06 10:52:16 +02:00
Jean-Francois Dockes
4af5b9b88d rclconfig test: added option to print fields config 2011-06-24 10:57:07 +02:00
Jean-Francois Dockes
9bb4461013 small recoll/kio_recoll build changes: avoid unnecessary recompilations and make them play nicer together 2011-06-22 11:16:09 +02:00
Jean-Francois Dockes
2458541c71 index: stop suffixes were ignored in some cases 2011-05-02 15:09:45 +02:00
Jean-Francois Dockes
55f124725f Fix problems that occurred when multiple threads were trying to read/convert files at the same time (ie: indexing and previewing threads in the GUI calling internfile()). Either get rid of or lock-protect all shared data, eliminate misc initialization possible conflicts by using static initializers. Hopefuly closes issue #51 2011-04-28 10:58:33 +02:00
Jean-Francois Dockes
b28eaf23fb Got rid of all the old RCS id strings 2011-04-27 08:22:17 +02:00
Jean-Francois Dockes
e61712fc90 search gui: allow specifying fields in complex search panel 2011-03-30 18:52:44 +02:00
Jean-Francois Dockes
25f6a75315 none 2011-03-02 19:50:34 +01:00
Jean-Francois Dockes
e1a20aa810 got rid of accesses to global config through getMainConfig() 2011-03-02 13:47:07 +01:00
Jean-Francois Dockes
85b36d3c34 filename search fields: generate an AND of OR lists out of wildcard expansion instead of a global OR which did not make much sense 2011-01-13 11:47:35 +01:00
Jean-Francois Dockes
166399fd62 indexing: create lock / pid file 2011-01-08 19:24:26 +01:00
Jean-Francois Dockes
c5e40d8510 replaced all q3 widgets except textbrowsers 2010-12-01 16:15:22 +01:00
Jean-Francois Dockes
6c03417195 Move locafields parsing code from fsindexer to rclconfig for possible reuse 2010-11-22 15:56:14 +01:00
Jean-Francois Dockes
061ffda545 checked/changed all sprintf calls 2010-11-15 11:57:39 +01:00
Jean-Francois Dockes
9bd082bf39 comments only 2010-10-31 09:56:43 +01:00
Jean-Francois Dockes
0fa92899f9 gcc44 compile 2010-09-23 19:05:11 +02:00
Jean-Francois Dockes
ad4f24923f uncompress file before starting external viewer except if in the nouncompforviewmts list 2010-09-20 10:35:26 +02:00
Jean-Francois Dockes
4385dd1b8b small compilation issues on misc systems 2010-09-13 21:34:23 +02:00
Jean-Francois Dockes
f3b0b49c77 add autosplitting getconfparam() overloads 2010-09-10 09:34:43 +02:00
"Jean-Francois Dockes ext:(%22)
846bec8a73 fix english indexation -> indexing 2010-07-20 09:48:20 +02:00
"Jean-Francois Dockes ext:(%22)
e5f41aeb05 Add large file support 2010-07-16 17:08:07 +02:00
Jean-Francois Dockes
e6d5f72886 added the possibility to extract arbitrary mail headers and use them as document fields. This forced an incompatible change in the format of the [stored] section inside the "fields" config file 2010-07-06 17:16:36 +02:00
Jean-Francois Dockes
8520ec668a recognize more numbers: 1e-10, 1.e3 2010-05-17 09:20:09 +02:00
Jean-Francois Dockes
48358c8252 Added option nonumbers not to generate terms for numbers. closes #16 2010-05-05 10:18:56 +02:00
Jean-Francois Dockes
b87a23bfca separated out the cache access part from beaglequeueindexer. this avoids having to link the pure query programs with indexing code 2010-02-05 12:46:41 +01:00
Jean-Francois Dockes
8b2b00bc72 cosmetics: use derived class for actual splitter instead of callback 2010-02-02 15:33:52 +01:00
Jean-Francois Dockes
af603af058 use 3-arg version of ac_define as the 1-arg one is being obsoleted 2010-01-31 19:47:49 +01:00
Jean-Francois Dockes
c4e7ff69f6 Renamed WITHOUT_X11 to DISABLE_X11MON for clarification 2010-01-30 08:21:35 +01:00
dockes
52d5725c54 1.13.00: fixed doc ortographic typos 2010-01-05 07:14:27 +00:00
dockes
69c27db46a add --enable-camelcase option to configure 2009-12-14 10:10:01 +00:00