229 Commits

Author SHA1 Message Date
Jean-Francois Dockes
78bd8d63da use vector instead of list for execmd arg list 2012-04-11 15:36:49 +02:00
Jean-Francois Dockes
9f402d33cb got rid of unused csguess module 2012-04-06 15:14:01 +02:00
Jean-Francois Dockes
80fb2f553c MIME handling: treat content-type=="text" as "text/plain". Needed for some very old messages 2012-03-18 08:26:44 +01:00
Jean-Francois Dockes
0050f96f57 fix test driver 2012-03-18 08:23:33 +01:00
Jean-Francois Dockes
85166c93b2 Changed the way we handle document sizes. The fbytes field should now be in most cases the most "natural" document size. pcbytes holds the top external container size and dbytes the text size 2012-03-07 15:39:30 +01:00
Jean-Francois Dockes
638d468796 clarified the use of string keys inside the Filter metaData array 2012-03-07 10:13:46 +01:00
Jean-Francois Dockes
a5af2b93bd "md5"->cstr_md5 2012-02-25 10:41:27 +01:00
Jean-Francois Dockes
ec87379015 html: handle the html5 charset meta tag 2012-01-26 19:27:58 +01:00
Jean-Francois Dockes
0d8a61ced9 log message 2012-01-26 19:26:54 +01:00
Jean-Francois Dockes
639a434dce comments 2012-01-26 18:17:37 +01:00
Jean-Francois Dockes
eed31f9ef1 html index: throw an exception after parsing in all cases so that the same code path is always used. The previous approach sometimes resulted in a bad charset used for preview 2012-01-25 17:33:41 +01:00
Jean-Francois Dockes
516863b5d6 GUI: perform up to date check before previewing a subdoc. This is for example to avoid showing the wrong message if a mail folder has been compacted 2012-01-20 17:48:55 +01:00
Jean-Francois Dockes
036937e8bf added getmeta() method to Rcl::Doc and use in misc places 2012-01-20 14:48:50 +01:00
Jean-Francois Dockes
1931595637 GUI: added menu entry to show all the mime types actually indexed (by content) 2011-11-25 19:47:56 +01:00
Jean-Francois Dockes
49554e42c2 Factorized common text transcoding code in separate module 2011-10-20 17:53:42 +02:00
Jean-Francois Dockes
f544b28b4a Transcode mh_execm text/plain output like we do for mh_exec. Adjust handling of transcoding errors. These changes should fix most cases of non-utf8 text making it to unac/index 2011-10-20 14:00:38 +02:00
Jean-Francois Dockes
38e0957962 const string cleanup 2011-10-01 16:39:38 +02:00
Jean-Francois Dockes
487b623faf log 2011-10-01 09:31:38 +02:00
Jean-Francois Dockes
424e4173ba threading cleanup: add mutex protection around moronic change to transcode. Add mutex to equiv issue in unac. Rename const strings everywhere to cstr_xx to ease future detection of potentially problematic static variables. Most probably close issue #65 2011-09-28 15:01:14 +02:00
"Jean-Francois Dockes ext:(%22)
802ebc7704 comments 2011-08-21 13:29:06 +02:00
"Jean-Francois Dockes ext:(%22)
9cefcb7283 Simple optimization makes mh_mbox 3x faster 2011-08-20 14:54:29 +02:00
"Jean-Francois Dockes ext:(%22)
6b04fe7f2c The record for an attachment for which conversion failed (ie: image without exiftool) would erase the message's record because its ipath was not updated 2011-07-16 11:53:54 +02:00
"Jean-Francois Dockes ext:(%22)
88685d2e64 search/index: fixed a number of bad conversions to properly deal with text documents bigger than 2GB 2011-07-12 08:28:09 -07:00
Jean-Francois Dockes
5292a97de3 mail handler: remove header names when indexing to avoid articially increasing the frequency of ie, the "subject" term 2011-06-27 18:38:44 +02:00
Jean-Francois Dockes
c7a241d26e htmlparse: merged some updates from xapian 1.2.6 2011-06-24 10:41:54 +02:00
Jean-Francois Dockes
67ad817e52 internfile: revert 2314:17098b627784 which was unneeded and wrong 2011-06-22 17:49:51 +02:00
Jean-Francois Dockes
ce44c0a875 preview: use the index idea of the mime type after decompression instead or re-running mimetype(). This will fix preview for compressed man pages (which were identified as text/troff after decomp because not under man/ 2011-06-22 16:09:55 +02:00
Jean-Francois Dockes
ba5e0c41b4 index: fixed the way we process some mime type aliases, which resulted in accumulating handlers in the handler cache 2011-06-21 19:18:55 +02:00
Jean-Francois Dockes
631121e24e internfile: keep around temp file for possible caller use 2011-05-09 07:00:34 +02:00
Jean-Francois Dockes
c45cdd7561 common data locking: remove deadlock in mbox cache locking 2011-04-28 14:28:19 +02:00
Jean-Francois Dockes
55f124725f Fix problems that occurred when multiple threads were trying to read/convert files at the same time (ie: indexing and previewing threads in the GUI calling internfile()). Either get rid of or lock-protect all shared data, eliminate misc initialization possible conflicts by using static initializers. Hopefuly closes issue #51 2011-04-28 10:58:33 +02:00
Jean-Francois Dockes
b28eaf23fb Got rid of all the old RCS id strings 2011-04-27 08:22:17 +02:00
Jean-Francois Dockes
2d8e57ee4f Gui preview, internfile: handle case where target doc of a compound ipath still needs further translation (is not text or html) 2011-04-26 08:26:09 +02:00
Jean-Francois Dockes
f4c1c3678d indexing: an error on an archive member could crash or block the indexing because of the unclean way the ipath was passed in/out of internfile(). Closes issue #55 2011-04-25 16:41:43 +02:00
Jean-Francois Dockes
52fda2a075 GUI: lock handler cache against multiple thread access 2011-04-24 08:47:27 +02:00
Jean-Francois Dockes
7eb182f53c index: escape colon characters inside ipaths. This could potentially happen with the zip (ie: zipped maildir) and chm filters 2011-03-12 12:03:39 +01:00
Jean-Francois Dockes
e1a20aa810 got rid of accesses to global config through getMainConfig() 2011-03-02 13:47:07 +01:00
Jean-Francois Dockes
292859a3ac Index: improve processing/rejection for binary files disguising as scripts (ie: shar archives). Use "internal text/plain" instead of "exec rcltext" for script files so that normal text/plain processing is done (max size, splits). Reject text if more than 25% iconv errors 2011-03-01 08:39:30 +01:00
Jean-Francois Dockes
93a761785a mh_execm: send/receive charset-related parms (no filter use them for now) 2011-02-01 19:16:32 +01:00
Jean-Francois Dockes
320a869d6e Indexing filters: somewhat clarified and unified some charset-related parameters 2011-02-01 15:04:49 +01:00
Jean-Francois Dockes
91e740074e mh_execm: removed incorrect subdocerrors handling leftover from previous change 2011-01-31 09:31:35 +01:00
Jean-Francois Dockes
9b26100e6a comment 2011-01-29 16:18:37 +01:00
Jean-Francois Dockes
d80f4478fc Support thunderbird naked "^From $" separators 2011-01-11 18:36:40 +01:00
Jean-Francois Dockes
fccc9a590f mimehandler: accept additional parameter from config after internal for using different mime type 2011-01-08 19:22:09 +01:00
Jean-Francois Dockes
6ebc4b4fad fix r2093 which broke indexallfilenames 2010-12-15 15:45:24 +01:00
Jean-Francois Dockes
52e845a9fb debug traces: add is_unknown() method to filters to help with pointing out unhandled mime types 2010-12-14 18:21:39 +01:00
Jean-Francois Dockes
084740cd2b simplified the mbox-reading code 2010-11-30 15:21:44 +01:00
Jean-Francois Dockes
629e62e2b8 mbox: test driver improved 2010-11-30 11:35:21 +01:00
Jean-Francois Dockes
2f837a89b3 fix thunderbird hack breakage for 1,14,4 2010-11-29 22:43:41 +01:00
Jean-Francois Dockes
34151006fe Index: add call to get rid of filter subprocesses at end of indexing (for the GUI thread) 2010-11-23 19:35:44 +01:00