382 Commits

Author SHA1 Message Date
Jean-Francois Dockes
dca18bc585 Try to give possible explanations when opening a preview fails 2019-06-15 19:21:52 +02:00
Jean-Francois Dockes
37e203d535 mh_text: log message when skipping file with size over max 2019-05-17 09:32:46 +02:00
Jean-Francois Dockes
33e1847b26 suppress misc warnings on fedora and macosx 2019-04-28 15:39:15 +02:00
Jean-Francois Dockes
35d2d5bf49 Fixed a number of recollinit invocations. Most in dead/test code 2019-03-21 15:28:02 +01:00
Jean-Francois Dockes
1cf8327525 internfile: let the constructor succeed even on uncompression error, so that the doc record is created and retry choices can be done for other runs 2019-03-19 16:41:01 +01:00
Jean-Francois Dockes
bfa786dfeb internfile: do not process set_document_xx errors. Wait until the next_document() call so that the file names or ipaths are indexed 2019-03-12 14:55:31 +01:00
Jean-Francois Dockes
0cbc46732f Fixed the FSF address 2019-03-04 11:19:14 +01:00
Jean-Francois Dockes
e3c5f51519 Make sure we dont grow the ipath with each consecutive error 2019-02-19 20:54:57 +01:00
Jean-Francois Dockes
478739f1e7 uncomp: better error message. 2019-01-25 15:18:50 +01:00
Jean-Francois Dockes
f6eacd5949 mh_html: print explanation for read errors 2019-01-23 14:50:51 +01:00
Jean-Francois Dockes
3be55f4ad4 Internal xsltproc: small Windows adjustments 2019-01-08 14:38:30 +01:00
Jean-Francois Dockes
f6f4d8426a comments + really compute md5 on uncompressed data 2018-12-28 10:32:01 +01:00
Jean-Francois Dockes
586ff90dc0 internal xslt: openoffice zip format working 2018-12-27 16:20:12 +01:00
Jean-Francois Dockes
00c0c5168b internal xslt working for single-sheet (abw). Still leaking memory? 2018-12-25 10:57:26 +01:00
Jean-Francois Dockes
abc45bc156 internfile: transfer metadata from the last extracted (file-like) stage to the final document 2018-11-30 11:55:30 +01:00
Jean-Francois Dockes
495bd66bf5 mh_text: use c++11 for init 2018-11-22 17:46:06 +01:00
Jean-Francois Dockes
b4dfa40cbf mh_mail: use rfc2047 on additional headers requested by config. comments and small cleanups 2018-11-22 17:44:33 +01:00
Jean-Francois Dockes
0cdcaea437 mimeparse: use cp1252 instead of iso-8859 on values with residual 8bit chars.
Also: comments and missing std:: qualifiers
2018-11-22 17:42:00 +01:00
Jean-Francois Dockes
3b226a108a missing using 2018-11-22 17:21:36 +01:00
Jean-Francois Dockes
9a9ce69647 comments and indent 2018-11-22 14:41:06 +01:00
Jean-Francois Dockes
fd12341b99 comments 2018-11-20 16:09:12 +01:00
Jean-Francois Dockes
218b3fbfe2 Fix clear() call super antipattern in handlers 2018-11-14 15:29:07 +01:00
Jean-Francois Dockes
23141307f7 internfile:collectIpathAndMT: simplify a bit 2018-11-14 15:09:45 +01:00
Jean-Francois Dockes
f008457493 comments and unused defs removal 2018-11-14 09:43:20 +01:00
Jean-Francois Dockes
2267e5f2f5 Simplified code by replacing misc direct regex/regex.h invocation with SimpleRegex wrapper 2018-09-03 13:29:16 +02:00
Jean-Francois Dockes
7b8ba96b25 md5 for text/plain attachments was not computed, stayed same as parent so they were not shown if hide duplicates option was active in the GUI 2018-07-07 09:17:46 +02:00
Jean-Francois Dockes
d69d2abbde TempFile: clean-up interface by using internal ref-counted class member. Uncomp: add interface to clear cache 2018-05-17 10:24:01 +02:00
Jean-Francois Dockes
9244e31574 fixed a few spelling errors, mostly in comments and debug messages 2018-05-03 16:20:36 +02:00
Jean-Francois Dockes
60c9f8229a prettifylog message 2018-02-09 18:14:48 +01:00
Jean-Francois Dockes
f83490a5ee When indexing arbitrary email headers: sanitize the data to utf-8 to avoid later splitter errors 2017-10-20 17:49:30 +02:00
Jean-Francois Dockes
aa56a3540e mail: must not reset the configured list of additional headers for each message ! 2017-10-18 15:21:43 +02:00
Jean-Francois Dockes
29c6f75423 make sure that python rclextract.idoctofile always retrieves an uncompressed file of the correct MIME type. + misc comments 2017-07-20 12:52:24 +02:00
Jean-Francois Dockes
32e79d301b comments and LOG prettifying 2017-07-20 07:52:22 +02:00
Jean-Francois Dockes
9f02bc8119 prettified LOG lines 2017-07-19 19:15:29 +02:00
Jean-Francois Dockes
19a4b2a287 Do not filter out text/html when it results from a conversion, even if excluded by indexedmimetypes/excludedmimetypes 2017-06-08 10:09:05 +02:00
Jean-Francois Dockes
65387963ed Avoid creating temp files for mh_null and mh_unknown... 2017-06-07 20:57:33 +02:00
Jean-Francois Dockes
5863d29e49 debug function 2017-05-12 10:12:48 +02:00
Jean-Francois Dockes
9d95de032d mail message: multipart/alternative: avoid choosing the text/plain part if it is empty (yes it happens...) 2017-03-26 17:39:49 +02:00
Jean-Francois Dockes
bde991c08a got rid of off_t 2017-02-28 20:36:01 +01:00
Jean-Francois Dockes
b55f4b3b0a add nomd5types parameter to set file types for which dedup is not that useful and computation is expensive (e.g. audio files). Replace "call parent" misfeature with call to virtual in MimeHandler constructor. Fix log calls indent 2017-02-02 18:09:00 +01:00
Jean-Francois Dockes
90bae886c2 increased max attributes value to 200 2017-01-28 10:01:59 +01:00
Jean-Francois Dockes
217eb388e2 log formats 2017-01-28 10:00:07 +01:00
Jean-Francois Dockes
2594b71ae8 log 2017-01-16 11:14:54 +01:00
Jean-Francois Dockes
d80531fa62 Fix mimetype filtering (indexedmimetypes/excludedmimetypes) not working for embedded documents 2017-01-13 09:18:18 +01:00
Jean-Francois Dockes
3595109084 detect unicode BOM in text files 2016-11-15 18:31:34 +01:00
Jean-Francois Dockes
93c0001439 pretty 2016-11-08 12:42:46 +01:00
Jean-Francois Dockes
9ce6530e7b execm filters: the change to let filters set arbitrary metadata lost the top doc size, now saved aside 2016-08-12 18:00:52 +02:00
Jean-Francois Dockes
92da4c00cd use std c++11 initializer instead of create_xx hacks 2016-07-16 11:15:31 +02:00
Jean-Francois Dockes
c1fad4afc7 Replaced pthread with std:: thread and mutex 2016-07-12 18:08:21 +02:00
Jean-Francois Dockes
f6a999de84 logging now uses c++ streams 2016-07-12 09:41:04 +02:00