422 Commits

Author SHA1 Message Date
Jean-Francois Dockes
3be55f4ad4 Internal xsltproc: small Windows adjustments 2019-01-08 14:38:30 +01:00
Jean-Francois Dockes
f6f4d8426a comments + really compute md5 on uncompressed data 2018-12-28 10:32:01 +01:00
Jean-Francois Dockes
586ff90dc0 internal xslt: openoffice zip format working 2018-12-27 16:20:12 +01:00
Jean-Francois Dockes
00c0c5168b internal xslt working for single-sheet (abw). Still leaking memory? 2018-12-25 10:57:26 +01:00
Jean-Francois Dockes
abc45bc156 internfile: transfer metadata from the last extracted (file-like) stage to the final document 2018-11-30 11:55:30 +01:00
Jean-Francois Dockes
495bd66bf5 mh_text: use c++11 for init 2018-11-22 17:46:06 +01:00
Jean-Francois Dockes
b4dfa40cbf mh_mail: use rfc2047 on additional headers requested by config. comments and small cleanups 2018-11-22 17:44:33 +01:00
Jean-Francois Dockes
0cdcaea437 mimeparse: use cp1252 instead of iso-8859 on values with residual 8bit chars.
Also: comments and missing std:: qualifiers
2018-11-22 17:42:00 +01:00
Jean-Francois Dockes
3b226a108a missing using 2018-11-22 17:21:36 +01:00
Jean-Francois Dockes
9a9ce69647 comments and indent 2018-11-22 14:41:06 +01:00
Jean-Francois Dockes
fd12341b99 comments 2018-11-20 16:09:12 +01:00
Jean-Francois Dockes
218b3fbfe2 Fix clear() call super antipattern in handlers 2018-11-14 15:29:07 +01:00
Jean-Francois Dockes
23141307f7 internfile:collectIpathAndMT: simplify a bit 2018-11-14 15:09:45 +01:00
Jean-Francois Dockes
f008457493 comments and unused defs removal 2018-11-14 09:43:20 +01:00
Jean-Francois Dockes
2267e5f2f5 Simplified code by replacing misc direct regex/regex.h invocation with SimpleRegex wrapper 2018-09-03 13:29:16 +02:00
Jean-Francois Dockes
7b8ba96b25 md5 for text/plain attachments was not computed, stayed same as parent so they were not shown if hide duplicates option was active in the GUI 2018-07-07 09:17:46 +02:00
Jean-Francois Dockes
d69d2abbde TempFile: clean-up interface by using internal ref-counted class member. Uncomp: add interface to clear cache 2018-05-17 10:24:01 +02:00
Jean-Francois Dockes
9244e31574 fixed a few spelling errors, mostly in comments and debug messages 2018-05-03 16:20:36 +02:00
Jean-Francois Dockes
60c9f8229a prettifylog message 2018-02-09 18:14:48 +01:00
Jean-Francois Dockes
f83490a5ee When indexing arbitrary email headers: sanitize the data to utf-8 to avoid later splitter errors 2017-10-20 17:49:30 +02:00
Jean-Francois Dockes
aa56a3540e mail: must not reset the configured list of additional headers for each message ! 2017-10-18 15:21:43 +02:00
Jean-Francois Dockes
29c6f75423 make sure that python rclextract.idoctofile always retrieves an uncompressed file of the correct MIME type. + misc comments 2017-07-20 12:52:24 +02:00
Jean-Francois Dockes
32e79d301b comments and LOG prettifying 2017-07-20 07:52:22 +02:00
Jean-Francois Dockes
9f02bc8119 prettified LOG lines 2017-07-19 19:15:29 +02:00
Jean-Francois Dockes
19a4b2a287 Do not filter out text/html when it results from a conversion, even if excluded by indexedmimetypes/excludedmimetypes 2017-06-08 10:09:05 +02:00
Jean-Francois Dockes
65387963ed Avoid creating temp files for mh_null and mh_unknown... 2017-06-07 20:57:33 +02:00
Jean-Francois Dockes
5863d29e49 debug function 2017-05-12 10:12:48 +02:00
Jean-Francois Dockes
9d95de032d mail message: multipart/alternative: avoid choosing the text/plain part if it is empty (yes it happens...) 2017-03-26 17:39:49 +02:00
Jean-Francois Dockes
bde991c08a got rid of off_t 2017-02-28 20:36:01 +01:00
Jean-Francois Dockes
b55f4b3b0a add nomd5types parameter to set file types for which dedup is not that useful and computation is expensive (e.g. audio files). Replace "call parent" misfeature with call to virtual in MimeHandler constructor. Fix log calls indent 2017-02-02 18:09:00 +01:00
Jean-Francois Dockes
90bae886c2 increased max attributes value to 200 2017-01-28 10:01:59 +01:00
Jean-Francois Dockes
217eb388e2 log formats 2017-01-28 10:00:07 +01:00
Jean-Francois Dockes
2594b71ae8 log 2017-01-16 11:14:54 +01:00
Jean-Francois Dockes
d80531fa62 Fix mimetype filtering (indexedmimetypes/excludedmimetypes) not working for embedded documents 2017-01-13 09:18:18 +01:00
Jean-Francois Dockes
3595109084 detect unicode BOM in text files 2016-11-15 18:31:34 +01:00
Jean-Francois Dockes
93c0001439 pretty 2016-11-08 12:42:46 +01:00
Jean-Francois Dockes
9ce6530e7b execm filters: the change to let filters set arbitrary metadata lost the top doc size, now saved aside 2016-08-12 18:00:52 +02:00
Jean-Francois Dockes
92da4c00cd use std c++11 initializer instead of create_xx hacks 2016-07-16 11:15:31 +02:00
Jean-Francois Dockes
c1fad4afc7 Replaced pthread with std:: thread and mutex 2016-07-12 18:08:21 +02:00
Jean-Francois Dockes
f6a999de84 logging now uses c++ streams 2016-07-12 09:41:04 +02:00
Jean-Francois Dockes
b9e672abda Allow execm input handlers to set arbitrary data fields 2016-07-11 18:13:39 +02:00
Jean-Francois Dockes
1aea57fcb2 defined data access interface for external indexers 2016-06-01 09:46:47 +02:00
Jean-Francois Dockes
627da5a39b Handler timeout should not interrupt the whole indexing pass 2016-04-14 15:48:01 +02:00
Jean-Francois Dockes
f3820471e4 Add cachedir variable allowing to move all data directories by setting a single value. Closes issue #270 2016-04-08 15:09:15 +02:00
Jean-Francois Dockes
a4fd4ee5be moved code around to make smallut and pathut less recoll-specific and reusable. No actual changes 2016-03-21 12:55:31 +01:00
Jean-Francois Dockes
08a810986c Lower log level for xattr op error with errno ENOTSUP 2016-02-23 08:03:17 +01:00
Jean-Francois Dockes
ff15f8fb1c Centralize stat calls to ensure consistency of time fields on windows 2016-01-08 11:23:10 +01:00
Jean-Francois Dockes
a95dcbd4b0 Windows: fix missing O_BINARY 2015-12-02 11:42:44 +01:00
Jean-Francois Dockes
a783ab17dc mh_execm: compute file md5 before activating filter to avoid concurrent open issues on Windows 2015-12-02 10:30:04 +01:00
Jean-Francois Dockes
5ba0be5e58 windows: mh_mbox reverted the test for From lines... 2015-12-01 17:29:44 +01:00