Jean-Francois Dockes
|
61a2e28a7c
|
Absurd input source global variable in Binc imap caused the indexer to crash when an email message contained attachments which were disguised messages (ie: x-mimehtml), because this would cause a recursive call into Binc with a different data source (ie: string instead of original fd, clobbering the original source
|
2012-05-24 14:52:41 +02:00 |
|
Jean-Francois Dockes
|
3accce0b22
|
index: added sanity checks to mail handler
|
2012-05-16 12:25:44 +02:00 |
|
Jean-Francois Dockes
|
ec7b40a52e
|
cosmetics: list -> vector in more places
|
2012-04-11 19:58:08 +02:00 |
|
Jean-Francois Dockes
|
80fb2f553c
|
MIME handling: treat content-type=="text" as "text/plain". Needed for some very old messages
|
2012-03-18 08:26:44 +01:00 |
|
Jean-Francois Dockes
|
638d468796
|
clarified the use of string keys inside the Filter metaData array
|
2012-03-07 10:13:46 +01:00 |
|
Jean-Francois Dockes
|
a5af2b93bd
|
"md5"->cstr_md5
|
2012-02-25 10:41:27 +01:00 |
|
Jean-Francois Dockes
|
f544b28b4a
|
Transcode mh_execm text/plain output like we do for mh_exec. Adjust handling of transcoding errors. These changes should fix most cases of non-utf8 text making it to unac/index
|
2011-10-20 14:00:38 +02:00 |
|
Jean-Francois Dockes
|
38e0957962
|
const string cleanup
|
2011-10-01 16:39:38 +02:00 |
|
Jean-Francois Dockes
|
5292a97de3
|
mail handler: remove header names when indexing to avoid articially increasing the frequency of ie, the "subject" term
|
2011-06-27 18:38:44 +02:00 |
|
Jean-Francois Dockes
|
b28eaf23fb
|
Got rid of all the old RCS id strings
|
2011-04-27 08:22:17 +02:00 |
|
Jean-Francois Dockes
|
e1a20aa810
|
got rid of accesses to global config through getMainConfig()
|
2011-03-02 13:47:07 +01:00 |
|
Jean-Francois Dockes
|
061ffda545
|
checked/changed all sprintf calls
|
2010-11-15 11:57:39 +01:00 |
|
Jean-Francois Dockes
|
e6d5f72886
|
added the possibility to extract arbitrary mail headers and use them as document fields. This forced an incompatible change in the format of the [stored] section inside the "fields" config file
|
2010-07-06 17:16:36 +02:00 |
|
dockes
|
c78a3bb567
|
add cnf(maildefcharset) to set specific mail default charset (mainly for readpst extracts which are utf-8 but have no charset set)
|
2009-11-27 13:23:13 +00:00 |
|
dockes
|
dd6acb07cc
|
mh_mail: use truncate_to_word to avoid cutting an utf8 char. rcldb: logdeb text_to_word errors
|
2009-11-18 10:26:47 +00:00 |
|
dockes
|
7d18c22142
|
reason msg
|
2009-11-16 16:10:31 +00:00 |
|
dockes
|
daae416d98
|
extract msgid + generate abstract at start of txt, excluding headers
|
2009-10-31 09:00:31 +00:00 |
|
dockes
|
229645a0e2
|
added optional extended file attributes support
|
2009-01-21 13:55:12 +00:00 |
|
dockes
|
f57d4a91f9
|
compute md5 checksums for all docs and optionally collapse duplicates in results
|
2009-01-09 14:56:36 +00:00 |
|
dockes
|
9082f3bf65
|
allow specifying format and charset for ext filters. Cache and reuse filters
|
2008-10-04 14:26:59 +00:00 |
|
dockes
|
5cc1de9aad
|
emit field for recipients
|
2008-09-16 08:13:45 +00:00 |
|
dockes
|
022e0e5f43
|
suppressed a few wasteful string-cstr conversions
|
2008-07-01 11:51:51 +00:00 |
|
dockes
|
0460f1016c
|
mh_mail now uses mimetype() to try and better identify application/octet-stream
|
2008-07-01 10:29:45 +00:00 |
|
dockes
|
46a7f05cbc
|
gcc 4 compat, thanks to Kartik Mistry
|
2007-12-13 06:58:22 +00:00 |
|
dockes
|
02475fba71
|
text/plain attachments were not transcoded to utf-8
|
2007-10-17 11:40:35 +00:00 |
|
dockes
|
1d683ad411
|
added field/prefixes for author and title + command line query language
|
2007-01-17 13:53:41 +00:00 |
|
dockes
|
094e465252
|
handle multipart/signed
|
2007-01-13 10:28:37 +00:00 |
|
dockes
|
8fe7cb37d3
|
mh_mail needs to lowercase contentypes
|
2006-12-18 12:06:11 +00:00 |
|
dockes
|
8f1f2ca66d
|
mail attachments sort of ok
|
2006-12-16 15:39:54 +00:00 |
|
dockes
|
229eb0de78
|
test data indexing result same terms as 1.6.3
|
2006-12-15 16:33:15 +00:00 |
|
dockes
|
33c95ef1ba
|
Dijon filters 1st step: mostly working needs check and optim
|
2006-12-15 12:40:24 +00:00 |
|
dockes
|
9c32ef4f16
|
fix bug with bad message "From " delimiter detection
|
2006-12-07 08:06:54 +00:00 |
|
dockes
|
d5745bdb83
|
fix bug with bad message "From " delimiter detection
|
2006-12-07 07:06:28 +00:00 |
|
dockes
|
290a7272be
|
use regexp to better discriminate From delimiter lines in mbox. Avoid reading mboxes twice
|
2006-12-05 15:25:17 +00:00 |
|
dockes
|
417586fb2b
|
fix newlines
|
2006-09-23 07:39:18 +00:00 |
|
dockes
|
b14021f539
|
clarified depth processing and increased limit
|
2006-09-22 07:19:13 +00:00 |
|
dockes
|
3e2bccd259
|
walk the full mime tree instead of staying at level 1
|
2006-09-19 14:30:39 +00:00 |
|
dockes
|
cfe1dd5d9f
|
Use own code to parse rfc822 dates, strptime() cant do
|
2006-09-15 16:50:44 +00:00 |
|
dockes
|
804b79ee56
|
let mimeparse handle decoding or param values
|
2006-09-05 17:09:30 +00:00 |
|
dockes
|
92b930f2c4
|
index and display attachment file names
|
2006-09-05 08:05:02 +00:00 |
|
dockes
|
c23b7e452b
|
comments+conventions
|
2006-04-07 08:51:15 +00:00 |
|
dockes
|
2a3075d6a6
|
reference to GPL in all .cpp files
|
2006-01-23 13:32:29 +00:00 |
|
dockes
|
ae6ce2638a
|
freebsd 4 port
|
2005-12-07 15:41:50 +00:00 |
|
dockes
|
ae8ff5abb3
|
*** empty log message ***
|
2005-11-24 07:16:16 +00:00 |
|
dockes
|
6cba3b65c1
|
restructuring on mimehandler files
|
2005-11-18 13:23:46 +00:00 |
|
dockes
|
baa0ff491b
|
renamed MimeHandler::worker to mkDoc + comments for doxygen
|
2005-11-08 21:02:55 +00:00 |
|
dockes
|
5ebcb0c104
|
separate file and document dates (mainly for email folders). Better check configuration at startup
|
2005-11-05 14:40:50 +00:00 |
|
dockes
|
f0f98312cd
|
fixed base64 decoding of email parts: str[x] = ch does not adjust length! and be more lenient with encoding errors
|
2005-10-31 08:59:05 +00:00 |
|
dockes
|
763b5f58c7
|
decode encoded mail headers, plus use message date instead of file mtime
|
2005-10-15 12:18:04 +00:00 |
|
dockes
|
1293f0d834
|
re-port to linux
|
2005-04-06 10:20:11 +00:00 |
|