Jean-Francois Dockes
|
29fe1e4927
|
implemented maxmemberkb limit for multidoc (e.g. archive) members
|
2012-10-06 09:05:35 +02:00 |
|
Jean-Francois Dockes
|
1329265b7b
|
check for empty file name in internfile, else gets stuck later because empty fn is interpreted as read stdin in md5
|
2012-10-05 16:42:13 +02:00 |
|
Jean-Francois Dockes
|
d942b44785
|
mbox: implement member size limit of 100MB and autodetec thunderbird mboxes (look for .msf)
|
2012-10-04 17:00:50 +02:00 |
|
Jean-Francois Dockes
|
e0bc65bfdd
|
small mods inocuous or auxiliary to case/diac sensitivity but which can live in main branch
|
2012-09-13 12:25:01 +02:00 |
|
"Jean-Francois Dockes ext:(%22)
|
ec3dbb4092
|
comments
|
2012-08-21 08:38:23 +02:00 |
|
"Jean-Francois Dockes ext:(%22)
|
2870274f80
|
slightly simplified temp file handling
|
2012-08-21 08:35:39 +02:00 |
|
Jean-Francois Dockes
|
254a7dc972
|
comment
|
2012-06-05 14:14:02 +02:00 |
|
Jean-Francois Dockes
|
643f4d56bb
|
internals: virtualized the doc fetcher interface
|
2012-06-05 07:16:11 +02:00 |
|
Jean-Francois Dockes
|
61a2e28a7c
|
Absurd input source global variable in Binc imap caused the indexer to crash when an email message contained attachments which were disguised messages (ie: x-mimehtml), because this would cause a recursive call into Binc with a different data source (ie: string instead of original fd, clobbering the original source
|
2012-05-24 14:52:41 +02:00 |
|
Jean-Francois Dockes
|
3accce0b22
|
index: added sanity checks to mail handler
|
2012-05-16 12:25:44 +02:00 |
|
Jean-Francois Dockes
|
0333d83d2e
|
html: small additional cleanup after previous <body> processing modification
|
2012-05-16 10:13:53 +02:00 |
|
Jean-Francois Dockes
|
e6191b51a8
|
Html: Just ignore opening and closing <body> and <html> tags. Current browsers show text before or after the body and ignore multiple body tags. Not pushed to 1.17 maint because of possible disruption. Closes issue #92
|
2012-05-16 10:07:09 +02:00 |
|
Jean-Francois Dockes
|
8b34610dde
|
Cleaned up file name handling. Fixes that file names were sometimes indexed split, sometimes not. They now always are both, with different prefixes. Forces reindex
|
2012-04-13 09:18:08 +02:00 |
|
Jean-Francois Dockes
|
ec7b40a52e
|
cosmetics: list -> vector in more places
|
2012-04-11 19:58:08 +02:00 |
|
Jean-Francois Dockes
|
78bd8d63da
|
use vector instead of list for execmd arg list
|
2012-04-11 15:36:49 +02:00 |
|
Jean-Francois Dockes
|
9f402d33cb
|
got rid of unused csguess module
|
2012-04-06 15:14:01 +02:00 |
|
Jean-Francois Dockes
|
80fb2f553c
|
MIME handling: treat content-type=="text" as "text/plain". Needed for some very old messages
|
2012-03-18 08:26:44 +01:00 |
|
Jean-Francois Dockes
|
0050f96f57
|
fix test driver
|
2012-03-18 08:23:33 +01:00 |
|
Jean-Francois Dockes
|
85166c93b2
|
Changed the way we handle document sizes. The fbytes field should now be in most cases the most "natural" document size. pcbytes holds the top external container size and dbytes the text size
|
2012-03-07 15:39:30 +01:00 |
|
Jean-Francois Dockes
|
638d468796
|
clarified the use of string keys inside the Filter metaData array
|
2012-03-07 10:13:46 +01:00 |
|
Jean-Francois Dockes
|
a5af2b93bd
|
"md5"->cstr_md5
|
2012-02-25 10:41:27 +01:00 |
|
Jean-Francois Dockes
|
ec87379015
|
html: handle the html5 charset meta tag
|
2012-01-26 19:27:58 +01:00 |
|
Jean-Francois Dockes
|
0d8a61ced9
|
log message
|
2012-01-26 19:26:54 +01:00 |
|
Jean-Francois Dockes
|
639a434dce
|
comments
|
2012-01-26 18:17:37 +01:00 |
|
Jean-Francois Dockes
|
eed31f9ef1
|
html index: throw an exception after parsing in all cases so that the same code path is always used. The previous approach sometimes resulted in a bad charset used for preview
|
2012-01-25 17:33:41 +01:00 |
|
Jean-Francois Dockes
|
516863b5d6
|
GUI: perform up to date check before previewing a subdoc. This is for example to avoid showing the wrong message if a mail folder has been compacted
|
2012-01-20 17:48:55 +01:00 |
|
Jean-Francois Dockes
|
036937e8bf
|
added getmeta() method to Rcl::Doc and use in misc places
|
2012-01-20 14:48:50 +01:00 |
|
Jean-Francois Dockes
|
1931595637
|
GUI: added menu entry to show all the mime types actually indexed (by content)
|
2011-11-25 19:47:56 +01:00 |
|
Jean-Francois Dockes
|
49554e42c2
|
Factorized common text transcoding code in separate module
|
2011-10-20 17:53:42 +02:00 |
|
Jean-Francois Dockes
|
f544b28b4a
|
Transcode mh_execm text/plain output like we do for mh_exec. Adjust handling of transcoding errors. These changes should fix most cases of non-utf8 text making it to unac/index
|
2011-10-20 14:00:38 +02:00 |
|
Jean-Francois Dockes
|
38e0957962
|
const string cleanup
|
2011-10-01 16:39:38 +02:00 |
|
Jean-Francois Dockes
|
487b623faf
|
log
|
2011-10-01 09:31:38 +02:00 |
|
Jean-Francois Dockes
|
424e4173ba
|
threading cleanup: add mutex protection around moronic change to transcode. Add mutex to equiv issue in unac. Rename const strings everywhere to cstr_xx to ease future detection of potentially problematic static variables. Most probably close issue #65
|
2011-09-28 15:01:14 +02:00 |
|
"Jean-Francois Dockes ext:(%22)
|
802ebc7704
|
comments
|
2011-08-21 13:29:06 +02:00 |
|
"Jean-Francois Dockes ext:(%22)
|
9cefcb7283
|
Simple optimization makes mh_mbox 3x faster
|
2011-08-20 14:54:29 +02:00 |
|
"Jean-Francois Dockes ext:(%22)
|
6b04fe7f2c
|
The record for an attachment for which conversion failed (ie: image without exiftool) would erase the message's record because its ipath was not updated
|
2011-07-16 11:53:54 +02:00 |
|
"Jean-Francois Dockes ext:(%22)
|
88685d2e64
|
search/index: fixed a number of bad conversions to properly deal with text documents bigger than 2GB
|
2011-07-12 08:28:09 -07:00 |
|
Jean-Francois Dockes
|
5292a97de3
|
mail handler: remove header names when indexing to avoid articially increasing the frequency of ie, the "subject" term
|
2011-06-27 18:38:44 +02:00 |
|
Jean-Francois Dockes
|
c7a241d26e
|
htmlparse: merged some updates from xapian 1.2.6
|
2011-06-24 10:41:54 +02:00 |
|
Jean-Francois Dockes
|
67ad817e52
|
internfile: revert 2314:17098b627784 which was unneeded and wrong
|
2011-06-22 17:49:51 +02:00 |
|
Jean-Francois Dockes
|
ce44c0a875
|
preview: use the index idea of the mime type after decompression instead or re-running mimetype(). This will fix preview for compressed man pages (which were identified as text/troff after decomp because not under man/
|
2011-06-22 16:09:55 +02:00 |
|
Jean-Francois Dockes
|
ba5e0c41b4
|
index: fixed the way we process some mime type aliases, which resulted in accumulating handlers in the handler cache
|
2011-06-21 19:18:55 +02:00 |
|
Jean-Francois Dockes
|
631121e24e
|
internfile: keep around temp file for possible caller use
|
2011-05-09 07:00:34 +02:00 |
|
Jean-Francois Dockes
|
c45cdd7561
|
common data locking: remove deadlock in mbox cache locking
|
2011-04-28 14:28:19 +02:00 |
|
Jean-Francois Dockes
|
55f124725f
|
Fix problems that occurred when multiple threads were trying to read/convert files at the same time (ie: indexing and previewing threads in the GUI calling internfile()). Either get rid of or lock-protect all shared data, eliminate misc initialization possible conflicts by using static initializers. Hopefuly closes issue #51
|
2011-04-28 10:58:33 +02:00 |
|
Jean-Francois Dockes
|
b28eaf23fb
|
Got rid of all the old RCS id strings
|
2011-04-27 08:22:17 +02:00 |
|
Jean-Francois Dockes
|
2d8e57ee4f
|
Gui preview, internfile: handle case where target doc of a compound ipath still needs further translation (is not text or html)
|
2011-04-26 08:26:09 +02:00 |
|
Jean-Francois Dockes
|
f4c1c3678d
|
indexing: an error on an archive member could crash or block the indexing because of the unclean way the ipath was passed in/out of internfile(). Closes issue #55
|
2011-04-25 16:41:43 +02:00 |
|
Jean-Francois Dockes
|
52fda2a075
|
GUI: lock handler cache against multiple thread access
|
2011-04-24 08:47:27 +02:00 |
|
Jean-Francois Dockes
|
7eb182f53c
|
index: escape colon characters inside ipaths. This could potentially happen with the zip (ie: zipped maildir) and chm filters
|
2011-03-12 12:03:39 +01:00 |
|