31 Commits

Author SHA1 Message Date
Jean-Francois Dockes
7c9e0af8a1 Use readnext() method to read even 1st chunk of text files to perform appropriate end of chunk truncation to eol. Wont affect uncunked files 2015-01-21 16:03:26 +01:00
Jean-Francois Dockes
a7728ceb91 changed the mime handler cache key (was the mime type), to avoid having multiple copies of the same filter when applied to different mime types. This reduces a lot the number of processes during indexing, with no impact on performance 2013-04-25 18:18:48 +02:00
"Jean-Francois Dockes ext:(%22)
860521be88 internfile: do not compute md5 when in preview mode 2013-04-09 12:40:46 +02:00
Jean-Francois Dockes
66b59c9963 use the "charset" extended attribute for text files if it is set 2013-01-23 12:04:02 +01:00
Jean-Francois Dockes
9f402d33cb got rid of unused csguess module 2012-04-06 15:14:01 +02:00
Jean-Francois Dockes
638d468796 clarified the use of string keys inside the Filter metaData array 2012-03-07 10:13:46 +01:00
Jean-Francois Dockes
a5af2b93bd "md5"->cstr_md5 2012-02-25 10:41:27 +01:00
Jean-Francois Dockes
49554e42c2 Factorized common text transcoding code in separate module 2011-10-20 17:53:42 +02:00
Jean-Francois Dockes
38e0957962 const string cleanup 2011-10-01 16:39:38 +02:00
"Jean-Francois Dockes ext:(%22)
88685d2e64 search/index: fixed a number of bad conversions to properly deal with text documents bigger than 2GB 2011-07-12 08:28:09 -07:00
Jean-Francois Dockes
b28eaf23fb Got rid of all the old RCS id strings 2011-04-27 08:22:17 +02:00
Jean-Francois Dockes
f4c1c3678d indexing: an error on an archive member could crash or block the indexing because of the unclean way the ipath was passed in/out of internfile(). Closes issue #55 2011-04-25 16:41:43 +02:00
Jean-Francois Dockes
e1a20aa810 got rid of accesses to global config through getMainConfig() 2011-03-02 13:47:07 +01:00
Jean-Francois Dockes
292859a3ac Index: improve processing/rejection for binary files disguising as scripts (ie: shar archives). Use "internal text/plain" instead of "exec rcltext" for script files so that normal text/plain processing is done (max size, splits). Reject text if more than 25% iconv errors 2011-03-01 08:39:30 +01:00
Jean-Francois Dockes
320a869d6e Indexing filters: somewhat clarified and unified some charset-related parameters 2011-02-01 15:04:49 +01:00
Jean-Francois Dockes
061ffda545 checked/changed all sprintf calls 2010-11-15 11:57:39 +01:00
"Jean-Francois Dockes ext:(%22)
e5f41aeb05 Add large file support 2010-07-16 17:08:07 +02:00
dockes
e7b2bc4b46 new glibc missing includes 2009-11-28 09:15:46 +00:00
dockes
a029de8be9 set defaults usedesktoprefs, maxtext 20mb pagesz 1000k webcache 40m 2009-11-28 08:14:05 +00:00
dockes
6bd43301e1 gcc43+linux compile 2009-10-21 11:32:49 +00:00
dockes
a73a1fb097 dont set ipath for the first page in text files to avoid dual records for files under the page size 2009-09-30 15:53:06 +00:00
dockes
a374b2a7b7 implemented paged text files 2009-09-30 15:45:53 +00:00
dockes
0e1cbddb8b textfilemaxmbs 2009-09-29 15:58:45 +00:00
dockes
229645a0e2 added optional extended file attributes support 2009-01-21 13:55:12 +00:00
dockes
f57d4a91f9 compute md5 checksums for all docs and optionally collapse duplicates in results 2009-01-09 14:56:36 +00:00
dockes
33c95ef1ba Dijon filters 1st step: mostly working needs check and optim 2006-12-15 12:40:24 +00:00
dockes
f96fcd6dd3 get rid of unused temp 2006-03-20 15:14:08 +00:00
dockes
2a3075d6a6 reference to GPL in all .cpp files 2006-01-23 13:32:29 +00:00
dockes
be485e8059 allow indexing individual files. Fix pb with preview and charsets (local defcharset ignored) 2005-12-14 11:00:48 +00:00
dockes
ae8ff5abb3 *** empty log message *** 2005-11-24 07:16:16 +00:00
dockes
6cba3b65c1 restructuring on mimehandler files 2005-11-18 13:23:46 +00:00