86 Commits

Author SHA1 Message Date
Jean-Francois Dockes
412a5e6f78 none 2014-06-10 17:40:56 +02:00
Jean-Francois Dockes
030e576cdb add excludedmimetypes configuration variable 2014-05-02 10:07:26 +02:00
Jean-Francois Dockes
12e3f683b9 tests: avoid breaking old html text with data for python 2013-10-30 18:37:26 +01:00
Jean-Francois Dockes
338ec6eb42 python api tests: added rclextract test 2013-10-30 18:28:36 +01:00
Jean-Francois Dockes
27584674d5 added python api tests 2013-10-30 16:56:18 +01:00
Jean-Francois Dockes
56a56500c1 Handle partial indexing of document restricted to metadata from extended attributes 2013-10-04 10:57:11 +02:00
Jean-Francois Dockes
906e58feff added code to purge obsolete messages when a compound document (esp. mbox) is shortened and a partial update is performed (no general purge). Else the orphan docs remained in the index potentially forever (needed actual reindex of the file by a full pass to go away) 2013-04-22 11:32:49 +02:00
Jean-Francois Dockes
d3631b5ddf cleaned up processing of metadata from diverse origins (doc,extattrs,localfields) 2013-01-29 14:33:57 +01:00
Jean-Francois Dockes
f897f087aa HTML: do not concatenate text found before body tag with the title. Fixes issue #125 2013-01-12 14:06:40 +01:00
Jean-Francois Dockes
9e88426d13 adjusted results for ordering etc. on new machine 2013-01-05 15:40:47 +01:00
Jean-Francois Dockes
5b09254700 adjusted results for ordering etc. on new machine 2013-01-05 15:36:25 +01:00
Jean-Francois Dockes
5f61c2edff use static linking on macosx 2013-01-04 14:15:34 +01:00
Jean-Francois Dockes
be4a6a9420 add test with dir: spec on a path containing spaces 2012-12-25 10:21:56 +01:00
Jean-Francois Dockes
27da60c015 set mboxquirks tbird on ~/.mozilla 2012-10-26 10:25:22 +02:00
Jean-Francois Dockes
68955d9427 define non default unac_except_trans for tests 2012-10-16 13:35:35 +02:00
Jean-Francois Dockes
5add2e2384 Arrange so we can now open the parent of a document (e.g. chm file instead of temp copy of html page inside chm), even when the parent is itself embedded in an archive 2012-10-12 16:54:52 +02:00
Jean-Francois Dockes
da4e576330 add epub test 2012-10-12 13:37:23 +02:00
Jean-Francois Dockes
d0a1545fff fix a few tests to better run in an utf-8 locale 2012-10-06 15:49:07 +02:00
Jean-Francois Dockes
84b561b040 For plain text files, try alternate decode from 8bit charset when decode from UTF-8 fails 2012-10-06 15:12:49 +02:00
Jean-Francois Dockes
d29719e0f1 small test fixups 2012-10-06 12:11:51 +02:00
Jean-Francois Dockes
720f113d42 new autodiacsens=false default 2012-10-06 12:10:48 +02:00
"Jean-Francois Dockes ext:(%22)
4812965693 added a few case and diacritics sensitivity test cases 2012-09-27 20:12:24 +02:00
Jean-Francois Dockes
9b273d94e8 ensure that recoll configured with indexStripChars=1 runs as compiled with -DRCL_INDEX_STRIPCHARS
--HG--
branch : CASEDIACSENS
2012-09-15 15:16:20 +02:00
Jean-Francois Dockes
812e7b9fcd Added contributed rcltar filter 2012-05-25 17:04:39 +02:00
Jean-Francois Dockes
97ad15c42c Added contributed rcltar filter 2012-05-25 17:04:22 +02:00
Jean-Francois Dockes
1b397b46c5 test xml 2012-05-21 12:02:17 +02:00
Jean-Francois Dockes
e6191b51a8 Html: Just ignore opening and closing <body> and <html> tags. Current browsers show text before or after the body and ignore multiple body tags. Not pushed to 1.17 maint because of possible disruption. Closes issue #92 2012-05-16 10:07:09 +02:00
Jean-Francois Dockes
c0767c213d "program" unit test: file -i output changed from ie text/x-perl to application/x-perl on freebsd 2012-05-16 10:04:34 +02:00
Jean-Francois Dockes
a4c17941b1 Added a configuration parameter to set specific unaccenting/lowercasing for some characters to be handled differently than would result from using the Unicode database. Exemple: "a with ring above" could be set to be preserved by a Swedish locutor 2012-04-09 12:42:23 +02:00
"Jean-Francois Dockes ext:(%22)
e338272f58 dia test cases 2012-04-03 17:34:22 +02:00
Jean-Francois Dockes
ed0aed138a doc+tests 2012-03-28 10:19:39 +02:00
Jean-Francois Dockes
a8f124f637 added test cases 2012-03-20 11:17:41 +01:00
Jean-Francois Dockes
2c5e51fb1f fix tests for new file size handling 2012-03-08 15:58:36 +01:00
Jean-Francois Dockes
c53ca49f07 test: html5 meta charset 2012-01-26 19:31:06 +01:00
Jean-Francois Dockes
94989747ba test: okular notes 2012-01-23 21:19:02 +01:00
Jean-Francois Dockes
17542969a5 new gnumeric and okular notes filters 2012-01-23 20:25:55 +01:00
Jean-Francois Dockes
502f7e783e chm filter: handle files lacking a topics node 2011-12-17 16:41:45 +01:00
Jean-Francois Dockes
f544b28b4a Transcode mh_execm text/plain output like we do for mh_exec. Adjust handling of transcoding errors. These changes should fix most cases of non-utf8 text making it to unac/index 2011-10-20 14:00:38 +02:00
Jean-Francois Dockes
8b239e432b test deeply embedded document 2011-09-21 13:47:31 +02:00
Jean-Francois Dockes
ee0d602ab3 Implement anchored searches: terms to be found at a maximum distance of the start or end of the text 2011-09-20 16:42:56 +02:00
Jean-Francois Dockes
f1f6d0cf07 rerooted test results 2011-08-24 09:37:02 +02:00
"Jean-Francois Dockes ext:(%22)
07f2ec6dbe none 2011-08-23 11:02:45 +02:00
"Jean-Francois Dockes ext:(%22)
753636a6e6 rar test 2011-08-23 10:59:20 +02:00
"Jean-Francois Dockes ext:(%22)
38d5f9a2d9 rerooted test results 2011-08-23 10:29:19 +02:00
"Jean-Francois Dockes ext:(%22)
8362ee5da6 none 2011-08-23 10:10:53 +02:00
"Jean-Francois Dockes ext:(%22)
bd25305cee put test config under vc 2011-08-22 10:14:16 +02:00
"Jean-Francois Dockes ext:(%22)
fb874e7759 none 2011-07-16 11:49:32 +02:00
Jean-Francois Dockes
36a97cb8aa test: added html field extraction test 2011-06-24 11:08:12 +02:00
Jean-Francois Dockes
8fe524bd7f add html charset test 2011-06-24 10:40:29 +02:00
Jean-Francois Dockes
dd8f42253c Improve rcldoc filter and switch back to using it for indexing instead of direct antiword exec. This is slightly slower but it does catch a number of .doc files which would not be indexed otherwise 2011-05-10 09:03:13 +02:00