200 Commits

Author SHA1 Message Date
Jean-Francois Dockes
766a34a8db fix flac mime types in rclaudio + small changes for experimenting with embedding an interpreter in recollindex 2015-08-23 09:29:26 +02:00
Jean-Francois Dockes
83939e45ab import sys 2015-08-09 13:37:30 +02:00
Jean-Francois Dockes
6a6552ee43 exit with meaningful status 2015-07-31 11:24:56 +02:00
Jean-Francois Dockes
922a9384f9 rclpdf: work with newer poppler version which do escape html text inside <head> 2015-06-30 10:35:22 +02:00
Jean-Francois Dockes
eaddefa7c5 Add capability to run tesseract from rclpdf. Disabled by default, see comments at the top of rclpdf 2015-04-24 18:13:52 +02:00
Jean-Francois Dockes
1e6f56522e Let recollindex execute a script at startup to try and guess if it should retry failed files 2015-04-24 10:46:58 +02:00
Jean-Francois Dockes
fb83946183 Contributed rclscribus fixes, thanks to Morten 2015-04-20 09:16:37 +02:00
Jean-Francois Dockes
47b1d77c5d guard against spaces in filenames inside rclokulnote and rcldoc filters 2015-04-17 13:12:01 +02:00
Francois Botha
d80db8c09f Implement filter for .7z files. Based on rclzip and rcltar 2015-04-06 09:57:00 +02:00
Jean-Francois Dockes
fbb2c257a5 python2->python in script headers 2015-02-27 18:43:27 +01:00
Jean-Francois Dockes
cf33d7531c Make xls-dump.py errors less noisy, hopefully avoiding system reports on Fedora 2014-12-18 15:35:42 +01:00
Jean-Francois Dockes
02874255d8 rclmpdf ok? 2014-10-29 11:57:44 +01:00
Jean-Francois Dockes
86bc0e9104 dquot -> quot! 2014-10-29 11:57:18 +01:00
Jean-Francois Dockes
293468bd58 new pdf filter which can process attachments 2014-10-29 08:20:03 +01:00
Jean-Francois Dockes
7837558909 rclpurple: fix for current log format 2014-10-01 11:37:20 +02:00
Jean-Francois Dockes
552eb0965b rclpdf: also escape text inside meta content attributes 2014-08-25 14:16:45 +02:00
Jean-Francois Dockes
729be49a1b Improved error message, closes issue #207 2014-07-14 08:30:41 +02:00
Jean-Francois Dockes
958a8f6abb zip: improved error output. Fixes issue #201 2014-07-06 16:32:41 +02:00
Jean-Francois Dockes
cada24896f ppt-dump: improve error messages 2014-07-06 16:27:40 +02:00
Jean-Francois Dockes
25271db690 msword docs: avoid generating an error for files containing only a picture (empty antiword output) 2014-07-06 16:24:11 +02:00
Jean-Francois Dockes
62c2ff3d4c OpenOffice filter: do produce white space for tab input! 2014-06-24 08:13:32 +02:00
Jean-Francois Dockes
27f77addd6 rcltar: clean up import statements 2014-06-07 11:45:25 +02:00
Jean-Francois Dockes
28a4e4d8a8 catch ppt-dump errors to avoid bogus system reports 2014-05-06 11:39:27 +02:00
Jean-Francois Dockes
45b845769c Replace catdoc with mso-dumper for XLS too 2014-01-09 17:44:05 +01:00
Jean-Francois Dockes
ea2c80f3a8 PPT filter: fix infinite loop in script (happened on invalid files) 2013-11-21 12:59:13 +01:00
Jean-Francois Dockes
064c247499 PPT filter: use mso-dump 2013-11-19 14:42:05 +01:00
Jean-Francois Dockes
aca05b7b2a comments 2013-11-19 14:41:14 +01:00
Jean-Francois Dockes
f078369cbb rclppt: fix absolute paths 2013-11-14 19:20:36 +01:00
Jean-Francois Dockes
9c42bab11b ppt filter: support unoconv 0.4 by using directory as parameter to -o 2013-11-14 19:09:47 +01:00
Jean-Francois Dockes
134153e412 powerpoint: decide to use unoconv based on the number of lines in catppt output 2013-11-12 10:40:07 +01:00
Jean-Francois Dockes
a9358d2f03 Powerpoint docs: add option to have rclppt use unoconv 2013-11-12 09:56:50 +01:00
Jean-Francois Dockes
9d25a0475f have the zip filter access the config if possible and use the zipSkippedNames variable 2013-06-10 14:03:24 +02:00
Jean-Francois Dockes
ea27248837 test driver: no data output by default 2013-06-10 14:01:03 +02:00
Jean-Francois Dockes
2018ef76b8 extract more svg metadata 2013-03-28 08:49:40 +01:00
Jean-Francois Dockes
d3631b5ddf cleaned up processing of metadata from diverse origins (doc,extattrs,localfields) 2013-01-29 14:33:57 +01:00
Jean-Francois Dockes
e24bd240f9 Implement workaround to character encoding issues in chm files and python HTMLParser 2012-12-05 13:24:02 +01:00
Jean-Francois Dockes
e3664ca88b handle filters returning unicode objects 2012-10-23 16:32:52 +02:00
Jean-Francois Dockes
c92cf26316 extract epub metadata into top document 2012-10-23 16:32:20 +02:00
Jean-Francois Dockes
816980a1c4 implemented advanced search history feature 2012-10-16 13:37:56 +02:00
Jean-Francois Dockes
5add2e2384 Arrange so we can now open the parent of a document (e.g. chm file instead of temp copy of html page inside chm), even when the parent is itself embedded in an archive 2012-10-12 16:54:52 +02:00
Jean-Francois Dockes
c7a35a176c none 2012-10-12 13:35:21 +02:00
Jean-Francois Dockes
7fcb7c9bf7 ensure chm file can be renamed 2012-10-12 13:34:56 +02:00
Jean-Francois Dockes
d4edbbaedb rclepub: use elt ids instead of hrefs + debug traces 2012-10-11 15:35:15 +02:00
Jean-Francois Dockes
7c18d74541 add epub viewer and set rclaptg meta tag for chm and info 2012-10-11 14:03:30 +02:00
Jean-Francois Dockes
7037e1ca38 fix 8bit file name processing 2012-10-06 12:00:05 +02:00
Jean-Francois Dockes
ff2e12f149 glitch in maxmemberkb handling 2012-10-06 11:59:48 +02:00
Jean-Francois Dockes
29fe1e4927 implemented maxmemberkb limit for multidoc (e.g. archive) members 2012-10-06 09:05:35 +02:00
Jean-Francois Dockes
5b3cb69ee9 let rcldvi and rclps emit ^L page markers for use with %p and evince 2012-10-04 09:49:03 +02:00
Jean-Francois Dockes
b321b0babb skip very big files (50M) in zip tar and rar extractors 2012-10-04 08:22:33 +02:00
Jean-Francois Dockes
2bb14cc6ff none 2012-10-04 08:21:54 +02:00