Jean-Francois Dockes
|
d6b230043c
|
Check for newer pdftotext version to avoid double HTML escaping. fixes issue #318
|
2016-08-05 08:51:34 +02:00 |
|
Jean-Francois Dockes
|
b9e672abda
|
Allow execm input handlers to set arbitrary data fields
|
2016-07-11 18:13:39 +02:00 |
|
Jean-Francois Dockes
|
236900ee2a
|
comments
|
2016-05-23 19:16:31 +02:00 |
|
Jean-Francois Dockes
|
b2bd67cee8
|
added bogus minimum sample execm handler, indexing text lines as docs
|
2016-05-23 18:59:00 +02:00 |
|
Jean-Francois Dockes
|
b421f86f72
|
renamed rclmpdf.py to more normal rclpdf.py
|
2016-04-11 13:59:07 +02:00 |
|
Jean-Francois Dockes
|
4830e35a1b
|
pdf: add config variables to control if we attempt attachment extraction and ocr
|
2016-04-11 13:57:58 +02:00 |
|
Jean-Francois Dockes
|
74088bdada
|
doc
|
2016-04-09 20:01:48 +02:00 |
|
Jean-Francois Dockes
|
b995cfb4e8
|
added module for simplified interface to libxmp
|
2016-04-08 11:37:23 +02:00 |
|
Jean-Francois Dockes
|
031cdf9761
|
converted rcldjvu to python
|
2016-04-08 10:24:52 +02:00 |
|
Jean-Francois Dockes
|
95bd49b420
|
Restore PDF OCR capability from shell version of rclpdf script
|
2016-04-08 09:00:23 +02:00 |
|
Jean-Francois Dockes
|
92bb5bfc43
|
xls filter: catch HTML files disguising as XLS
|
2016-02-26 09:35:23 +01:00 |
|
Jean-Francois Dockes
|
b4c1fd033a
|
effect-less typo
|
2016-02-26 08:45:07 +01:00 |
|
Jean-Francois Dockes
|
d115bcfaa2
|
rclmpdf.py: p2/3 compat
|
2015-11-21 12:46:58 +01:00 |
|
Jean-Francois Dockes
|
5776c4bc20
|
rclinfo: remove trace message
|
2015-11-21 12:46:28 +01:00 |
|
Jean-Francois Dockes
|
953144d131
|
Make sure to execute python2 scripts with python2
|
2015-11-16 15:18:59 +01:00 |
|
Jean-Francois Dockes
|
683a258d4d
|
more python3 tweaks
|
2015-11-16 13:19:44 +01:00 |
|
Jean-Francois Dockes
|
452e5c1c59
|
comments
|
2015-11-16 09:26:19 +01:00 |
|
Jean-Francois Dockes
|
585f651919
|
Use os.devnull instead of /dev/null
|
2015-11-15 16:04:55 +01:00 |
|
Jean-Francois Dockes
|
2e78f573de
|
more py3 fixups
|
2015-11-07 17:19:40 +01:00 |
|
Jean-Francois Dockes
|
dfe00ab11f
|
more filters made compatible with python3
|
2015-11-07 16:59:17 +01:00 |
|
Jean-Francois Dockes
|
f344e8fedd
|
first pass at converting the filters for python 2/3 compat
|
2015-11-06 16:49:03 +01:00 |
|
Jean-Francois Dockes
|
d416acf1c0
|
use $HOME instead of ~
|
2015-10-27 07:38:05 +01:00 |
|
Jean-Francois Dockes
|
8324f09d19
|
Get uncompression to work and fix a few other issues
|
2015-10-13 16:48:16 +02:00 |
|
Jean-Francois Dockes
|
a02a611694
|
let filter 'which' find a command in a specified subdir of PATH elements
|
2015-10-13 10:00:48 +02:00 |
|
Jean-Francois Dockes
|
4c3e112c27
|
Use the python-based filters written for ms-win on Linux too
|
2015-10-11 08:41:15 +02:00 |
|
Jean-Francois Dockes
|
0e6d921f9a
|
added image tag filter based on pyexiv2
|
2015-10-10 18:40:04 +02:00 |
|
Jean-Francois Dockes
|
1e3ce6c36f
|
Pure mingw build ok
|
2015-10-08 15:32:01 +02:00 |
|
Jean-Francois Dockes
|
453ed8748a
|
Windows: manage timeouts, time and size limits
|
2015-10-08 14:08:36 +02:00 |
|
Jean-Francois Dockes
|
374f775092
|
Added possibly uncomplete rcluncomp.py script for windows
|
2015-10-01 18:23:30 +02:00 |
|
Jean-Francois Dockes
|
a411d4c964
|
Windows: small fixes for rclmpdf.py to work with alivate poppler
|
2015-10-01 16:36:29 +02:00 |
|
Jean-Francois Dockes
|
376e5485dc
|
prepare rclmpdf->rclmpdf.py for windows
|
2015-10-01 15:09:45 +02:00 |
|
Jean-Francois Dockes
|
031a2a0b4a
|
Small filter fixes
--HG--
branch : WINDOWSPORT
|
2015-09-14 14:19:23 +02:00 |
|
Jean-Francois Dockes
|
7337e5a9ff
|
filters: use rb instead of r
--HG--
branch : WINDOWSPORT
|
2015-09-14 11:36:36 +02:00 |
|
Jean-Francois Dockes
|
002eb67185
|
python scripts for ppt and xls
--HG--
branch : WINDOWSPORT
|
2015-09-14 11:32:16 +02:00 |
|
Jean-Francois Dockes
|
86ef362461
|
rclimg (tweaks for perl)
--HG--
branch : WINDOWSPORT
|
2015-09-14 10:33:39 +02:00 |
|
Jean-Francois Dockes
|
24c77d2984
|
more filter conversion to python: svg and xml. Get rid of rclnull
--HG--
branch : WINDOWSPORT
|
2015-09-14 09:51:11 +02:00 |
|
Jean-Francois Dockes
|
36b36f2c69
|
rcltext.py
--HG--
branch : WINDOWSPORT
|
2015-09-13 10:34:40 +02:00 |
|
Jean-Francois Dockes
|
42401c8f26
|
windows: rclrtf.py and rcldoc.py apparently working ok
--HG--
branch : WINDOWSPORT
|
2015-09-12 16:53:24 +02:00 |
|
Jean-Francois Dockes
|
118982d25e
|
cleanup in new python filters
--HG--
branch : WINDOWSPORT
|
2015-09-12 10:54:26 +02:00 |
|
Jean-Francois Dockes
|
330c7fc30d
|
Python filters beginning to work, still issues.
--HG--
branch : WINDOWSPORT
|
2015-09-11 16:16:16 +02:00 |
|
Jean-Francois Dockes
|
bd58ffb920
|
open xml python + xslt filter
--HG--
branch : WINDOWSPORT
|
2015-09-10 17:39:49 +02:00 |
|
Jean-Francois Dockes
|
8794932158
|
converted/duplicated rclsoff to rclsoff.py, using python-libxslt/xml
--HG--
branch : WINDOWSPORT
|
2015-09-07 15:34:39 +02:00 |
|
Jean-Francois Dockes
|
e40cf64e66
|
New python-based msword filter + basic arch to convert the others
--HG--
branch : WINDOWSPORT
|
2015-09-07 11:16:20 +02:00 |
|
Jean-Francois Dockes
|
f00ed2ba5a
|
actually postprocess
--HG--
branch : WINDOWSPORT
|
2015-09-07 09:23:07 +02:00 |
|
Jean-Francois Dockes
|
16f495a9c0
|
temp ckpt
--HG--
branch : WINDOWSPORT
|
2015-09-06 19:55:43 +02:00 |
|
Jean-Francois Dockes
|
766a34a8db
|
fix flac mime types in rclaudio + small changes for experimenting with embedding an interpreter in recollindex
|
2015-08-23 09:29:26 +02:00 |
|
Jean-Francois Dockes
|
83939e45ab
|
import sys
|
2015-08-09 13:37:30 +02:00 |
|
Jean-Francois Dockes
|
6a6552ee43
|
exit with meaningful status
|
2015-07-31 11:24:56 +02:00 |
|
Jean-Francois Dockes
|
922a9384f9
|
rclpdf: work with newer poppler version which do escape html text inside <head>
|
2015-06-30 10:35:22 +02:00 |
|
Jean-Francois Dockes
|
eaddefa7c5
|
Add capability to run tesseract from rclpdf. Disabled by default, see comments at the top of rclpdf
|
2015-04-24 18:13:52 +02:00 |
|