Jean-Francois Dockes
7179e0dbf8
cmdtalk: remove remains of python2 support
2021-09-23 11:19:36 +02:00
Jean-Francois Dockes
3df83ec982
Zip archives: set the modification date attribute for members
2021-07-30 10:53:43 +02:00
Jean-Francois Dockes
a67dd3f8a3
ost/pst filter: fix not fetching the message dates
2021-07-23 19:12:34 +02:00
Jean-Francois Dockes
174ad9fe22
rcl ocr with tesseract: fix stupid breakage in script
2021-06-13 07:14:51 +01:00
Jean-Francois Dockes
e42a4e9669
Chm: fix catenate mode which was broken a long time ago
2021-05-01 10:29:44 +02:00
Jean-Francois Dockes
3865e1b05f
rclchm: chmcatenate=1 would get the handler to crash
2021-05-01 08:10:34 +02:00
Jean-Francois Dockes
5656d376c7
Windows: djvu: need to convert file name becore subprocess check_output
2021-04-30 08:37:19 +01:00
Jean-Francois Dockes
3f23000b89
rclpython: when not previewing, just output the file text, with no processing at all. Avoids spurious newlines
2021-04-14 14:26:11 +02:00
Jean-Francois Dockes
7a54c3a110
rclpython.py: dont try to subscript an exception
2021-03-29 09:52:38 +02:00
Jean-Francois Dockes
a4b3aff5c4
rclaudio: if mutagen.File() fails, try with mutagen.ID3()
...
This allows extracting the tags e.g. from adts files
mistaken for mp3 during initial identification, and for which
the full later mp3 init fails because wrong kind of frame.
2021-03-03 12:53:59 +01:00
Jean-Francois Dockes
31f6793495
rclaudio: catch exception when parsing bad date, set date to the epoch
2021-02-25 19:27:24 +01:00
Jean-Francois Dockes
dc934b7ddc
comment
2021-02-10 14:57:40 +01:00
freddii
89c7efe682
fixed typos
2021-02-04 17:12:22 +01:00
Jean-Francois Dockes
50b64caf5e
rclaudio: process the Group tag
2021-01-27 09:32:55 +01:00
Jean-Francois Dockes
2998486d54
revert wrong change in rclaudio
2021-01-19 19:27:48 +01:00
Jean-Francois Dockes
baf2ee8d6b
dont make date a field alias for dmtime, does not make sense because of diff. formats in general
2021-01-16 19:19:29 +01:00
Jean-Francois Dockes
cb13b8b6df
"print fields" change in rclexecm options had broken -s
2021-01-15 14:06:52 +01:00
Jean-Francois Dockes
72a9548c88
fix warning from rclaudio regexp
2021-01-06 12:01:42 +01:00
Jean-Francois Dockes
e00767d98c
rclexecm test/debug: add option -f to dump fields
2020-12-29 15:04:49 +01:00
Jean-Francois Dockes
ee1e84b2f3
comments
2020-12-25 17:35:08 +01:00
Jean-Francois Dockes
53edd7b213
rcl7z: use py7zr if available, rather than pylzma, which does not work on some archives
2020-12-25 17:34:15 +01:00
Jean-Francois Dockes
824e305bb0
Add option to limit tesseract threads
2020-12-17 11:08:31 +01:00
Jean-Francois Dockes
b2f0e2e657
Add handler for emacs org-mode files
2020-11-30 09:50:44 +01:00
Jean-Francois Dockes
33725fd02c
simplify stdout redirection for pdftk
2020-11-25 17:54:06 +01:00
Jean-Francois Dockes
8b6082a89f
shared
2020-11-09 12:13:30 +01:00
Jean-Francois Dockes
f0abc1df68
pdf: discard pdftk stdout message "Error occurred during initialization of VM", it breaks pdf indexing when it occurs
2020-11-04 14:33:55 +01:00
Jean-Francois Dockes
f50a4e54b1
rclpython: renamed rclpython.py. Use rclexecm. Only colorize for preview, not indexing
2020-11-04 10:32:18 +01:00
Jean-Francois Dockes
e10cb959b3
add test for python program (different handler)
2020-10-18 18:38:44 +02:00
Jean-Francois Dockes
25eda37bc9
Index pdf annotations separately under field name annotation. Add annot, pdfannot and pa aliases.
2020-10-12 10:05:38 +02:00
Jean-Francois Dockes
694d0f155d
pdf annot: guard against possible exception while formatting results
2020-10-10 12:48:18 +02:00
Jean-Francois Dockes
96104e7d67
fix rclocrtesseract fix
2020-09-28 11:05:12 +02:00
Jean-Francois Dockes
8accec9b88
rclocrtesseract: unquote tesseractcmd parameter and check existence.
2020-09-24 07:13:21 +02:00
Jean-Francois Dockes
0dd609cf1a
python filters: replace misc message printing with single method in rclexecm
2020-09-23 18:38:22 +02:00
Jean-Francois Dockes
10bdf2a0c8
comments
2020-09-05 09:19:10 +02:00
Jean-Francois Dockes
d62bb9016a
pdf: try to extract annotation text if the python3 poppler-glib binding is available
2020-09-03 16:16:54 +02:00
Jean-Francois Dockes
2c0fd8502a
PDF: pdftk as snap (ubuntu): print warning about pdf attachments if TMPDIR does not belong to user
2020-08-20 11:27:12 +02:00
Jean-Francois Dockes
b305c86041
recoll-we-move-files: apply expanduser to the webdownloadsdir config value
2020-08-17 11:02:46 +02:00
Jean-Francois Dockes
d932d19562
epub handler: extract the opf metadata subjects fields as dc:subject tags. Share more code between rclepub and the now redundant rclepub1 (no more lynx usage in rclepub)
2020-08-09 09:49:08 +02:00
Jean-Francois Dockes
19fe03af62
Support visio .vsdx format
2020-08-04 10:57:13 +02:00
Jean-Francois Dockes
b2e68740ba
PDF: attachment extraction was broken since python3 (wrong open mode r instead of rb for the extracted file)
2020-07-27 09:03:58 +02:00
Jean-Francois Dockes
b4306b71c0
openxml word: be more specific for extracting text, avoids treating some image parameters as text
2020-07-15 10:49:06 +02:00
Jean-Francois Dockes
4508b6b064
rclpdf: avoid crash when external metadata filter cant be imported
2020-07-13 10:13:59 +02:00
Jean-Francois Dockes
73f2836317
korean splitter: add inactive option to split on white space before calling the tagger
2020-05-19 09:22:16 +02:00
Jean-Francois Dockes
c6dac9347f
cmdtalk: catch param decoding exceptions
2020-05-14 09:23:46 +02:00
Jean-Francois Dockes
dce3bff5d7
comment
2020-04-19 09:19:28 +02:00
Jean-Francois Dockes
c38db0f160
comment
2020-04-18 09:15:45 +02:00
Jean-Francois Dockes
b63cc1b712
Korean splitter script: use python-mecab-ko if possible, else konlpy
2020-04-10 14:27:06 +02:00
Jean-Francois Dockes
e8194dea9d
comment
2020-04-08 09:51:37 +02:00
Jean-Francois Dockes
d3de1f0d6f
add common execPythonScript method to rclexecm
2020-04-07 10:09:09 +02:00
Jean-Francois Dockes
32ebd65ba8
Windows: small changes for porting back from msvc to mingw
2020-04-07 09:40:00 +02:00