469 Commits

Author SHA1 Message Date
Jean-Francois Dockes
c86cb9438b handlers: remove remnant bits of python2 compat 2022-09-05 10:43:50 +02:00
Jean-Francois Dockes
422d24e94e recoll-we-move-files: do not accept downloadsdir parameter, add -c configdir option
The parameter was not used by recollindex anyway and the script NEEDS access to the config for
retrieving other values. This can also be set with RECOLL_CONFDIR (which recollindex does).
2022-08-26 15:24:05 +02:00
Jean-Francois Dockes
489e88c87d Have the indexer actually set the downloadsdir parameter on the recoll-we-move-files script command line. Previously, a custom directory could only be set for the default recoll configuration or if RECOLL_CONFDIR was set 2022-08-22 18:08:17 +02:00
Jean-Francois Dockes
225081563d rclpdf: encoding issue in attachment extract 2022-08-15 17:44:43 +02:00
Jean-Francois Dockes
e43e223ef8 rclaudio: process COMM==lang= tags from mutagen into comments 2022-08-04 16:11:30 +02:00
Jean-Francois Dockes
2e53989d8f usage string 2022-08-04 16:11:01 +02:00
Jean-Francois Dockes
5a3b366911 revert wrong flac embedded img test change from 4954c1d8 2022-08-04 13:58:55 +02:00
Jean-Francois Dockes
4954c1d855 Use audio/flac as MIME type for flac files as this seems now to be the norm 2022-08-02 15:01:05 +02:00
Jean-Francois Dockes
9b06d05be7 python preview filter: avoid printing spurious encoding value 2022-07-31 21:58:03 +02:00
Jean-Francois Dockes
9a7561517f OCR cache: do not create Path entries for temporary files 2022-06-18 16:06:51 +02:00
Jean-Francois Dockes
6575476d47 rclpdf.py: opening attachment: need to decode the utf-8 ipath 2022-06-14 14:52:26 +02:00
Jean-Francois Dockes
e6596cb26d rclcheckneedretry: avoid useless error messages when there is no $HOME 2022-06-13 13:00:12 +02:00
Jean-Francois Dockes
f2b24cf22d Add orgmodesubdocs recoll.conf parameter to switch rclorgmode from using whole text or creating level-1 subdocs (default is subdocs) 2022-04-08 08:52:23 +02:00
Jean-Francois Dockes
ad2db6cd1c orgmode: also index text before the first heading. Add mimeview config 2022-04-06 19:29:23 +02:00
Jean-Francois Dockes
8b3792026f Renamed a few extension-less python handlers with a .py extension for consistency 2022-01-14 12:12:22 +01:00
Jean-Francois Dockes
667e661c46 Standardize the shebang line of python scripts to using /usr/bin/env, which was already the vastly dominant choice 2022-01-14 09:27:04 +01:00
Jean-Francois Dockes
b3bb3784fc Define specific document type for orgmode sub-documents 2022-01-04 14:02:26 +01:00
Jean-Francois Dockes
5fcffb7654 tesseract ocr: use compressed tif temp pages if pdftocairo is available (10x smaller than ppm) 2021-12-04 09:35:10 +01:00
Jean-Francois Dockes
e121695a3c Python handlers: factorise tmp dir code 2021-12-03 11:03:23 +01:00
Jean-Francois Dockes
1593b1d87f Change the way rclpd executes rclocr to avoid the command being killed before it can clean up when a signal is raised (e.g. timeout or kbd interrupt) 2021-12-03 10:49:44 +01:00
Jean-Francois Dockes
58d98b5626 PST : account for badly formed headers 2021-10-21 20:42:27 +02:00
Jean-Francois Dockes
1d158f329a pst: account for possible failure in decoding body and possible "unicode" name for encoding 2021-10-19 09:53:59 +02:00
Jean-Francois Dockes
8a98635c3a ipynb: format variations 2021-10-10 09:45:37 +02:00
Jean-Francois Dockes
7b81c16ea0 add support for ipython/jupyter notebooks 2021-10-10 08:11:59 +02:00
Jean-Francois Dockes
7179e0dbf8 cmdtalk: remove remains of python2 support 2021-09-23 11:19:36 +02:00
Jean-Francois Dockes
3df83ec982 Zip archives: set the modification date attribute for members 2021-07-30 10:53:43 +02:00
Jean-Francois Dockes
a67dd3f8a3 ost/pst filter: fix not fetching the message dates 2021-07-23 19:12:34 +02:00
Jean-Francois Dockes
174ad9fe22 rcl ocr with tesseract: fix stupid breakage in script 2021-06-13 07:14:51 +01:00
Jean-Francois Dockes
e42a4e9669 Chm: fix catenate mode which was broken a long time ago 2021-05-01 10:29:44 +02:00
Jean-Francois Dockes
3865e1b05f rclchm: chmcatenate=1 would get the handler to crash 2021-05-01 08:10:34 +02:00
Jean-Francois Dockes
5656d376c7 Windows: djvu: need to convert file name becore subprocess check_output 2021-04-30 08:37:19 +01:00
Jean-Francois Dockes
3f23000b89 rclpython: when not previewing, just output the file text, with no processing at all. Avoids spurious newlines 2021-04-14 14:26:11 +02:00
Jean-Francois Dockes
7a54c3a110 rclpython.py: dont try to subscript an exception 2021-03-29 09:52:38 +02:00
Jean-Francois Dockes
a4b3aff5c4 rclaudio: if mutagen.File() fails, try with mutagen.ID3()
This allows extracting the tags e.g. from adts files
mistaken for mp3 during initial identification, and for which
the full later mp3 init fails because wrong kind of frame.
2021-03-03 12:53:59 +01:00
Jean-Francois Dockes
31f6793495 rclaudio: catch exception when parsing bad date, set date to the epoch 2021-02-25 19:27:24 +01:00
Jean-Francois Dockes
dc934b7ddc comment 2021-02-10 14:57:40 +01:00
freddii
89c7efe682 fixed typos 2021-02-04 17:12:22 +01:00
Jean-Francois Dockes
50b64caf5e rclaudio: process the Group tag 2021-01-27 09:32:55 +01:00
Jean-Francois Dockes
2998486d54 revert wrong change in rclaudio 2021-01-19 19:27:48 +01:00
Jean-Francois Dockes
baf2ee8d6b dont make date a field alias for dmtime, does not make sense because of diff. formats in general 2021-01-16 19:19:29 +01:00
Jean-Francois Dockes
cb13b8b6df "print fields" change in rclexecm options had broken -s 2021-01-15 14:06:52 +01:00
Jean-Francois Dockes
72a9548c88 fix warning from rclaudio regexp 2021-01-06 12:01:42 +01:00
Jean-Francois Dockes
e00767d98c rclexecm test/debug: add option -f to dump fields 2020-12-29 15:04:49 +01:00
Jean-Francois Dockes
ee1e84b2f3 comments 2020-12-25 17:35:08 +01:00
Jean-Francois Dockes
53edd7b213 rcl7z: use py7zr if available, rather than pylzma, which does not work on some archives 2020-12-25 17:34:15 +01:00
Jean-Francois Dockes
824e305bb0 Add option to limit tesseract threads 2020-12-17 11:08:31 +01:00
Jean-Francois Dockes
b2f0e2e657 Add handler for emacs org-mode files 2020-11-30 09:50:44 +01:00
Jean-Francois Dockes
33725fd02c simplify stdout redirection for pdftk 2020-11-25 17:54:06 +01:00
Jean-Francois Dockes
8b6082a89f shared 2020-11-09 12:13:30 +01:00
Jean-Francois Dockes
f0abc1df68 pdf: discard pdftk stdout message "Error occurred during initialization of VM", it breaks pdf indexing when it occurs 2020-11-04 14:33:55 +01:00