ae3df4f7c3cjk phrases: hopefully the right fix this time for slack computation. lastpos-termcount correction was applied twice
Jean-Francois Dockes
2019-07-06 13:52:51 +02:00
55545a7bdbhldata: more tests
Jean-Francois Dockes
2019-07-06 13:50:26 +02:00
5b6436ca08add test driver for hldata:matchGroup + some help from textsplit
Jean-Francois Dockes
2019-07-06 11:39:09 +02:00
c588fddb83hldata: matchGroup: return false if no match found
Jean-Francois Dockes
2019-07-06 11:38:21 +02:00
63cfcf0ad5hldata generation during query processing: increase slack if position increases faster than term count (cjk)
Jean-Francois Dockes
2019-07-06 08:28:05 +02:00
9ae775b095snippets: allow a little more contiguous expansion of current snippet
Jean-Francois Dockes
2019-07-06 08:26:42 +02:00
c858f16877updated the message files
Jean-Francois Dockes
2019-07-05 19:12:09 +02:00
fcc6ae556cbump master to 1.26
Jean-Francois Dockes
2019-07-05 18:05:53 +02:00
f877e7e459Highlighting and snippets extraction: reworked to handle phrases properly. Use a compound position list instead of multiplying the OR groups inside a near clause
Jean-Francois Dockes
2019-07-05 18:02:09 +02:00
00eb803f5dDo not process hangul as words, but as ngrams. Same issues as with Katakana: word separation too hard
Jean-Francois Dockes
2019-07-05 17:57:00 +02:00
4ad8a08030hldata: cleanup + support phrases
Jean-Francois Dockes
2019-07-05 11:43:14 +02:00
3f33d1d0eautf8iter driver: read from stdin
Jean-Francois Dockes
2019-07-05 11:33:29 +02:00
75759ed481GUI: snippets: dont recreate the window each time, allow displaying data for multiple documents. restable: update snippets when changing current row
Jean-Francois Dockes
2019-07-03 13:46:38 +02:00
b79332580aRegularise processing of hangul characters (there was a mixup of cjk/regular processing), and add a build-time option to either use cjk/ngram or regular term splitting for them
Jean-Francois Dockes
2019-07-02 18:02:38 +02:00
36caf02133bumped version to 1.25.20~pre1
Jean-Francois Dockes
2019-06-28 15:35:28 +02:00
3ccd2364b9GUI snippets window: add options for the max list length and for sorting the snippets by page number
Jean-Francois Dockes
2019-06-28 14:20:47 +02:00
0a460ea9c6The container for temp files to be removed was a vector, but it needed stable member addresses. make it a list
Jean-Francois Dockes
2019-06-27 11:12:01 +02:00
3f7d270691GUI preview: improve operation when the index data is not up to date. Avoid erasing all the file index data in case the subsequent update fails (e.g. the file is locked). Improve the messages. Check for previous indexing error, and modify the message.
Jean-Francois Dockes
2019-06-24 17:37:37 +02:00
4c2fd82d4epst: wait for pffexport and generate error if exit code is not 0
Jean-Francois Dockes
2019-06-24 11:47:17 +02:00
ee8c5410bdAvoid purging existing subdocuments on file indexing error (e.g.: maybe a file lock issue that will go away)
Jean-Francois Dockes
2019-06-21 17:18:15 +02:00
db9fd248f37z: properly list the needed package as pylzma
Jean-Francois Dockes
2019-06-21 16:57:58 +02:00
be81082f38default config: fixed some mtypes without icons or catgs
Jean-Francois Dockes
2019-06-17 08:12:33 +02:00
e38e58c37aIn case the self-doc was not sent first by the handler, its udi was not recalculated, and it clobbered the last subdoc
Jean-Francois Dockes
2019-06-16 13:46:00 +02:00
be214c4a5aTake advantage of text storage when possible to display preview data for an unaccessible document
Jean-Francois Dockes
2019-06-16 11:49:18 +02:00
5d25094107pst: pass the command line ipath as base64 as there is no msw way to pass utf-8
Jean-Francois Dockes
2019-06-14 14:33:49 +02:00
d20172032btranscode: separate main program
Jean-Francois Dockes
2019-06-14 10:15:17 +02:00
6c73a0d666pst: reset generator for new file
Jean-Francois Dockes
2019-06-13 16:16:32 +02:00
bec40e9a31test: fix small issue in config introduced by previous change
Jean-Francois Dockes
2019-06-13 16:15:47 +02:00
5ff1a92a51pdf: ocr: small fixes, plus make pdfocr redefinable in subdirs
Jean-Francois Dockes
2019-06-13 09:47:25 +02:00
1991e132a7bumped version to 1.25.19
Jean-Francois Dockes
2019-06-13 08:38:02 +02:00
9dcdb6e9a6pdf: ocr function was broken for python3 in some cases (depending on how the ocr language was specified)
Jean-Francois Dockes
2019-06-13 08:33:55 +02:00
4c205e44e0tests: test the xmp metadata extraction
Jean-Francois Dockes
2019-06-12 19:22:30 +02:00
b895980e95PDF: fix the XMP metadata extraction code for python3 and other issues. Also get metadata from XML attributes
Jean-Francois Dockes
2019-06-12 19:21:37 +02:00
b759490559gcc 9.1: comparison object needs to be invocable as const. fixes issue #95
Jean-Francois Dockes
2019-06-12 11:17:35 +02:00
f0944ae0b2rclpst: indexing / searching mostly working with maybe issues in data charset conversions (check). Preview does not work, ipath needs conversion inside pffexport
Jean-Francois Dockes
2019-05-28 18:39:37 +02:00
0101e6e160bumped version ->1.25.18
Jean-Francois Dockes
2019-05-27 17:08:43 +02:00
2c9bc17587fix webengine reslist which had stopped working at some point in qt revs. Now working with 5.9-5.12 at least
Jean-Francois Dockes
2019-05-27 17:04:28 +02:00
c1553029b9Pst on Unix: email message indexing seems fully ok
Jean-Francois Dockes
2019-05-27 12:17:41 +02:00
cc4f4e0c74ckpt: pst: basic indexing of email. no getipath/preview
Jean-Francois Dockes
2019-05-26 12:30:59 +02:00
c7c413d9e7email address in copyright
Jean-Francois Dockes
2019-05-26 12:29:36 +02:00
e63ce00935sanitize version for tagging
Jean-Francois Dockes
2019-05-22 09:11:08 +02:00
2b586a70bdadded traces to syngroup building
Jean-Francois Dockes
2019-05-22 09:09:13 +02:00
38b3e63bd4log: print recoll version in rclinit
Jean-Francois Dockes
2019-05-21 19:34:39 +02:00
bc5ea83a3aqt gui: fix highlighting for the table mode display. the query terms were fetched too early, before executing the query. Also share hiliter between restable and reslist and avoid allocating another DocSource for restable, share the one from reslist.
Jean-Francois Dockes
2019-05-21 11:20:14 +02:00
ef5eed8bc6interim version 1.25.16~pre1
Jean-Francois Dockes
2019-05-17 10:31:30 +02:00
8ddcc578acReverted 34d43d1188adfddb8fd8a4f7c7a28158a8b534f4 Keep only the main Snippet-producing makeabstract in rclquery, further formatting done in using modules This was just a bad idea. The common methods are also used by the python module
Jean-Francois Dockes
2019-05-17 10:19:03 +02:00
a5810508edabstract: optimize the way we retrieve the wdfs by sorting the list of terms we query for. Big difference on very big docs
Jean-Francois Dockes
2019-05-17 09:39:26 +02:00
fdb14e60acbuilding abstract from stored text: limit count of terms explored to avoid taking forever on monster (multi mega-terms) documents
Jean-Francois Dockes
2019-05-17 09:37:39 +02:00
37e203d535mh_text: log message when skipping file with size over max
Jean-Francois Dockes
2019-05-17 09:32:46 +02:00
780521ec6cWhen checking if user input contains capital letters, take care of some lowercase letters which dont casefold to themselves
Jean-Francois Dockes
2019-05-16 15:35:11 +02:00