599 Commits

Author SHA1 Message Date
Jean-Francois Dockes
39c152bada Fixed MSVC warnings, all inocuous 2020-04-17 14:26:40 +01:00
Jean-Francois Dockes
12ebb7ac6e Windows: deal with non-ASCII user login, non-ascii paths in confdir etc. 2020-04-15 14:03:04 +01:00
Jean-Francois Dockes
9565663f09 textsplit: create isNGRAMMED() method to replace isCJK() and let the latter actually return what it says 2020-04-14 09:27:26 +02:00
Jean-Francois Dockes
5dd8774b3c whitespace and indents only 2020-04-14 09:25:13 +02:00
Jean-Francois Dockes
6999284c42 indent and decls 2020-04-05 13:46:47 +01:00
Jean-Francois Dockes
afcacf63c0 Fix page handling in Korean spitter, bug would shift the byte positions, with bad consequences for snippets 2020-03-31 16:11:37 +02:00
Jean-Francois Dockes
b6cd22c320 rcldb: message log level change (docid beyond updated.size()) 2020-03-27 10:56:14 +01:00
Jean-Francois Dockes
414222c003 use conftree conversions 2019-12-02 09:37:34 +01:00
Jean-Francois Dockes
f42338c026 recollq: add option to obtain exact result count 2019-11-28 16:13:27 +01:00
Jean-Francois Dockes
1243c30980 rcldb_p needs to include log.h if threads disabled 2019-11-25 09:58:26 +01:00
Jean-Francois Dockes
b368e4276f do not include excluded terms in the highlight information data 2019-07-21 19:13:24 +02:00
Jean-Francois Dockes
6a405e2089 hldata: comments + map->unordered_map 2019-07-21 19:13:24 +02:00
Jean-Francois Dockes
8ed8d05aab cjk phrases: hopefully the right fix this time for slack computation. lastpos-termcount correction was applied twice 2019-07-21 19:13:24 +02:00
Jean-Francois Dockes
fae0621d76 hldata generation during query processing: increase slack if position increases faster than term count (cjk) 2019-07-21 19:13:24 +02:00
Jean-Francois Dockes
6cd2c9e2ca snippets: allow a little more contiguous expansion of current snippet 2019-07-21 19:13:24 +02:00
Jean-Francois Dockes
35ee3f7a13 Highlighting and snippets extraction: reworked to handle phrases properly. Use a compound position list instead of multiplying the OR groups inside a near clause 2019-07-21 19:09:51 +02:00
Jean-Francois Dockes
736051fcd6 GUI snippets window: add options for the max list length and for sorting the snippets by page number 2019-07-21 19:09:51 +02:00
Jean-Francois Dockes
3f7d270691 GUI preview: improve operation when the index data is not up to date.
Avoid erasing all the file index data in case the subsequent update fails
(e.g. the file is locked). Improve the messages. Check for previous
indexing error, and modify the message.
2019-06-24 17:37:37 +02:00
Jean-Francois Dockes
ee8c5410bd Avoid purging existing subdocuments on file indexing error (e.g.: maybe a file lock issue that will go away) 2019-06-21 17:18:15 +02:00
Jean-Francois Dockes
be214c4a5a Take advantage of text storage when possible to display preview data for an unaccessible document 2019-06-16 11:49:18 +02:00
Jean-Francois Dockes
2a945c9443 abstract: we used to discard snippets too early, before they might get a phrase weight boost 2019-05-24 08:51:11 +02:00
Jean-Francois Dockes
81a91404a4 logs 2019-05-18 16:50:12 +02:00
Jean-Francois Dockes
8ddcc578ac Reverted 34d43d1188adfddb8fd8a4f7c7a28158a8b534f4
Keep only the main Snippet-producing makeabstract in rclquery, further
  formatting done in using modules
This was just a bad idea. The common methods are also used by the python module
2019-05-17 10:19:03 +02:00
Jean-Francois Dockes
a5810508ed abstract: optimize the way we retrieve the wdfs by sorting the list of terms we query for. Big difference on very big docs 2019-05-17 09:39:26 +02:00
Jean-Francois Dockes
fdb14e60ac building abstract from stored text: limit count of terms explored to avoid taking forever on monster (multi mega-terms) documents 2019-05-17 09:37:39 +02:00
Jean-Francois Dockes
8428093f6a synfamily: indent/log formats/extracted test main. No real change 2019-05-16 15:31:41 +02:00
Jean-Francois Dockes
10a500aa1c comment 2019-05-16 15:29:52 +02:00
Jean-Francois Dockes
d4900584c8 comment 2019-05-16 15:29:32 +02:00
Jean-Francois Dockes
34d43d1188 Keep only the main Snippet-producing makeabstract in rclquery, further formatting done in using modules 2019-05-13 18:11:23 +02:00
Jean-Francois Dockes
ee5a260d54 Print xapian error when flush fails during purge 2019-05-02 10:36:24 +02:00
Jean-Francois Dockes
54f0eda990 make doc.meta an unordered_map 2019-04-20 15:04:19 +02:00
Jean-Francois Dockes
34bb62a8d9 got rid of a few unused variable warnings 2019-04-11 15:31:27 +02:00
Jean-Francois Dockes
b914e1d5b9 Fix abstract building when additional indexes are built: the raw text must be fetched by get_metadata() from its own index, not the combined one 2019-03-20 10:57:05 +01:00
Jean-Francois Dockes
0cbc46732f Fixed the FSF address 2019-03-04 11:19:14 +01:00
Jean-Francois Dockes
037aa07bfd Suppress compiler warning about a possibly truncated snprintf (no real problem) by increasing a buffer size 2019-03-04 10:58:49 +01:00
Jean-Francois Dockes
b69912bfab Fix crash during abstract generation, occuring when no matching fragments are found 2019-02-19 19:02:23 +01:00
Jean-Francois Dockes
9574030edc No need for boosting the original term if there was no expansion 2019-02-14 14:54:17 +01:00
Jean-Francois Dockes
b079f0fb94 adjust log message levels and fix a warning 2019-02-04 11:42:35 +01:00
Jean-Francois Dockes
399c633efd Avoid purging documents from absent mountable volumes 2019-02-03 18:51:52 +01:00
Jean-Francois Dockes
a16e39a92b improve readability by fixing LOG statements and using auto and range-fors 2019-02-03 12:31:42 +01:00
Jean-Francois Dockes
04f3449f99 Avoid multiple expansion of xapian term iterator 2019-02-01 09:07:28 +01:00
Jean-Francois Dockes
2909eec062 get rid of redundant rclversion.h 2019-01-30 12:40:55 +01:00
Jean-Francois Dockes
dcd517bcf2 Fix -z always resetting index to non-text-storing independantly of configuration 2019-01-29 20:11:43 +01:00
Jean-Francois Dockes
b62478c0cc Indent + comments + use c++11 loops 2018-11-14 10:38:22 +01:00
Jean-Francois Dockes
ea999ed6e5 Indent + comments + use c++11 initializers 2018-11-14 10:30:31 +01:00
Jean-Francois Dockes
036e1da6b4 rcldb: change flush log message level to INF 2018-11-14 09:42:22 +01:00
Jean-Francois Dockes
3ff88a9541 suppress query time updated map spurious error message 2018-10-07 09:07:51 +02:00
Jean-Francois Dockes
8358742132 get things to build on centos7.5 (cosmetic changes) 2018-09-02 18:47:03 +02:00
Jean-Francois Dockes
6441eea8aa Store the origin dbdir inside the GUI doc history, so we can later fetch documents from external indexes 2018-05-31 15:01:17 +02:00
Jean-Francois Dockes
ea3bd23d7c Fixed namespace decls issues 2018-04-18 09:34:58 +02:00