Jean-Francois Dockes
f15e3f21fa
Windows: replace unlink() with unicode-capable path_unlink()
2020-06-02 10:56:55 +01:00
Jean-Francois Dockes
560041cab9
cleared out errant tabs
2020-05-30 15:54:49 +02:00
Jean-Francois Dockes
796db76fc6
When splitting to generate abstract from text, do not set ONLYSPANS, generate all terms. Seems to solve issues with the snippet generator not finding a match when the query term is a partial span
2020-05-30 12:37:14 +02:00
Jean-Francois Dockes
5f76c2527d
GUI searching with saved query: restore external indexes from saved query
2020-05-19 14:20:21 +02:00
Jean-Francois Dockes
2f794be314
Fix Windows gcc build. Needs some def to get w7+ windows api
2020-04-25 11:41:37 +02:00
Jean-Francois Dockes
126ac47dba
tabs and indents
2020-04-24 13:45:41 +02:00
Jean-Francois Dockes
8a29522ef8
Fix issues consequent to type change for searchdata m_minsize and m_maxsize members
2020-04-21 13:45:00 +01:00
Jean-Francois Dockes
39c152bada
Fixed MSVC warnings, all inocuous
2020-04-17 14:26:40 +01:00
Jean-Francois Dockes
12ebb7ac6e
Windows: deal with non-ASCII user login, non-ascii paths in confdir etc.
2020-04-15 14:03:04 +01:00
Jean-Francois Dockes
9565663f09
textsplit: create isNGRAMMED() method to replace isCJK() and let the latter actually return what it says
2020-04-14 09:27:26 +02:00
Jean-Francois Dockes
5dd8774b3c
whitespace and indents only
2020-04-14 09:25:13 +02:00
Jean-Francois Dockes
6999284c42
indent and decls
2020-04-05 13:46:47 +01:00
Jean-Francois Dockes
afcacf63c0
Fix page handling in Korean spitter, bug would shift the byte positions, with bad consequences for snippets
2020-03-31 16:11:37 +02:00
Jean-Francois Dockes
b6cd22c320
rcldb: message log level change (docid beyond updated.size())
2020-03-27 10:56:14 +01:00
Jean-Francois Dockes
414222c003
use conftree conversions
2019-12-02 09:37:34 +01:00
Jean-Francois Dockes
f42338c026
recollq: add option to obtain exact result count
2019-11-28 16:13:27 +01:00
Jean-Francois Dockes
1243c30980
rcldb_p needs to include log.h if threads disabled
2019-11-25 09:58:26 +01:00
Jean-Francois Dockes
b368e4276f
do not include excluded terms in the highlight information data
2019-07-21 19:13:24 +02:00
Jean-Francois Dockes
6a405e2089
hldata: comments + map->unordered_map
2019-07-21 19:13:24 +02:00
Jean-Francois Dockes
8ed8d05aab
cjk phrases: hopefully the right fix this time for slack computation. lastpos-termcount correction was applied twice
2019-07-21 19:13:24 +02:00
Jean-Francois Dockes
fae0621d76
hldata generation during query processing: increase slack if position increases faster than term count (cjk)
2019-07-21 19:13:24 +02:00
Jean-Francois Dockes
6cd2c9e2ca
snippets: allow a little more contiguous expansion of current snippet
2019-07-21 19:13:24 +02:00
Jean-Francois Dockes
35ee3f7a13
Highlighting and snippets extraction: reworked to handle phrases properly. Use a compound position list instead of multiplying the OR groups inside a near clause
2019-07-21 19:09:51 +02:00
Jean-Francois Dockes
736051fcd6
GUI snippets window: add options for the max list length and for sorting the snippets by page number
2019-07-21 19:09:51 +02:00
Jean-Francois Dockes
3f7d270691
GUI preview: improve operation when the index data is not up to date.
...
Avoid erasing all the file index data in case the subsequent update fails
(e.g. the file is locked). Improve the messages. Check for previous
indexing error, and modify the message.
2019-06-24 17:37:37 +02:00
Jean-Francois Dockes
ee8c5410bd
Avoid purging existing subdocuments on file indexing error (e.g.: maybe a file lock issue that will go away)
2019-06-21 17:18:15 +02:00
Jean-Francois Dockes
be214c4a5a
Take advantage of text storage when possible to display preview data for an unaccessible document
2019-06-16 11:49:18 +02:00
Jean-Francois Dockes
2a945c9443
abstract: we used to discard snippets too early, before they might get a phrase weight boost
2019-05-24 08:51:11 +02:00
Jean-Francois Dockes
81a91404a4
logs
2019-05-18 16:50:12 +02:00
Jean-Francois Dockes
8ddcc578ac
Reverted 34d43d1188adfddb8fd8a4f7c7a28158a8b534f4
...
Keep only the main Snippet-producing makeabstract in rclquery, further
formatting done in using modules
This was just a bad idea. The common methods are also used by the python module
2019-05-17 10:19:03 +02:00
Jean-Francois Dockes
a5810508ed
abstract: optimize the way we retrieve the wdfs by sorting the list of terms we query for. Big difference on very big docs
2019-05-17 09:39:26 +02:00
Jean-Francois Dockes
fdb14e60ac
building abstract from stored text: limit count of terms explored to avoid taking forever on monster (multi mega-terms) documents
2019-05-17 09:37:39 +02:00
Jean-Francois Dockes
8428093f6a
synfamily: indent/log formats/extracted test main. No real change
2019-05-16 15:31:41 +02:00
Jean-Francois Dockes
10a500aa1c
comment
2019-05-16 15:29:52 +02:00
Jean-Francois Dockes
d4900584c8
comment
2019-05-16 15:29:32 +02:00
Jean-Francois Dockes
34d43d1188
Keep only the main Snippet-producing makeabstract in rclquery, further formatting done in using modules
2019-05-13 18:11:23 +02:00
Jean-Francois Dockes
ee5a260d54
Print xapian error when flush fails during purge
2019-05-02 10:36:24 +02:00
Jean-Francois Dockes
54f0eda990
make doc.meta an unordered_map
2019-04-20 15:04:19 +02:00
Jean-Francois Dockes
34bb62a8d9
got rid of a few unused variable warnings
2019-04-11 15:31:27 +02:00
Jean-Francois Dockes
b914e1d5b9
Fix abstract building when additional indexes are built: the raw text must be fetched by get_metadata() from its own index, not the combined one
2019-03-20 10:57:05 +01:00
Jean-Francois Dockes
0cbc46732f
Fixed the FSF address
2019-03-04 11:19:14 +01:00
Jean-Francois Dockes
037aa07bfd
Suppress compiler warning about a possibly truncated snprintf (no real problem) by increasing a buffer size
2019-03-04 10:58:49 +01:00
Jean-Francois Dockes
b69912bfab
Fix crash during abstract generation, occuring when no matching fragments are found
2019-02-19 19:02:23 +01:00
Jean-Francois Dockes
9574030edc
No need for boosting the original term if there was no expansion
2019-02-14 14:54:17 +01:00
Jean-Francois Dockes
b079f0fb94
adjust log message levels and fix a warning
2019-02-04 11:42:35 +01:00
Jean-Francois Dockes
399c633efd
Avoid purging documents from absent mountable volumes
2019-02-03 18:51:52 +01:00
Jean-Francois Dockes
a16e39a92b
improve readability by fixing LOG statements and using auto and range-fors
2019-02-03 12:31:42 +01:00
Jean-Francois Dockes
04f3449f99
Avoid multiple expansion of xapian term iterator
2019-02-01 09:07:28 +01:00
Jean-Francois Dockes
2909eec062
get rid of redundant rclversion.h
2019-01-30 12:40:55 +01:00
Jean-Francois Dockes
dcd517bcf2
Fix -z always resetting index to non-text-storing independantly of configuration
2019-01-29 20:11:43 +01:00