66 Commits

Author SHA1 Message Date
Jean-Francois Dockes
8b34610dde Cleaned up file name handling. Fixes that file names were sometimes indexed split, sometimes not. They now always are both, with different prefixes. Forces reindex 2012-04-13 09:18:08 +02:00
Jean-Francois Dockes
4eaf12fb9c more delistification 2012-04-12 08:15:50 +02:00
Jean-Francois Dockes
ec7b40a52e cosmetics: list -> vector in more places 2012-04-11 19:58:08 +02:00
Jean-Francois Dockes
7ddbbb1ee8 search language: implemented filtering on file size 2012-03-07 17:08:22 +01:00
Jean-Francois Dockes
6cdf9ae12b Accept and process relative/incomplete paths with the dir: directive (dont anchor path phrase is path does not start with /) 2012-02-24 19:25:55 +01:00
Jean-Francois Dockes
1931595637 GUI: added menu entry to show all the mime types actually indexed (by content) 2011-11-25 19:47:56 +01:00
Jean-Francois Dockes
8d52e928d1 increase slack for automatic phrases 2011-10-20 13:25:33 +02:00
Jean-Francois Dockes
0860b559ee get rid of a few garbage terms during indexing. Set a threshold for conversion errors after which we discard the doc. Stabilize the new termproc pipeline but no commongrams for now 2011-10-12 17:55:58 +02:00
Jean-Francois Dockes
5fd31172f5 New text to terms processing pipelines: results identical to 1.16 when used with empty stopfile 2011-10-07 07:53:49 +02:00
Jean-Francois Dockes
eda494153e simplify calls to isStop 2011-10-05 17:25:35 +02:00
Jean-Francois Dockes
bb2685c2f5 Add frequency threshold to avoid adding common term to the automatic phrase search extension. Use autophrase by default with simple search, with a default freq threshold at 2% 2011-10-04 09:03:43 +02:00
Jean-Francois Dockes
38e0957962 const string cleanup 2011-10-01 16:39:38 +02:00
Jean-Francois Dockes
702fb88a1e Search: remove restriction on empty queries by replacing empty query with Xapian::Query::Matchall. This allows querying all files of a given type, or under a given tree, without an actual text search part 2011-09-30 08:50:50 +02:00
Jean-Francois Dockes
424e4173ba threading cleanup: add mutex protection around moronic change to transcode. Add mutex to equiv issue in unac. Rename const strings everywhere to cstr_xx to ease future detection of potentially problematic static variables. Most probably close issue #65 2011-09-28 15:01:14 +02:00
Jean-Francois Dockes
ee0d602ab3 Implement anchored searches: terms to be found at a maximum distance of the start or end of the text 2011-09-20 16:42:56 +02:00
"Jean-Francois Dockes ext:(%22)
ebbcc115a8 Allow setting a weight increase for field terms 2011-07-22 16:43:39 +02:00
Jean-Francois Dockes
91f277ec26 Search: allow setting weights on terms, ie: "important"2.5 2011-05-30 14:03:01 +02:00
Jean-Francois Dockes
ce9e9e4d00 query: support negative mime and catg clauses: -mime:text/plain 2011-05-15 09:29:24 +02:00
Jean-Francois Dockes
b28eaf23fb Got rid of all the old RCS id strings 2011-04-27 08:22:17 +02:00
Jean-Francois Dockes
e883c4d04e Search: allow negative directory filtering (all except from dir). Emit more explicit errors for other unallowed negative search clauses. 2011-03-30 14:35:09 +02:00
Jean-Francois Dockes
e8fcd35fef fix term highlighting for field searches 2011-01-28 15:47:58 +01:00
Jean-Francois Dockes
34511918d9 query: extract the collapse count from xapian + small cleanups 2011-01-17 11:25:05 +01:00
Jean-Francois Dockes
85b36d3c34 filename search fields: generate an AND of OR lists out of wildcard expansion instead of a global OR which did not make much sense 2011-01-13 11:47:35 +01:00
Jean-Francois Dockes
107e02b74a Gui search: make autophrase work with a query language query 2010-12-21 16:00:25 +01:00
Jean-Francois Dockes
21c6025ba7 Use a xapian phrase search on the split path for filtering on directory location (much faster than the current method) 2010-12-16 15:53:40 +01:00
Jean-Francois Dockes
4385dd1b8b small compilation issues on misc systems 2010-09-13 21:34:23 +02:00
Jean-Francois Dockes
ceb996c8fb Implement date: date range filter/searches. Remove restriction on pure negative queries 2010-09-11 12:07:53 +02:00
Jean-Francois Dockes
4006825961 display more complete stats in spell window 2010-05-08 10:38:13 +02:00
Jean-Francois Dockes
48358c8252 Added option nonumbers not to generate terms for numbers. closes #16 2010-05-05 10:18:56 +02:00
Jean-Francois Dockes
8b2b00bc72 cosmetics: use derived class for actual splitter instead of callback 2010-02-02 15:33:52 +01:00
dockes
8ddea418aa field values were not used in case term expansion was not performed (phrase or capitalized term) 2010-01-07 08:29:30 +00:00
dockes
bab030f846 Term expansion: handle field issues inside rcldb::termmatch, ensuring that we take the field name into account for all expansions. Ensures that File Name searches and filename: query language searches work the same, + overall better consistency 2009-12-07 13:27:57 +00:00
dockes
f554960b9b suggest alternate spellings if no results 2009-11-26 14:03:02 +00:00
dockes
7dcc7c61c8 modified the time at which we unaccent so that we can do the Capitalized->nostemming test on single words (this had been broken by the change of noac/split order done earlier to get japanese to work) 2009-01-26 18:30:48 +00:00
dockes
d9b9b41a9d getMainConfig not actually needed and possibly harmful 2008-12-19 09:55:36 +00:00
dockes
0821f0cc29 dont unaccent japanese + fix bug in unac/split ordering in searchdata 2008-12-19 09:44:39 +00:00
dockes
5463ea258f comment 2008-12-17 14:26:09 +00:00
dockes
c0689dd1cf make gcc happy 2008-12-15 14:39:52 +00:00
dockes
5d27917c66 reorganize code + add boost to phrase element to match boost of original user terms 2008-12-15 09:24:24 +00:00
dockes
3414963810 take care of splitting user string with respect to unicode white space, not only ascii 2008-12-05 11:09:31 +00:00
dockes
b6936d5a60 highlighting would not work with cat filt active because ClausSub did not implement getTerms 2008-10-14 07:50:14 +00:00
dockes
f0538b15f2 move stemlang from RclQuery to SearchData. Allow DocSequences to do the sorting/filtering themselves 2008-09-29 11:33:55 +00:00
dockes
6d48df7a91 move sort params from searchdata to rclquery 2008-09-29 06:58:25 +00:00
dockes
7d30485f87 general field name handling cleanup + sort facility in rclquery 2008-09-16 08:18:30 +00:00
dockes
34864f159a ensure that a negative clause is not first or only in list 2008-08-28 15:42:43 +00:00
dockes
3223e8dc03 express query language OR chains as rcldb subqueries so that field specs will work inside them 2008-01-16 11:14:38 +00:00
dockes
65d9ae06dc splitString filename queries 2008-01-16 08:43:26 +00:00
dockes
9b5de1a4ac when search includes composite spans + other terms, increase slack instead of switching to word split 2007-10-04 12:26:04 +00:00
dockes
844f4f831a comments,formatting 2007-09-20 08:43:12 +00:00
dockes
e892ca4fa4 handle mime: and ext: in qlang 2007-06-22 06:14:04 +00:00