91 Commits

Author SHA1 Message Date
Jean-Francois Dockes
d9e6030b66 reorganized the term expansion code so that the term explorer works fully with case and diac sensitivity options 2013-01-14 18:06:48 +01:00
Jean-Francois Dockes
f8280c88ca small fixups and compilation issues 2013-01-14 09:57:04 +01:00
Jean-Francois Dockes
1b38c5c98c replaced SCLT_EXCL clauses with general excl/neg flag 2013-01-05 18:15:54 +01:00
Jean-Francois Dockes
cbc269abb1 define new searchdataclausepath to replace the old dir: filtering mechanism. ORing dirs now works 2013-01-05 16:21:30 +01:00
Jean-Francois Dockes
9b55eb1cda perform case/diac expansion when processing wildcards 2013-01-04 13:34:26 +01:00
Jean-Francois Dockes
9561309f0b make ugroups a real vector of vectors (the previous "vectors" had only one entry with the user string even if it was made of several words) 2012-12-19 19:57:12 +01:00
Jean-Francois Dockes
5c6db5331c use const rclconfig 2012-11-28 13:20:52 +01:00
Jean-Francois Dockes
7115be2440 stemdb: only need to expand the unac'd term if it differs from raw + comments and traces 2012-11-26 09:13:57 +01:00
Jean-Francois Dockes
48deb73c43 add "soft" term expansion limit not causing error when reached 2012-11-18 17:28:49 +01:00
Jean-Francois Dockes
881794ce2b simplified and dispatched code in the searchdata monster 2012-11-18 13:25:54 +01:00
Jean-Francois Dockes
816980a1c4 implemented advanced search history feature 2012-10-16 13:37:56 +02:00
Jean-Francois Dockes
075f1f7518 filenames used for "filename search" need to be lowercased and stripped 2012-10-15 08:06:04 +02:00
Jean-Francois Dockes
bfeb681574 mimetype T prefix was mishandled for a raw index 2012-10-13 11:08:53 +02:00
Jean-Francois Dockes
a16d047f8d Snippet generation: limit positions walk to max hit position. Return status code when truncated walk possibly generated incomplete snippets. Implement config variabl for max pos walk 2012-10-08 14:30:14 +02:00
Jean-Francois Dockes
1329265b7b check for empty file name in internfile, else gets stuck later because empty fn is interpreted as read stdin in md5 2012-10-05 16:42:13 +02:00
Jean-Francois Dockes
c9f6612c10 implemented proper limitation and error reporting in case of truncation for term and query expansions 2012-10-05 12:36:19 +02:00
Jean-Francois Dockes
bfd111ecaa removed list size truncature on filename expansion 2012-10-05 09:19:42 +02:00
Jean-Francois Dockes
2807fa3c18 autodiacsens and autocasesens parameters 2012-10-03 15:35:40 +02:00
Jean-Francois Dockes
c589419267 Abstracts: improve the way we group terms for quality computation 2012-10-03 11:17:16 +02:00
Jean-Francois Dockes
cb654c74e9 comments and small fixes to case/diac code 2012-10-01 17:26:16 +02:00
Jean-Francois Dockes
9b273d94e8 ensure that recoll configured with indexStripChars=1 runs as compiled with -DRCL_INDEX_STRIPCHARS
--HG--
branch : CASEDIACSENS
2012-09-15 15:16:20 +02:00
Jean-Francois Dockes
5b38b9ebd0 case sensitivity: clause mod flags were lost on the way
--HG--
branch : CASEDIACSENS
2012-09-14 15:02:51 +02:00
Jean-Francois Dockes
a7222d4f96 Make Recoll optionally sensitive to case and diacritics
--HG--
branch : CASEDIACSENS
2012-09-14 14:34:27 +02:00
"Jean-Francois Dockes ext:(%22)
6eada80b08 allow multiple directory specs as in dir:/home/me -dir:tmp 2012-08-19 08:27:12 +02:00
"Jean-Francois Dockes ext:(%22)
dc7b3420a0 defined data structure to pass around the search term description used for highlighting and other 2012-08-17 10:45:00 +02:00
Jean-Francois Dockes
8b34610dde Cleaned up file name handling. Fixes that file names were sometimes indexed split, sometimes not. They now always are both, with different prefixes. Forces reindex 2012-04-13 09:18:08 +02:00
Jean-Francois Dockes
4eaf12fb9c more delistification 2012-04-12 08:15:50 +02:00
Jean-Francois Dockes
ec7b40a52e cosmetics: list -> vector in more places 2012-04-11 19:58:08 +02:00
Jean-Francois Dockes
7ddbbb1ee8 search language: implemented filtering on file size 2012-03-07 17:08:22 +01:00
Jean-Francois Dockes
6cdf9ae12b Accept and process relative/incomplete paths with the dir: directive (dont anchor path phrase is path does not start with /) 2012-02-24 19:25:55 +01:00
Jean-Francois Dockes
1931595637 GUI: added menu entry to show all the mime types actually indexed (by content) 2011-11-25 19:47:56 +01:00
Jean-Francois Dockes
8d52e928d1 increase slack for automatic phrases 2011-10-20 13:25:33 +02:00
Jean-Francois Dockes
0860b559ee get rid of a few garbage terms during indexing. Set a threshold for conversion errors after which we discard the doc. Stabilize the new termproc pipeline but no commongrams for now 2011-10-12 17:55:58 +02:00
Jean-Francois Dockes
5fd31172f5 New text to terms processing pipelines: results identical to 1.16 when used with empty stopfile 2011-10-07 07:53:49 +02:00
Jean-Francois Dockes
eda494153e simplify calls to isStop 2011-10-05 17:25:35 +02:00
Jean-Francois Dockes
bb2685c2f5 Add frequency threshold to avoid adding common term to the automatic phrase search extension. Use autophrase by default with simple search, with a default freq threshold at 2% 2011-10-04 09:03:43 +02:00
Jean-Francois Dockes
38e0957962 const string cleanup 2011-10-01 16:39:38 +02:00
Jean-Francois Dockes
702fb88a1e Search: remove restriction on empty queries by replacing empty query with Xapian::Query::Matchall. This allows querying all files of a given type, or under a given tree, without an actual text search part 2011-09-30 08:50:50 +02:00
Jean-Francois Dockes
424e4173ba threading cleanup: add mutex protection around moronic change to transcode. Add mutex to equiv issue in unac. Rename const strings everywhere to cstr_xx to ease future detection of potentially problematic static variables. Most probably close issue #65 2011-09-28 15:01:14 +02:00
Jean-Francois Dockes
ee0d602ab3 Implement anchored searches: terms to be found at a maximum distance of the start or end of the text 2011-09-20 16:42:56 +02:00
"Jean-Francois Dockes ext:(%22)
ebbcc115a8 Allow setting a weight increase for field terms 2011-07-22 16:43:39 +02:00
Jean-Francois Dockes
91f277ec26 Search: allow setting weights on terms, ie: "important"2.5 2011-05-30 14:03:01 +02:00
Jean-Francois Dockes
ce9e9e4d00 query: support negative mime and catg clauses: -mime:text/plain 2011-05-15 09:29:24 +02:00
Jean-Francois Dockes
b28eaf23fb Got rid of all the old RCS id strings 2011-04-27 08:22:17 +02:00
Jean-Francois Dockes
e883c4d04e Search: allow negative directory filtering (all except from dir). Emit more explicit errors for other unallowed negative search clauses. 2011-03-30 14:35:09 +02:00
Jean-Francois Dockes
e8fcd35fef fix term highlighting for field searches 2011-01-28 15:47:58 +01:00
Jean-Francois Dockes
34511918d9 query: extract the collapse count from xapian + small cleanups 2011-01-17 11:25:05 +01:00
Jean-Francois Dockes
85b36d3c34 filename search fields: generate an AND of OR lists out of wildcard expansion instead of a global OR which did not make much sense 2011-01-13 11:47:35 +01:00
Jean-Francois Dockes
107e02b74a Gui search: make autophrase work with a query language query 2010-12-21 16:00:25 +01:00
Jean-Francois Dockes
21c6025ba7 Use a xapian phrase search on the split path for filtering on directory location (much faster than the current method) 2010-12-16 15:53:40 +01:00