295 Commits

Author SHA1 Message Date
Jean-Francois Dockes
8d52e928d1 increase slack for automatic phrases 2011-10-20 13:25:33 +02:00
Jean-Francois Dockes
0860b559ee get rid of a few garbage terms during indexing. Set a threshold for conversion errors after which we discard the doc. Stabilize the new termproc pipeline but no commongrams for now 2011-10-12 17:55:58 +02:00
Jean-Francois Dockes
4a7ff398b2 comments 2011-10-07 08:05:36 +02:00
Jean-Francois Dockes
5fd31172f5 New text to terms processing pipelines: results identical to 1.16 when used with empty stopfile 2011-10-07 07:53:49 +02:00
Jean-Francois Dockes
eda494153e simplify calls to isStop 2011-10-05 17:25:35 +02:00
Jean-Francois Dockes
acb297c9df comments + move the position jump to text_to_words 2011-10-04 16:33:44 +02:00
Jean-Francois Dockes
e4eba0de97 stoplist: use stringToStrings in place of splitter to support quoted space-containing entries 2011-10-04 16:04:28 +02:00
Jean-Francois Dockes
bb2685c2f5 Add frequency threshold to avoid adding common term to the automatic phrase search extension. Use autophrase by default with simple search, with a default freq threshold at 2% 2011-10-04 09:03:43 +02:00
Jean-Francois Dockes
4ced9bee49 add termDocCnt method 2011-10-04 08:04:17 +02:00
Jean-Francois Dockes
38e0957962 const string cleanup 2011-10-01 16:39:38 +02:00
Jean-Francois Dockes
702fb88a1e Search: remove restriction on empty queries by replacing empty query with Xapian::Query::Matchall. This allows querying all files of a given type, or under a given tree, without an actual text search part 2011-09-30 08:50:50 +02:00
Jean-Francois Dockes
383468e2fc bump doc create/update messages updates to loginfo so that indexing progress can be monitored with less noise 2011-09-30 08:47:39 +02:00
Jean-Francois Dockes
424e4173ba threading cleanup: add mutex protection around moronic change to transcode. Add mutex to equiv issue in unac. Rename const strings everywhere to cstr_xx to ease future detection of potentially problematic static variables. Most probably close issue #65 2011-09-28 15:01:14 +02:00
Jean-Francois Dockes
e0d211d602 none 2011-09-20 17:16:41 +02:00
Jean-Francois Dockes
ee0d602ab3 Implement anchored searches: terms to be found at a maximum distance of the start or end of the text 2011-09-20 16:42:56 +02:00
Jean-Francois Dockes
c5ff0cdf52 Control memory usage when deleting documents: use idxflushmb as when adding/updating 2011-09-07 19:11:11 +02:00
Jean-Francois Dockes
a380873029 suppress some sources of spurious ellipsises in abstracts 2011-08-24 14:51:59 +02:00
Jean-Francois Dockes
d3fc258d85 avoid generating empty abstract field 2011-08-19 09:20:11 +02:00
"Jean-Francois Dockes ext:(%22)
ebbcc115a8 Allow setting a weight increase for field terms 2011-07-22 16:43:39 +02:00
"Jean-Francois Dockes ext:(%22)
48e86c99b5 GUI restable: fix sorting by file and doc size 2011-07-20 10:44:04 +02:00
Jean-Francois Dockes
469c544915 GUI: allow setting the snippet separator inside abstract (now a real html ellipsis by default) 2011-07-07 11:11:02 +02:00
Jean-Francois Dockes
b6c73ecdeb debug: improve consistency of log messages about up to date/processed files 2011-06-04 10:18:46 +02:00
Jean-Francois Dockes
91f277ec26 Search: allow setting weights on terms, ie: "important"2.5 2011-05-30 14:03:01 +02:00
Jean-Francois Dockes
ce9e9e4d00 query: support negative mime and catg clauses: -mime:text/plain 2011-05-15 09:29:24 +02:00
Jean-Francois Dockes
08a65f5cfc experiment with xapian spell support (not ready yet) + take care of some static init issues showing up on the mac 2011-05-10 10:15:15 +02:00
Jean-Francois Dockes
ce607032fa Fix a number of potential or actual static object initialization issues 2011-05-09 20:49:15 +02:00
Jean-Francois Dockes
32f4f7b6fc Fix a number of potential or actual static object initialization issues 2011-05-09 20:48:59 +02:00
Jean-Francois Dockes
84d59f18a0 GUI: when opening the index, discriminate errors on the main index from errors on external ones, to avoid starting the initial indexing dialog in the latter case 2011-04-29 16:16:04 +02:00
Jean-Francois Dockes
a4d1689581 try to be more responsive to user interrupts: do not build the aux databases after an interruption, and check for an interruption during the purge pass 2011-04-28 12:27:06 +02:00
Jean-Francois Dockes
55f124725f Fix problems that occurred when multiple threads were trying to read/convert files at the same time (ie: indexing and previewing threads in the GUI calling internfile()). Either get rid of or lock-protect all shared data, eliminate misc initialization possible conflicts by using static initializers. Hopefuly closes issue #51 2011-04-28 10:58:33 +02:00
Jean-Francois Dockes
01f24fa5fd cleaning up static variables 2011-04-27 09:09:01 +02:00
Jean-Francois Dockes
b28eaf23fb Got rid of all the old RCS id strings 2011-04-27 08:22:17 +02:00
Jean-Francois Dockes
e883c4d04e Search: allow negative directory filtering (all except from dir). Emit more explicit errors for other unallowed negative search clauses. 2011-03-30 14:35:09 +02:00
Jean-Francois Dockes
ae6d758b34 GUI: display estimated result count in status line 2011-03-11 11:54:50 +01:00
Jean-Francois Dockes
963d7c50fd suppressed some overly repeated log messages 2011-03-11 11:49:54 +01:00
Jean-Francois Dockes
26929e9fb9 index: fixed the fix for path elts too long... 2011-02-14 20:30:26 +01:00
Jean-Francois Dockes
bf39719ac3 Indexing: need to truncate pathologically long path elements (would cause add_document error) 2011-02-13 10:07:25 +01:00
Jean-Francois Dockes
e8fcd35fef fix term highlighting for field searches 2011-01-28 15:47:58 +01:00
Jean-Francois Dockes
50238d5577 restable: highlight match terms 2011-01-28 12:28:27 +01:00
Jean-Francois Dockes
76edc0b290 missing stdio.h 2011-01-17 16:09:14 +01:00
Jean-Francois Dockes
93fb51d59b query: add duplication indicator to relevancy rating 2011-01-17 16:04:07 +01:00
Jean-Francois Dockes
34511918d9 query: extract the collapse count from xapian + small cleanups 2011-01-17 11:25:05 +01:00
Jean-Francois Dockes
85b36d3c34 filename search fields: generate an AND of OR lists out of wildcard expansion instead of a global OR which did not make much sense 2011-01-13 11:47:35 +01:00
Jean-Francois Dockes
58c4c12b04 restable. Set more sensible initial defaults + other small fixes 2011-01-11 08:39:00 +01:00
Jean-Francois Dockes
3bd39d893e Gui restable: add/remove columns 2010-12-24 15:48:44 +01:00
Jean-Francois Dockes
0a6063542f Gui: misc event/signals cleanups. No functional changes 2010-12-22 18:07:18 +01:00
Jean-Francois Dockes
107e02b74a Gui search: make autophrase work with a query language query 2010-12-21 16:00:25 +01:00
Jean-Francois Dockes
45c08165f5 log message format 2010-12-21 10:34:02 +01:00
Jean-Francois Dockes
c79410da94 Move sort/filtering code out of reslist 2010-12-18 15:45:12 +01:00
Jean-Francois Dockes
61348a7731 GUI: got rid of the sort parameters dialog and sort by mime type, replaced by 2 arrows in toolbar for sorting by date, ascending or descending 2010-12-17 13:18:13 +01:00