315 Commits

Author SHA1 Message Date
"Jean-Francois Dockes ext:(%22)
dc7b3420a0 defined data structure to pass around the search term description used for highlighting and other 2012-08-17 10:45:00 +02:00
Jean-Francois Dockes
f34994d882 Get recoll to compile with clang (on freebsd) and eliminate warnings. You can no build recoll with make CXX=clang LINK=clang 2012-05-20 17:35:03 +02:00
Jean-Francois Dockes
baf450e75a rcldb fix crash caused by 5c8d237c639d in case there is only one index 2012-05-04 11:54:07 +02:00
Jean-Francois Dockes
73a3106a6d GUI: only do the result up to date check before preview for the main index (we cant update the others anyway) 2012-05-04 09:52:14 +02:00
Jean-Francois Dockes
8b34610dde Cleaned up file name handling. Fixes that file names were sometimes indexed split, sometimes not. They now always are both, with different prefixes. Forces reindex 2012-04-13 09:18:08 +02:00
Jean-Francois Dockes
4eaf12fb9c more delistification 2012-04-12 08:15:50 +02:00
Jean-Francois Dockes
ec7b40a52e cosmetics: list -> vector in more places 2012-04-11 19:58:08 +02:00
Jean-Francois Dockes
c7c9c49437 add -Z "in place reset" option to recollindex 2012-04-11 11:33:33 +02:00
Jean-Francois Dockes
14042528bd dont send cjk terms to stemmers. Sending them didnt seem to hurt, but did not make sense 2012-03-22 15:09:40 +01:00
Jean-Francois Dockes
07813ab6ba Dont store filename in empty title at index time, to keep choice at display time. Define %t as title in addition to %T as title or filename 2012-03-10 14:45:40 +01:00
Jean-Francois Dockes
7ddbbb1ee8 search language: implemented filtering on file size 2012-03-07 17:08:22 +01:00
Jean-Francois Dockes
85166c93b2 Changed the way we handle document sizes. The fbytes field should now be in most cases the most "natural" document size. pcbytes holds the top external container size and dbytes the text size 2012-03-07 15:39:30 +01:00
Jean-Francois Dockes
7b5a891ee3 idx: make Doc parameter to addOrUpdate non const to avoid extra copy 2012-03-07 08:34:25 +01:00
Jean-Francois Dockes
25a99a3b38 add omega-compatible value slot for file size 2012-03-06 07:28:18 +01:00
Jean-Francois Dockes
6cdf9ae12b Accept and process relative/incomplete paths with the dir: directive (dont anchor path phrase is path does not start with /) 2012-02-24 19:25:55 +01:00
Jean-Francois Dockes
9bc2fc8958 Experimented with multithreading the indexing pipeline. Left undef'd as 15%-30% improvement of indexing time does not seem worth the complexity 2012-02-21 17:09:02 +01:00
Jean-Francois Dockes
516863b5d6 GUI: perform up to date check before previewing a subdoc. This is for example to avoid showing the wrong message if a mail folder has been compacted 2012-01-20 17:48:55 +01:00
Jean-Francois Dockes
036937e8bf added getmeta() method to Rcl::Doc and use in misc places 2012-01-20 14:48:50 +01:00
Jean-Francois Dockes
1931595637 GUI: added menu entry to show all the mime types actually indexed (by content) 2011-11-25 19:47:56 +01:00
Jean-Francois Dockes
607d3cc27b Add prefix translation for "mtype". Allows using term expansion to retrieve all the types from the index 2011-11-25 19:47:39 +01:00
Jean-Francois Dockes
8d52e928d1 increase slack for automatic phrases 2011-10-20 13:25:33 +02:00
Jean-Francois Dockes
0860b559ee get rid of a few garbage terms during indexing. Set a threshold for conversion errors after which we discard the doc. Stabilize the new termproc pipeline but no commongrams for now 2011-10-12 17:55:58 +02:00
Jean-Francois Dockes
4a7ff398b2 comments 2011-10-07 08:05:36 +02:00
Jean-Francois Dockes
5fd31172f5 New text to terms processing pipelines: results identical to 1.16 when used with empty stopfile 2011-10-07 07:53:49 +02:00
Jean-Francois Dockes
eda494153e simplify calls to isStop 2011-10-05 17:25:35 +02:00
Jean-Francois Dockes
acb297c9df comments + move the position jump to text_to_words 2011-10-04 16:33:44 +02:00
Jean-Francois Dockes
e4eba0de97 stoplist: use stringToStrings in place of splitter to support quoted space-containing entries 2011-10-04 16:04:28 +02:00
Jean-Francois Dockes
bb2685c2f5 Add frequency threshold to avoid adding common term to the automatic phrase search extension. Use autophrase by default with simple search, with a default freq threshold at 2% 2011-10-04 09:03:43 +02:00
Jean-Francois Dockes
4ced9bee49 add termDocCnt method 2011-10-04 08:04:17 +02:00
Jean-Francois Dockes
38e0957962 const string cleanup 2011-10-01 16:39:38 +02:00
Jean-Francois Dockes
702fb88a1e Search: remove restriction on empty queries by replacing empty query with Xapian::Query::Matchall. This allows querying all files of a given type, or under a given tree, without an actual text search part 2011-09-30 08:50:50 +02:00
Jean-Francois Dockes
383468e2fc bump doc create/update messages updates to loginfo so that indexing progress can be monitored with less noise 2011-09-30 08:47:39 +02:00
Jean-Francois Dockes
424e4173ba threading cleanup: add mutex protection around moronic change to transcode. Add mutex to equiv issue in unac. Rename const strings everywhere to cstr_xx to ease future detection of potentially problematic static variables. Most probably close issue #65 2011-09-28 15:01:14 +02:00
Jean-Francois Dockes
e0d211d602 none 2011-09-20 17:16:41 +02:00
Jean-Francois Dockes
ee0d602ab3 Implement anchored searches: terms to be found at a maximum distance of the start or end of the text 2011-09-20 16:42:56 +02:00
Jean-Francois Dockes
c5ff0cdf52 Control memory usage when deleting documents: use idxflushmb as when adding/updating 2011-09-07 19:11:11 +02:00
Jean-Francois Dockes
a380873029 suppress some sources of spurious ellipsises in abstracts 2011-08-24 14:51:59 +02:00
Jean-Francois Dockes
d3fc258d85 avoid generating empty abstract field 2011-08-19 09:20:11 +02:00
"Jean-Francois Dockes ext:(%22)
ebbcc115a8 Allow setting a weight increase for field terms 2011-07-22 16:43:39 +02:00
"Jean-Francois Dockes ext:(%22)
48e86c99b5 GUI restable: fix sorting by file and doc size 2011-07-20 10:44:04 +02:00
Jean-Francois Dockes
469c544915 GUI: allow setting the snippet separator inside abstract (now a real html ellipsis by default) 2011-07-07 11:11:02 +02:00
Jean-Francois Dockes
b6c73ecdeb debug: improve consistency of log messages about up to date/processed files 2011-06-04 10:18:46 +02:00
Jean-Francois Dockes
91f277ec26 Search: allow setting weights on terms, ie: "important"2.5 2011-05-30 14:03:01 +02:00
Jean-Francois Dockes
ce9e9e4d00 query: support negative mime and catg clauses: -mime:text/plain 2011-05-15 09:29:24 +02:00
Jean-Francois Dockes
08a65f5cfc experiment with xapian spell support (not ready yet) + take care of some static init issues showing up on the mac 2011-05-10 10:15:15 +02:00
Jean-Francois Dockes
ce607032fa Fix a number of potential or actual static object initialization issues 2011-05-09 20:49:15 +02:00
Jean-Francois Dockes
32f4f7b6fc Fix a number of potential or actual static object initialization issues 2011-05-09 20:48:59 +02:00
Jean-Francois Dockes
84d59f18a0 GUI: when opening the index, discriminate errors on the main index from errors on external ones, to avoid starting the initial indexing dialog in the latter case 2011-04-29 16:16:04 +02:00
Jean-Francois Dockes
a4d1689581 try to be more responsive to user interrupts: do not build the aux databases after an interruption, and check for an interruption during the purge pass 2011-04-28 12:27:06 +02:00
Jean-Francois Dockes
55f124725f Fix problems that occurred when multiple threads were trying to read/convert files at the same time (ie: indexing and previewing threads in the GUI calling internfile()). Either get rid of or lock-protect all shared data, eliminate misc initialization possible conflicts by using static initializers. Hopefuly closes issue #51 2011-04-28 10:58:33 +02:00