Jean-Francois Dockes
|
2d6e11c0aa
|
simplified field config a bit by moving some hard coded values from the c++ to the fields file
|
2012-08-28 14:44:53 +02:00 |
|
Jean-Francois Dockes
|
776800f47a
|
arrange to create all stem dicts in one pass
|
2012-08-28 13:39:34 +02:00 |
|
Jean-Francois Dockes
|
fc8b458222
|
create class StemDb as derived class from XapSynFamily
|
2012-08-27 15:38:08 +02:00 |
|
Jean-Francois Dockes
|
bd0f002c1a
|
Reimplemented the stem expansion mechanism over Xapian synonyms feature
|
2012-08-25 11:12:36 +02:00 |
|
"Jean-Francois Dockes ext:(%22)
|
0ebfc496d8
|
add capability to remember page breaks generated by, e.g. pdftotext, and use them to start an external viewer on a match page
|
2012-08-21 15:03:02 +02:00 |
|
Jean-Francois Dockes
|
baf450e75a
|
rcldb fix crash caused by 5c8d237c639d in case there is only one index
|
2012-05-04 11:54:07 +02:00 |
|
Jean-Francois Dockes
|
73a3106a6d
|
GUI: only do the result up to date check before preview for the main index (we cant update the others anyway)
|
2012-05-04 09:52:14 +02:00 |
|
Jean-Francois Dockes
|
8b34610dde
|
Cleaned up file name handling. Fixes that file names were sometimes indexed split, sometimes not. They now always are both, with different prefixes. Forces reindex
|
2012-04-13 09:18:08 +02:00 |
|
Jean-Francois Dockes
|
4eaf12fb9c
|
more delistification
|
2012-04-12 08:15:50 +02:00 |
|
Jean-Francois Dockes
|
ec7b40a52e
|
cosmetics: list -> vector in more places
|
2012-04-11 19:58:08 +02:00 |
|
Jean-Francois Dockes
|
c7c9c49437
|
add -Z "in place reset" option to recollindex
|
2012-04-11 11:33:33 +02:00 |
|
Jean-Francois Dockes
|
07813ab6ba
|
Dont store filename in empty title at index time, to keep choice at display time. Define %t as title in addition to %T as title or filename
|
2012-03-10 14:45:40 +01:00 |
|
Jean-Francois Dockes
|
7ddbbb1ee8
|
search language: implemented filtering on file size
|
2012-03-07 17:08:22 +01:00 |
|
Jean-Francois Dockes
|
85166c93b2
|
Changed the way we handle document sizes. The fbytes field should now be in most cases the most "natural" document size. pcbytes holds the top external container size and dbytes the text size
|
2012-03-07 15:39:30 +01:00 |
|
Jean-Francois Dockes
|
7b5a891ee3
|
idx: make Doc parameter to addOrUpdate non const to avoid extra copy
|
2012-03-07 08:34:25 +01:00 |
|
Jean-Francois Dockes
|
9bc2fc8958
|
Experimented with multithreading the indexing pipeline. Left undef'd as 15%-30% improvement of indexing time does not seem worth the complexity
|
2012-02-21 17:09:02 +01:00 |
|
Jean-Francois Dockes
|
516863b5d6
|
GUI: perform up to date check before previewing a subdoc. This is for example to avoid showing the wrong message if a mail folder has been compacted
|
2012-01-20 17:48:55 +01:00 |
|
Jean-Francois Dockes
|
607d3cc27b
|
Add prefix translation for "mtype". Allows using term expansion to retrieve all the types from the index
|
2011-11-25 19:47:39 +01:00 |
|
Jean-Francois Dockes
|
0860b559ee
|
get rid of a few garbage terms during indexing. Set a threshold for conversion errors after which we discard the doc. Stabilize the new termproc pipeline but no commongrams for now
|
2011-10-12 17:55:58 +02:00 |
|
Jean-Francois Dockes
|
5fd31172f5
|
New text to terms processing pipelines: results identical to 1.16 when used with empty stopfile
|
2011-10-07 07:53:49 +02:00 |
|
Jean-Francois Dockes
|
eda494153e
|
simplify calls to isStop
|
2011-10-05 17:25:35 +02:00 |
|
Jean-Francois Dockes
|
acb297c9df
|
comments + move the position jump to text_to_words
|
2011-10-04 16:33:44 +02:00 |
|
Jean-Francois Dockes
|
4ced9bee49
|
add termDocCnt method
|
2011-10-04 08:04:17 +02:00 |
|
Jean-Francois Dockes
|
38e0957962
|
const string cleanup
|
2011-10-01 16:39:38 +02:00 |
|
Jean-Francois Dockes
|
383468e2fc
|
bump doc create/update messages updates to loginfo so that indexing progress can be monitored with less noise
|
2011-09-30 08:47:39 +02:00 |
|
Jean-Francois Dockes
|
424e4173ba
|
threading cleanup: add mutex protection around moronic change to transcode. Add mutex to equiv issue in unac. Rename const strings everywhere to cstr_xx to ease future detection of potentially problematic static variables. Most probably close issue #65
|
2011-09-28 15:01:14 +02:00 |
|
Jean-Francois Dockes
|
e0d211d602
|
none
|
2011-09-20 17:16:41 +02:00 |
|
Jean-Francois Dockes
|
ee0d602ab3
|
Implement anchored searches: terms to be found at a maximum distance of the start or end of the text
|
2011-09-20 16:42:56 +02:00 |
|
Jean-Francois Dockes
|
c5ff0cdf52
|
Control memory usage when deleting documents: use idxflushmb as when adding/updating
|
2011-09-07 19:11:11 +02:00 |
|
Jean-Francois Dockes
|
a380873029
|
suppress some sources of spurious ellipsises in abstracts
|
2011-08-24 14:51:59 +02:00 |
|
Jean-Francois Dockes
|
d3fc258d85
|
avoid generating empty abstract field
|
2011-08-19 09:20:11 +02:00 |
|
"Jean-Francois Dockes ext:(%22)
|
ebbcc115a8
|
Allow setting a weight increase for field terms
|
2011-07-22 16:43:39 +02:00 |
|
"Jean-Francois Dockes ext:(%22)
|
48e86c99b5
|
GUI restable: fix sorting by file and doc size
|
2011-07-20 10:44:04 +02:00 |
|
Jean-Francois Dockes
|
469c544915
|
GUI: allow setting the snippet separator inside abstract (now a real html ellipsis by default)
|
2011-07-07 11:11:02 +02:00 |
|
Jean-Francois Dockes
|
b6c73ecdeb
|
debug: improve consistency of log messages about up to date/processed files
|
2011-06-04 10:18:46 +02:00 |
|
Jean-Francois Dockes
|
08a65f5cfc
|
experiment with xapian spell support (not ready yet) + take care of some static init issues showing up on the mac
|
2011-05-10 10:15:15 +02:00 |
|
Jean-Francois Dockes
|
84d59f18a0
|
GUI: when opening the index, discriminate errors on the main index from errors on external ones, to avoid starting the initial indexing dialog in the latter case
|
2011-04-29 16:16:04 +02:00 |
|
Jean-Francois Dockes
|
a4d1689581
|
try to be more responsive to user interrupts: do not build the aux databases after an interruption, and check for an interruption during the purge pass
|
2011-04-28 12:27:06 +02:00 |
|
Jean-Francois Dockes
|
55f124725f
|
Fix problems that occurred when multiple threads were trying to read/convert files at the same time (ie: indexing and previewing threads in the GUI calling internfile()). Either get rid of or lock-protect all shared data, eliminate misc initialization possible conflicts by using static initializers. Hopefuly closes issue #51
|
2011-04-28 10:58:33 +02:00 |
|
Jean-Francois Dockes
|
01f24fa5fd
|
cleaning up static variables
|
2011-04-27 09:09:01 +02:00 |
|
Jean-Francois Dockes
|
b28eaf23fb
|
Got rid of all the old RCS id strings
|
2011-04-27 08:22:17 +02:00 |
|
Jean-Francois Dockes
|
963d7c50fd
|
suppressed some overly repeated log messages
|
2011-03-11 11:49:54 +01:00 |
|
Jean-Francois Dockes
|
26929e9fb9
|
index: fixed the fix for path elts too long...
|
2011-02-14 20:30:26 +01:00 |
|
Jean-Francois Dockes
|
bf39719ac3
|
Indexing: need to truncate pathologically long path elements (would cause add_document error)
|
2011-02-13 10:07:25 +01:00 |
|
Jean-Francois Dockes
|
93fb51d59b
|
query: add duplication indicator to relevancy rating
|
2011-01-17 16:04:07 +01:00 |
|
Jean-Francois Dockes
|
85b36d3c34
|
filename search fields: generate an AND of OR lists out of wildcard expansion instead of a global OR which did not make much sense
|
2011-01-13 11:47:35 +01:00 |
|
Jean-Francois Dockes
|
0a6063542f
|
Gui: misc event/signals cleanups. No functional changes
|
2010-12-22 18:07:18 +01:00 |
|
Jean-Francois Dockes
|
45c08165f5
|
log message format
|
2010-12-21 10:34:02 +01:00 |
|
Jean-Francois Dockes
|
c79410da94
|
Move sort/filtering code out of reslist
|
2010-12-18 15:45:12 +01:00 |
|
Jean-Francois Dockes
|
61348a7731
|
GUI: got rid of the sort parameters dialog and sort by mime type, replaced by 2 arrows in toolbar for sorting by date, ascending or descending
|
2010-12-17 13:18:13 +01:00 |
|