Jean-Francois Dockes
|
075f1f7518
|
filenames used for "filename search" need to be lowercased and stripped
|
2012-10-15 08:06:04 +02:00 |
|
Jean-Francois Dockes
|
bfeb681574
|
mimetype T prefix was mishandled for a raw index
|
2012-10-13 11:08:53 +02:00 |
|
Jean-Francois Dockes
|
3a2b15da10
|
comment
|
2012-10-12 13:36:38 +02:00 |
|
Jean-Francois Dockes
|
a16d047f8d
|
Snippet generation: limit positions walk to max hit position. Return status code when truncated walk possibly generated incomplete snippets. Implement config variabl for max pos walk
|
2012-10-08 14:30:14 +02:00 |
|
Jean-Francois Dockes
|
c9f6612c10
|
implemented proper limitation and error reporting in case of truncation for term and query expansions
|
2012-10-05 12:36:19 +02:00 |
|
Jean-Francois Dockes
|
bfd111ecaa
|
removed list size truncature on filename expansion
|
2012-10-05 09:19:42 +02:00 |
|
Jean-Francois Dockes
|
3f331ebb3e
|
fix glitch caused by udi prefix change
|
2012-10-03 08:05:39 +02:00 |
|
Jean-Francois Dockes
|
be27f404d2
|
Prefixes for unique identifier and parent terms were not wrapped
|
2012-10-02 19:16:57 +02:00 |
|
Jean-Francois Dockes
|
4a0a4fcf8e
|
fix 2 glitches in pdf page numer handling
|
2012-10-01 11:27:16 +02:00 |
|
"Jean-Francois Dockes ext:(%22)
|
af2d031e50
|
moved snippets generation code from db to query object
|
2012-09-26 12:13:40 +02:00 |
|
"Jean-Francois Dockes ext:(%22)
|
52bc9f4aa3
|
merged the case/diac sensitivity code back into trunk
|
2012-09-25 19:20:24 +02:00 |
|
"Jean-Francois Dockes ext:(%22)
|
ab32062fcc
|
Separate count and context for snippets in the snippets popup from the default values for the result list
|
2012-09-23 18:19:43 +02:00 |
|
Jean-Francois Dockes
|
d9dc7cf142
|
preliminary implementation for the snippets "open to page" popup window
|
2012-09-20 13:51:40 +02:00 |
|
Jean-Francois Dockes
|
d25d79ea42
|
changed variable names for clarity
|
2012-09-19 19:49:43 +02:00 |
|
Jean-Francois Dockes
|
1b5136539f
|
Bad concatenation generated absurd page numbers for document with several multiple paeg breaks
|
2012-09-19 14:04:20 +02:00 |
|
Jean-Francois Dockes
|
9b273d94e8
|
ensure that recoll configured with indexStripChars=1 runs as compiled with -DRCL_INDEX_STRIPCHARS
--HG--
branch : CASEDIACSENS
|
2012-09-15 15:16:20 +02:00 |
|
Jean-Francois Dockes
|
a7222d4f96
|
Make Recoll optionally sensitive to case and diacritics
--HG--
branch : CASEDIACSENS
|
2012-09-14 14:34:27 +02:00 |
|
Jean-Francois Dockes
|
3dfaa7525b
|
Display page numbers inside abstracts when possible (e.g.: for pdfs)
|
2012-09-11 12:44:40 +02:00 |
|
Jean-Francois Dockes
|
3343a7f724
|
Fix the page break recording function for multiple page break at same term position
|
2012-09-10 18:14:21 +02:00 |
|
Jean-Francois Dockes
|
de812094b5
|
more small prefix fixups
|
2012-08-28 17:36:24 +02:00 |
|
Jean-Francois Dockes
|
2d6e11c0aa
|
simplified field config a bit by moving some hard coded values from the c++ to the fields file
|
2012-08-28 14:44:53 +02:00 |
|
Jean-Francois Dockes
|
776800f47a
|
arrange to create all stem dicts in one pass
|
2012-08-28 13:39:34 +02:00 |
|
Jean-Francois Dockes
|
fc8b458222
|
create class StemDb as derived class from XapSynFamily
|
2012-08-27 15:38:08 +02:00 |
|
Jean-Francois Dockes
|
bd0f002c1a
|
Reimplemented the stem expansion mechanism over Xapian synonyms feature
|
2012-08-25 11:12:36 +02:00 |
|
"Jean-Francois Dockes ext:(%22)
|
0ebfc496d8
|
add capability to remember page breaks generated by, e.g. pdftotext, and use them to start an external viewer on a match page
|
2012-08-21 15:03:02 +02:00 |
|
Jean-Francois Dockes
|
baf450e75a
|
rcldb fix crash caused by 5c8d237c639d in case there is only one index
|
2012-05-04 11:54:07 +02:00 |
|
Jean-Francois Dockes
|
73a3106a6d
|
GUI: only do the result up to date check before preview for the main index (we cant update the others anyway)
|
2012-05-04 09:52:14 +02:00 |
|
Jean-Francois Dockes
|
8b34610dde
|
Cleaned up file name handling. Fixes that file names were sometimes indexed split, sometimes not. They now always are both, with different prefixes. Forces reindex
|
2012-04-13 09:18:08 +02:00 |
|
Jean-Francois Dockes
|
4eaf12fb9c
|
more delistification
|
2012-04-12 08:15:50 +02:00 |
|
Jean-Francois Dockes
|
ec7b40a52e
|
cosmetics: list -> vector in more places
|
2012-04-11 19:58:08 +02:00 |
|
Jean-Francois Dockes
|
c7c9c49437
|
add -Z "in place reset" option to recollindex
|
2012-04-11 11:33:33 +02:00 |
|
Jean-Francois Dockes
|
07813ab6ba
|
Dont store filename in empty title at index time, to keep choice at display time. Define %t as title in addition to %T as title or filename
|
2012-03-10 14:45:40 +01:00 |
|
Jean-Francois Dockes
|
7ddbbb1ee8
|
search language: implemented filtering on file size
|
2012-03-07 17:08:22 +01:00 |
|
Jean-Francois Dockes
|
85166c93b2
|
Changed the way we handle document sizes. The fbytes field should now be in most cases the most "natural" document size. pcbytes holds the top external container size and dbytes the text size
|
2012-03-07 15:39:30 +01:00 |
|
Jean-Francois Dockes
|
7b5a891ee3
|
idx: make Doc parameter to addOrUpdate non const to avoid extra copy
|
2012-03-07 08:34:25 +01:00 |
|
Jean-Francois Dockes
|
9bc2fc8958
|
Experimented with multithreading the indexing pipeline. Left undef'd as 15%-30% improvement of indexing time does not seem worth the complexity
|
2012-02-21 17:09:02 +01:00 |
|
Jean-Francois Dockes
|
516863b5d6
|
GUI: perform up to date check before previewing a subdoc. This is for example to avoid showing the wrong message if a mail folder has been compacted
|
2012-01-20 17:48:55 +01:00 |
|
Jean-Francois Dockes
|
607d3cc27b
|
Add prefix translation for "mtype". Allows using term expansion to retrieve all the types from the index
|
2011-11-25 19:47:39 +01:00 |
|
Jean-Francois Dockes
|
0860b559ee
|
get rid of a few garbage terms during indexing. Set a threshold for conversion errors after which we discard the doc. Stabilize the new termproc pipeline but no commongrams for now
|
2011-10-12 17:55:58 +02:00 |
|
Jean-Francois Dockes
|
5fd31172f5
|
New text to terms processing pipelines: results identical to 1.16 when used with empty stopfile
|
2011-10-07 07:53:49 +02:00 |
|
Jean-Francois Dockes
|
eda494153e
|
simplify calls to isStop
|
2011-10-05 17:25:35 +02:00 |
|
Jean-Francois Dockes
|
acb297c9df
|
comments + move the position jump to text_to_words
|
2011-10-04 16:33:44 +02:00 |
|
Jean-Francois Dockes
|
4ced9bee49
|
add termDocCnt method
|
2011-10-04 08:04:17 +02:00 |
|
Jean-Francois Dockes
|
38e0957962
|
const string cleanup
|
2011-10-01 16:39:38 +02:00 |
|
Jean-Francois Dockes
|
383468e2fc
|
bump doc create/update messages updates to loginfo so that indexing progress can be monitored with less noise
|
2011-09-30 08:47:39 +02:00 |
|
Jean-Francois Dockes
|
424e4173ba
|
threading cleanup: add mutex protection around moronic change to transcode. Add mutex to equiv issue in unac. Rename const strings everywhere to cstr_xx to ease future detection of potentially problematic static variables. Most probably close issue #65
|
2011-09-28 15:01:14 +02:00 |
|
Jean-Francois Dockes
|
e0d211d602
|
none
|
2011-09-20 17:16:41 +02:00 |
|
Jean-Francois Dockes
|
ee0d602ab3
|
Implement anchored searches: terms to be found at a maximum distance of the start or end of the text
|
2011-09-20 16:42:56 +02:00 |
|
Jean-Francois Dockes
|
c5ff0cdf52
|
Control memory usage when deleting documents: use idxflushmb as when adding/updating
|
2011-09-07 19:11:11 +02:00 |
|
Jean-Francois Dockes
|
a380873029
|
suppress some sources of spurious ellipsises in abstracts
|
2011-08-24 14:51:59 +02:00 |
|