136 Commits

Author SHA1 Message Date
Jean-Francois Dockes
d69d2abbde TempFile: clean-up interface by using internal ref-counted class member. Uncomp: add interface to clear cache 2018-05-17 10:24:01 +02:00
Jean-Francois Dockes
29c6f75423 make sure that python rclextract.idoctofile always retrieves an uncompressed file of the correct MIME type. + misc comments 2017-07-20 12:52:24 +02:00
Jean-Francois Dockes
9f02bc8119 prettified LOG lines 2017-07-19 19:15:29 +02:00
Jean-Francois Dockes
19a4b2a287 Do not filter out text/html when it results from a conversion, even if excluded by indexedmimetypes/excludedmimetypes 2017-06-08 10:09:05 +02:00
Jean-Francois Dockes
bde991c08a got rid of off_t 2017-02-28 20:36:01 +01:00
Jean-Francois Dockes
2594b71ae8 log 2017-01-16 11:14:54 +01:00
Jean-Francois Dockes
d80531fa62 Fix mimetype filtering (indexedmimetypes/excludedmimetypes) not working for embedded documents 2017-01-13 09:18:18 +01:00
Jean-Francois Dockes
9ce6530e7b execm filters: the change to let filters set arbitrary metadata lost the top doc size, now saved aside 2016-08-12 18:00:52 +02:00
Jean-Francois Dockes
f6a999de84 logging now uses c++ streams 2016-07-12 09:41:04 +02:00
Jean-Francois Dockes
1aea57fcb2 defined data access interface for external indexers 2016-06-01 09:46:47 +02:00
Jean-Francois Dockes
ff15f8fb1c Centralize stat calls to ensure consistency of time fields on windows 2016-01-08 11:23:10 +01:00
Jean-Francois Dockes
f70c92c629 rcldb::getSubDocs() (called from GUI show subdocs) was returning too many results because the parent/child ipath test was flawed 2015-11-03 08:40:13 +01:00
Jean-Francois Dockes
3b18facc16 Fixed some "unused xxx" warnings + include autoconfig 2015-10-07 08:30:49 +02:00
Jean-Francois Dockes
1cbf02f713 Suppressed many integer size warnings by a mix of type adjustments and casts,
none of which should have a real effect.

--HG--
branch : WINDOWSPORT
2015-09-01 19:39:20 +02:00
Jean-Francois Dockes
14c8e740d6 Windows: fixed a number of int size warnings mostly by casting them away
--HG--
branch : WINDOWSPORT
2015-08-30 17:30:31 +02:00
Jean-Francois Dockes
d4cd1dd91c 1st mods to get a build under windows. Does not build yet, far from it
--HG--
branch : WINDOWSPORT
2015-08-30 11:19:18 +02:00
Jean-Francois Dockes
c6e228b7c6 Prepared windows port by removing a number of spurious reference to unix-specific interfaces, and using some xapian posix adaptor includes 2015-08-19 14:41:10 +02:00
Jean-Francois Dockes
4d1f679eac Use std[::tr1]::shared_ptr instead of local RefCntr by default 2015-08-09 13:54:24 +02:00
Jean-Francois Dockes
0840daf20e Avoid replacing (instead of concatenating) the current author field value with the internal one when the document is a top-level one. This allows metadata from metadatacmds to be used 2015-08-06 08:08:36 +02:00
Jean-Francois Dockes
4d35cbabfb Also index non-html files from the web queue and fix the Open operation for them 2015-07-24 16:30:13 +02:00
Jean-Francois Dockes
d630cbbaec Delete RCL_USE_XATTR configure/compile time variable, it was not
useful. Add configuration variable to use mtime instead of ctime for update
detection. Useful on a system where xattrs would be modified but not
indexed, to avoid excessive reindexing.
2014-12-09 11:15:17 +01:00
Jean-Francois Dockes
4ac34cb134 Off by one error in maximum embedding depth test caused overflow of FileInterner m_tmpflgs temp flags array and possibly bus error depending on arch (only seen on 32 bits arch) 2014-05-15 15:15:01 +02:00
Jean-Francois Dockes
9487a0cffa Code for reaping xattrs and cmd metadata did not need to be implemented as internfile members and can be used in other contexts 2013-10-03 09:38:35 +02:00
Jean-Francois Dockes
ebe9b44a2c fix metadatacmds multifield modif, didnt set anything at all... 2013-09-27 13:04:05 +02:00
Jean-Francois Dockes
3fbcbc8c2b allow multiple field output from metadatacmds entry beginning with rclmulti. Add noxattrfields config variable to allow disabling extended attributes usage 2013-09-27 12:07:32 +02:00
medoc
641acd3d68 move the execution of external metadata-gathering commands from fsindexer to internfile for consistency of handling with filter-generated metadata 2013-09-06 11:51:00 +02:00
Jean-Francois Dockes
243ac82526 missing return statement... 2013-05-26 15:25:16 +02:00
Jean-Francois Dockes
a1b7018cfd Fix problems which occurred when using functions like open-parents with multiple indexes containing identical paths (udis) 2013-05-25 11:26:57 +02:00
Jean-Francois Dockes
167c8a4286 fix minor issues in multisave and popup menus 2013-04-28 16:58:05 +02:00
Jean-Francois Dockes
a7728ceb91 changed the mime handler cache key (was the mime type), to avoid having multiple copies of the same filter when applied to different mime types. This reduces a lot the number of processes during indexing, with no impact on performance 2013-04-25 18:18:48 +02:00
Jean-Francois Dockes
2b80c77c23 Add possibility to display a list of sub-documents for a given result 2013-04-24 16:33:53 +02:00
Jean-Francois Dockes
3c80e51940 simplified temp file handling for compressed documents and, for querying, implemented caching for last file uncompressed 2013-03-06 18:52:57 +01:00
Jean-Francois Dockes
50135e3428 process extended attributes by default 2013-02-19 16:12:24 +01:00
Jean-Francois Dockes
d3631b5ddf cleaned up processing of metadata from diverse origins (doc,extattrs,localfields) 2013-01-29 14:33:57 +01:00
Jean-Francois Dockes
d2f7f11715 Use dynamic lib for shared recoll code 2012-12-29 14:27:01 +01:00
Jean-Francois Dockes
2d5c2a8058 split the iDocToFile method into static and member parts for use from python module 2012-12-20 11:15:10 +01:00
Jean-Francois Dockes
5fc8f240fe from 1.18 branch: Adjust things for using the new Firefox plugin: remove visible Beagle references + fix 1.18 web queue indexing bugs 2012-11-01 11:30:39 +01:00
Jean-Francois Dockes
ee7d0f2ee7 1st parallel multithreaded version of indexing which can do my home without crashing... Let's checkpoint 2012-11-01 11:19:48 +01:00
Jean-Francois Dockes
b8963db4b1 cleaned up the missing helper storage class 2012-10-28 16:43:19 +01:00
Jean-Francois Dockes
95ef518ec7 the missing filter detection code was broken 2012-10-23 19:40:51 +02:00
Jean-Francois Dockes
5add2e2384 Arrange so we can now open the parent of a document (e.g. chm file instead of temp copy of html page inside chm), even when the parent is itself embedded in an archive 2012-10-12 16:54:52 +02:00
Jean-Francois Dockes
8e1ed842d2 message 2012-10-09 14:52:32 +02:00
Jean-Francois Dockes
1329265b7b check for empty file name in internfile, else gets stuck later because empty fn is interpreted as read stdin in md5 2012-10-05 16:42:13 +02:00
"Jean-Francois Dockes ext:(%22)
2870274f80 slightly simplified temp file handling 2012-08-21 08:35:39 +02:00
Jean-Francois Dockes
643f4d56bb internals: virtualized the doc fetcher interface 2012-06-05 07:16:11 +02:00
Jean-Francois Dockes
8b34610dde Cleaned up file name handling. Fixes that file names were sometimes indexed split, sometimes not. They now always are both, with different prefixes. Forces reindex 2012-04-13 09:18:08 +02:00
Jean-Francois Dockes
ec7b40a52e cosmetics: list -> vector in more places 2012-04-11 19:58:08 +02:00
Jean-Francois Dockes
78bd8d63da use vector instead of list for execmd arg list 2012-04-11 15:36:49 +02:00
Jean-Francois Dockes
85166c93b2 Changed the way we handle document sizes. The fbytes field should now be in most cases the most "natural" document size. pcbytes holds the top external container size and dbytes the text size 2012-03-07 15:39:30 +01:00
Jean-Francois Dockes
638d468796 clarified the use of string keys inside the Filter metaData array 2012-03-07 10:13:46 +01:00