35 Commits

Author SHA1 Message Date
Jean-Francois Dockes
f6a999de84 logging now uses c++ streams 2016-07-12 09:41:04 +02:00
Jean-Francois Dockes
75517f7497 recollindex builds. Still need to implement quite a lot of ifndefed stuff (pathut, rclconfig)
--HG--
branch : WINDOWSPORT
2015-08-30 15:30:50 +02:00
Jean-Francois Dockes
e867f855ad get rid of numerous probably inocuous valgrind/helgrind messages by ensuring that actual string copies are passed between threads, without refcount/shared data magic 2014-05-05 19:01:58 +02:00
Jean-Francois Dockes
f897f087aa HTML: do not concatenate text found before body tag with the title. Fixes issue #125 2013-01-12 14:06:40 +01:00
Jean-Francois Dockes
6457fb4100 take care of pathologic charset decls with empty value 2012-11-26 11:40:08 +01:00
Jean-Francois Dockes
17f8b652d4 Support explicit HTML markup in fields when the markup="html" attribute is present 2012-10-25 14:22:20 +02:00
Jean-Francois Dockes
0333d83d2e html: small additional cleanup after previous <body> processing modification 2012-05-16 10:13:53 +02:00
Jean-Francois Dockes
e6191b51a8 Html: Just ignore opening and closing <body> and <html> tags. Current browsers show text before or after the body and ignore multiple body tags. Not pushed to 1.17 maint because of possible disruption. Closes issue #92 2012-05-16 10:07:09 +02:00
Jean-Francois Dockes
638d468796 clarified the use of string keys inside the Filter metaData array 2012-03-07 10:13:46 +01:00
Jean-Francois Dockes
ec87379015 html: handle the html5 charset meta tag 2012-01-26 19:27:58 +01:00
Jean-Francois Dockes
38e0957962 const string cleanup 2011-10-01 16:39:38 +02:00
Jean-Francois Dockes
c7a241d26e htmlparse: merged some updates from xapian 1.2.6 2011-06-24 10:41:54 +02:00
Jean-Francois Dockes
55f124725f Fix problems that occurred when multiple threads were trying to read/convert files at the same time (ie: indexing and previewing threads in the GUI calling internfile()). Either get rid of or lock-protect all shared data, eliminate misc initialization possible conflicts by using static initializers. Hopefuly closes issue #51 2011-04-28 10:58:33 +02:00
dockes
4876574e3e accept iso date format (2008-09-05T11:55:32) 2008-09-05 10:33:27 +00:00
dockes
46a7f05cbc gcc 4 compat, thanks to Kartik Mistry 2007-12-13 06:58:22 +00:00
dockes
a2659b48e4 renamed the html charset values to stick to omega usage 2007-06-19 12:17:07 +00:00
dockes
750d1c918d updated html parser to omega 1.0.1 + moved entity decoder to myhtmlparse to minimize amount of diffs 2007-06-19 10:28:40 +00:00
dockes
0c74bd6e36 added open-ended field name handling 2007-06-19 08:36:24 +00:00
dockes
1d683ad411 added field/prefixes for author and title + command line query language 2007-01-17 13:53:41 +00:00
dockes
e3f89dca7e dont throw away text even if html is weird 2006-09-21 05:59:59 +00:00
dockes
3872f8cf38 *** empty log message *** 2006-01-30 11:15:28 +00:00
dockes
c8213f76d3 strip whitespace and newlines (as the original version), except in pre tags 2006-01-27 13:38:18 +00:00
dockes
65d00b9c74 reenable stripping newlines 2006-01-25 08:39:07 +00:00
dockes
0122545ece process text from html files without a </body> tag 2005-12-08 08:44:14 +00:00
dockes
c8e18ccc81 previous html fix didnt work 2005-12-06 09:40:18 +00:00
dockes
d2b54d6af2 fix nasty html parse bug introduced in 1.0.9 2005-12-06 08:35:48 +00:00
dockes
5b5be0c853 glitches in linux/solaris compil. + install 2005-11-21 17:18:58 +00:00
dockes
ad67a6cbb7 mimemap processing recentered in rclconfig. Handle directory-local suffix to mime-type definitions. Implement gaim log handling 2005-11-21 14:31:24 +00:00
dockes
d392d317bb mail ckpt 2005-03-25 09:40:28 +00:00
dockes
152d47306e added support for openoffice and word + optimized decomp temp dir usage 2005-02-09 12:07:30 +00:00
dockes
2e35f674a6 *** empty log message *** 2005-02-08 14:45:54 +00:00
dockes
82334f2957 ckpt 2005-01-28 15:25:40 +00:00
dockes
6d35f5430c merged modifs from xapian/omega 0.8.5 2005-01-28 09:37:37 +00:00
dockes
44d2b70fdf import from xapian 0.8.5 2005-01-28 08:56:26 +00:00
dockes
e3ec2bb7bf *** empty log message *** 2005-01-28 08:45:40 +00:00