17 Commits

Author SHA1 Message Date
Jean-Francois Dockes
f897f087aa HTML: do not concatenate text found before body tag with the title. Fixes issue #125 2013-01-12 14:06:40 +01:00
Jean-Francois Dockes
e6191b51a8 Html: Just ignore opening and closing <body> and <html> tags. Current browsers show text before or after the body and ignore multiple body tags. Not pushed to 1.17 maint because of possible disruption. Closes issue #92 2012-05-16 10:07:09 +02:00
Jean-Francois Dockes
c7a241d26e htmlparse: merged some updates from xapian 1.2.6 2011-06-24 10:41:54 +02:00
Jean-Francois Dockes
55f124725f Fix problems that occurred when multiple threads were trying to read/convert files at the same time (ie: indexing and previewing threads in the GUI calling internfile()). Either get rid of or lock-protect all shared data, eliminate misc initialization possible conflicts by using static initializers. Hopefuly closes issue #51 2011-04-28 10:58:33 +02:00
dockes
a2659b48e4 renamed the html charset values to stick to omega usage 2007-06-19 12:17:07 +00:00
dockes
750d1c918d updated html parser to omega 1.0.1 + moved entity decoder to myhtmlparse to minimize amount of diffs 2007-06-19 10:28:40 +00:00
dockes
0c74bd6e36 added open-ended field name handling 2007-06-19 08:36:24 +00:00
dockes
1d683ad411 added field/prefixes for author and title + command line query language 2007-01-17 13:53:41 +00:00
dockes
229eb0de78 test data indexing result same terms as 1.6.3 2006-12-15 16:33:15 +00:00
dockes
33c95ef1ba Dijon filters 1st step: mostly working needs check and optim 2006-12-15 12:40:24 +00:00
dockes
3872f8cf38 *** empty log message *** 2006-01-30 11:15:28 +00:00
dockes
65d00b9c74 reenable stripping newlines 2006-01-25 08:39:07 +00:00
dockes
0122545ece process text from html files without a </body> tag 2005-12-08 08:44:14 +00:00
dockes
ad67a6cbb7 mimemap processing recentered in rclconfig. Handle directory-local suffix to mime-type definitions. Implement gaim log handling 2005-11-21 14:31:24 +00:00
dockes
6d35f5430c merged modifs from xapian/omega 0.8.5 2005-01-28 09:37:37 +00:00
dockes
44d2b70fdf import from xapian 0.8.5 2005-01-28 08:56:26 +00:00
dockes
e3ec2bb7bf *** empty log message *** 2005-01-28 08:45:40 +00:00