Jean-Francois Dockes
|
0333d83d2e
|
html: small additional cleanup after previous <body> processing modification
|
2012-05-16 10:13:53 +02:00 |
|
Jean-Francois Dockes
|
e6191b51a8
|
Html: Just ignore opening and closing <body> and <html> tags. Current browsers show text before or after the body and ignore multiple body tags. Not pushed to 1.17 maint because of possible disruption. Closes issue #92
|
2012-05-16 10:07:09 +02:00 |
|
Jean-Francois Dockes
|
638d468796
|
clarified the use of string keys inside the Filter metaData array
|
2012-03-07 10:13:46 +01:00 |
|
Jean-Francois Dockes
|
ec87379015
|
html: handle the html5 charset meta tag
|
2012-01-26 19:27:58 +01:00 |
|
Jean-Francois Dockes
|
38e0957962
|
const string cleanup
|
2011-10-01 16:39:38 +02:00 |
|
Jean-Francois Dockes
|
c7a241d26e
|
htmlparse: merged some updates from xapian 1.2.6
|
2011-06-24 10:41:54 +02:00 |
|
Jean-Francois Dockes
|
55f124725f
|
Fix problems that occurred when multiple threads were trying to read/convert files at the same time (ie: indexing and previewing threads in the GUI calling internfile()). Either get rid of or lock-protect all shared data, eliminate misc initialization possible conflicts by using static initializers. Hopefuly closes issue #51
|
2011-04-28 10:58:33 +02:00 |
|
dockes
|
4876574e3e
|
accept iso date format (2008-09-05T11:55:32)
|
2008-09-05 10:33:27 +00:00 |
|
dockes
|
46a7f05cbc
|
gcc 4 compat, thanks to Kartik Mistry
|
2007-12-13 06:58:22 +00:00 |
|
dockes
|
a2659b48e4
|
renamed the html charset values to stick to omega usage
|
2007-06-19 12:17:07 +00:00 |
|
dockes
|
750d1c918d
|
updated html parser to omega 1.0.1 + moved entity decoder to myhtmlparse to minimize amount of diffs
|
2007-06-19 10:28:40 +00:00 |
|
dockes
|
0c74bd6e36
|
added open-ended field name handling
|
2007-06-19 08:36:24 +00:00 |
|
dockes
|
1d683ad411
|
added field/prefixes for author and title + command line query language
|
2007-01-17 13:53:41 +00:00 |
|
dockes
|
e3f89dca7e
|
dont throw away text even if html is weird
|
2006-09-21 05:59:59 +00:00 |
|
dockes
|
3872f8cf38
|
*** empty log message ***
|
2006-01-30 11:15:28 +00:00 |
|
dockes
|
c8213f76d3
|
strip whitespace and newlines (as the original version), except in pre tags
|
2006-01-27 13:38:18 +00:00 |
|
dockes
|
65d00b9c74
|
reenable stripping newlines
|
2006-01-25 08:39:07 +00:00 |
|
dockes
|
0122545ece
|
process text from html files without a </body> tag
|
2005-12-08 08:44:14 +00:00 |
|
dockes
|
c8e18ccc81
|
previous html fix didnt work
|
2005-12-06 09:40:18 +00:00 |
|
dockes
|
d2b54d6af2
|
fix nasty html parse bug introduced in 1.0.9
|
2005-12-06 08:35:48 +00:00 |
|
dockes
|
5b5be0c853
|
glitches in linux/solaris compil. + install
|
2005-11-21 17:18:58 +00:00 |
|
dockes
|
ad67a6cbb7
|
mimemap processing recentered in rclconfig. Handle directory-local suffix to mime-type definitions. Implement gaim log handling
|
2005-11-21 14:31:24 +00:00 |
|
dockes
|
d392d317bb
|
mail ckpt
|
2005-03-25 09:40:28 +00:00 |
|
dockes
|
152d47306e
|
added support for openoffice and word + optimized decomp temp dir usage
|
2005-02-09 12:07:30 +00:00 |
|
dockes
|
2e35f674a6
|
*** empty log message ***
|
2005-02-08 14:45:54 +00:00 |
|
dockes
|
82334f2957
|
ckpt
|
2005-01-28 15:25:40 +00:00 |
|
dockes
|
6d35f5430c
|
merged modifs from xapian/omega 0.8.5
|
2005-01-28 09:37:37 +00:00 |
|
dockes
|
44d2b70fdf
|
import from xapian 0.8.5
|
2005-01-28 08:56:26 +00:00 |
|
dockes
|
e3ec2bb7bf
|
*** empty log message ***
|
2005-01-28 08:45:40 +00:00 |
|