doc
This commit is contained in:
parent
ad466ee42d
commit
8c816f50cf
@ -3446,43 +3446,48 @@ fs.inotify.max_user_watches=32768
|
|||||||
WEB history.</p>
|
WEB history.</p>
|
||||||
<p>Here follows an example:</p>
|
<p>Here follows an example:</p>
|
||||||
<pre class="programlisting">
|
<pre class="programlisting">
|
||||||
<?xml version="1.0" encoding="UTF-8"?>
|
<?xml version="1.0" encoding="UTF-8"?>
|
||||||
|
<fragbuts version="1.0">
|
||||||
|
|
||||||
<fragbuts version="1.0">
|
<radiobuttons>
|
||||||
|
<!-- Actually useful: toggle WEB queue results inclusion -->
|
||||||
|
<fragbut>
|
||||||
|
<label>Include Web Results</label>
|
||||||
|
<frag></frag>
|
||||||
|
</fragbut>
|
||||||
|
|
||||||
<radiobuttons>
|
<fragbut>
|
||||||
|
<label>Exclude Web Results</label>
|
||||||
|
<frag>-rclbes:BGL</frag>
|
||||||
|
</fragbut>
|
||||||
|
|
||||||
<fragbut>
|
<fragbut>
|
||||||
<label>Include Web Results</label>
|
<label>Only Web Results</label>
|
||||||
<frag></frag>
|
<frag>rclbes:BGL</frag>
|
||||||
</fragbut>
|
</fragbut>
|
||||||
|
|
||||||
<fragbut>
|
</radiobuttons>
|
||||||
<label>Exclude Web Results</label>
|
|
||||||
<frag>-rclbes:BGL</frag>
|
|
||||||
</fragbut>
|
|
||||||
|
|
||||||
<fragbut>
|
<buttons>
|
||||||
<label>Only Web Results</label>
|
|
||||||
<frag>rclbes:BGL</frag>
|
|
||||||
</fragbut>
|
|
||||||
|
|
||||||
</radiobuttons>
|
<fragbut>
|
||||||
|
<label>Example: Year 2010</label>
|
||||||
|
<frag>date:2010-01-01/2010-12-31</frag>
|
||||||
|
</fragbut>
|
||||||
|
|
||||||
<buttons>
|
<fragbut>
|
||||||
|
<label>Example: c++ files</label>
|
||||||
|
<frag>ext:cpp OR ext:cxx</frag>
|
||||||
|
</fragbut>
|
||||||
|
|
||||||
<fragbut>
|
<fragbut>
|
||||||
<label>Year 2010</label>
|
<label>Example: My Great Directory</label>
|
||||||
<frag>date:2010-01-01/2010-12-31</frag>
|
<frag>dir:/my/great/directory</frag>
|
||||||
</fragbut>
|
</fragbut>
|
||||||
|
|
||||||
<fragbut>
|
</buttons>
|
||||||
<label>My Great Directory Only</label>
|
|
||||||
<frag>dir:/my/great/directory</frag>
|
|
||||||
</fragbut>
|
|
||||||
|
|
||||||
</buttons>
|
</fragbuts>
|
||||||
</fragbuts>
|
|
||||||
</pre>
|
</pre>
|
||||||
<p>Each <code class="literal">radiobuttons</code> or
|
<p>Each <code class="literal">radiobuttons</code> or
|
||||||
<code class="literal">buttons</code> section defines a
|
<code class="literal">buttons</code> section defines a
|
||||||
@ -3781,6 +3786,20 @@ fs.inotify.max_user_watches=32768
|
|||||||
your NLS environment. Weird things will probably
|
your NLS environment. Weird things will probably
|
||||||
happen if languages are mixed up.</p>
|
happen if languages are mixed up.</p>
|
||||||
</dd>
|
</dd>
|
||||||
|
<dt><span class="term">Show index
|
||||||
|
statistics</span></dt>
|
||||||
|
<dd>
|
||||||
|
<p>This will print a long list of boring numbers
|
||||||
|
about the index</p>
|
||||||
|
</dd>
|
||||||
|
<dt><span class="term">List files which could not be
|
||||||
|
indexed</span></dt>
|
||||||
|
<dd>
|
||||||
|
<p>This will show the files which caused errors,
|
||||||
|
usually because <span class=
|
||||||
|
"command"><strong>recollindex</strong></span> could
|
||||||
|
not translate their format into text.</p>
|
||||||
|
</dd>
|
||||||
</dl>
|
</dl>
|
||||||
</div>
|
</div>
|
||||||
<p>Note that in cases where <span class=
|
<p>Note that in cases where <span class=
|
||||||
@ -3862,14 +3881,9 @@ fs.inotify.max_user_watches=32768
|
|||||||
<code class="envar">RECOLL_ACTIVE_EXTRA_DBS</code>, you
|
<code class="envar">RECOLL_ACTIVE_EXTRA_DBS</code>, you
|
||||||
can add and activate the index for the mounted volume
|
can add and activate the index for the mounted volume
|
||||||
when starting <span class=
|
when starting <span class=
|
||||||
"command"><strong>recoll</strong></span>.</p>
|
"command"><strong>recoll</strong></span>. Unreachable
|
||||||
<p><code class="envar">RECOLL_ACTIVE_EXTRA_DBS</code> is
|
indexes will automatically be deactivated when starting
|
||||||
available for <span class="application">Recoll</span>
|
up.</p>
|
||||||
versions 1.17.2 and later. A change was made in the same
|
|
||||||
update so that <span class=
|
|
||||||
"command"><strong>recoll</strong></span> will
|
|
||||||
automatically deactivate unreachable indexes when
|
|
||||||
starting up.</p>
|
|
||||||
</div>
|
</div>
|
||||||
<div class="sect2">
|
<div class="sect2">
|
||||||
<div class="titlepage">
|
<div class="titlepage">
|
||||||
@ -5579,8 +5593,9 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
|||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
<p><b>Term synonyms: </b>there are a number of ways to
|
<p><b>Term synonyms and text search: </b>in general,
|
||||||
use term synonyms for searching text:</p>
|
there are two main ways to use term synonyms for searching
|
||||||
|
text:</p>
|
||||||
<div class="itemizedlist">
|
<div class="itemizedlist">
|
||||||
<ul class="itemizedlist" style="list-style-type: disc;">
|
<ul class="itemizedlist" style="list-style-type: disc;">
|
||||||
<li class="listitem">
|
<li class="listitem">
|
||||||
@ -5829,15 +5844,25 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
|||||||
is minimal. However there are a few tools available:</p>
|
is minimal. However there are a few tools available:</p>
|
||||||
<div class="itemizedlist">
|
<div class="itemizedlist">
|
||||||
<ul class="itemizedlist" style="list-style-type: disc;">
|
<ul class="itemizedlist" style="list-style-type: disc;">
|
||||||
|
<li class="listitem">
|
||||||
|
<p>Users of recent Ubuntu-derived distributions, or
|
||||||
|
any other Gnome desktop systems (e.g. Fedora) can
|
||||||
|
install the <a class="ulink" href=
|
||||||
|
"https://www.lesbonscomptes.com/recoll/download.html#gssp"
|
||||||
|
target="_top">Recoll GSSP</a> (Gnome Shell Search
|
||||||
|
Provider).</p>
|
||||||
|
</li>
|
||||||
<li class="listitem">
|
<li class="listitem">
|
||||||
<p>The <span class="application">KDE</span> KIO Slave
|
<p>The <span class="application">KDE</span> KIO Slave
|
||||||
was described in a <a class="link" href=
|
was described in a <a class="link" href=
|
||||||
"#RCL.SEARCH.KIO" title=
|
"#RCL.SEARCH.KIO" title=
|
||||||
"3.3. Searching with the KDE KIO slave">previous
|
"3.3. Searching with the KDE KIO slave">previous
|
||||||
section</a>.</p>
|
section</a>. It can provide search results inside
|
||||||
|
<span class=
|
||||||
|
"command"><strong>Dolphin</strong></span>.</p>
|
||||||
</li>
|
</li>
|
||||||
<li class="listitem">
|
<li class="listitem">
|
||||||
<p>If you use a recent version of Ubuntu Linux, you
|
<p>If you use an oldish version of Ubuntu Linux, you
|
||||||
may find the <a class="ulink" href=
|
may find the <a class="ulink" href=
|
||||||
"https://www.lesbonscomptes.com/recoll/faqsandhowtos/UnityLens"
|
"https://www.lesbonscomptes.com/recoll/faqsandhowtos/UnityLens"
|
||||||
target="_top">Ubuntu Unity Lens</a> module
|
target="_top">Ubuntu Unity Lens</a> module
|
||||||
@ -5975,8 +6000,8 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
|||||||
C++ and live inside <span class=
|
C++ and live inside <span class=
|
||||||
"command"><strong>recollindex</strong></span>. This latter
|
"command"><strong>recollindex</strong></span>. This latter
|
||||||
kind will not be described here.</p>
|
kind will not be described here.</p>
|
||||||
<p>There are currently (since version 1.13) two kinds of
|
<p>There are two kinds of external executable input
|
||||||
external executable input handlers:</p>
|
handlers:</p>
|
||||||
<div class="itemizedlist">
|
<div class="itemizedlist">
|
||||||
<ul class="itemizedlist" style="list-style-type: disc;">
|
<ul class="itemizedlist" style="list-style-type: disc;">
|
||||||
<li class="listitem">
|
<li class="listitem">
|
||||||
@ -6180,10 +6205,11 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
|||||||
have a look. Before the C++ import, the xsl-based
|
have a look. Before the C++ import, the xsl-based
|
||||||
handlers used a common module <code class=
|
handlers used a common module <code class=
|
||||||
"filename">rclgenxslt.py</code>, it is still around
|
"filename">rclgenxslt.py</code>, it is still around
|
||||||
but unused. The handler for OpenXML presentations
|
but unused at the moment. The handler for OpenXML
|
||||||
is still the Python version because the format did
|
presentations is still the Python version because
|
||||||
not fit with what the C++ code does. It would be a
|
the format did not fit with what the C++ code does.
|
||||||
good base for another similar issue.</p>
|
It would be a good base for another similar
|
||||||
|
issue.</p>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
</div>
|
</div>
|
||||||
@ -6366,14 +6392,14 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
|||||||
minimal like the following example:</p>
|
minimal like the following example:</p>
|
||||||
<pre class="programlisting">
|
<pre class="programlisting">
|
||||||
<html>
|
<html>
|
||||||
<head>
|
<head>
|
||||||
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
|
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8"/>
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
Some text content
|
Some text content
|
||||||
</body>
|
</body>
|
||||||
</html>
|
</html>
|
||||||
</pre>
|
</pre>
|
||||||
<p>You should take care to escape some characters inside
|
<p>You should take care to escape some characters inside
|
||||||
the text by transforming them into appropriate entities.
|
the text by transforming them into appropriate entities.
|
||||||
At the very minimum, "<code class="literal">&</code>"
|
At the very minimum, "<code class="literal">&</code>"
|
||||||
@ -6613,11 +6639,17 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
|||||||
for creating/updating an index. Bindings exist for
|
for creating/updating an index. Bindings exist for
|
||||||
Python2 and Python3.</p>
|
Python2 and Python3.</p>
|
||||||
<p>The search interface is used in a number of active
|
<p>The search interface is used in a number of active
|
||||||
projects: the <span class="application">Recoll</span>
|
projects: the <a class="ulink" href=
|
||||||
|
"https://www.lesbonscomptes.com/recoll/download.html#gssp"
|
||||||
|
target="_top"><span class="application">Recoll</span>
|
||||||
<span class="application">Gnome Shell Search
|
<span class="application">Gnome Shell Search
|
||||||
Provider</span>, the <span class=
|
Provider</span></a> , the <a class="ulink" href=
|
||||||
"application">Recoll</span> Web UI, and the upmpdcli UPnP
|
"https://opensourceprojects.eu/p/recollwebui/code/"
|
||||||
Media Server, in addition to many small scripts.</p>
|
target="_top"><span class="application">Recoll</span> Web
|
||||||
|
UI</a>, and the <a class="ulink" href=
|
||||||
|
"https://www.lesbonscomptes.com/upmpdcli/upmpdcli-manual.html#UPRCL"
|
||||||
|
target="_top">upmpdcli UPnP Media Server</a>, in addition
|
||||||
|
to many small scripts.</p>
|
||||||
<p>The index update section of the API may be used to
|
<p>The index update section of the API may be used to
|
||||||
create and update <span class="application">Recoll</span>
|
create and update <span class="application">Recoll</span>
|
||||||
indexes on specific configurations (separate from the
|
indexes on specific configurations (separate from the
|
||||||
|
|||||||
@ -2454,46 +2454,52 @@ fs.inotify.max_user_watches=32768
|
|||||||
contains an example which filters the results from the WEB
|
contains an example which filters the results from the WEB
|
||||||
history.</para>
|
history.</para>
|
||||||
|
|
||||||
|
|
||||||
<para>Here follows an example:
|
<para>Here follows an example:
|
||||||
<programlisting>
|
<programlisting><![CDATA[
|
||||||
<?xml version="1.0" encoding="UTF-8"?>
|
<?xml version="1.0" encoding="UTF-8"?>
|
||||||
|
<fragbuts version="1.0">
|
||||||
|
|
||||||
<fragbuts version="1.0">
|
<radiobuttons>
|
||||||
|
<!-- Actually useful: toggle WEB queue results inclusion -->
|
||||||
|
<fragbut>
|
||||||
|
<label>Include Web Results</label>
|
||||||
|
<frag></frag>
|
||||||
|
</fragbut>
|
||||||
|
|
||||||
<radiobuttons>
|
<fragbut>
|
||||||
|
<label>Exclude Web Results</label>
|
||||||
|
<frag>-rclbes:BGL</frag>
|
||||||
|
</fragbut>
|
||||||
|
|
||||||
<fragbut>
|
<fragbut>
|
||||||
<label>Include Web Results</label>
|
<label>Only Web Results</label>
|
||||||
<frag></frag>
|
<frag>rclbes:BGL</frag>
|
||||||
</fragbut>
|
</fragbut>
|
||||||
|
|
||||||
<fragbut>
|
</radiobuttons>
|
||||||
<label>Exclude Web Results</label>
|
|
||||||
<frag>-rclbes:BGL</frag>
|
|
||||||
</fragbut>
|
|
||||||
|
|
||||||
<fragbut>
|
<buttons>
|
||||||
<label>Only Web Results</label>
|
|
||||||
<frag>rclbes:BGL</frag>
|
|
||||||
</fragbut>
|
|
||||||
|
|
||||||
</radiobuttons>
|
<fragbut>
|
||||||
|
<label>Example: Year 2010</label>
|
||||||
|
<frag>date:2010-01-01/2010-12-31</frag>
|
||||||
|
</fragbut>
|
||||||
|
|
||||||
<buttons>
|
<fragbut>
|
||||||
|
<label>Example: c++ files</label>
|
||||||
|
<frag>ext:cpp OR ext:cxx</frag>
|
||||||
|
</fragbut>
|
||||||
|
|
||||||
<fragbut>
|
<fragbut>
|
||||||
<label>Year 2010</label>
|
<label>Example: My Great Directory</label>
|
||||||
<frag>date:2010-01-01/2010-12-31</frag>
|
<frag>dir:/my/great/directory</frag>
|
||||||
</fragbut>
|
</fragbut>
|
||||||
|
|
||||||
<fragbut>
|
</buttons>
|
||||||
<label>My Great Directory Only</label>
|
|
||||||
<frag>dir:/my/great/directory</frag>
|
|
||||||
</fragbut>
|
|
||||||
|
|
||||||
</buttons>
|
</fragbuts>
|
||||||
</fragbuts>
|
]]></programlisting>
|
||||||
</programlisting>
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>Each <literal>radiobuttons</literal> or
|
<para>Each <literal>radiobuttons</literal> or
|
||||||
@ -2745,6 +2751,16 @@ fs.inotify.max_user_watches=32768
|
|||||||
environment. Weird things will probably happen if
|
environment. Weird things will probably happen if
|
||||||
languages are mixed up.</para></listitem>
|
languages are mixed up.</para></listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
<varlistentry>
|
||||||
|
<term>Show index statistics</term> <listitem><para>This will
|
||||||
|
print a long list of boring numbers about the index</para>
|
||||||
|
</listitem></varlistentry>
|
||||||
|
<varlistentry>
|
||||||
|
<term>List files which could not be indexed</term>
|
||||||
|
<listitem><para>This will show the files which caused errors,
|
||||||
|
usually because <command>recollindex</command> could not
|
||||||
|
translate their format into text.</para>
|
||||||
|
</listitem></varlistentry>
|
||||||
</variablelist>
|
</variablelist>
|
||||||
|
|
||||||
<para>Note that in cases where &RCL; does not know the beginning
|
<para>Note that in cases where &RCL; does not know the beginning
|
||||||
@ -2804,22 +2820,16 @@ fs.inotify.max_user_watches=32768
|
|||||||
</para>
|
</para>
|
||||||
<screen>export RECOLL_EXTRA_DBS=/some/place/xapiandb:/some/other/db</screen>
|
<screen>export RECOLL_EXTRA_DBS=/some/place/xapiandb:/some/other/db</screen>
|
||||||
|
|
||||||
<para>Another environment variable,
|
<para>Another environment
|
||||||
<envar>RECOLL_ACTIVE_EXTRA_DBS</envar> allows adding to the active
|
variable, <envar>RECOLL_ACTIVE_EXTRA_DBS</envar> allows adding to
|
||||||
list of indexes. This variable was suggested and implemented by a
|
the active list of indexes. This variable was suggested and
|
||||||
&RCL; user. It is mostly useful if you use scripts to mount
|
implemented by a &RCL; user. It is mostly useful if you use scripts
|
||||||
external volumes with &RCL; indexes. By using
|
to mount external volumes with &RCL; indexes. By
|
||||||
<envar>RECOLL_EXTRA_DBS</envar> and
|
using <envar>RECOLL_EXTRA_DBS</envar>
|
||||||
<envar>RECOLL_ACTIVE_EXTRA_DBS</envar>, you can add and activate
|
and <envar>RECOLL_ACTIVE_EXTRA_DBS</envar>, you can add and
|
||||||
the index for the mounted volume when starting
|
activate the index for the mounted volume when
|
||||||
<command>recoll</command>.
|
starting <command>recoll</command>. Unreachable indexes will
|
||||||
</para>
|
automatically be deactivated when starting up.</para>
|
||||||
|
|
||||||
<para><envar>RECOLL_ACTIVE_EXTRA_DBS</envar> is available for
|
|
||||||
&RCL; versions 1.17.2 and later. A change was made in the same
|
|
||||||
update so that <command>recoll</command> will
|
|
||||||
automatically deactivate unreachable indexes when starting
|
|
||||||
up.</para>
|
|
||||||
|
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
@ -4261,8 +4271,9 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
|||||||
<sect1 id="RCL.SEARCH.SYNONYMS">
|
<sect1 id="RCL.SEARCH.SYNONYMS">
|
||||||
<title>Using Synonyms (1.22)</title>
|
<title>Using Synonyms (1.22)</title>
|
||||||
|
|
||||||
<formalpara><title>Term synonyms:</title>
|
<formalpara><title>Term synonyms and text search:</title> <para>in
|
||||||
<para>there are a number of ways to use term synonyms for searching text:
|
general, there are two main ways to use term synonyms for
|
||||||
|
searching text:
|
||||||
<itemizedlist>
|
<itemizedlist>
|
||||||
<listitem><para>At index creation time, they can be used to alter the
|
<listitem><para>At index creation time, they can be used to alter the
|
||||||
indexed terms, either increasing or decreasing their number, by
|
indexed terms, either increasing or decreasing their number, by
|
||||||
@ -4478,11 +4489,20 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
|||||||
available:
|
available:
|
||||||
<itemizedlist>
|
<itemizedlist>
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>The <application>KDE</application> KIO Slave was
|
<para>Users of recent Ubuntu-derived distributions, or
|
||||||
described in a <link linkend="RCL.SEARCH.KIO">previous section</link>.</para>
|
any other Gnome desktop systems (e.g. Fedora) can install the
|
||||||
|
<ulink
|
||||||
|
url="https://www.lesbonscomptes.com/recoll/download.html#gssp">
|
||||||
|
Recoll GSSP</ulink> (Gnome Shell Search Provider).</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>If you use a recent version of Ubuntu Linux, you may
|
<para>The <application>KDE</application> KIO Slave was described
|
||||||
|
in a <link linkend="RCL.SEARCH.KIO">previous
|
||||||
|
section</link>. It can provide search results
|
||||||
|
inside <command>Dolphin</command>. </para>
|
||||||
|
</listitem>
|
||||||
|
<listitem>
|
||||||
|
<para>If you use an oldish version of Ubuntu Linux, you may
|
||||||
find the <ulink url="&FAQS;UnityLens">Ubuntu Unity
|
find the <ulink url="&FAQS;UnityLens">Ubuntu Unity
|
||||||
Lens</ulink> module useful.</para>
|
Lens</ulink> module useful.</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
@ -4583,8 +4603,7 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
|||||||
in C++ and live inside <command>recollindex</command>. This latter
|
in C++ and live inside <command>recollindex</command>. This latter
|
||||||
kind will not be described here.</para>
|
kind will not be described here.</para>
|
||||||
|
|
||||||
<para>There are currently (since version 1.13) two kinds of
|
<para>There are two kinds of external executable input handlers:
|
||||||
external executable input handlers:
|
|
||||||
<itemizedlist>
|
<itemizedlist>
|
||||||
<listitem><para>Simple <literal>exec</literal> handlers
|
<listitem><para>Simple <literal>exec</literal> handlers
|
||||||
run once and exit. They can be bare programs like
|
run once and exit. They can be bare programs like
|
||||||
@ -4711,34 +4730,32 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
|||||||
<itemizedlist>
|
<itemizedlist>
|
||||||
<listitem><para><literal>rclimg</literal> is written in Perl and
|
<listitem><para><literal>rclimg</literal> is written in Perl and
|
||||||
handles the execm protocol all by itself (showing how trivial it
|
handles the execm protocol all by itself (showing how trivial it
|
||||||
is).</para></listitem>
|
is).</para></listitem> <listitem><para>All the Python handlers
|
||||||
<listitem><para>All the Python handlers share at least the
|
share at least the <filename>rclexecm.py</filename> module, which
|
||||||
<filename>rclexecm.py</filename> module, which handles the
|
handles the communication. Have a look at, for
|
||||||
communication. Have a look at, for example,
|
example, <filename>rclzip</filename> for a handler which
|
||||||
<filename>rclzip</filename> for a handler which uses
|
uses <filename>rclexecm.py</filename>
|
||||||
<filename>rclexecm.py</filename> directly.</para></listitem>
|
directly.</para></listitem> <listitem><para>Most Python handlers
|
||||||
<listitem><para>Most Python handlers which process
|
which process single-document files by executing another command
|
||||||
single-document files by executing another command are further
|
are further abstracted by using
|
||||||
abstracted by using the <filename>rclexec1.py</filename>
|
the <filename>rclexec1.py</filename> module. See for
|
||||||
module. See for example <filename>rclrtf.py</filename> for a
|
example <filename>rclrtf.py</filename> for a simple one,
|
||||||
simple one, or <filename>rcldoc.py</filename> for a slightly more
|
or <filename>rcldoc.py</filename> for a slightly more complicated
|
||||||
complicated one (possibly executing several
|
one (possibly executing several
|
||||||
commands).</para></listitem>
|
commands).</para></listitem> <listitem><para>Handlers which
|
||||||
<listitem><para>Handlers which extract text from an XML document
|
extract text from an XML document by using an XSLT style sheet
|
||||||
by using an XSLT style sheet are now executed inside
|
are now executed inside <command>recollindex</command>, with only
|
||||||
<command>recollindex</command>, with only the style sheet stored
|
the style sheet stored in the <filename>filters/</filename>
|
||||||
in the <filename>filters/</filename> directory. These can
|
directory. These can use a single style sheet
|
||||||
use a single style sheet (e.g. <filename>abiword.xsl</filename>),
|
(e.g. <filename>abiword.xsl</filename>), or two sheets for the
|
||||||
or two sheets for the data and metadata
|
data and metadata (e.g. <filename>opendoc-body.xsl</filename>
|
||||||
(e.g. <filename>opendoc-body.xsl</filename> and
|
and <filename>opendoc-meta.xsl</filename>). The <filename>mimeconf</filename>
|
||||||
<filename>opendoc-meta.xsl</filename>). The
|
configuration file defines how the sheets are used, have a
|
||||||
<filename>mimeconf</filename> configuration file defines how the
|
look. Before the C++ import, the xsl-based handlers used a common
|
||||||
sheets are used, have a look. Before the C++ import, the
|
module <filename>rclgenxslt.py</filename>, it is still around but
|
||||||
xsl-based handlers used a common module
|
unused at the moment. The handler for OpenXML presentations is
|
||||||
<filename>rclgenxslt.py</filename>, it is still around but
|
still the Python version because the format did not fit with what
|
||||||
unused. The handler for OpenXML presentations is still the Python
|
the C++ code does. It would be a good base for another similar
|
||||||
version because the format did not fit with what the C++ code
|
|
||||||
does. It would be a good base for another similar
|
|
||||||
issue.</para></listitem>
|
issue.</para></listitem>
|
||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
</para>
|
</para>
|
||||||
@ -4878,16 +4895,16 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
|||||||
|
|
||||||
<para>For filters producing HTML, the output could be very minimal
|
<para>For filters producing HTML, the output could be very minimal
|
||||||
like the following example:
|
like the following example:
|
||||||
<programlisting>
|
<programlisting><![CDATA[
|
||||||
<html>
|
<html>
|
||||||
<head>
|
<head>
|
||||||
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
|
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8"/>
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
Some text content
|
Some text content
|
||||||
</body>
|
</body>
|
||||||
</html>
|
</html>
|
||||||
</programlisting>
|
]]></programlisting>
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>You should take care to escape some
|
<para>You should take care to escape some
|
||||||
@ -5087,9 +5104,16 @@ text/html [file:///Users/uncrypted-dockes/projets/bateaux/ilur/factEtCie/r
|
|||||||
Python2 and Python3.</para>
|
Python2 and Python3.</para>
|
||||||
|
|
||||||
<para>The search interface is used in a number of active projects:
|
<para>The search interface is used in a number of active projects:
|
||||||
the &RCL; <application>Gnome Shell Search Provider</application>,
|
the <ulink
|
||||||
the &RCL; Web UI, and the upmpdcli UPnP Media Server, in addition
|
url="https://www.lesbonscomptes.com/recoll/download.html#gssp">
|
||||||
to many small scripts.</para>
|
&RCL; <application>Gnome Shell Search Provider</application>
|
||||||
|
</ulink>,
|
||||||
|
the <ulink url="https://opensourceprojects.eu/p/recollwebui/code/">
|
||||||
|
&RCL; Web UI</ulink>, and the
|
||||||
|
<ulink
|
||||||
|
url="https://www.lesbonscomptes.com/upmpdcli/upmpdcli-manual.html#UPRCL">
|
||||||
|
upmpdcli UPnP Media Server</ulink>, in addition
|
||||||
|
to many small scripts.</para>
|
||||||
|
|
||||||
<para>The index update section of the API may be used to create and
|
<para>The index update section of the API may be used to create and
|
||||||
update &RCL; indexes on specific configurations (separate from the
|
update &RCL; indexes on specific configurations (separate from the
|
||||||
|
|||||||
@ -1,118 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
# Copyright (C) 2014 J.F.Dockes
|
|
||||||
# This program is free software; you can redistribute it and/or modify
|
|
||||||
# it under the terms of the GNU General Public License as published by
|
|
||||||
# the Free Software Foundation; either version 2 of the License, or
|
|
||||||
# (at your option) any later version.
|
|
||||||
#
|
|
||||||
# This program is distributed in the hope that it will be useful,
|
|
||||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
||||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
||||||
# GNU General Public License for more details.
|
|
||||||
#
|
|
||||||
# You should have received a copy of the GNU General Public License
|
|
||||||
# along with this program; if not, write to the
|
|
||||||
# Free Software Foundation, Inc.,
|
|
||||||
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
|
|
||||||
######################################
|
|
||||||
|
|
||||||
from __future__ import print_function
|
|
||||||
|
|
||||||
import sys
|
|
||||||
import rclexecm
|
|
||||||
import rclgenxslt
|
|
||||||
|
|
||||||
stylesheet_all = '''<?xml version="1.0"?>
|
|
||||||
<xsl:stylesheet version="1.0"
|
|
||||||
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
|
|
||||||
xmlns:ab="http://www.abisource.com/awml.dtd"
|
|
||||||
exclude-result-prefixes="ab"
|
|
||||||
>
|
|
||||||
|
|
||||||
<xsl:output method="html" encoding="UTF-8"/>
|
|
||||||
|
|
||||||
<xsl:template match="/">
|
|
||||||
<html>
|
|
||||||
<head>
|
|
||||||
<xsl:apply-templates select="ab:abiword/ab:metadata"/>
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
|
|
||||||
<!-- This is for the older abiword format with no namespaces -->
|
|
||||||
<xsl:for-each select="abiword/section">
|
|
||||||
<xsl:apply-templates select="p"/>
|
|
||||||
</xsl:for-each>
|
|
||||||
|
|
||||||
<!-- Newer namespaced format -->
|
|
||||||
<xsl:for-each select="ab:abiword/ab:section">
|
|
||||||
<xsl:for-each select="ab:p">
|
|
||||||
<p><xsl:value-of select="."/></p><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:for-each>
|
|
||||||
</xsl:for-each>
|
|
||||||
|
|
||||||
</body>
|
|
||||||
</html>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="p">
|
|
||||||
<p><xsl:value-of select="."/></p><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="ab:metadata">
|
|
||||||
<xsl:for-each select="ab:m">
|
|
||||||
<xsl:choose>
|
|
||||||
<xsl:when test="@key = 'dc.creator'">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">author</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:when>
|
|
||||||
<xsl:when test="@key = 'abiword.keywords'">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">keywords</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:when>
|
|
||||||
<xsl:when test="@key = 'dc.subject'">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">keywords</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:when>
|
|
||||||
<xsl:when test="@key = 'dc.description'">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">abstract</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:when>
|
|
||||||
<xsl:when test="@key = 'dc.title'">
|
|
||||||
<title><xsl:value-of select="."/></title><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:when>
|
|
||||||
<xsl:otherwise>
|
|
||||||
</xsl:otherwise>
|
|
||||||
</xsl:choose>
|
|
||||||
</xsl:for-each>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
</xsl:stylesheet>
|
|
||||||
'''
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
proto = rclexecm.RclExecM()
|
|
||||||
extract = rclgenxslt.XSLTExtractor(proto, stylesheet_all)
|
|
||||||
rclexecm.main(proto, extract)
|
|
||||||
@ -1,112 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
# Copyright (C) 2014 J.F.Dockes
|
|
||||||
# This program is free software; you can redistribute it and/or modify
|
|
||||||
# it under the terms of the GNU General Public License as published by
|
|
||||||
# the Free Software Foundation; either version 2 of the License, or
|
|
||||||
# (at your option) any later version.
|
|
||||||
#
|
|
||||||
# This program is distributed in the hope that it will be useful,
|
|
||||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
||||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
||||||
# GNU General Public License for more details.
|
|
||||||
#
|
|
||||||
# You should have received a copy of the GNU General Public License
|
|
||||||
# along with this program; if not, write to the
|
|
||||||
# Free Software Foundation, Inc.,
|
|
||||||
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
|
|
||||||
######################################
|
|
||||||
|
|
||||||
from __future__ import print_function
|
|
||||||
|
|
||||||
import sys
|
|
||||||
import rclexecm
|
|
||||||
import rclgenxslt
|
|
||||||
|
|
||||||
|
|
||||||
stylesheet_all = '''<?xml version="1.0"?>
|
|
||||||
<xsl:stylesheet version="1.0"
|
|
||||||
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
|
|
||||||
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
|
|
||||||
xmlns:xlink="http://www.w3.org/1999/xlink"
|
|
||||||
xmlns:dc="http://purl.org/dc/elements/1.1/"
|
|
||||||
xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0"
|
|
||||||
xmlns:ooo="http://openoffice.org/2004/office"
|
|
||||||
xmlns:gnm="http://www.gnumeric.org/v10.dtd"
|
|
||||||
|
|
||||||
exclude-result-prefixes="office xlink meta ooo dc"
|
|
||||||
>
|
|
||||||
|
|
||||||
<xsl:output method="html" encoding="UTF-8"/>
|
|
||||||
|
|
||||||
<xsl:template match="/">
|
|
||||||
<html>
|
|
||||||
<head>
|
|
||||||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
|
|
||||||
<xsl:apply-templates select="//office:document-meta/office:meta"/>
|
|
||||||
</head>
|
|
||||||
|
|
||||||
<body>
|
|
||||||
<xsl:apply-templates select="//gnm:Cells"/>
|
|
||||||
<xsl:apply-templates select="//gnm:Objects"/>
|
|
||||||
</body>
|
|
||||||
</html>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="//dc:date">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">date</xsl:attribute>
|
|
||||||
<xsl:attribute name="content"><xsl:value-of select="."/></xsl:attribute>
|
|
||||||
</meta>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="//dc:description">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">abstract</xsl:attribute>
|
|
||||||
<xsl:attribute name="content"><xsl:value-of select="."/></xsl:attribute>
|
|
||||||
</meta>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="//meta:keyword">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">keywords</xsl:attribute>
|
|
||||||
<xsl:attribute name="content"><xsl:value-of select="."/></xsl:attribute>
|
|
||||||
</meta>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="//dc:subject">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">keywords</xsl:attribute>
|
|
||||||
<xsl:attribute name="content"><xsl:value-of select="."/></xsl:attribute>
|
|
||||||
</meta>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="//dc:title">
|
|
||||||
<title> <xsl:value-of select="."/> </title>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="//meta:initial-creator">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">author</xsl:attribute>
|
|
||||||
<xsl:attribute name="content"><xsl:value-of select="."/></xsl:attribute>
|
|
||||||
</meta>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="office:meta/*"/>
|
|
||||||
|
|
||||||
<xsl:template match="gnm:Cell">
|
|
||||||
<p><xsl:value-of select="."/></p>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="gnm:CellComment">
|
|
||||||
<blockquote><xsl:value-of select="@Text"/></blockquote>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
</xsl:stylesheet>
|
|
||||||
'''
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
proto = rclexecm.RclExecM()
|
|
||||||
extract = rclgenxslt.XSLTExtractor(proto, stylesheet_all, gzip=True)
|
|
||||||
rclexecm.main(proto, extract)
|
|
||||||
|
|
||||||
@ -1,70 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
# Copyright (C) 2014 J.F.Dockes
|
|
||||||
# This program is free software; you can redistribute it and/or modify
|
|
||||||
# it under the terms of the GNU General Public License as published by
|
|
||||||
# the Free Software Foundation; either version 2 of the License, or
|
|
||||||
# (at your option) any later version.
|
|
||||||
#
|
|
||||||
# This program is distributed in the hope that it will be useful,
|
|
||||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
||||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
||||||
# GNU General Public License for more details.
|
|
||||||
#
|
|
||||||
# You should have received a copy of the GNU General Public License
|
|
||||||
# along with this program; if not, write to the
|
|
||||||
# Free Software Foundation, Inc.,
|
|
||||||
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
|
|
||||||
######################################
|
|
||||||
from __future__ import print_function
|
|
||||||
|
|
||||||
import sys
|
|
||||||
import rclexecm
|
|
||||||
import rclgenxslt
|
|
||||||
|
|
||||||
stylesheet_all = '''<?xml version="1.0"?>
|
|
||||||
<xsl:stylesheet version="1.0"
|
|
||||||
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
|
|
||||||
|
|
||||||
<xsl:output method="html" encoding="UTF-8"/>
|
|
||||||
<xsl:strip-space elements="*" />
|
|
||||||
|
|
||||||
<xsl:template match="/">
|
|
||||||
<html>
|
|
||||||
<head>
|
|
||||||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
|
|
||||||
<title>
|
|
||||||
Okular notes about: <xsl:value-of select="/documentInfo/@url" />
|
|
||||||
</title>
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
<xsl:apply-templates />
|
|
||||||
</body>
|
|
||||||
</html>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="node()">
|
|
||||||
<xsl:apply-templates select="@* | node() "/>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="text()">
|
|
||||||
<p><xsl:value-of select="."/></p>
|
|
||||||
<xsl:text >
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="@contents|@author">
|
|
||||||
<p><xsl:value-of select="." /></p>
|
|
||||||
<xsl:text >
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="@*"/>
|
|
||||||
|
|
||||||
</xsl:stylesheet>
|
|
||||||
'''
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
proto = rclexecm.RclExecM()
|
|
||||||
extract = rclgenxslt.XSLTExtractor(proto, stylesheet_all)
|
|
||||||
rclexecm.main(proto, extract)
|
|
||||||
|
|
||||||
@ -1,137 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
# Copyright (C) 2014-2018 J.F.Dockes
|
|
||||||
# This program is free software; you can redistribute it and/or modify
|
|
||||||
# it under the terms of the GNU General Public License as published by
|
|
||||||
# the Free Software Foundation; either version 2 of the License, or
|
|
||||||
# (at your option) any later version.
|
|
||||||
#
|
|
||||||
# This program is distributed in the hope that it will be useful,
|
|
||||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
||||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
||||||
# GNU General Public License for more details.
|
|
||||||
#
|
|
||||||
# You should have received a copy of the GNU General Public License
|
|
||||||
# along with this program; if not, write to the
|
|
||||||
# Free Software Foundation, Inc.,
|
|
||||||
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
|
|
||||||
######################################
|
|
||||||
|
|
||||||
import sys
|
|
||||||
import rclexecm
|
|
||||||
import rclgenxslt
|
|
||||||
|
|
||||||
stylesheet = '''<?xml version="1.0"?>
|
|
||||||
<xsl:stylesheet version="1.0"
|
|
||||||
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
|
|
||||||
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
|
|
||||||
xmlns:xlink="http://www.w3.org/1999/xlink"
|
|
||||||
xmlns:dc="http://purl.org/dc/elements/1.1/"
|
|
||||||
xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0"
|
|
||||||
xmlns:ooo="http://openoffice.org/2004/office"
|
|
||||||
xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
|
|
||||||
exclude-result-prefixes="office xlink meta ooo dc text"
|
|
||||||
>
|
|
||||||
|
|
||||||
<xsl:output method="html" encoding="UTF-8"/>
|
|
||||||
|
|
||||||
<xsl:template match="/">
|
|
||||||
<html>
|
|
||||||
<head>
|
|
||||||
<xsl:apply-templates select="/office:document/office:meta" />
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
<xsl:apply-templates select="/office:document/office:body" />
|
|
||||||
</body></html>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
|
|
||||||
<xsl:template match="/office:document/office:meta">
|
|
||||||
<xsl:apply-templates select="dc:title"/>
|
|
||||||
<xsl:apply-templates select="dc:description"/>
|
|
||||||
<xsl:apply-templates select="dc:subject"/>
|
|
||||||
<xsl:apply-templates select="meta:keyword"/>
|
|
||||||
<xsl:apply-templates select="dc:creator"/>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="/office:document/office:body">
|
|
||||||
<xsl:apply-templates select=".//text:p" />
|
|
||||||
<xsl:apply-templates select=".//text:h" />
|
|
||||||
<xsl:apply-templates select=".//text:s" />
|
|
||||||
<xsl:apply-templates select=".//text:line-break" />
|
|
||||||
<xsl:apply-templates select=".//text:tab" />
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="dc:title">
|
|
||||||
<title> <xsl:value-of select="."/> </title><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="dc:description">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">abstract</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="dc:subject">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">keywords</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="dc:creator">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">author</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="meta:keyword">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">keywords</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="office:body//text:p">
|
|
||||||
<p><xsl:apply-templates/></p><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="office:body//text:h">
|
|
||||||
<p><xsl:apply-templates/></p><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="office:body//text:s">
|
|
||||||
<xsl:text> </xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="office:body//text:line-break">
|
|
||||||
<br />
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="office:body//text:tab">
|
|
||||||
<xsl:text> </xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
</xsl:stylesheet>
|
|
||||||
'''
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
proto = rclexecm.RclExecM()
|
|
||||||
extract = rclgenxslt.XSLTExtractor(proto, stylesheet)
|
|
||||||
rclexecm.main(proto, extract)
|
|
||||||
@ -1,170 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
# Copyright (C) 2014 J.F.Dockes
|
|
||||||
# This program is free software; you can redistribute it and/or modify
|
|
||||||
# it under the terms of the GNU General Public License as published by
|
|
||||||
# the Free Software Foundation; either version 2 of the License, or
|
|
||||||
# (at your option) any later version.
|
|
||||||
#
|
|
||||||
# This program is distributed in the hope that it will be useful,
|
|
||||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
||||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
||||||
# GNU General Public License for more details.
|
|
||||||
#
|
|
||||||
# You should have received a copy of the GNU General Public License
|
|
||||||
# along with this program; if not, write to the
|
|
||||||
# Free Software Foundation, Inc.,
|
|
||||||
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
|
|
||||||
######################################
|
|
||||||
|
|
||||||
from __future__ import print_function
|
|
||||||
|
|
||||||
import sys
|
|
||||||
import rclexecm
|
|
||||||
import rclxslt
|
|
||||||
from rclbasehandler import RclBaseHandler
|
|
||||||
from zipfile import ZipFile
|
|
||||||
|
|
||||||
stylesheet_meta = '''<?xml version="1.0"?>
|
|
||||||
<xsl:stylesheet version="1.0"
|
|
||||||
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
|
|
||||||
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
|
|
||||||
xmlns:xlink="http://www.w3.org/1999/xlink"
|
|
||||||
xmlns:dc="http://purl.org/dc/elements/1.1/"
|
|
||||||
xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0"
|
|
||||||
xmlns:ooo="http://openoffice.org/2004/office"
|
|
||||||
exclude-result-prefixes="office xlink meta ooo dc"
|
|
||||||
>
|
|
||||||
|
|
||||||
<xsl:output method="html" encoding="UTF-8"/>
|
|
||||||
|
|
||||||
<xsl:template match="/office:document-meta">
|
|
||||||
<xsl:apply-templates select="office:meta/dc:description"/>
|
|
||||||
<xsl:apply-templates select="office:meta/dc:subject"/>
|
|
||||||
<xsl:apply-templates select="office:meta/dc:title"/>
|
|
||||||
<xsl:apply-templates select="office:meta/meta:keyword"/>
|
|
||||||
<xsl:apply-templates select="office:meta/dc:creator"/>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="dc:title">
|
|
||||||
<title> <xsl:value-of select="."/> </title><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="dc:description">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">abstract</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="dc:subject">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">keywords</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="dc:creator">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">author</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="meta:keyword">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">keywords</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
</xsl:stylesheet>
|
|
||||||
'''
|
|
||||||
|
|
||||||
stylesheet_content = '''<?xml version="1.0"?>
|
|
||||||
<xsl:stylesheet version="1.0"
|
|
||||||
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
|
|
||||||
xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
|
|
||||||
exclude-result-prefixes="text"
|
|
||||||
>
|
|
||||||
|
|
||||||
<xsl:output method="html" encoding="UTF-8"/>
|
|
||||||
|
|
||||||
<xsl:template match="text:p">
|
|
||||||
<p><xsl:apply-templates/></p><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="text:h">
|
|
||||||
<p><xsl:apply-templates/></p><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="text:s">
|
|
||||||
<xsl:text> </xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="text:line-break">
|
|
||||||
<br />
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="text:tab">
|
|
||||||
<xsl:text> </xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
</xsl:stylesheet>
|
|
||||||
'''
|
|
||||||
|
|
||||||
class OOExtractor(RclBaseHandler):
|
|
||||||
def __init__(self, em):
|
|
||||||
super(OOExtractor, self).__init__(em)
|
|
||||||
|
|
||||||
|
|
||||||
def html_text(self, fn):
|
|
||||||
|
|
||||||
f = open(fn, 'rb')
|
|
||||||
zip = ZipFile(f)
|
|
||||||
|
|
||||||
docdata = b'<html>\n<head>\n<meta http-equiv="Content-Type"' \
|
|
||||||
b'content="text/html; charset=UTF-8">'
|
|
||||||
|
|
||||||
# Wrap metadata extraction because it can sometimes throw
|
|
||||||
# while the main text will be valid
|
|
||||||
try:
|
|
||||||
metadata = zip.read("meta.xml")
|
|
||||||
if metadata:
|
|
||||||
res = rclxslt.apply_sheet_data(stylesheet_meta, metadata)
|
|
||||||
docdata += res
|
|
||||||
except:
|
|
||||||
# To be checked. I'm under the impression that I get this when
|
|
||||||
# nothing matches?
|
|
||||||
#self.em.rclog("No/bad metadata in %s" % fn)
|
|
||||||
pass
|
|
||||||
|
|
||||||
docdata += b'</head>\n<body>\n'
|
|
||||||
|
|
||||||
content = zip.read("content.xml")
|
|
||||||
if content:
|
|
||||||
res = rclxslt.apply_sheet_data(stylesheet_content, content)
|
|
||||||
docdata += res
|
|
||||||
docdata += b'</body></html>'
|
|
||||||
|
|
||||||
return docdata
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
proto = rclexecm.RclExecM()
|
|
||||||
extract = OOExtractor(proto)
|
|
||||||
rclexecm.main(proto, extract)
|
|
||||||
@ -1,105 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
# Copyright (C) 2014 J.F.Dockes
|
|
||||||
# This program is free software; you can redistribute it and/or modify
|
|
||||||
# it under the terms of the GNU General Public License as published by
|
|
||||||
# the Free Software Foundation; either version 2 of the License, or
|
|
||||||
# (at your option) any later version.
|
|
||||||
#
|
|
||||||
# This program is distributed in the hope that it will be useful,
|
|
||||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
||||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
||||||
# GNU General Public License for more details.
|
|
||||||
#
|
|
||||||
# You should have received a copy of the GNU General Public License
|
|
||||||
# along with this program; if not, write to the
|
|
||||||
# Free Software Foundation, Inc.,
|
|
||||||
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
|
|
||||||
######################################
|
|
||||||
from __future__ import print_function
|
|
||||||
|
|
||||||
import sys
|
|
||||||
import rclexecm
|
|
||||||
import rclgenxslt
|
|
||||||
|
|
||||||
stylesheet_all = '''<?xml version="1.0"?>
|
|
||||||
<xsl:stylesheet version="1.0"
|
|
||||||
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
|
|
||||||
xmlns:svg="http://www.w3.org/2000/svg"
|
|
||||||
xmlns:dc="http://purl.org/dc/elements/1.1/"
|
|
||||||
exclude-result-prefixes="svg"
|
|
||||||
>
|
|
||||||
|
|
||||||
<xsl:output method="html" encoding="UTF-8"/>
|
|
||||||
|
|
||||||
<xsl:template match="/">
|
|
||||||
<html>
|
|
||||||
<head>
|
|
||||||
<xsl:apply-templates select="svg:svg/svg:title"/>
|
|
||||||
<xsl:apply-templates select="svg:svg/svg:desc"/>
|
|
||||||
<xsl:apply-templates select="svg:svg/svg:metadata/descendant::dc:creator"/>
|
|
||||||
<xsl:apply-templates select="svg:svg/svg:metadata/descendant::dc:subject"/>
|
|
||||||
<xsl:apply-templates select="svg:svg/svg:metadata/descendant::dc:description"/>
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
<xsl:apply-templates select="//svg:text"/>
|
|
||||||
</body>
|
|
||||||
</html>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="svg:desc">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">keywords</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="dc:creator">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">author</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="dc:subject">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">keywords</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="dc:description">
|
|
||||||
<meta>
|
|
||||||
<xsl:attribute name="name">description</xsl:attribute>
|
|
||||||
<xsl:attribute name="content">
|
|
||||||
<xsl:value-of select="."/>
|
|
||||||
</xsl:attribute>
|
|
||||||
</meta><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="svg:title">
|
|
||||||
<title><xsl:value-of select="."/></title><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
<xsl:template match="svg:text">
|
|
||||||
<p><xsl:value-of select="."/></p><xsl:text>
|
|
||||||
</xsl:text>
|
|
||||||
</xsl:template>
|
|
||||||
|
|
||||||
</xsl:stylesheet>
|
|
||||||
'''
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
proto = rclexecm.RclExecM()
|
|
||||||
extract = rclgenxslt.XSLTExtractor(proto, stylesheet_all)
|
|
||||||
rclexecm.main(proto, extract)
|
|
||||||
Loading…
x
Reference in New Issue
Block a user