XMLC Tutorial: Manipulating URLs

Contents

  1. Using XMLC with URLs
  2. A Sample HTML Page
  3. Resulting Java Class
  4. Resulting HTML

Using XMLC with URLs

This chapter shows how XMLC's -urlmapping and -urlsetting command line options can be used to map URLS from generic addresses to values specific to a particular project. The -docout option is also demonstrated.

These command line options are documented at the Enhydra site.

The -urlmapping option takes two arguments. The input HTML page is searched for instances of the first argument. All these instances are replaced in the output by the second argument.

The -urlsetting option takes two arguments. The input HTML page is searched for tags with a id attribute. If the value of the id attribute is the same as the first argument, and the tag also contains an URL, the URL is replaced in the output by the second argument.

The -docout option takes one argument, interpreted as a file name. This option shortcircuits the generation of Java code. If it is present on the command line, XMLC writes HTML to the named file. The HTML is identical to the HTML which would have been written by the Java class. This functionality can be useful where XMLC is being used to update the URLs in static pages, and where Enhydra applications have some static pages.

A Sample HTML Page

Consider the following HTML, which we will name "demo_urls.html".


<HTML>
<HEAD></HEAD>
<BODY>
<IMG SRC="Corporate.gif" ALIGN=LEFT>
<IMG SRC="Corporate.gif" ALIGN=RIGHT>
<A HREF="http://www.CorporateHost/ProjectPage/index.html">Project index</A>
<A HREF="http://www.CorporateHost/ProjectPage/index.html#team">Project team</A>
<A HREF="mailto:ProjectLeader@CorporateHost">Project leader</A>
<A HREF="http://www.CorporateHost/VeryOldPage" ID="defunct">First version</A>
<A HREF="http://www.CorporateHost/OldPage" ID="defunct">Second version</A>
</BODY>
</HTML>

Resulting Java Class

The -docout option does not generate java code.

Resulting HTML

The URL manipulation command line options can be invoked as follows.


$ENHYDRA/output/bin/xmlc -urlmapping Corporate.gif AmazingStock.gif -urlmapping http://www.CorporateHost/ProjectPage/index.html http://www.AmazingStock.com/apps/killer/index.html -urlmapping mailto:ProjectLeader@CorporateHost mailto:catbert@AmazingStock.com -urlsetting defunct http://www.CorporateHost/NotAvailable -docout demo_urls_new.html demo_urls.html

This generates the following output. Note that only those URLs which exactly matched -urlmapping's first argument were replaced. In particular, appending #team to an URL was sufficient to stop it being replaced.


<html>
<head></head>
<body>
<img align='left' src='AmazingStock.gif'></img>
<img align='right' src='AmazingStock.gif'></img>
<a href='http://www.AmazingStock.com/apps/killer/index.html'>Project index</a>
<a href='http://www.CorporateHost/ProjectPage/index.html#team'>Project team</a>
<a href='mailto:catbert@AmazingStock.com'>Project leader</a>
<a id='defunct' href='http://www.CorporateHost/NotAvailable'>First version</a>
<a id='defunct' href='http://www.CorporateHost/NotAvailable'>Second version</a>
</body>
</html>

Where the URLs in several HTML files need to be updated, it might be useful to put the URL mapping options into a shell environment variable, as follows.


URL_MAPPINGS="-urlmapping Corporate.gif AmazingStock.gif -urlmapping http://www.CorporateHost/ProjectPage/index.html http://www.AmazingStock.com/apps/killer/index.html -urlmapping mailto:ProjectLeader@CorporateHost mailto:catbert@AmazingStock.com -urlsetting defunct http://www.CorporateHost/NotAvailable" ; export URL_MAPPINGS
$ENHYDRA/output/bin/xmlc $URL_MAPPINGS -docout demo_urls_new.html demo_urls.html