XMLC Tutorial: Creating Dynamic Content

Contents

  1. Using <SPAN> tags and id Attributes
  2. A Sample HTML Page
  3. The Associated Document Object Model (DOM)
  4. Resulting Java Class
  5. Usage from a Manipulation Java Class
  6. Resulting HTML
  7. Adding a Node

Using <SPAN> tags and id Attributes

This chapter will show how XMLC may be used to make a dynamic HTML page from a static HTML template file. This is very useful stuff: Static HTML templates may be created and modified in HTML creation tools, whereas the code that manipulates them is kept in completely separate files. Programmers and artists can cooperate on XMLC-based projects in a way previously not possible. One need not step on work done by the other.

The first two XMLC features to be covered are the most common ones used. These involve marking up an HTML template with valid but little-used HTML 4.0 elements: The id attribute and <SPAN> tags. These two elements may be created in HTML builder tools which are HTML 4.0 compliant.

id attributes are used to name HTML tags in such a way that XMLC can find them. Once a reference to the tag is established, XMLC working in conjunction with the W3C DOM Java classes can provide access methods which are appropriate to the given tag type. That is, an HTML header will have access methods which make sense for an HTML header.

However, having access to HTML tags may not be sufficient. What if we wish to dynamically generate bits of text in an HTML document? For that, we use <SPAN> tags. Any text that we may wish to change is surrounded by <SPAN> tags. It should be noted that an opening <SPAN> tag also has an id attribute.

Since it is envisaged that most HTML that is created as input to XMLC will be done with HTML design tools, it should be noted that not all design tools support HTML 4.0. If you are using an older design tool, you may not be able to add <SPAN> tags directly. Most design tools do have a facility to allow one to add new or unsupported tags, however.

It should be noted that all id attributes must be given a name that is unique within the HTML document. These names must also be legal Java identifiers, since they will be used as (or part of) Java variable names. To review, legal Java identifiers are case sensitive and must begin with a letter, an underscore ('_') or a dollar sign ('$'). Subsequent characters can also be numbers ('0-9'). Naturally, Java identifiers cannot clash with Java reserved words (e.g. 'abstract' or 'boolean').

A Sample HTML Page

Let's look at a sample HTML page that shows these concepts. Consider this page, which we will name "hello.html":


<HTML>
<HEAD>
    <TITLE>Hello, World</TITLE>
</HEAD>

<BODY BGCOLOR=#FFFFFF TEXT="#000000">

<H1>Hello, World</H1>

This is a text of XMLC, showing how one can change text
located in SPAN tags.  Text outside of those tags and not within
another tag with an id attribute cannot be changed.

</BODY>
</HTML>

Now, let's make some minor changes to the page so that XMLC can access it contents in an efficient manner. Note that XMLC depends on legal HTML/XML markup that will not interfere with other uses of the tags. HTML browsers are required to ignore tags and attributes that they do not understand or use.


<HTML>
<HEAD>
    <TITLE id="title">Hello, World</TITLE>
</HEAD>

<BODY BGCOLOR=#FFFFFF TEXT="#000000">

<H1>Hello, World</H1>

<SPAN id="para1">This is a text of XMLC, showing how one can change text 
located in SPAN tags.</SPAN>  Text outside of those tags and not within 
another tag with an id attribute cannot be changed.

</BODY>
</HTML>

All that we have added is an single id attribute to the header, called "header1", and a <SPAN> tag surrounding part of the text, called "para1". We will show how these minor additions allow XMLC to create Java code that can manipulate these parts of the document.

The Associated Document Object Model (DOM)

XMLC has a command line option that lets us view the DOM for a document. This is accomplished with the "-dump" option on the xmlc command line:


$ $ENHYDRA/bin/xmlc -dump hello.html

If we look at the DOM for the first HTML page (without the id attribute or <SPAN> tag), we see a familiar looking structure which represents the HTML:


DOM hierarchy:
    HTMLDocument:null DocumentType
        HTMLHtmlElement: HTML
            HTMLHeadElement: HEAD
                HTMLTitleElement: TITLE
                    Text: Hello, World
            HTMLBodyElement: BODY: bgcolor='#FFFFFF' text='#000000'
                HTMLHeadingElement: H1
                    Text: Hello, World
                Text: This is a text of XMLC, showing how one can change text located in SPAN tags. Text outside of those tags and not within another tag with an id attribute cannot be changed.

If we look at the DOM for the second HTML page, we will see some nice additions:


DOM hierarchy:
    HTMLDocument:null DocumentType
        HTMLHtmlElement: HTML
            HTMLHeadElement: HEAD
                HTMLTitleElement: TITLE: id='title'
                    Text: Hello, World
            HTMLBodyElement: BODY: bgcolor='#FFFFFF' text='#000000'
                HTMLHeadingElement: H1
                    Text: Hello, World
                HTMLElement: SPAN: id='para1'
                    Text: This is a text of XMLC, showing how one can change text located in SPAN tags.
                Text:  Text outside of those tags and not within another tag with an id attribute cannot be changed.

The DOM now shows an id for the heading. You will also notice that the paragraph text is split into two parts. The bit outside of the <SPAN> tag is represented in a Text element, as before. The part within the <SPAN>, however, now has a HTMLElement element, including an id and and an associate Text all to its own. We shall see what happens to those ided elements in the Java code shortly.

Resulting Java Class

If we wish to keep the Java source code (and we do, for the purposes of this tutorial), then we use the "-keep" command line option to xmlc:


$ $ENHYDRA/bin/xmlc -keep hello.html

This will create both an compiled Java class file ('hello.class') and a source code file ('hello.java').

The Java class that results from the above example provides us with access to each element named with an id or a <SPAN>. Although it may look a bit strange to you if you are not yet comfortable with Java, don't be concerned. We will show you the important parts. Here is 'hello.java':


/*
 ************************************
 * XMLC GENERATED CODE, DO NOT EDIT *
 ************************************
 */
import org.w3c.dom.*;
import org.enhydra.xml.xmlc.XMLCUtil;
import org.enhydra.xml.xmlc.XMLCError;
import org.enhydra.xml.xmlc.dom.XMLCDomFactory;

public class hello extends org.enhydra.xml.xmlc.html.HTMLObjectImpl {
    /**
     * Field that is used to identify this as an XMLC
     * generated class.  Contains an reference to the
     * class object.
     */
    public static final Class XMLC_GENERATED_CLASS = hello.class;

    /**
     * Field containing CLASSPATH relative name of the source file
     * that this class was generated from.
     */
    public static final String XMLC_SOURCE_FILE = "nullhello.html";

    /**
     * Get the element with id para1.
     * @see org.w3c.dom.html.HTMLElement
     */
    public org.w3c.dom.html.HTMLElement getElementPara1() {
        return $elementpara1;
    }
    private org.w3c.dom.html.HTMLElement $elementpara1;

    /**
     * Get the value of text child of element para1.
     * @see org.w3c.dom.Text
     */
    public void setTextPara1(String text) {
        XMLCUtil.getFirstText($elementpara1).setData(text);
    }

    /**
     * Get the element with id title.
     * @see org.w3c.dom.html.HTMLTitleElement
     */
    public org.w3c.dom.html.HTMLTitleElement getElementTitle() {
        return $elementtitle;
    }
    private org.w3c.dom.html.HTMLTitleElement $elementtitle;

    /**
     * Create document as a DOM and initialize accessor method fields.
     */
    public void buildDocument() {
        XMLCDomFactory domFactory = org.enhydra.xml.xmlc.dom.XMLCDomFactoryCache.getFactory("org.enhydra.xml.xmlc.dom.DefaultHTMLDomFactory");
        Document document = domFactory.createDocument(null, null);
        setDocument(document);

        domFactory.setErrorChecking(document, false);

        Node $node0, $node1, $node2, $node3;
        Element $elem0, $elem1, $elem2;
        Attr $attr0, $attr1, $attr2;

        $elem0 = document.getDocumentElement();
        $elem1 = document.createElement("HEAD");;
        $elem0.appendChild($elem1);
        
        $elem2 = document.createElement("TITLE");;
        $elem1.appendChild($elem2);
        
        $elementtitle = (org.w3c.dom.html.HTMLTitleElement)$elem2;
        $attr2 = document.createAttribute("id");
        $elem2.setAttributeNode($attr2);
        
        $node3 = document.createTextNode("title");;
        $attr2.appendChild($node3);
        
        $node3 = document.createTextNode("Hello, World");;
        $elem2.appendChild($node3);
        
        $elem1 = document.createElement("BODY");;
        $elem0.appendChild($elem1);
        
        $attr1 = document.createAttribute("bgcolor");
        $elem1.setAttributeNode($attr1);
        
        $node2 = document.createTextNode("#FFFFFF");;
        $attr1.appendChild($node2);
        
        $attr1 = document.createAttribute("text");
        $elem1.setAttributeNode($attr1);
        
        $node2 = document.createTextNode("#000000");;
        $attr1.appendChild($node2);
        
        $elem2 = document.createElement("H1");;
        $elem1.appendChild($elem2);
        
        $node3 = document.createTextNode("Hello, World");;
        $elem2.appendChild($node3);
        
        $elem2 = document.createElement("SPAN");;
        $elem1.appendChild($elem2);
        
        $elementpara1 = (org.w3c.dom.html.HTMLElement)$elem2;
        $attr2 = document.createAttribute("id");
        $elem2.setAttributeNode($attr2);
        
        $node3 = document.createTextNode("para1");;
        $attr2.appendChild($node3);
        
        $node3 = document.createTextNode("This is a text of XMLC, showing how one can change text located in SPAN tags.");;
        $elem2.appendChild($node3);
        
        $node2 = document.createTextNode(" Text outside of those tags and not within another tag with an id attribute cannot be changed.");;
        $elem1.appendChild($node2);
        

        domFactory.setErrorChecking(document, true);

    }

    /**
     * Recursize function to do set access method fields from the DOM.
     * Missing ids have fields set to null.
     */
    protected void syncWithDocument(Node node) {
        if (node instanceof Element) {
            String id = ((Element)node).getAttribute("id");
            if (id.length() == 0) {
            } else if (id.equals("para1")) {
                $elementpara1 = (org.w3c.dom.html.HTMLElement)node;
            } else if (id.equals("title")) {
                $elementtitle = (org.w3c.dom.html.HTMLTitleElement)node;
            }
        }
    }

    /**
     * Default constructor.
     */
    public hello() {
        buildDocument();
    }

    /**
     * Constructor with optional building of the DOM. 
     *
     * @param buildDOM If false, the DOM will not be built until
     * buildDocument() is called by the derived class.  If true, 
     * the DOM is built immediatly.
     */
    public hello(boolean buildDOM) {
        if (buildDOM) {
            buildDocument();
        }
    }

    /**
     * Copy constructor.
     * @param src The document to clone.
     */
    public hello(hello src) {
        setDocument((Document)src.getDocument().cloneNode(true));
        syncAccessMethods();
    }

    /**
      * Clone the document.
      * @param deep Must be true, only deep clone is supported.
      */
    public Node cloneNode(boolean deep) {
        cloneDeepCheck(deep);
        return new hello(this);
    }

} 

The first thing to notice is that this class extends org.enhydra.xml.xmlc.html.HTMLObjectImpl. If we look at that class, we find that it is an abstract class which provides a number of useful methods. (Note that Javadoc is located at $ENHYDRA/doc/user-doc/index.html, and the source for this class is at $ENHYDRA/modules/XMLC/src/org/enhydra/xml/xmlc/html/HTMLObjectImpl.java in the src distribution.) Since our class extends that abstract class, we gain access to all of those methods. The most interesting one is:

Get methods exist for other HTML elements as well, such as getApplets(), getImages(), getLinks(), getForms(), getAnchors(), etc. These methods return an object of type HTMLCollection - we'll leave that to the reader to research.

The next thing to notice is the text that we have set in red. These methods extend the methods available to any HTMLObject. Note that we have been given a getElement* method for every HTML element that we gave an id attribute. We also have a getElement* and setText* methods for every bit of text that we surrounded with <SPAN> tags.

We have getElement* methods for:

and a setElement* method for: For any tag with an id attribute, the associated getElement* method returns an object of type org.w3c.dom.html.HTMLElement (or one of its subclasses). In fact, every type of tag has its own subclass of this interface. Thus, our "title" heading has a getElement* method called "getElementTitle()", which returns an org.w3c.dom.html.HTMLTitleElement object.

If we look at the Javadoc for org.w3c.dom.html.HTMLTitleElement, we would see all of the methods we can call to manipulate our HTML page's title, "title". The next section shows an example of this.

Similarly, the text that we wrapped in <SPAN> tags has a method called getElementPara1(), which returns just an org.w3c.dom.html.HTMLElement object.

The setText* methods for <SPAN> tags is necessary since they refer to text that is not properly bounded by other tags. That is, a <SPAN> refers to a subset of a paragraph tag. XMLC uses the <SPAN> tags as a cue to create the necessary methods.

Usage from a Manipulation Java Class

At this point, we have a Java class that represents our HTML document. We also have methods that we can call to change the values of the HTML title and some specially-identified text ("para1"). Now we need a Java class to use this class. We will call this new class a "manipulation" class. It is responsible for: We have given the <TITLE> tag an id attribute, so we can change it. We also want to change the text within the <SPAN> tags. A Java manipulation class that makes these changes and prints the final HTML page looks like this:


import org.w3c.dom.html.*;
public class hello_creator {

    public static void main (String[] args) {

        // Create an instance of the HTML page object.
        hello hello = new hello();

        // Get a reference to the header and change it.
        HTMLTitleElement title = hello.getElementTitle();
        title.setText("Hello, New World!");

        // Change some text within the <SPAN> tags.
        hello.setTextPara1("We changed this!");

        // Print out the results.
        System.out.print( hello.toDocument() );

    }

}

As you can see, this example is a simple class that can be run from a command line and prints to STDOUT. Within the confines of Enhydra, we could replace this class with a presentation object as we shall show later.

Resulting HTML

We now have two Java classes: One that represents the HTML template and another that creates the final dynamic HTML page. We can now compile the manipulation class and run it. These commands will compile and run the manipulation class:


$ javac hello_creator.java
$ java hello_creator

When the toDocument() method of the page class is finally called, it will create the HTML that the user will see. Here is the HTML page created by the above example:


<HTML>
  <HEAD>
    <TITLE id='title'>Hello, New World!</TITLE>
  </HEAD>
  <BODY bgcolor='#FFFFFF' text='#000000'>
    <H1>Hello, World</H1>
    <SPAN>We changed this!</SPAN>
    Text outside of those tags and not within another tag with an id
    attribute cannot be changed.
  </BODY>
</HTML>

You should note that the HTML that is actually generated does not maintain the formatting shown above. That is, there are no line separators or whitespace between the tags. This is most efficient for both network transport and machine parsing, but it does make it hard for a human to read. We have included formatting here for ease of reading.

Adding a Node

Let's get more complicated.  Let's say that you wanted to replace the contents of a <SPAN> tag with some text that included a hyperlink or an HTML header.  This is easy to do by creating an HTML node, then inserting it in the right place.

This example started life as a question on the Enhydra mailing list and was answered by Mark Diekhans.

If we wanted to replace <span id="replaceme"> Replace Me with some text </span> with some thing like <h1> Hello World <h1> <A HREF=Welcome.po> Welcome Page </A>

To add HTML elements to an XMLC generated object, which is a DOM Document object, you must create new DOM element objects and add them as children to the parent element.

To do this, use the createElement() method of the DOM Document.  The following (untested) code demonstrates this:


HTMLObject htmlObj = new HelloHTML();

// Construct head
HTMLHeadingElement head = htmlObj.createElement("h1");
Text headText = htmlObj.createText("Hello World");
head.appendChild(htmlTest);

// Construct anchor
HTMLAnchorElement anchor = htmlObj.createElement("a");
anchor.setHref("Welcome.po");
Text anchorText = htmlObj.createText("Welcome Page");
anchor.appendChild(anchorText);

// Replace contents of id-labeled node.
Element replace = htmlObj.getElementReplaceme();
Element parent = replace.getParent();

// Start with the last new child so we can use insertBefore
parent.replaceChild(anchor, replace);
parent.insertBefore(head, anchor);