Compiling HTML Pages

The xmlc command is used to run XMLC. All of command options are explained in detail in the the XMLC command reference page. It parses an HTML pages and normally creates a Java class than contains a DOM representation of that object.

Basics of XMLC compilation

The XMLC compilation process is straight forward, with only a class name, output directory and a HTML file normally required as parameters. For example:

    xmlc -d ../../classes -class app.presentation.user.UserTable ../html/usertable.html
would generate an XMLC class named app.presentation.user.UserTable from a HTML file ../html/usertable.html and writes the resulting class file under the directory ../../classes.

Using the Enhydra make rules

If one is developing applications using the standard Enhydra make configuration, it is very easy compile with XMLC. The rules will compile a HTML named in the form foo.html into a class named fooHTML in the package the makefile is associated with. The following variables, defined in stdrules.mk, are used when compiling HTML files with XMLC:

The following is an example Enhydra makefile for compiling four HTML objects . More examples maybe found in the

    ROOT = ../../../..

    PACKAGEDIR = golfShop/presentation/xmlc/login

    HTML_DIR = ../../html/login
    HTML_XMLC_OPTS_FILE = login.xmlc

    HTML_CLASSES = LoginHTML \
                   LogoutHTML \
                   CheckVersionHTML \
                   NewAccountHTML

    include $(ROOT)/config.mk

In certain cases, it maybe necessary to have specific rules defined for HTML objects that need options that don't apply to all HTML files. The following variables maybe used in construction new rules:

Choice of Parser

The HTML parser defaults to HTML Tidy, earlier versions of XMLC used the Swing HTML parser. Due to differences in the way non-conforming HTML is handled, the resulting DOM trees may not be the same. If one does not wish to fix these inconsistencies, it is suggested that existing documents use the Swing parser. This is specified with the -parser swing option. When using the Enhydra make rules, the parser can be specified by setting:

    XMLC_HTML_OPTS = -parser swing

Non-conforming HTML

The default XMLC HTML parser is built on the Java port of the HTML Tidy program. This parser locates, and often correct many errors in HTML. Problems that can't be corrected must be fixed before the page will compile. The HTML Tidy program maybe useful in producing corrected HTML files.

The HTML Tidy parser will reject proprietary tags it does not understand.

Diagnosing Problems

Several options are useful for understanding the results of XMLC. The -verbose option provides a tracing of the overall execution of XMLC (but not the parser details obtained with -parseinfo). The -info option produces a dump of information about the page being compiled, currently consisting of the ids and URLs found in the page. With the -methods option, a list of all of the generated access methods for the class will be produced. The -parseinfo option, which traces the execution of the HTML parser, can be very useful in debugging page problems that are not obvious from the parser error messages. To see the DOM tree that is produced, use the -dump option. These options can all be used with -nocompile to only get status without generating a class file.

Creating Modified HTML pages using XMLC

Occasionally the only different between the mockup HTML page and the page in the application is the URLs. This is often the case for frame sets. XMLC can address this using the -urlmapping options to update the URLs and then the -docout option to write a new HTML file with updated URLs instead of producing a class file.