Also see the XMLC 2.2 Release Note, XMLC 2.2.1 Release Note, XMLC 2.2.2 Release Note, XMLC 2.2.3 Release Note, XMLC 2.2.4 Release Note, XMLC 2.2.5 Release Note, XMLC 2.2.6 Release Note, XMLC 2.2.7.1 Release Note, XMLC 2.2.8.1 Release Note, XMLC 2.2.9 Release Note, XMLC 2.2.10 Release Note, XMLC 2.2.11 Release Note, and XMLC 2.2.12 Release Note.
It took a *lot* of work (and even some patches, which have been applied to the official Xerces2 source repository at http://xerces.apache.org) to make it happen but XMLC is now compatible with Xerces2! It was as of XMLC 2.2.6 that XMLC was enhanced to successfully run in an environment containing the DOM3 APIs, such as under JDK1.5. However, DOM3 support in XMLC 2.2.6+ is only API-deep and not actually implemented. With Xerces2 as the new DOM implementation base for XMLC 2.3, DOM3 is finally fully supported.
Note that versions of Xerces prior to 2.8.0 are totally incompatible with XMLC 2.3. This is due to the required fix for issue XERCESJ-1133. Even then, fixes available in Xerces 2.8.1, such as XERCESJ-1181 and XERCESJ-1187, make 2.8.1 the recommended version. Additionally there is at least one post 2.8.1 fix (XERCESJ-1200) that led me to package not 2.8.1, but a version based on the very latest Xerces2 source. If another version is released before the final XMLC 2.3 release, I will use that instead of a moving target.
Also note that Xerces2 binary is re-packaged as enhydraXercesImpl.jar using JarJar 0.7 for two reasons:
Xerces2 and XMLCommons Resolver (and NekoHTML) are packaged in xmlc.jar and xmlc-all-runtime.jar for convenience; all under the org.enhydra.* namespace. Because Xerces2 is larger than Xerces1, the XMLC jars are larger. However, the plan is to use the official releases under the official namespaces once all required binaries are available as official releases. When that happpens, the XMLC jars will be significantly smaller (for a hint at how small, look at xmlc-base.jar) and the 3rd party binaries can be shared by multiple applications. And because the packaging for XMLC 2.3 is consistent with 2.2.xx, the upgrade is simple: just replace the 2.2.xx jar(s) with the 2.3 jar(s) and everything should continue to work as before (assuming you compiled with -for-deferred-parsing. Please report observations to the contrary).
All existing parsers have been removed in favor of a new XML parser (XercesDOMParser) extending the Xerces2 DOMParser and a new HTML parser (XercesHTMLDOMParser) extending the XML parser, using the NekoHTML HTMLConfiguration. Parsing is now performed 100% inside Xerces2/NekoHTML with XMLC's minimal parser extensions merely gathering XMLCDocument information and performing entity resolution. DOM correctness is now entirely enforced by the underlying parsers, respectively.
Note that in cases where "tidy" or "swing" parser types have been specified in XMLC configuration, NekoHTML will be used instead. No need to recompile with a new configuration. To explicitly specify NekoHTML as the parser, use the new parser type "nekohtml".
Xerces2 is much more strict about DOM creation. Static Loading builds the DOM from scratch. When it comes to appending DocumentType, Entity, and EntityReference nodes, Xerces2 throws exceptions stating that operations on these objects are read only. Using Deferred Parsing and Dynamic Loading solves this issue, since the DOM is always built by the parser. The DOM is parsed once (and reparsed if a change to the source markup is detected) and cached. DOM instances are clones of the master cached DOM, with the cloning performed internally by Xerces, allowing for write operations on normally read-only nodes. An added benefit was the opportunity to lighten up XMLC by removing a good deal of code that is now unnecessary.
The -for-deferred-parsing flag is still recognized, but ignored since Deferred Parsing is now used by default. As such, there is no longer a requirement to specify -for-deferred-parsing.
getDocumentClassName()
method - meant to return the fully qualified class name of the Document
implementation being represented by the factory. This is used to feed the Xerces2 XMLParserConfiguration "http://apache.org/xml/properties/dom/document-class-name" property (see Xerces2 properties), informing Xerces of the custom DOM with which to bind.
createAccessorGenerator()
and createDocBuilderGenerator()
methods - the only AccessorGenerator and DocBuilderGenerator implementations now in use are the Deferred Parsing implementations, so there is no longer a need to specify these.