org.enhydra.xml.xmlc.html.parsers.tidy
Class TidyHTMLParser
java.lang.Object
|
+--org.enhydra.xml.xmlc.html.parsers.HTMLParserBase
|
+--org.enhydra.xml.xmlc.html.parsers.tidy.TidyHTMLParser
- All Implemented Interfaces:
- XMLCParser
- public class TidyHTMLParser
- extends HTMLParserBase
- implements XMLCParser
XMLCParser object for HTML and HTML framesets that uses the Java version of
the W3C HTML tidy program. It uses Tidy to convert HTML to XHTML and then
parses it with an XML parser.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
TidyHTMLParser
public TidyHTMLParser()
throws XMLCException
- Constructor.
parse
public XMLCDocument parse(InputSource input,
LineNumberMap lineNumberMap,
XMLCDomFactory domFactory,
MetaData metaData,
ErrorReporter errorReporter,
ParseTracer tracer)
throws java.io.IOException,
XMLCException
- Description copied from interface:
XMLCParser
- Parse a XML file (or any file, such as HTML, that can be converted into
XML).
- Specified by:
parse
in interface XMLCParser
- Parameters:
input
- The input source to parse.lineNumberMap
- If not null, a dynamic map of input stream
line numbers and offsets to source files and line numbers.
This object is dynamically updated as input is read. It may not
have valid mappings for characeters that have not been read.domFactory
- The DOM factory object.metaData
- MetaData for the document.errorReporter
- Object for reporting errors during the parse.tracer
- Object for parser info tracing.
- Returns:
- A XMLC document object that contains the actual DOM Document.
- Throws:
XMLCException
- Thrown for fatal errors found parsing the
document.
java.io.IOException
- See Also:
XMLCParser.parse(org.xml.sax.InputSource, org.enhydra.xml.xmlc.misc.LineNumberMap, org.enhydra.xml.xmlc.dom.XMLCDomFactory, org.enhydra.xml.xmlc.metadata.MetaData, org.enhydra.xml.io.ErrorReporter, org.enhydra.xml.xmlc.parsers.ParseTracer)