Processor API

1. Scope

This section documents the PresentationServer Processor Java API. This is a Java API that you can use to write custom processors in Java. You can then use those custom processors in your PresentationServer applications, just like the standard processors bundled with PresentationServer.

2. Why Write Custom Processors?

In general, PresentationServer processors encapsulate business logic to perform generic tasks such as calling a Web service or accessing a database using SQL. With those processors, the developer can describe the specifics of a task at a high level in a declarative way.

However, there are special cases where:

  • no existing processor exactly encapsulate the task to be performed
  • or, it is more suitable to write Java code to get the job done rather than using an existing processor

In those cases, it makes sense for the developer to write his/her own processor in Java. This section goes through the essential APIs used to write processors in Java.

3. Prerequisites

Writing PresentationServer processors is expected to be done by Java developers who are comfortable with the Java language as well as compiling and deploying onto J2EE application servers. In addition, we assume that the developer is comfortable with both:

4. Processor With Outputs

4.1 Example

We consider a very simple processor with an input number and an output double. The processor computes the double of the number it gets as an input. For instance, if the input is <number>21</number>, the output will be <number>42</number>.

import org.orbeon.oxf.pipeline.api.PipelineContext;
import org.orbeon.oxf.processor.SimpleProcessor;
import org.orbeon.oxf.processor.ProcessorInputOutputInfo;
import org.xml.sax.ContentHandler;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.AttributesImpl;
import org.dom4j.Document;

public class MyProcessor extends SimpleProcessor {

    public MyProcessor() {
        addInputInfo(new ProcessorInputOutputInfo("number"));
        addOutputInfo(new ProcessorInputOutputInfo("double"));
    }

    public void generateDouble(PipelineContext context,
                               ContentHandler contentHandler)
            throws SAXException {

        // Get number from input using DOM4J
        Document numberDocument = readInputAsDOM4J(context, "number");
        String numberString = (String)
            numberDocument.selectObject("string(/number)");
        int number = Integer.parseInt(numberString);
        String doubleString = Integer.toString(number * 2);

        // Generate output document with SAX
        contentHandler.startDocument();
        contentHandler.startElement("", "number", "number",
                                    new AttributesImpl());
        contentHandler.characters(doubleString.toCharArray(), 0,
                                  doubleString.length());
        contentHandler.endElement("", "number", "number");
        contentHandler.endDocument();
    }
}

4.2 Deriving from SimpleProcessor

All the processors must implement the Processor interface (in the package org.orbeon.oxf.pipeline.processors). SimpleProcessor is an abstract class that implements all the methods of Processor and that can be used as a base class to create a custom processor (MyProcessor.java in the figure below).

4.3 Declaring Inputs and Outputs

The processor must declare its mandatory static inputs and outputs. This is done in the default constructor by calling the addInputInfo and addOutputInfo methods and passing an object of type ProcessorInputOutputInfo. For instance:

public MyProcessor() {
    addInputInfo(new ProcessorInputOutputInfo("number"));
    addOutputInfo(new ProcessorInputOutputInfo("double"));
}

In addition to the name of the input/output, one can pass an optional schema URI declared in the PresentationServer properties. If a schema URI is specified, the corresponding input or output can be validated.

Note
Note that the processor may have optional inputs and outputs, and/or read dynamic inputs and generate dynamic outputs, in which case it doesn't need to declare such inputs with addInputInfo and addOutputInfo.

4.4 Implementing generate Methods

For each declared output, the class must declare a corresponding generate method. For instance, in the example, we have an output named double. The document for this output is produced by the method generateDouble. generate methods must have two arguments:

  • A PipelineContext. This context needs to be passed to other methods that need one, typically to read inputs (more on this later).
  • A ContentHandler. This is a SAX content handler that receives the document produced by the generate method.

4.5 Reading Inputs

If the output depends on the inputs, one will need to read those inputs. There are 3 different APIs to read an input:

  • One can get the W3C DOM representation of the input document by calling the readInputAsDOM(context, name) method.
  • One can get the DOM4J representation of the input document by calling the readInputAsDOM4J(context, name) method.
  • One can provide a custom SAX content handler to the method readInputAsSAX(context, name, contentHandler).

Depending on what the generate method needs to do with the input document, one API might be more appropriate than the others.

In our example, we want to get the value inside the <number> element. We decided to go with the DOM4J API, calling the numberDocument.selectObject("string(/number)") on the DOM4J document.

4.6 Generating a Document

The output document can alternatively be generated by:

  • Directly calling methods of the content handler received by the generate method. This is what we do in the example detailed in this section. Here is the code generating the output document:
  • contentHandler.startDocument();
    contentHandler.startElement("", "number", "number",
                            new AttributesImpl());
    contentHandler.characters(doubleString.toCharArray(), 0,
                          doubleString.length());
    contentHandler.endElement("", "number", "number");
    contentHandler.endDocument();
    
  • Create a DOM4J document and have it sent to the content handler using a LocationSAXWriter (in package org.orbeon.oxf.xml.dom4j):
Document doc = ...;
LocationSAXWriter saxWriter = new LocationSAXWriter();
saxWriter.setContentHandler(contentHandler);
saxWriter.write(doc);
Note
Using the LocationSAXWriter provided with PresentationServer is the preferred way to write a DOM4J document to a SAX content handler. The standard JAXP API (calling transform with a org.dom4j.io.DocumentSource) can also be used, but if it is used, the location information stored in the DOM4J document will be lost.
  • Create a W3C document and send it to the content handler using the standard JAXP API:
Document doc = ...;
Transformer identity = TransformerUtils.getIdentityTransformer();
transformer.transform(new DOMSource(doc), new SAXResult(contentHandler));
Note
TransformerUtils is a PresentationServer class (in package org.orbeon.oxf.xml). It will create and cache the appropriate transformer factory. The developer is of course free to create its own factory and transformer calling directly the JAXP API.

5. Processor With No Output

5.1 Implementing The start Method

Implementing a processor with no output is very similar to implementing a processor with outputs (see above). The only difference is that you need to implement the start() method, instead of the generate() methods.

5.2 Example

The processor below reads its data input and writes the content of the XML document to the standard output stream.

package org.orbeon.oxf;

import org.dom4j.Document;
import org.dom4j.io.OutputFormat;
import org.dom4j.io.XMLWriter;
import org.orbeon.oxf.common.OXFException;
import org.orbeon.oxf.processor.ProcessorInputOutputInfo;
import org.orbeon.oxf.processor.SimpleProcessor;
import org.orbeon.oxf.pipeline.api.PipelineContext;

import java.io.IOException;
import java.io.StringWriter;

public class SystemOutProcessor extends SimpleProcessor {

    public SystemOutProcessor() {
        addInputInfo(new ProcessorInputOutputInfo("data"));
    }

    public void start(PipelineContext context) {
        try {
            Document dataDocument = readInputAsDOM4J(context, "data");
            OutputFormat format = OutputFormat.createPrettyPrint();
            format.setIndentSize(4);
            StringWriter writer = new StringWriter();
            XMLWriter xmlWriter = new XMLWriter(writer, format);
            xmlWriter.write(dataDocument);
            xmlWriter.close();
            System.out.println(writer.toString());
        } catch (IOException e) {
            throw new OXFException(e);
        }
    }
}