The article The Process Virtual Machine describes the concepts and background of this implementation.
In essence, the process virtual machine is a framework for implementing advanced state machines. It's done in such a way that it's easy to build workflow, BPM, orchestration and other graph based execution langauges on top of it.
Process information can be split into 3 distinct categories:
Definition: A {@link org.jbpm.pvm.Process} is made up of {@link org.jbpm.pvm.Node}s and {@link org.jbpm.pvm.Transition}s. These relate to the graphical representation and form the structure of the process. Typically a process is static and it can have many executions. With nodes and transitions, all graph based process languages can be represented. But note that because {@link org.jbpm.pvm.Node} has a hierarchical parent-child relationship, it's also possible to model block structured process languages with the PVM.
Execution: An {@link org.jbpm.pvm.Execution} represents the runtime state for one path of execution. The most important part of the state is the current node pointer {@link org.jbpm.pvm.ExecutionAccessor#getNode()}. Also the process variable instances are part of the runtime state of an execution.
History: The history of an execution is a kind of audit trail that captures all the detailed information about when and how an execution changed it's state. It includes external triggers, state changes, process variable updates and so on. Potentially, this information could be sufficient to replay or to roll back an execution. Of course, this generating and storing history information should be optional. From the raw logging information, business intelligence information can be derived. This is basically the same information, but structured differently resulting in an easy-to-query relational database schema.
The nodes and transitions form the basis of a process. But with nodes and transitions alone, there is no runtime behaviour associated with the process graph. That is delegated to interfaces {@link org.jbpm.pvm.NodeBehaviour} and {@link org.jbpm.pvm.Action}. With those interfaces, programming logic is bound to the process structure.
A process language typically specifies a number of reusable process constructs. In PVM, this translates to a number of reusable and configurable {@link org.jbpm.pvm.NodeBehaviour} implementations. The configurations of a process constructs are represented in the member fields of the {@link org.jbpm.pvm.NodeBehaviour}.
So here is an overview of the process definition classes.
Transitions can connect nodes in the form of a directed graph. Nodes can also group a set of child nodes. Both forms can be mixed (like with superstates in a UML state diagrams). This basic structure supports free graph based process languages and block structured process languages.
This structure can be decorated with actions. Actions are pieces of Java code (command pattern like) that are associated with events in a process. Action (unlike nodes and transitions) are NOT to be represented graphically. This decouples the graphical representation from the actual implementation process, ensuring that a developer does not have to mess with a business analysts' diagram to get it working.
Node behaviours and actions are referenced as {@link org.jbpm.ref.ObjectReference}s. This is done for persistence reasons. Those objects will not be persisted with hibernate, but instead, a wire description of how these objects are created and configured is stored in the database. This way, the database schema remains unrelated to the {@link org.jbpm.pvm.NodeBehaviour} implementations. So a new process language (or new process langauge features) can be added without changing that part of the database.
TODO Exception handlers TODO EventsAn asynchronous continuation (aka safepoint) means that logically, the process keeps executing, but the current transaction must be committed and a new transaction must be started automatically in which the execution is resumed.
A process is executed in atomic steps called {@link org.jbpm.pvm.impl.AtomicOperation}s. An atomic operation is an uninterruptable piece of process execution.
By default process execution is synchonous, which means that method {@link org.jbpm.pvm.Execution#signal()} will block until the process arrives in a wait state.
Potentially, the calculation time needed to execute the process till that next wait state might be too long. For example, when automatic activities are long calculations like in pdf generation. That is why process execution is split up in a number of sequential atomic operations.
Before and after an atomic operation, process execution can be stopped, which means that the blocking method {@link org.jbpm.pvm.Execution#signal()} will return and the process can potentially be saved. Before the execution returned, it will also have sent an asynchronous message with the {@link org.jbpm.svc.MessageService}. The destination of the message is responsible for resuming process execution by executing the next atomic operation.
An execution can stop running in 3 specific situations:
The latter two are asynchronous continuations. The asynchronous continuation implementation is based on a jbpm-defined asynchronous message service API. Configurable implementations of this API can bind the used service to JMS, in-memory blocking queues or any other implementation of asynchronous messaging.
Jobs are the messages that will be sent to the job executor component over the asynchronous messaging service.
During an asynchronous continuation, the execution is locked so that potentially other competing external triggers do not interfere with the execution. Locking does not include a security aspect here. It's just a precautionary measure to prevent that other signals are given to the execution, which is conceptually still running.
All jBPM exceptions are runtime exceptions. The base class of exceptions is JbpmException.
jBPM can throw an exception when the interpretation of the process cannot be completed successfully. Of course, many of these problems (but not all) can be analysed and detected at deployment time. E.g. if the process is parsed from XML, the schema validation will already provide a first level of validation. Then during the DOM model parsing, a second level of validation is done. But still the parsed process might be unexecutable.
The second source of exceptions are the plugins like Executable implementations or services. These exceptions will be wrapped in JbpmExceptions, pass completely through jBPM and then thrown to the original client.
As mentioned above, the execution can only be stopped in 3 specific situations. At those points during execution, some invariants will be met (e.g. all transient data can be lost). While a process is running not in those invariants might be violated. So when an exception occurs users always have to discard the execution. Typically in a persistent scenario, the transaction will be rolled back.
All methods of process definition objects that are called during execution of a process are supposed to be thread safe. Process execution objects don't have to be thread safe. There will always be 1 object per thread.
Building the process definition object graph is usually done single threaded. Also executions typically have their own set of objects per thread as this is a core paradigm in current ORM solutions like hibernate and JPA.
Every process language might have a number of extensions that are related to the execution. E.g. jPDL has Tasks, Swimlanes, Comments and possibly some node state informations. BPEL has partnerlinks and correlation sets.
Since every process language needs variables, they are moved to the execution itself. (is this a good choice ?)
In this process model thoughout the pvm codebase, there are several bidirectional relations. By convention, the bidirectional relations are maintained on the many side. This means that when you call for example {@link org.jbpm.pvm.Process#addNode(org.jbpm.pvm.Node)}, that method will also update the inverse pointer by invoking the {@link org.jbpm.pvm.Node#setProcess(org.jbpm.pvm.Process)} setter. Setter methods will be plain setter methods and they will NOT update the inverse reference. (that would cause an infinite loop).