Introduction

DOGEN is a code generation system that aims to eliminate the hand-coding of persistent Java objects stored in relational databases. DOGEN uses Java's built-in reflection mechanism to examine specially written Java classes which serve as declarations for persistent objects. DOGEN translates these declarations into classes whose instances are persistent and have other useful properties as well.

We're releasing the source for DOGEN under an Enhydra-like license in the hopes that other developers will find it useful and (hopefully) contribute to its future.

Audience

Since there's essentially no documentation for DOGEN at the moment (other than what you're reading), you should be very comfortable with Enhydra and JDBC concepts before trying to understand DOGEN.

About the Demo App

To see DOGEN in action, you'll want to compile the Enhydra application in this directory. If you're using the latest release of Enhydra (2.1) you shouldn't have any problems.

You'll also need to do whatever database administration tasks are required so that JDBC can log in and create a session. With Oracle, you'll need to create a user and a tablespace. Other RDBMSs have slightly different setup requirements. Before you run the application, make sure the information in the DogenDemo.conf file is correct.

In order to run the application, you'll need to install the appropriate JDBC drivers and make sure they're available on your classpath. If the application crashes before you can get to the main page, it's probably because the app can't find its JDBC drivers.

Once you've successfully started the application, visit the following URL with your browser:

http://localhost:9000/

There's only one HTML page to the demo app. With it you can create and destroy the SQL schema on the database. The page will show you any data that already exists in the tables, and you can create new Student records. The application itself is trivial. However, it does serve as an example to put DOGEN through its paces.

Theory of Operation

The DOGEN program itself is a collection of Java routines that you merge with your application. There are three packages, dogen.generator, dogen.runtime, and dogen.types which contain the code. The generator directory contains the code generator and driver program for DOGEN. Runtime contains various classes which your application will use in conjunction with DOGEN generated code. Finally, the types directory defines the mapping between Java types and SQL. Because the mapping of types and the behaivor of the different types is not part of the code generator itself, it's easy to extend DOGEN.

Unlike most other approaches to code generation, DOGEN parses class objects rather than formatted files to get its information. This happens through use of Java's built-in reflection mechanism. In essence, DOGEN steps through the fields of a specially "formatted" class and converts declarations like, "public int foo;" into SQL column and table definitions.

The Schema Declaration

The first thing that DOGEN sees is a schema declaration. In this case, a schema is simply a Java class which contains a public field declaration for each class or table that should be made persistent. The demo application's schema declaration can be found in dogenDemo/data/DogenDemoSchemaDECL.java. In general, the name of any class which DOGEN will parse should end in "DECL".

Here's the interesting part of this file:

public class DogenDemoSchemaDECL { 
  public OracleTypeMapping $typemap;
  public boolean $emitSQLConstraints = false; 
  public ObjectIDDECL table0;
  public StudentDECL table1;
  public TeacherDECL table2; 
}

The definitions of table1 and table2 are the interesting ones. They tell DOGEN to look at the StudentDECL and TeacherDECL classes and to generate SQL tables which correspond to the fields they define. More about that in a minute. The ObjectIDDECL class is also important, it's there to define the table that Enhydra uses to keep track of OID numbers. It has to be present in every schema, but it's not really a data object declaration like the other two. The interested reader can a look at dogenDemo/data/ObjectIDDECL.java to see what it contains.

The declaration of $typemap tells DOGEN to use the Oracle-specific mapping of Java types to SQL. The next line after that tells DOGEN to omit any foreign key SQL constraints that it might otherwise add to the schema. Since the constraint mechanism in DOGEN doesn't work all that well yet, it's best to turn it off.

Object Declarations

Another interesting file to look at is dogenDemo/data/EnhydraDataObjectDECL.java. Here's the meat of that file:

package dogenDemo.data;

import dogen.runtime.*;
import dogen.types.*;
import dogen.generator.*;
import java.util.*;
public class EnhydraDataObjectDECL {
    public String $superclass =             // runtime instances are derived
	"dogen.runtime.DataObject";         // from a hand-coded class
    public OID  objid;                      // Enhydra's OID field
    public boolean objid$isKey = true;      // key for this table
    public boolean objid$canBeNull = false; // every object has an OID
    public boolean objid$isUnique = true;   // every object's OID is unique
    public int          version;            // Enhydra's version field
}

This is a declaration that creates the standard Enhydra data object fields. There's an object identifier named, "objid," (different than "oid" to avoid a Postgres reserved word), and an integer version field. This declaration serves as the ultimate ancestor of the other data objects. DOGEN translates this class into EnhydraDataObject (without the trailing "DECL") which, among other things, contains two fields: objid and version; All the other fields that appear in the DECL have a "$" in the name and this tells DOGEN that these fields are to be treated as configuration parameters for generating the code for this class.

The code generator isn't hard-wired to use EnhydraDataObjectDECL, it's just that the other declaration objects in this example derive from it. You could easily write some other declaration class (just make sure there's a key field of type OID). In fact, you could have two different "root" classes if there was a reason to.

Here's PersonDECL.java:

package dogenDemo.data;
import dogen.generator.*;


public class PersonDECL extends EnhydraDataObjectDECL {
 public String firstName;
 public int firstName$size = 255;
 public String lastName;
 public int lastName$size = 255;
 public int age;
} 

This declaration creates a class with three instance variables which is derived from EnhydraDataObject. While Java has Strings of unbounded length, SQL needs an upper bound. The declarations that end in $size specify the maximum size of the string for that particular field. Since PersonDECL didn't appear explicitly in the DogenDemoSchemaDECL class, it will be treated as an abstract class. It's only purpose is to be the parent of StudentDECL and TeacherDECL.

Data Object Semantics

Most of the classes that DOGEN generates are for data objects. Each data object represents a single row in its table. Because the ultimate ancestor for any data object is required to have a key field of type OID, DOGEN guarantees that every object has a unique identifier. Since the identifiers for all data objects are unique, DOGEN guarantees that there's never more than one data object which corresponds to a particular identifier. In other words, there's going to be at most one Java object that corresponds to row in the database.

This means you can use Java's '==' operator (rather than the equals() method) to test whether two Java objects are the same. It also means you shouldn't write code which modifies a "copy" of a data object which is then discarded.

Transactions

DOGEN provides some support for transactions in its API. Any code which creates, deletes or modifies data objects should be wrapped in a begin()/end() pair. Here's an example:

try {
 DogenDemoSchema.begin();
 Student s = new Student();
 s.setFirstName(firstName); 
 s.setLastName(lastName);
 s.setAge(Integer.parseInt(age));
 s.setMajor(major); 
 DogenDemoSchema.end();
} catch (Exception e) {
 DogenDemoSchema.abort();
}

The schema objects generated by DOGEN provide the transaction API. In addition to begin() and end(), there's also abort(), which discards the current transaction. When a transaction is aborted, any objects affected by that transaction have their attributes invalidated. The next call to an access method of one of those objects will query the database for the "correct" attribute values. Transactions can be nested and each thread has its own distinct transaction context.

Queries

For each data object referenced in a schema declaration, DOGEN creates two query classes. The first one is a general purpose query which will return all objects of that particular class. You can use this class directly, or more typically, subclass it to perform more interesting queries. DOGEN handles all the object formatting duties. All you have to do is supply a SELECT statment that returns the objects you want. The following complicated example extends the DOGEN generated class for CommentAnswerQuery with a hand-coded SELECT statement:

public class ComplicatedQuery extends CommentAnswerQuery {
  protected ReportDO report;
  boolean retrieveSelfComments;

  public NewCommentAnswerQuery(ReportDO report,
			       boolean retrieveSelfComments)
  {
    this.report = report;
    this.retrieveSelfComments = retrieveSelfComments;
  }

  protected void setStatementParameters(PreparedStatement s)
      throws SQLException
    {
    s.setString(1, report.get$OID().toString());
    s.setBoolean(2, retrieveSelfComments);
  }

  protected String getSqlTemplate() {
    return
      "SELECT * FROM CommentAnswer WHERE " +
      "RaterOId IN " + 
	"(SELECT Rater.objid from Rater, ReportRaters WHERE " +
	        "ReportRaters.ReportOId = ? AND " +
	        "Rater.objid = ReportRaters.RaterOId AND " +
	        "Rater.Self = ?)";
  }
 }

DOGEN's approach to queries is based on the belief that SQL provides a great way to describe database queries. The generated code tries to handle all the other data management details while making easy for the programmer to use the power of SELECT.

The second query class generated by DOGEN lets you query the database for objects by their identifiers. The constructors for these queries let you pass in one or more OIDs (in String form).

Business Objects

DOGEN doesn't create business objects.

Schemas

In addition to providing a transaction API, the schema objects let you download a schema into your database. In addition, there are methods to tell whether the schema exists in the database and to remove. Here's a code snippet that will initialize a new schema and set up the tables in the database:

if(DogenDemoSchema.getInstance() == null) {
   new DogenDemoSchema(Enhydra.getApplication().getConfig(),
                       null); // null means the default db


   if(!DogenDemoSchema.exists()) {
      DogenDemoSchema.create();
   }
}

Build System

At the moment, the DOGEN source code is copied into the tree for any application which uses it. This means that you'll need to copy the classes and modify the top-level Makefile to ensure that the dogen package is compiled before the rest of the application.

You should also examine the Makefile in the dogenDemo/data directory because it contains the rules that will invoke DOGEN and then compile all the generated code.

To Do List