Specifies the interface to the Write Ahead Logging (WAL) system.

Introduction

The Write Ahead Log plays a crucial role in a DBMS. It provides the basis for recoverability. It is also a critical part of the system that has a massive impact on performance of an OLTP system.

Conceptually, the Log can be thought of as an ever growing sequential file. In the form of Log Records, the Log contains a history of all changes made to the database. Each Log Record is uniquely identified by a number called the Log Sequence Number (LSN). The LSN is designed in such a way that given an LSN, the system can locate the corresponding Log Record quickly. LSNs are assigned in strict ascending order (monotonicity). This is an important property when it comes to recovery.

During the progress of a Transaction, the a DBMS records in the Log all the changes made by the transaction. The Log records can be used to recover the system if there is a failure, or they can be used to undo the changes made by a transaction.

Initially, Log Records are stored in memory. They are flushed to disk during transaction commits, and also during checkpoints. In the event of a crash, it is possible to lose the log records that were not flushed to disk. This does not cause a problem, however, because by definition these log records must correspond to changes made by incomplete transactions. Also, the WAL protocol (described below) ensures that such Log records do contain changes that have already been persisted within the database.

Write Ahead Log (WAL) Protocol

The WAL protocol requires the following conditions to hold true:

  1. All changes made by a transaction must be saved to the Log and the Log must be flushed to disk before the transaction is committed.
  2. A database buffer page may not be modified until its modifications have been logged. A buffer page may not be saved to disk until all its associated log records have been saved to disk.
  3. While the buffer page is being modified and the Log is being updated, an Exclusive latch (a type of fast lock) must be held on the page to ensure that order in which changes are recorded in the Log correspond to the order in which they were made.

Consequences of above rules are:

  1. If a Log Record was not saved to disk, it can be safely ignored, because any changes contained in it are guaranteed to belong to uncommitted transactions. Also, such Log Records cannot represent changes that have been made persistent in the database.
  2. Log records represent changes to the system in the correct order. The latching protocol ensures that if two Log records represent changes to the same Page, then the ordering of these records reflects the order in which the changes were made to the page.

Advantages of WAL

Typically, in an OLTP system, updates tend to be random and can affect different parts of the disk at a point in time. In comparison, writes to the Log are always sequential. If it were necessary to flush all changes made by the DBMS to disk at commit time, it would have a massive impact on performance because of the randomness of the disk writes. However, in a WAL system, only the Log needs to be flushed to disk at Commit. Thus, the Log has the effect of transforming random writes into serial writes, thereby improving performance significantly.

Design Issues

There are several design decisions that must be made when implementing a Write Ahead Log.

  1. Should the Log Manager share the page buffering system used by the DBMS to manage IO to data files? The database page buffering system usually operates for fixed size pages, whereas log records are typically variable length. Using a page buffering system would require mapping of log records to pages. There is also the chicken and egg problem. Changes made to disk pages are supposed to be logged in the write ahead log. Having the log system depend on the page buffer system would introduces a cyclical dependency, which is best to avoid. Last, but not the least, log systems usually require very high io performance, and can take advantage of the fact that log writes are always sequential. A custom algorithm for handling log writes is likely to achieve better performance than the more generic solution in page buffering systems.
  2. Should Log Records be allowed to span multiple log files? It is easier to keep things simple and insist that the largest of log records must fit into an individual log file. Clients that require larger log records can always split them into multiple records.
  3. Should log files be allocated as and when needed? There is often a middle ground; a fixed set of online log files can be permanently allocated. Having a fixed set of online log files makes the system more reliable, because there is no need to allocate a file when writing to the Log. New files can be allocated when archiving log files. This ensures that even if the log archival fails, the main log files will remain intact.
  4. What is the level of concurrency that should be supported in the log system? Using fine-grained locking increases the complexity of the code enormously, but should result in greater scalability. Concurrency solution can take advantage of the fact that log writes always occur at the end of the log.