Journaling

From
Jump to: navigation, search

Introduction

The goal both in Journaling and Soft-Updates is to speed up file writes, especially the writing of meta-data.

Journaling does this by writing alle the meta-data changes en bloc to a consecutive log before the data write. If a crash occurs during the write, the log is either replayed or undone.

The Journal

is a consecutive and circular buffer which stores the meta-data. It can be either a file in the current partition, a partition on its own or even a separate disk - this is the most efficient option.

A Journal contains three types of entries

  • the actual meta-data
  • descriptors which say where the meta-data belongs, and
  • a start and end marker

Before each write of real data, the meta-data is written to the log and the end-marker is moved.

After the data and the meta-data writes have completed, the start marker moves to reflect the new start of the journal.

(If a journal contains several entries, they may be processed out of their log sequence. If so, the system remembers the completed parts and moves the start marker forward that far after the first block is on disk, too.)

Crash recovery

If a crash occurs during operations, the file system may be in an inconsistent state. This inconsistency may comee from any of the active operations in the log. To repair the system, the log is read, the start and end are determined and the open operations are either all undone or completed.

This takes a small amount of time but much less than checking the entire fs.

Efficiency

On the first view, logging leads to an increase in disk writes: meta-data is written not once, but twice: one time to the journal, the second time to the real file system.

This effect is dwarfed by the huge savings that arise from asynchronously writing the meta-data and by writing the journal in huge chunks.

Both procedures reduce disk seeks, the expensive operation in writes. In addition, not all operations that are logged need to be carried out: if, for example, a directory is changed twice in short succession, only the second state needs to reach the disk.

In tests in the literature, Journaling yeilded an increase in 50 to 400 % versus standard operations.

Links