Mnemofs's Journal

Continuing a reoccurring theme, to understand this, we need to understand something else. Since the journal stores updated information about a file or a directory, first we need to look into how mnemofs stores a file or a directory.

Count Trailing Zero (CTZ)

On a seemingly unrelated note, have you heard of the CTZ operation? It, as its name suggests, counts the number of trailing zeros in the binary representation of a number.

e.g., 1860 is 11101000100 in binary, and there are 2 trailing zeroes in it. Thus, ctz(1860) == 2.

GNU compilers provide a __builtin_ctz(x) for this, and in C 23, it's become a part of the official standard. Most CPU architectures support this instruction.

CTZ Skip List

A skip list is a modified singly linked list that, instead of containing one pointer per node to point to the next node, contains more pointers in addition to the original pointer. Also, as shown by littlefs (who pioneered the CTZ skip list data structure) (opens in a new tab), Copy-On-Write benefits more from a backward linked list (or a backward skip list) than a forward skip list.

Skip lists prefer to keep the number of pointers per node at random to lower the cost of insertion and deletion. However, a Copy-On-Write file system has no need for "insertion" and "deletion." All of the operations are in the form of "appending" and all modifications requested are done in memory. We'll discuss how a CTZ skip list works in mnemofs, but first we need to know what the structure of a CTZ skip list is.

In CTZ skip lists, each CTZ skip list block at index x has ctz(x) + 1 number of pointers (0 for the 0th CTZ skip list block). Each CTZ skip list block has pointer to the (x - 2^i)th CTZ skip list block for every i such that x is divisible by 2^i. For example, a CTZ skip list block with index 6 will have pointers to 5th and 4th CTZ skip list blocks, while a CTZ skip list block with index 8 will have pointers to 7th, 6th, 4th, and 0th CTZ skip list blocks.

In mnemofs, each CTZ skip list block takes exactly one page of space.

Since it's possible to iterate over a CTZ skip list to reach any CTZ skip list block from the very last CTZ skip list block, only the page number and index of the last CTZ skip list block are stored, along with the size of the file. Mnemofs uses CTZ skip lists like its creator, littefs, does. However, mnemofs uses it to represent files and directories.

Travel and Offset

We'll use the word "offset" to refer to "data offset," which is the offset into the actual data contained in the CTZ skip list, which doesn't include the pointers.

Conversion of the offset into its CTZ skip list block index and page offset can be done through the derivations done by littlefs (opens in a new tab).

Travel from one CTZ skip list block to the other can be done using a greedy approach that utilizes the fact that the powers of 2 that change from one CTZ skip list block to the other while traveling first monotonically increase and then monotonically decrease. The graph might be discontinuous, but that's not an issue. It's actually easier to understand by reading the code in this case.

Journal Logs

Back to the journal. Now, once the updated information on the CTZ skip list is received, it is logged to the journal along with a CTZ skip list representation of the path of the FS object. This log is followed by a checksum to make sure that the entire log was written correctly to the flash.

Structure

The journal consists of blocks from the NAND flash. The last two blocks allocated for the journal are reserved for the master node and are called master blocks (more on that later). The first block starts with an 8-byte magic sequence, followed by the number of blocks (all n + 2 blocks) allocated to the journal, and then an array that contains the block numbers of all the blocks allocated to the journal.

This is like a modified version of a singly linked list. A traditional singly linked list design was not used to allow the mount process to quickly find the master node once the journal was found (more on that later), and it reduces space, as a traditional design would require the last page of every block in the journal to be reserved specifically for storing the block number of the next block.

The first n blocks (out of the n + 2 blocks) store the logs, and once full, the journal gets flushed, but more on that later.

Conclusion

So, this is how the journal works, and the "more on that later" parts will be explained in the next section that discusses the master node of mnemofs.



⌂ Home

MIT 2024 © resyfer.