From 5d026c7242201e7c9d0e12fcb2bcaffead9d59fd Mon Sep 17 00:00:00 2001
From: Kai Bankett <chaosman@ontika.net>
Date: Fri, 17 Feb 2012 05:59:20 +0100
Subject: fs: initial qnx6fs addition

Adds support for qnx6fs readonly support to the linux kernel.

* Mount option
  The option mmi_fs can be used to mount Harman Becker/Audi MMI 3G
  HDD qnx6fs filesystems.

* Documentation
  A high level filesystem stucture description can be found in the
  Documentation/filesystems directory. (qnx6.txt)

* Additional features
  - Active (stable) superblock selection
  - Superblock checksum check (enforced)
  - Supports mount of qnx6 filesystems with to host different endianess
  - Automatic endianess detection
  - Longfilename support (with non-enfocing crc check)
  - All blocksizes (512, 1024, 2048 and 4096 supported)

Signed-off-by: Kai Bankett <chaosman@ontika.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 Documentation/filesystems/qnx6.txt | 174 +++++++++++++++++++++++++++++++++++++
 1 file changed, 174 insertions(+)
 create mode 100644 Documentation/filesystems/qnx6.txt

(limited to 'Documentation')

diff --git a/Documentation/filesystems/qnx6.txt b/Documentation/filesystems/qnx6.txt
new file mode 100644
index 000000000000..050223ea03c7
--- /dev/null
+++ b/Documentation/filesystems/qnx6.txt
@@ -0,0 +1,174 @@
+The QNX6 Filesystem
+===================
+
+The qnx6fs is used by newer QNX operating system versions. (e.g. Neutrino)
+It got introduced in QNX 6.4.0 and is used default since 6.4.1.
+
+Option
+======
+
+mmi_fs		Mount filesystem as used for example by Audi MMI 3G system
+
+Specification
+=============
+
+qnx6fs shares many properties with traditional Unix filesystems. It has the
+concepts of blocks, inodes and directories.
+On QNX it is possible to create little endian and big endian qnx6 filesystems.
+This feature makes it possible to create and use a different endianness fs
+for the target (QNX is used on quite a range of embedded systems) plattform
+running on a different endianess.
+The Linux driver handles endianness transparently. (LE and BE)
+
+Blocks
+------
+
+The space in the device or file is split up into blocks. These are a fixed
+size of 512, 1024, 2048 or 4096, which is decided when the filesystem is
+created.
+Blockpointers are 32bit, so the maximum space that can be adressed is
+2^32 * 4096 bytes or 16TB
+
+The superblocks
+---------------
+
+The superblock contains all global information about the filesystem.
+Each qnx6fs got two superblocks, each one having a 64bit serial number.
+That serial number is used to identify the "active" superblock.
+In write mode with reach new snapshot (after each synchronous write), the
+serial of the new master superblock is increased (old superblock serial + 1)
+
+So basically the snapshot functionality is realized by an atomic final
+update of the serial number. Before updating that serial, all modifications
+are done by copying all modified blocks during that specific write request
+(or period) and building up a new (stable) filesystem structure under the
+inactive superblock.
+
+Each superblock holds a set of root inodes for the different filesystem
+parts. (Inode, Bitmap and Longfilenames)
+Each of these root nodes holds information like total size of the stored
+data and the adressing levels in that specific tree.
+If the level value is 0, up to 16 direct blocks can be adressed by each
+node.
+Level 1 adds an additional indirect adressing level where each indirect
+adressing block holds up to blocksize / 4 bytes pointers to data blocks.
+Level 2 adds an additional indirect adressig block level (so, already up
+to 16 * 256 * 256 = 1048576 blocks that can be adressed by such a tree)a
+
+Unused block pointers are always set to ~0 - regardless of root node,
+indirect adressing blocks or inodes.
+Data leaves are always on the lowest level. So no data is stored on upper
+tree levels.
+
+The first Superblock is located at 0x2000. (0x2000 is the bootblock size)
+The Audi MMI 3G first superblock directly starts at byte 0.
+Second superblock position can either be calculated from the superblock
+information (total number of filesystem blocks) or by taking the highest
+device address, zeroing the last 3 bytes and then substracting 0x1000 from
+that address.
+
+0x1000 is the size reserved for each superblock - regardless of the
+blocksize of the filesystem.
+
+Inodes
+------
+
+Each object in the filesystem is represented by an inode. (index node)
+The inode structure contains pointers to the filesystem blocks which contain
+the data held in the object and all of the metadata about an object except
+its longname. (filenames longer than 27 characters)
+The metadata about an object includes the permissions, owner, group, flags,
+size, number of blocks used, access time, change time and modification time.
+
+Object mode field is POSIX format. (which makes things easier)
+
+There are also pointers to the first 16 blocks, if the object data can be
+adressed with 16 direct blocks.
+For more than 16 blocks an indirect adressing in form of another tree is
+used. (scheme is the same as the one used for the superblock root nodes)
+
+The filesize is stored 64bit. Inode counting starts with 1. (whilst long
+filename inodes start with 0)
+
+Directories
+-----------
+
+A directory is a filesystem object and has an inode just like a file.
+It is a specially formatted file containing records which associate each
+name with an inode number.
+'.' inode number points to the directory inode
+'..' inode number points to the parent directory inode
+Eeach filename record additionally got a filename length field.
+
+One special case are long filenames or subdirectory names.
+These got set a filename length field of 0xff in the corresponding directory
+record plus the longfile inode number also stored in that record.
+With that longfilename inode number, the longfilename tree can be walked
+starting with the superblock longfilename root node pointers.
+
+Special files
+-------------
+
+Symbolic links are also filesystem objects with inodes. They got a specific
+bit in the inode mode field identifying them as symbolic link.
+The directory entry file inode pointer points to the target file inode.
+
+Hard links got an inode, a directory entry, but a specific mode bit set,
+no block pointers and the directory file record pointing to the target file
+inode.
+
+Character and block special devices do not exist in QNX as those files
+are handled by the QNX kernel/drivers and created in /dev independant of the
+underlaying filesystem.
+
+Long filenames
+--------------
+
+Long filenames are stored in a seperate adressing tree. The staring point
+is the longfilename root node in the active superblock.
+Each data block (tree leaves) holds one long filename. That filename is
+limited to 510 bytes. The first two starting bytes are used as length field
+for the actual filename.
+If that structure shall fit for all allowed blocksizes, it is clear why there
+is a limit of 510 bytes for the actual filename stored.
+
+Bitmap
+------
+
+The qnx6fs filesystem allocation bitmap is stored in a tree under bitmap
+root node in the superblock and each bit in the bitmap represents one
+filesystem block.
+The first block is block 0, which starts 0x1000 after superblock start.
+So for a normal qnx6fs 0x3000 (bootblock + superblock) is the physical
+address at which block 0 is located.
+
+Bits at the end of the last bitmap block are set to 1, if the device is
+smaller than addressing space in the bitmap.
+
+Bitmap system area
+------------------
+
+The bitmap itself is devided into three parts.
+First the system area, that is split into two halfs.
+Then userspace.
+
+The requirement for a static, fixed preallocated system area comes from how
+qnx6fs deals with writes.
+Each superblock got it's own half of the system area. So superblock #1
+always uses blocks from the lower half whilst superblock #2 just writes to
+blocks represented by the upper half bitmap system area bits.
+
+Bitmap blocks, Inode blocks and indirect addressing blocks for those two
+tree structures are treated as system blocks.
+
+The rational behind that is that a write request can work on a new snapshot
+(system area of the inactive - resp. lower serial numbered superblock) while
+at the same time there is still a complete stable filesystem structer in the
+other half of the system area.
+
+When finished with writing (a sync write is completed, the maximum sync leap
+time or a filesystem sync is requested), serial of the previously inactive
+superblock atomically is increased and the fs switches over to that - then
+stable declared - superblock.
+
+For all data outside the system area, blocks are just copied while writing.
-- 
cgit v1.2.3