At present there are plenty of electronic devices that are intended to store information in electronic form. They include personal and server computers, personal digital assistants (PDA), mobile phones and so on. These devices store huge amount of personal or corporate information. The information is stored in a form of so-called 'files' which take some amount of 'storage space' and contain the actual information: documents, presentations, pictures, music, video, databases, email messages etc.
If you are interested in lost information recovery - it's recommended to have basic understanding of how information is stored on a computer device.
Before understanding about file system you must know about the file. So what is a file?.
According to Wikipedia:-
A
computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished. Computer files can be considered as the modern counterpart of paper documents which traditionally are kept in offices' and libraries' files, and this is the source of the term.
OR
In simpler terms:- A file is the smallest allotment of logical secondary storage.
What is file system?
According to Wikipedia:-
A
file system (
file system) is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device(s) which contain it. A file system organizes data in an efficient manner and is tuned to the specific characteristics of the device.
Now let goes deep into it to find more accurate definition of file systems (Long But Informative).
Any computer file is stored on some kind of storage: hard disk, CD, DVD, flash memory and so on. These storages have specific, model-dependent capacity to store data. From the software point of view (including operating system) each storage is linear space to read or both read and write the digital information. Each byte of the information on a storage has its own specific offset from storage start (the address) and could be referenced by this address. You may imagine storage as grid with set of numbered cells (each cell - the single byte). Any file saved to storage takes a number of these cells.
Historically, computer storages like hard disk, CD, DVD or flash memory use pair of sector and in-sector offset to reference any byte of information on storage. The sector is the group of bytes (usually 512 bytes) that is minimum addressable unit of the physical storage. For example, byte 1030 on hard disk will be referenced as sector #3 and offset in sector 16 bytes ([sector]+[sector]+[16 bytes]). This schema is used to optimize storage addressing and use smaller number to reference any portion of information on the storage.
To omit second part of the address (the in-sector offset), files on storage are usually stored from the sector start and take all whole sectors (e.g.: 10 byte file takes a whole sector, 512 byte files also takes one sector, 514 byte file will take two whole sectors and so on).
Each file will be stored to 'unused' sectors and could be read then by known position and size. However, how do we know what sectors are used or unused? Where are file size and position stored? Where is file name? This answers give us the
file system.
File system - is just a kind of structured data representation on a storage and set of metadata to describe the stored data. Unlike plain storage, file system could be located on disk partition - the isolated segment of storage. Usually it operates blocks, not sectors. The file system blocks are groups of sectors aimed at storage addressing optimization. Modern file systems generally use block sizes from 1 up to 128 sectors (512-65536 bytes). The files are usually stored from start of block and take entire blocks.
Many write/delete operations to file system could cause file system fragmentation: the files could not be stored as whole fragments anymore and are divided to fragments. Here is an example of “
fragmentation”: imagine a storage entirely taken by files with size about 4 blocks (e.g. pictures collection). User wants to store a file that would take 8 blocks and therefore deletes the first and the last file. By doing this he releases 8 blocks, however the first segment is near to storage start, and the second near to storage end. In this case 8 block file will be split into two parts (4 blocks for each part) and will take free space 'holes'. The information about both fragments which are parts of of a single file will be stored to file system.
Apart from user files, file system also stores its own parameters (as block size etc.), file descriptors (that include file size, file location, its fragments etc.), file names and directory hierarchy. It may store also security information, extended attributes and other parameters as well.
There are many requirements to storage performance, stability and other qualities of the file system. To best suit a specific purpose there have been developed many different types of file systems. So at present we can see plenty of file systems that are used on different types of computer systems and serve specific purposes.
Now lets study about various file system and their features quickly:-
Various file systems:-
- FAT (File Allocation Table):
-
Introductions:-
The file system is one of most simple types of file systems. It consists of file system descriptor sector (boot sector or superblock), file system block allocation table (referenced as File Allocation Table) and plain storage space to store files and folders. The files on FAT are stored in directories. Each directory is the array of 32-byte records, each defines file or file extended attributes (like long file name). File record references a first block of file. Any next block could be found through block allocation table by using it as linked-list.
Block allocation table contains array of block descriptors. Zero value indicates block is not used and non-zero indicates reference to next block of the file or special value for end of file.
The number in FAT12, FAT16, FAT32 file system name means how many bits are used to number file system block. This means that FAT12 may use up to 4096 different block references, FAT16 - 65536 and FAT32 - 4294967296. Actual maximum count of blocks is even less and depends on file system driver implementation.
-
Version of FAT file system:-
FAT 12-The initial version of FAT introduced in 1977.
• Primary file system for Microsoft System upto MS-DOS 4.0
FAT 16-Introduced in 1988,primary file system for MS 4.0 upto Windows 95.
• Support drive size upto 2 GB.
FAT 32-Latest version of FAT,introduced in 1996 for windows 95 OSRL Users.
• Support drive size upto 8TB.
-
Applications:-
FAT12 was used for old floppy disks. FAT16 (or simply FAT) and FAT32 are widely used for flash memory cards, USB flash sticks and so on. It is supported by mobile phones, digital cameras and other portable devices.
FAT or FAT32 could be identified as file system, used on Windows-compatible external storages or disk partitions with size below 2GB (for FAT) or 32GB (for FAT32). Windows can not create even FAT32 file system over 32GB (however Linux supports FAT32 up to 2TB).
- NTFS (New Technology File System):
-
Introductions:-
It was introduced in Windows NT and at present is main file system for Windows. It is default file system for disk partions and the only one file system that is supported for disk partitions over 32GB. The file system is quite extensible and supports many file properties, including access control, encryption etc. Each file on NTFS is stored as file descriptor in Master File Table and file content. Master file table contains all information about file: size, allocation, name and so on. The fist and the last sectors of the file system contain file system settings (the boot record or superblock). The file system uses 48 and 64 bit values to reference files thus it supports quite large disk storages.
-
Features:-
1. It uses 64-bit disk addresses and can support disk partitions up to 264 bytes.
2. Individual file names in NTFS are limited to 255 characters. Case sensitive names.
3. Encryption & Data recovery.
4. Compression.
5. File level security.
FILE ENCRYPTION:-
Encrypting file system is used to encrypt files in NTFS.
Generally Public Key cryptography is used
-
Data Recovery:-
It offers data recovery mechanism.
File Compression:-
NTFS can perform data compression on individual files or on all data files in a directory.
-
Security:-
NTFS allows file level security.With NTFS permissions ,one can control which users have what kind of access to which files.
Security can be assigned at two different levels
1. Per user basis
2. On a group basis
Network File System (NFS)
- Network File System (protocol)
-
Introduction:-
Network File System (NFS) is a network file system protocol originally developed by Sun Microsystems in 1984, allowing a user on a client computer to access files over a network in a manner similar to how local storage is accessed. NFS, like many other protocols, builds on the Open Network Computing Remote Procedure Call (ONC RPC) system. The Network File System is an open standard defined in RFCs, allowing anyone to implement the protocol.
-
NFS Protocols:-
NFS accomplishes two client-server protocol-
1. The Mount Protocol- it handles mounting.
2. The NFS protocols- which is for directory & file access.
Daemons
NFS server daemons (nfsd)
Accept RPC calls from client.
- The extended file system (ext)
-
Introduction:-
The extended file system (ext), was released in April 1992 as the first file system using the VFS API and was included in Linux version 0.96c .
-
Features:-
1) allowed 2 gigabytes of data
2) filenames of up to 255 characters.
Limitation of Ext:-
There was no support for separate access
i-node modification and data modification timestamps.
-
Solution:-
A new filesystems were developed in January 1993 by Rémy Card .
- The second extended file system (ext2 )
-
Introduction:-
The Second Extended File system was devised as an extensible and powerful file system for Linux. It is also the most successful file system so far in the Linux community and is the basis for all of the currently shipping Linux distributions.
Ext2 data structures
Physical Layout of the EXT2 File system
i-node structure of ext-2:-
Features of ext -2
POSIX ,ACL and extended attribute were first introduced.
Journaling not allowed with flash drives.
-
Disadvantages:-
Limit of sublevel directory 32768
Cannot handle file larger than 2TB
Block size is limited by architecture
- The ext3 or third extended file system
- Introduction:-
The ext3 or third extended file system is a journaled file system that is commonly used by the Linux kernel. It is the default file system for many popular Linux distributions, including Debian. Stephen Tweedie first revealed that he was working on extending ext2 in Journaling the Linux ext2fs File system in a 1998 paper and later in a February 1999 kernel mailing list posting, and the file system was merged with the mainline Linux kernel in November 2001 from 2.4.15 onward.
- Disadvantages:-
Functionality
Defragmentation
Compression
No check summing in journal..
- The ext4 or fourth extended filesystem
-
Introduction:-
Ext4 is the evolution of the most used Linux filesystem, Ext3. In many ways, Ext4 is a deeper improvement over Ext3 than Ext3 was over Ext2. Ext3 was mostly about adding journaling to Ext2, but Ext4 modifies important data structures of the filesystem such as the ones destined to store the file data. The result is a filesystem with an improved design, better performance, reliability, and features.
-
Features:-
- Compatibility (the filesystem can continue to be mounted as ext3) – This allows users to still read the filesystem from other distributions/operating systems without ext4 support (e.g. Windows with ext3 drivers)
- Improved performance (though not as much as a fully-converted ext4 partition)
- ReiserFS
- Introduction:- the alternative Linux file system with the main purpose to store huge amount of small files. It has good capability of files search and it allows to 'compact' files allocation by storing file tails or small file along with metadata and to not use large file system blocks for this.
- Some Other file systems:-
- XFS - the file system from SGI company who initially used it for their IRIX servers. Now XFS specifications are open it the file system support was implemented in Linux. The XFS file system has great performance and thus widely used as file storage file system.
- JFS - the file system was developed by IBM for their powerful cumputing systems. Saying JFS one usually mean JFS, second edition (JFS2). Currently this file system is open-source and is implemented in most modern Linux distributions.
- UFS:- The most common file system for these OS is UFS (the Unix File System). It is also often called FFS (the Fast File System; it is 'fast' in comparison with a previous file system used for Unix). The UFS is the source of ideas for many other file system implementations.
Currently UFS (in different editions) is supported by all Unix-family OS and is the main file system of BSD OS and Sun Solaris OS. The modern tendency is to implement replacements for UFS in different OS (ZFS for Solaris, JFS and derived file systems for Unix and so on).
-
Clustered file systems:-
The clusterd file systems specifics is that they are used in computer cluster systems. These file systems have embedded support of distributed storage.
- ZFS - Sun company 'Zettabyte File System' - the new file system developed for distrubuted storages of Sun Solaris OS.
- Apple Xsan - the Apple company evolution of CentraVision and later StorNext file systems.
- VMFS - the 'Virtual Machine File System' developed by VMware company for its VMware ESX Server.
- GFS - the Rad Hat Linux 'Global File System'.
- JFS1 - the original (legacy) design of IBM JFS file system used in older AIX storage systems.