Windows 8 will include the next generation file system. The Resilient File System, or ReFS for short, was designed for new approaches to data storage. During development, special emphasis was placed on backward compatibility with NTFS. The system is characterized by its ability to repair damaged data on the fly. It is not necessary to unmount the entire volume and have it checked by ScanDisk. Its structure allows for better scalability than NTFS.
Key features
ReFS is primarily a robust and highly available file system. It can check and self-repair corrupted data. It is optimized for extreme scalability. It is suitable for both embedded devices for the .NET Micro Framework and data centers running Windows Azure. The system was designed to isolate any erroneous data and allow the administrator to restore it from a backup without having to shut down the entire system. Storage virtualization is intended to simplify the creation and management of a file system.
Structure
All data in a file system is stored in B+ trees, unlike NTFS, which stored some data in additional tables and metafiles. This has contributed significantly to the simplification of the system and the loss of source code. In addition, the entire tree representing the file system as a whole allows for much better scalability. In addition, individual nodes allow you to iterate through keyvalue elements, so that they can be viewed as tables. After this simplification, the system can be looked at as follows:
NTFS compatibility
Some advanced NTFS features will be preserved, while others will not be implemented if they had a high impact on the overall complexity of the system or the backup of the entire partition. The preserved functions include
- BitLocker encryption,
- Access control list,
- Journaling
- Reporting changes,
- Symbolic link to files and directories
- Mounting a volume in a directory.
- Additional directory information.
- Copy
- unique file identifier; and
- Locking files.
Features that will no longer be supported include
- secondary streams,
- Object identifier
- Shortened file names
- File compression
- hardlinks,
- sparse files,
- File-level encryption
- file details and
- disk quotas.
Secondary streams, also called alternate streams or named streams, think of a file as a directory. This little-known feature allows you to add additional files to the file. For example, path C:\muj_soubor.txt:stream.xml
is perfectly fine and points to the stream. When a browser downloads a file from the Internet, this information is stored there. However, it has major disadvantages. These streams are not retained when transferring files between different file systems, FTP transfers, compressing, and so on. Additionally, the size of streams doesn't count towards file size. Last but not least, Explorer can't even make the data in those streams available.
Abbreviated file names, 8 characters for the name and 3 for the extension, are a remnant of MS-DOS and have no practical use today. As far as compression is concerned, the SandForce controller on the SSD can compress much more efficiently, so compressing the files themselves is rather harmful these days. Hardlinks have the disadvantage that they can only be created within a bundle. Disk quotas are losing their importance, because nowadays user data is stored more in SQL Server, which has its own system for limiting the size of the database. The file details are replaced with metadata.
Resistance
Windows 8 brings new features of data storage, collectively called Storage Spaces. It allows you to organize physical disks connected via SATA, SAS and USB into disk storage. They can create virtual partitions that are resistant to hardware failures. Basically, it is RAID for home use. The capacity of physical disks can vary, as well as the bus through which they are connected.
When you start to run out of space while writing data, Windows displays a warning and prompts the user to purchase a new disk. It is simply plugged into the PC and connected to the storage. The system will then take care of the rest. Individual Spaces, such as Documents or Desktop, can be mirrored and files in them can be written to two or even three physically different storages. For Videos, Pictures, Podcasts or Virtual Machines, parity, i.e. duplicate information from which lost data can be calculated in the event of a hardware failure, will be more suitable.
Storage Space Documents, which is mirrored to two physical disks connected via USB. Storage Spaces may be available to other computers in your home.
If a disk fails, Action Center displays a warning. The control panels will then give a clearer explanation of the situation.
A lot of errors occur due to data corruption due to writing to the wrong place, or loss of the record. There are relatively few cases where the hardware itself fails. ReFS can detect these errors thanks to checksums. If this happens, ReFS instructs Storage Spaces to overwrite the faulty files with a backup copy.
ReFS has nothing in common with WinFS. In the case of WinFS, it was more about indexing data, in ReFS it is about the robustness of the infrastructure. Due to the significant changes and the conservative approach to SQL Server development, it seems that databases will not be able to be created on ReFS yet.
The article was written for the Czech MSDN Blog and TechNet Blog CZ/SK.