What’s up with Mac OS X Resource forks, Extended Attributes, NTFS Streams and Dot-Underscore files?
- Document Type:
Lets review the latest developments in the way Macintosh files are stored.
Macintosh files have always had at least three parts, and now, with Extended Attributes, they can have any number of parts. Extended Attributes will be explained below. Statements that the “resource fork is dead” are really mis-interpretations of the changes in the evolution from Resource Forks to Extended Attributes.
System Administrators need to understand Mac multi-part files, in order to to preserve files created by Macintosh during data replication, HSM, backup and other administrator movement of data. Failure to understand Mac multi-part files can lead to data corruption which will cost your organization money.
The original Mac OS 9 file structure was three “forks” or parts – the data fork, resource fork and Finder information
- Data fork – This is where the data is saved by an application. It is the equivalent of a “file” on the PC.
- Resource fork – This hidden file fork contains additional information about the file depending upon what application created it. For example, BBedit, a text editing application, stores the text of a document in the data fork like any other application, but also saves the location of the cursor in the resource fork so that the next time you open the document the cursor will be right where you left off. Although different applications use the resource fork for different purposes, in general, it is used to store additional information about a file beyond the generic data.
- Finder info – This stores information called “metadata” (data about data). The metadata, maintained by Mac OS X, is required to provide the full Macintosh user experience. Some of this metadata (such as creation date and owner) is common to both Windows and the Mac file systems. Other metadata (such as type, creator and label) are unique to just the Macintosh. Finder info is sometimes called HFS metadata because on Macintosh disks that are formatted with HFS and HFS+ the Finder info is stored directly in the file system. In other case it is referred to as AFP_info because that is the name of the NTFS alternate data stream used by ExtremeZ-IP and the old Services for Macintosh to store it. The Finder information includes information such as owner, type, creator, date modified, date created, name, label, and visibility.
Starting with Mac OS X 10.4 Tiger and refined with Mac OS X 10.5 Leopard, Apple replaced the original Mac OS 9 fork file structure with a new structure called Extended Attributes which you can read about here: Extended File Attributes. As this Wikipedia article explains, many modern file systems have adopted Extended Attributes to gain the power of extensible files for metadata tagging. Extended Attributes allow for an unlimited number of “forks” so that each vendor may have their own and standards will emerge for shared types.
Different File Systems
On Mac OS X, Extended Attributes are managed by HFS+.
On Windows, SFM set a standard for mapping Macintosh file forks to the NTFS version of Extended Attributes know as NTFS Streams. This “SFM format” was adopted by ExtremeZ-IP as well as other products such as Thursby’s Dave and AdmitMac. Although Microsoft never updated SFM to support Extended Attributes, ExtremeZ-IP has lead the way to extend the SFM format approach with NTFS Streams for that purpose.
ExtremeZ-IP supports Macintosh resource forks and metadata using Windows NTFS alternate data streams. ExtremeZ-IP’s use of alternate data streams is compatible with Microsoft’s “Services for Macintosh,” meaning that files saved using SFM can be shared by ExtremeZ-IP without the need to migrate or convert the data. However, ExtremeZ-IP also offers support for the 3.x versions of the Apple Filing Protocol that Mac OS X introduced.
Unlike the SMB format that uses two files to keep track of the data fork, resource fork, and metadata, ExtremeZ-IP uses one file that has a main data stream, and two alternate data streams. These streams are the main stream for the data fork, the AFP_Info stream that contains metadata about the file (Finder info), and the AFP_Resource stream that contains the resource fork if one is present. All files created by the Mac will have an AFP_Info stream to maintain the Finder information, but not all files will have a resource fork. Because the NTFS file system shows all of these data streams as one file, when a Windows user moves this file to a new folder, the entire file including all three streams will be moved to the new folder and all of the Macintosh metadata such as type and creator will be maintained.
With the SMB client, Apple needed a solution that would be compatible with SMB servers that did not support Extended Attributes such as some versions of Unix and Linux. This special format is called AppleDouble to encode the metadata and resource fork. As the name implies, the AppleDouble format uses two files, the data fork and another file that contains the combined metadata and resource fork information. This second file is the so called “dot-underscore” (._) file. Because the dot-underscore file contains the metadata for the file in addition to the resource fork, if this information is lost, you will lose the linkage between the file & its creating application. In cross platform environments this is a frequent occurrence because when Windows users move the file to a new folder, the dot-underscore will not be moved at the same time. It is also a fairly common occurrence for a Windows user to delete the “._” extra file with a similar name, because they do not know what it is!
Some other useful material may be found in the following:
You can learn more about this in a webcast presentation given by Group Logic in the MacEnterprise webcast series:
Group Logic/MacEnterprise March 2007 Webcast
When I save a file using SMB protocol, what information is saved in the “dot-underscore” (._) files? How is this information stored on an NTFS file system? Group Logic KB217
Migrating from MacServerIP 8.1 to ExtremeZ-IP. Group Logic KB189