2009-08-18 00:39:28

Additional information about file extensions

Usage of file extensions

Filename extensions can be considered as a type of metadata. They are most-commonly used to determine information about the way data might be stored in the file. The exact definition what part of the file name is its extension, belongs to the rules of the specific filesystem used. Usually the extension is the substring which follows the last occurrence, if any, of the dot character (e.g. txt is the file extension of the filename readme.txt, or readme.some.file-base-name.additional.text.txt).

Under Microsoft's Windows and older MS-DOS, some extensions, including exe, com, bat, and cmd, vbs, indicate that a file is an executable program. This is different from Unix-like operating systems, where a suffix is not a separate namespace from base filename, and where even having a filename suffix is voluntary, as file system permissions are used to decide whether a file is executable.

With the advent of graphical user interfaces, the issue of file management and interface behavior arose. Microsoft Windows allowed multiple applications to be associated with a given extension, and different actions were available for selecting the required application, such as a context menu offering a choice between viewing (playing), editing, burning or printing the file etc..

Windows Vista Explorer window with file extensions display settings on Windows Vista Eplorer thumbnail view with various file type related actions Apple MAC OS X Finder window with file types listing and file extensions enabled

Windows Vista Explorer window screenshot with showing file extensions enabled, list view with file types

Windows Vista Explorer window screenshot with showing file extensions enabled, large thumbnail view with various selected file type actions

Apple MAC OS X (Leopard) Finder window with file types listing and file extensions enabled

Click at thumbnail to enlarge the image and view it in full size

Historical limitations

File extensions were used in Digital Equipment Corporation (DEC) operating systems (for example, TOPS-10, OS/8 and RT-11). Operating system called CP/M adopted the convention and Microsoft's MS-DOS, as a re-implementation of CP/M, did so as well.

The DEC operating systems internally divide the filename into a base filename and a filename extension. The base filename lenght was restricted to five to eight characters (initially six characters in RT-11 and nine characters in RSX and VMS). The filename suffix was limited to two or three characters. When a base filename/filename extension was typed in commands, a period (.) was placed between the base filename and filename extension.

CP/M worked the same way - the base filename was limited to eight characters and the filename extension was limited to three and with a dot between them.

Early versions of the FAT filesystem used in Microsoft's MS-DOS and Microsoft Windows imposed the same file naming limitations. This is sometimes referred to as the 8.3 filename convention. It can be generalized as: FILENAME.EXT - the word filename is eight letters long base filename and ext is a reasonable abbreviation for filename extension.

Most modern Microsoft's operating systems still using file extensions and do not attribute any special meaning to the "." (period, or dot) character. The 8.3 filename limit was extended to much more characters and there is new file extensions with more than three characters and can better describe type of file. But some historically common file types still using three characters as a file extension (e.g. txt, pdf, mp3 and also for quick typing of filename). Newly four/five character file extensions are used for variations of file types from three characters base file extensions (e.g. docx, docm or xlsx, xlsm, xlsb etc.)

Improvements

The filename extension was originally used to easily determine the file's generic type. The need to condense a file's type into three characters frequently led to bizarre extensions. Examples include using .txt for plain text files, .gfx for graphics files, and .mus for music files, .dat as data files etc. However, because many different software programs have been made that all handle these data types (and others) in a variety of ways, filename extensions started to become closely associated with certain products - even specific product versions. For example, early Corel Draw! graphic files used .cdr or .cdr3, cdr4, cdr(n), where n was the program's version number and back to cdr without version number for all above versions of Corel Draw!.

Also, the same filename extensions began to conflict between separate file types. One example is .mpp file extension used for Microsoft Project files and Musepack audio files and more others file types with same extension name. Another example .ttf file extension is shared by TrueType fonts, Quartus II tabular text file, and Optigraphics Tiled image file and many others conflict cases.

The High Performance File System (HPFS), used in Microsoft and IBM's OS/2 also supported long file names, and didn't divide the file name into a name and an extension. However, the convention of using extensions continued, even though HPFS supported extended file attributes, allowing a file's type information to be stored with the file as an extended attribute.

Microsoft's Windows NT's native file system - NTFS, supported long file names and didn't divide the file name into a name and an extension, but again, the convention of using suffixes to simulate extensions continued, for compatibility with existing versions of Windows.

When the Internet age first arrived, those using Windows systems that were still restricted to 8.3 filename formats had to create web pages with names ending in .htm, while those using Macintosh or Unix computers could use the recommended .html filename extension. This also became a problem for programmers experimenting with the Java programming language, since it requires source code files to have the four-letter suffix .java and compiles object code output files with the five-letter .class suffix.

Eventually, Microsoft Windows introduced support for long file names, and removed the 8.3 name/extension split in file names, in an extended version of the commonly used FAT file system called VFAT. VFAT first appeared in Windows NT 3.5 and Windows 95. The internal implementation of long file names in VFAT is largely considered to be a kludge, but it removed the important length restriction, and allowed files to have a mix of upper case and lower case letters, on machines that would not run Windows NT well. However, the use of three-character extensions under Microsoft Windows has continued, originally for backward compatibility with older versions of Windows and now by habit, along with the problems it creates.

File extension alternatives

In network contexts, files are regarded as streams of bits and do not have filenames or extensions. On the Internet, the type of a bitstream is stated as the Internet media type of the stream (also called the MIME type or content type). This is given in a line of text preceding the stream sent to the client, such as: Content-type: text/plain for simple text file type.

Here at File-Extensions.org you can find out MIME types to many file extensions / file formats. If exists, information about MIME type is stated in description of file extension.

BFS file system from BeOS supports extended attributes and it tags a file with its Internet media type as an extended attribute. The KDE and GNOME desktop environments associate an Internet media type with a file by examining both the filename extension and the contents of the file, in the fashion of the file command, as a heuristic. They choose the application to launch when a file is opened based on that Internet media type, reducing the dependency on filename extensions.

Mac OS X uses both filename extensions and media types, as well as file type codes, to select a Uniform Type Identifier by which to identify the file type internally.

 

Additional resources: Credit: Wikipedia, the free encyclopedia: Filename extension, Computer file, File format

 

 

File extension list tablesHere at File-extensions.org you can find a huge searchable file extensions library that contains thousands of file extension records. A large number of file type entries have detailed explanation of each file type and the way they are used today and also contain linked programs that can view, open, edit, convert or play unknown file type you are looking for.

 
 

Add new comment about “Additional information about file extensions

Enter any file extension without dot (e.g. pdf)
Search for file extension details and associated application(s)

RSS feed