Term frequently applied to
filesystems on
computers.
Data is usually stored on your hard drive in blocks. The blocks for a particular file need not be in order, or together on the disk (contiguous). In fact, over time, they are likely to end up out of order and scattered, as files are moved, deleted, copied, replaced, and the free space on your drive becomes fragmented. The computer will then happily break up your files into thousands of tiny pieces to fit them in and you'll never know the difference - except in terms of speed (and possibly, equipment wear), as the hard drive whirs and clicks all over its recordable surfaces to reconstruct your data. Also, cache locality is reduced, so more physical reads/writes are likely to happen per file operation.
Defragmentation is therefore a maintenance process some people perform (via a defragmenting program such as the one in Norton Utilities, or Windows' built-in disk utilities) on their hard drives which "organizes" the data on the drive - maneuvering all the blocks of all the files so that they become contiguous again. Some more sophisticated defragmenters attempt to group files frequently used together to the same region on the disk, and/or move more frequently used files to the "front" of the disk (usually, this means the innermost part of the platter) which is theoretically slightly "faster."
Most Windows machines and Macintoshes could benefit from occasional defragmenting - perhaps once a year depending on use. I have known a number of people (even otherwise very technically competent people) who have become obsessive defragmenters; in many way the trend towards defragmenting every night or every week is reminiscent of the "screen saver" craze. It plays on flattery by allowing you to think you are being proactive and clever in maintaining your machine, and it is "theoretically" beneficial - infinitessimally so. It's also an outlet for frustration with the speed of your computer - though I have never seen an improvement large enough to be noticeable to the human eye from a defragmentation process, even fifteen years ago, when it was conceivable that I might have.
Most modern filesystems, such as ext2 or reiserfs (on Linux) are what can be described as self-defragmenting. That is, they use organizational schemes on the disk that promote contiguity by their nature. I quote from Theodore Tso's excellent ext2 paper:
Ext2fs also contains many allocation optimizations. Block groups are used to cluster together related inodes and data: the kernel code always tries to allocate data blocks for a file in the same group as its inode. This is intended to reduce the disk head seeks made when the kernel reads an inode and its data blocks.
When writing data to a file, Ext2fs preallocates up to 8 adjacent blocks when allocating a new block. Preallocation hit rates are around 75% even on very full filesystems. This preallocation achieves good write performances under heavy load. It also allows contiguous blocks to be allocated to files, thus it speeds up the future sequential reads.
These two allocation optimizations produce a very good locality of:
- related files through block groups
- related blocks through the 8 bits clustering of block allocations.