Archive for the ‘sorting’ Category

external sorting?

August 31, 2007

A sorting program that sorts items that are on secondary storage (disk or tape) rather than primary storage (memory) is called an external sort. Exactly how to sort large data depends on what is meant by ?too large to fit in memory.? If the items to be sorted are themselves too large to fit in memory (such as images), but there aren?t many items, you can keep in memory only the sort key and a value indicating the data?s location on disk. After the key/value pairs are sorted, the data is rearranged on disk into the correct order. If ?too large to fit in memory? means that there are too many items to fit into memory at one time, the data can be sorted in groups that will fit into memory, and then the resulting files can be merged. A sort such as a radix sort can also be used as an external sort, by making each bucket in the sort a file. Even the quick sort can be an external sort. The data can be partitioned by writing it to two smaller files. When the partitions are small enough to fit, they are sorted in memory and concatenated to form the sorted file.