The Burrows–Wheeler transform (BWT, also called block-sorting compression), is an algorithm used in data compression techniques such as bzip2. It was invented by Michael Burrows and David Wheeler in 1994 while working at DEC Systems Research Center in Palo Alto, California. It is based on a previously unpublished transformation discovered by Wheeler in 1983.
When a character string is transformed by the BWT, none of its characters change value. The transformation permutes the order of the characters. If the original string had several substrings that occurred often, then the transformed string will have several places where a single character is repeated multiple times in a row. This is useful for compression, since it tends to be easy to compress a string that has runs of repeated characters by techniques such as move-to-front transform and run-length encoding.
The output is easier to compress because it has many repeated characters. In fact, in the transformed string, there are a total of six runs of identical characters: XX, SS, PP, .., II, and III, which together make 13 out of the 44 characters in it.
Other related articles:
... (HTS) techniques at the end of the 2000 decade has led to another application of the Burrows–Wheeler transformation ... several alignment programs were developed (Bowtie, BWA, and SOAP2) which use the Burrows–Wheeler transform ...
Famous quotes containing the word transform:
“The inspired scribbler always has the gift for gossip in our common usage ... he or she can always inspire the commonplace with an uncommon flavor, and transform trivialities by some original grace or sympathy or humor or affection.”
—Elizabeth Drew (18871965)