Burrows–Wheeler Transform - Sample Implementation

Sample Implementation

This Python implementation sacrifices speed for simplicity: the program is short, but takes more than the linear time that would be desired in a practical implementation.

Using the null character as the end of file marker, and using s + s to construct the ith rotation of s, the forward transform takes the last character of each of the sorted rows:

def bwt(s): """Apply Burrows-Wheeler transform to input string.""" assert "" not in s, "Input string cannot contain null character ('\0')" s += "" # Add end of file marker table = sorted(s + s for i in range(len(s))) # Table of rotations of string last_column = for row in table] # Last characters of each row return "".join(last_column) # Convert list of characters into string

The inverse transform repeatedly inserts r as the left column of the table and sorts the table. After the whole table is built, it returns the row that ends with null, minus the null.

def ibwt(r): """Apply inverse Burrows-Wheeler transform.""" table = * len(r) # Make empty table for i in range(len(r)): table = sorted(r + table for i in range(len(r))) # Add a column of r s = # Find the correct row (ending in "") return s.rstrip("") # Get rid of trailing null character

Here is another, more efficient method for the inverse transform. Although more complex, it increases the speed greatly when decoding lengthy strings.

def ibwt(r, *args): "Inverse Burrows-Wheeler transform. args is the original index if it was not indicated by a null byte" firstCol = "".join(sorted(r)) count = *256 byteStart = *256 output = * len(r) shortcut = *len(r) #Generates shortcut lists for i in range(len(r)): shortcutIndex = ord(r) shortcut = count count += 1 shortcutIndex = ord(firstCol) if byteStart == -1: byteStart = i localIndex = (r.index("x00") if not args else args) for i in range(len(r)): #takes the next index indicated by the transformation vector nextByte = r output = nextByte shortcutIndex = ord(nextByte) #assigns localIndex to the next index in the transformation vector localIndex = byteStart + shortcut return "".join(output).rstrip("x00")

Read more about this topic:  Burrows–Wheeler Transform

Other articles related to "sample implementation, implementation":

Push-relabel Maximum Flow Algorithm - Sample Implementation
... C implementation #include #include #define NODES 6 #define MIN(X,Y) X < Y ? X Y #define INFINITE 10000000 void push(const int **C, int **F, int *excess, int u. 0 # start from front of list else p += 1 return sum(F) Note that the above implementation is not very efficient ...

Famous quotes containing the word sample:

    All that a city will ever allow you is an angle on it—an oblique, indirect sample of what it contains, or what passes through it; a point of view.
    Peter Conrad (b. 1948)