霍夫曼编码算法
霍夫曼编码是一种无损数据压缩算法。这种算法将可变长度代码分配给输入的不同字符。代码长度与字符的使用频率相关。使用最频繁的字符具有最短代码,而最不频繁的字符具有最长代码。
主要有两个部分。第一个部分是创建霍夫曼树,另一个部分是遍历树以查找代码。
例如,考虑一些字符串“YYYZXXYYX”,字符 Y 的频率大于 X,而字符 Z 的频率最低。因此,Y 的代码长度小于 X,而 X 的代码长度小于 Z。
根据每个字符的频率为其分配代码的复杂度为 O(n log n)
输入和输出
Input: A string with different characters, say “ACCEBFFFFAAXXBLKE” Output: Code for different characters: Data: K, Frequency: 1, Code: 0000 Data: L, Frequency: 1, Code: 0001 Data: E, Frequency: 2, Code: 001 Data: F, Frequency: 4, Code: 01 Data: B, Frequency: 2, Code: 100 Data: C, Frequency: 2, Code: 101 Data: X, Frequency: 2, Code: 110 Data: A, Frequency: 3, Code: 111
算法
huffmanCoding(string)
输入:不同字符的字符串。
输出:每个单个字符的代码。
Begin define a node with character, frequency, left and right child of the node for Huffman tree. create a list ‘freq’ to store frequency of each character, initially, all are 0 for each character c in the string do increase the frequency for character ch in freq list. done for all type of character ch do if the frequency of ch is non zero then add ch and its frequency as a node of priority queue Q. done while Q is not empty do remove item from Q and assign it to left child of node remove item from Q and assign to the right child of node traverse the node to find the assigned code done End
traverseNode(n: node, code)
输入:霍夫曼树的节点 n 及之前调用分配的代码
输出:与每个字符分配的代码
if a left child of node n ≠φ then traverseNode(leftChild(n), code+’0’) //traverse through the left child traverseNode(rightChild(n), code+’1’) //traverse through the right child else display the character and data of current node.
例子
#include #include#include using namespace std; struct node { int freq; char data; const node *child0, *child1; node(char d, int f = -1) { //assign values in the node data = d; freq = f; child0 = NULL; child1 = NULL; } node(const node *c0, const node *c1) { data = 0; freq = c0->freq + c1->freq; child0=c0; child1=c1; } bool operator<( const node &a ) const { //< operator performs to find priority in queue return freq >a.freq; } void traverse(string code = "")const { if(child0!=NULL) { child0->traverse(code+'0'); //add 0 with the code as left child child1->traverse(code+'1'); //add 1 with the code as right child }else { cout << "Data: " << data<< ", Frequency: "< qu; int frequency[256]; for(int i = 0; i<256; i++) frequency[i] = 0; //clear all frequency for(int i = 0; i1) { node *c0 = new node(qu.top()); //get left child and remove from queue qu.pop(); node *c1 = new node(qu.top()); //get right child and remove from queue qu.pop(); qu.push(node(c0, c1)); //add freq of two child and add again in the queue } cout << "The Huffman Code: "<
输出
The Huffman Code: Data: K, Frequency: 1, Code: 0000 Data: L, Frequency: 1, Code: 0001 Data: E, Frequency: 2, Code: 001 Data: F, Frequency: 4, Code: 01 Data: B, Frequency: 2, Code: 100 Data: C, Frequency: 2, Code: 101 Data: X, Frequency: 2, Code: 110 Data: A, Frequency: 3, Code: 111
广告