C++ 程序实现用于字符串匹配的 Bitap 算法
本篇为 C++ 程序,介绍了如何实现用于字符串匹配的 Bitap 算法。该算法判断给定的文本是否包含与给定模式“近似相等”的子字符串,其中近似相等由 Levenshtein 距离定义——如果子字符串与模式之间的距离小于给定距离 k,则根据该算法,二者是相等的。该算法首先为模式的每个元素预先计算一组包含一个位的位掩码。这样,我们可以利用位运算完成大部分工作。位运算的速度极其快。
算法
Begin Take the string and pattern as input. function bitmap_search() and it takes argument string text t and string pattern p : Initialize the bit array A. Initialize the pattern bitmasks, p_mask[300] Update the bit array. for i = 0 to 299 p_mask[i] = ~0 for i = 0 to m-1 p_mask[p[i]] and= ~(1L left shift i); for i = 0 to t.length()-1 A |= p_mask[t[i]]; A <<= 1; if ((A and (1L left shift m)) == 0 return i - m + 1 return -1 End
示例代码
#include <string> #include <map> #include <iostream> using namespace std; int bitmap_search(string t, string p) { int m = p.length(); long p_mask[300]; long A = ~1; if (m == 0) return -1; if (m >63) { cout<<"Pattern is too long!";//if pattern is too long return -1; } for (int i = 0; i <= 299; ++i) p_mask[i] = ~0; for (int i = 0; i < m; ++i) p_mask[p[i]] &= ~(1L << i); for (int i = 0; i < t.length(); ++i) { A |= p_mask[t[i]]; A <<= 1; if ((A & (1L << m)) == 0) return i - m + 1; } return -1; } void findPattern(string t, string p) { int position = bitmap_search(t, p);//initialize the position with the function bitmap_search if (position == -1) cout << "\nNo Match\n"; else cout << "\nPattern found at position : " << position; } int main(int argc, char **argv) { cout << "Enter Text:\n"; string t; cin >>t; cout << "Enter Pattern:\n"; string p; cin >>p; findPattern(t, p); }
输出
Enter Text: Tutorialspoint Enter Pattern: point Pattern found at position : 9
广告