长度为 K 的公共前缀字符串的最大数量

在这个问题中，我们需要计算具有长度为 K 的公共前缀的最大字符串数。我们可以从所有字符串中提取长度为 K 的前缀，并使用 map 数据结构计算相似前缀的最大数量。此外，我们还可以使用 Trie 数据结构来解决这个问题。

问题陈述 - 我们给定一个包含多个字符串的 strs[] 数组。我们需要计算包含长度为 K 的公共前缀的字符串的最大数量。

示例

输入

strs = {"tutorialspoint", "tut", "abcd", "tumn", "tutorial", "PQR", "ttus", "tuto"};
 K = 3;

输出

解释 - 这 4 个字符串包含长度为 3 的公共前缀 'tut'。

输入

strs = {"tutorialspoint", "tut", "abcd", "tumn", "tutorial", "PQR", "ttus", "tuto"};
 K = 8;

输出

解释 - 只有 2 个字符串包含相同的长度为 8 的前缀，即 'tutorial'。

输入

strs = {"tutorialspoint", "tut", "abcd", "tumn", "tutorial", "PQR", "ttus", "tuto"};
 K = 8;

输出

解释 - 数组不包含长度为 2 的公共前缀的字符串。因此，我们可以输出 1。

方法 1

在这种方法中，我们将使用 map 数据结构来计算每个子字符串长度为 K 的前缀的频率。最后，我们取频率最高的前缀作为输出。

算法

步骤 1 - 将 'ans' 初始化为 0，用于存储具有公共前缀的最大字符串数。此外，定义 'pref' map 来存储字符串前缀的频率。

步骤 2 - 开始遍历字符串。

步骤 3 - 从 0 开始获取长度为 K 的子字符串，并将其存储到 'temp' 字符串中。

步骤 4 - 在 map 中将 'temp' 的频率加 1。

步骤 5 - 从 'ans' 和 'temp' 字符串的频率中获取最大值。

步骤 6 - 最后，返回 'ans' 值。

示例

#include <bits/stdc++.h>
using namespace std;

int getMaxStrs(vector<string> strs, int K) {
   int ans = 0;
   // Map to store prefix of length K
   map<string, int> pref;
   // Traverse string
   for (int p = 0; p < strs.size(); p++) {
      // Taking substring of length K
      string temp = strs[p].substr(0, K);
      // Insert the prefix into the map
      pref[temp]++;
      // Get the maximum answer
      ans = max(ans, pref[temp]);
   }
   return ans;
}
int main() {
   vector<string> strs = {"tutorialspoint", "tut", "abcd", "tumn", "tutorial", "PQR", "ttus", "tuto"};
   int K = 3;
   cout << "The number of strings having a common prefix of length K is " << getMaxStrs(strs, K);
   return 0;
}

输出

The number of strings having a common prefix of length K is 4

时间复杂度 - 遍历字符串为 O(N)。

空间复杂度 - 在 map 中存储前缀的频率为 O(N)。

方法 2

在这种方法中，我们将使用 Trie 数据结构来查找具有公共前缀的最大字符串数。我们将所有字符串的前缀插入到 Trie 中，并检查它是否出现次数最多。

算法

步骤 1 - 初始化 Trie 的节点，该节点包含长度为 26 的数组，指向当前节点的每个字母字符，并将 'cnt' 变量初始化为 0，表示公共前缀的数量。

步骤 2 - 开始遍历数组的每个字符串，并执行 insertNode() 函数以将字符串长度为 K 的前缀插入到 Trie 中。此外，将 'ans' 变量作为引用传递，以存储具有公共前缀的最大字符串数。

步骤 3 - 在 insertNode() 函数中，使用 'head' 节点初始化 'temp' 节点，并遍历字符串以将其前缀插入到 Trie 中。

步骤 4 - 使用 toLower() 方法将字符转换为小写。

步骤 5 - 如果 temp 节点的 'chars' 数组的 (ch - 'a') 索引为空，则将其赋值为新节点。

步骤 6 - 将 temp->chars[ch - 'a'] 节点的 'cnt' 加 1。

步骤 7 - 如果 p + 1 等于 K，则我们将长度为 K 的前缀插入到 Trie 中。因此，使用 'ans' 和 temp->chars[ch - 'a']->cnt 的最大值更新 'ans'，并中断循环。

步骤 8 - 将 temp 节点移动到下一个节点。

步骤 9 - 最后，打印 'ans' 值。

示例

#include <bits/stdc++.h>
using namespace std;

struct Node {
   Node *chars[26];
   int cnt = 0;
};
Node* head;
void insertNode(string &str, int K, int &ans) {
   // Temporary node
   Node *temp = head;
   // Traverse string characters
   for (int p = 0; p < str.size(); p++) {
      // Change character to lowercase
      char ch = tolower(str[p]);
      // If the node does not exist for the current character, initialize it
      if (temp->chars[ch - 'a'] == NULL) {
         temp->chars[ch - 'a'] = new Node();
      }
      // Increase count to increment the length of the prefix
      temp->chars[ch - 'a']->cnt++;
      // If p + 1 is equal to K, then get the maximum result and break the loop.
      if (p + 1 == K) {
         ans = max(ans, temp->chars[ch - 'a']->cnt);
         break;
      }
      // Go to the next pointer
      temp = temp->chars[ch - 'a'];
   }
}
int main() {
   vector<string> strs = {"tutorialspoint", "tut", "abcd", "tumn", "tutorial", "PQR", "ttus", "tuto"};
   int K = 3;
   int ans = 0;
   // Node initialization
   head = new Node(); 
   // Insert all the strings into Trie
   for (auto str : strs) {
      insertNode(str, K, ans);
   }
   cout << "The number of strings having a common prefix of length K is " << ans;
   return 0;
}

输出

The number of strings having a common prefix of length K is 4

时间复杂度 - O(N*K)，其中 N 是数组长度，K 是前缀长度。

空间复杂度 - 在 Trie 中存储所有字符串为 O(N*K)。

第一种方法更有效，并且对于初学者更容易理解。第二种方法可能比较复杂，但学习 Trie 数据结构的概念是必要的。

Shubham Vora

更新于：2023年8月31日

浏览量 128

开启你的职业生涯

完成课程获得认证

开始学习