C++程序：查找序列中出现次数第二多的单词

给定一个单词数组，我们需要找到频率第二高的单词。

让我们假设一些简单的输入和输出场景

假设我们有一个包含如下元素的数组：[“point,” “world,” “articles,” “articles,” “articles,” “world,” “world,” “world,” “point”]。

单词的频率为：

“point”: 2
“world”: 4
“articles”: 3 // This will be the second most repeated word in the array.

因此，出现次数第二多的单词是“articles”，我们的输出是“articles”。

让我们考虑另一种情况，其中数组中有一组相似的单词，例如[“abc”, “abc”, “abc”, “bab”, “bab”, “cat”]。序列中出现次数第二多的单词将是“bab”。

“abc” = 3
“bab” = 2 // This will be the second most repeated word in the array.
“cat” = 1

我们可以使用哈希表查找给定单词数组中每个单词的频率，然后返回第二高的频率。在C++中，我们可以使用`unordered_map`数据结构。

算法

执行此任务需要遵循以下步骤：

实现一个哈希表。
将向量中每个字符串的频率存储在哈希表中。
在将每个字符串存储到哈希表后遍历哈希表，以查找具有第二高频率的字符串。
然后，打印该字符串作为输出

示例

查找列表中出现频率第二高的单词的C++实现如下：


#include <iostream>
#include <vector>
#include <unordered_map>
using namespace std;
string solve(vector<string> words) {
   unordered_map<string, int> hash;
   for (auto word: words) {
      hash[word]++;
   }
   int first_max = INT32_MIN, sec_max = INT32_MIN;
   for (auto it: hash) {
      if (it.second > first_max) {
         sec_max = first_max;
         first_max = it.second;
      } else if (it.second > sec_max && it.second != first_max) {
         sec_max = it.second;
      }
   }
   for (auto it: hash) {
      if (it.second == sec_max) {
         return it.first;
      }
   }
   return "";
}
int main() {
   vector<string> words = { "point", "world", "articles", "articles", "articles", "world", "world", "world", "point" };
   cout << solve(words);
   return 0;
}

输出

“articles”

结论

我们也可以在C++中使用`map`，但这没有必要，因为我们不希望在哈希时按字典顺序对单词进行排序。使用`map`也会比较昂贵，因为插入操作需要O(log(n))的时间复杂度，而`unordered_map`只需要O(1)。然而，对于更大的输入，为了避免冲突并保持时间复杂度为O(1)，我们必须编写自定义哈希函数。

Prateek Jangid

更新于：2022年8月10日

浏览量：313

开启你的职业生涯

通过完成课程获得认证

开始学习