从给定句子中查找以给定单词作为前缀的单词


在处理自然语言处理或文本分析时,通常需要在较大的文本主体中搜索特定的单词或短语。一项常见的任务是在句子中查找所有以给定前缀开头的单词。在本文中,我们将探讨如何完成此任务。

算法

  • 读取输入句子和前缀。

  • 将输入句子分解成单个单词。

  • 对于句子中的每个单词,检查它是否以给定前缀开头。

  • 如果单词以该前缀开头,则将其添加到匹配单词列表中。

  • 打印匹配单词列表。

示例

以下是解决该问题的程序:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main() {
   // Declare variables
   char sentence[] = "The quick brown fox jumps over the lazy dog";
   char prefix[] = "fox";
   char *word = strtok(sentence, " "); // Tokenization using space as delimiter
   char *words[100]; // Array to store tokenized words
   int wordCount = 0;

   // Tokenize the sentence into individual words
   while (word != NULL) {
      words[wordCount++] = word; // Store each word in the array
      word = strtok(NULL, " "); // Get the next word
   }

   char *matches[100]; // Array to store matching words
   int matchCount = 0;

   // Check for words that start with the given prefix
   for (int i = 0; i < wordCount; i++) {
      if (strncmp(words[i], prefix, strlen(prefix)) == 0) { // Compare with prefix
         matches[matchCount++] = words[i]; // Store matching words
      }
   }

   // Print the matching words
   printf("Matching words:\n");
   for (int i = 0; i < matchCount; i++) {
      printf("%s\n", matches[i]);
   }

   return 0;
}

输出

Matching words:
Fox
#include <iostream>
#include <string>
#include <vector>

using namespace std;

int main() {
   string sentence, prefix;
   vector<string> words;
   
   // Read in the input sentence and prefix
   sentence="The quick brown fox jumps over the lazy dog";
   prefix="fox";
   
   // Tokenize the input sentence into individual words
   string word = "";
   for (auto c : sentence) {
      if (c == ' ') {
         words.push_back(word);
         word = "";
      }
      else {
         word += c;
      }
   }
   words.push_back(word);

   // Find all words in the sentence that start with the given prefix
   vector<string> matches;
   for (auto w : words) {
      if (w.substr(0, prefix.length()) == prefix) {
         matches.push_back(w);
      }
   }
   
   // Print the list of matching words
   cout << "Matching words:" << endl;
   for (auto m : matches) {
      cout << m << endl;
   }
   
   return 0;
}

输出

Matching words:
Fox
import java.util.ArrayList;

public class Main {
   public static void main(String[] args) {
      // Declare variables
      String sentence = "The quick brown fox jumps over the lazy dog";
      String prefix = "fox";
      String[] words = sentence.split(" "); // Tokenization using space as delimiter

      ArrayList<String> matches = new ArrayList<>(); // ArrayList to store matching words

      // Check for words that start with the given prefix
      for (String w : words) {
         if (w.startsWith(prefix)) {
            matches.add(w); // Store matching words in ArrayList
         }
      }

      // Print the matching words
      System.out.println("Matching words:");
      for (String m : matches) {
         System.out.println(m);
      }
   }
}

输出

Matching words:
Fox
sentence = "The quick brown fox jumps over the lazy dog"
prefix = "fox"
words = sentence.split() # Tokenization using space as delimiter

matches = [w for w in words if w.startswith(prefix)] # List comprehension to find matching words

print("Matching words:")
for m in matches:
   print(m)

输出

Matching words:
Fox

测试用例示例

假设我们有以下输入句子:

The quick brown fox jumps over the lazy dog

并且我们想要查找所有以前缀“fox”开头的单词。使用此输入运行上述代码将产生以下输出

在此示例中,句子中唯一以前缀“fox”开头的单词是“fox”本身,因此它是唯一打印为匹配项的单词。

结论

查找句子中所有以给定前缀开头的单词是自然语言处理和文本分析中一项有用的任务。通过将输入句子分解成单个单词并检查每个单词是否存在匹配的前缀,我们可以轻松完成此任务。

更新于:2023年10月20日

206 次查看

开启您的 职业生涯

通过完成课程获得认证

立即开始
广告