如何通过正则表达式过滤 Pandas 中的行？

正则表达式（regex）是一个字符序列，用于定义搜索模式。为了通过正则表达式过滤 Pandas 中的行，我们可以使用 str.match() 方法。

步骤

创建一个二维、可变大小、潜在非同质表格数据，df。
打印输入 DataFrame，df。
初始化一个用于表达式的变量 regex。提供一个字符串值作为正则表达式，例如，字符串 'J.*' 将过滤以字母“J”开头的所有条目。
使用 df.column_name.str.match(regex) 通过所提供的正则表达式过滤给定列名中的所有条目。

示例

Open Compiler

import pandas as pd

df = pd.DataFrame(
   dict(
      name=['John', 'Jacob', 'Tom', 'Tim', 'Ally'],
      marks=[89, 23, 100, 56, 90],
      subjects=["Math", "Physics", "Chemistry", "Biology", "English"]
   )
)

print "Input DataFrame is:\n", df

regex = 'J.*'
print "After applying ", regex, " DataFrame is:\n", df[df.name.str.match(regex)]

regex = 'A.*'
print "After applying ", regex, " DataFrame is:\n", df[df.name.str.match(regex)]

Explore our latest online courses and learn new skills at your own pace. Enroll and become a certified expert to boost your career.

输出

Input DataFrame is:

     name    marks   subjects
0    John     89        Math
1   Jacob     23     Physics
2     Tom    100   Chemistry
3     Tim     56     Biology
4    Ally     90     English

After applying J.* DataFrame is:

    name   marks   subjects
0   John     89        Math
1  Jacob     23     Physics

After applying A.* DataFrame is:

    name   marks   subjects
4   Ally     90     English

Rishikesh Kumar Rishi

更新于： 2021 年 9 月 14 日

17K+ 次浏览

开启你的职业之旅

完成课程获得认证

开始

如何通过正则表达式过滤 Pandas 中的行？

步骤

示例

输出

开启你的 职业之旅

开启你的职业之旅