当列名中存在空格时如何使用“pd.Series.str.contains()”？

Question

我想在列中找到子字符串的第一次出现。

例如，如果我有以下数据框..

# Create example dataframe
import pandas as pd
data = {
        'Tunnel ID':['Tom', 'Dick', 'Harry'],
        'State':['Grumbly', 'Very Happy', "Happy"],
        'Length':[302, 285, 297]
        }
df = pd.DataFrame(data)

.. 我可以使用以下方法在 'State' 列中找到第一次出现的 'Happy'：

# Returns index 1
first_match = df.State.str.contains('Happy').idxmax()

但是，如果我想在 'Tunnel ID' 中找到 'ic' 的第一个匹配项：

# Returns syntax error because of space in col name.
first_match = df.Tunnel ID.str.contains('ic').idxmax()
# Would ideally return index: 1; containing ID: 'Dick'.

那么尝试使用 pd.Series.str.contains() 并且 pd.Series 包含空格会怎样？

Answer 1

您还可以通过索引到您的数据框而不是使用点符号来访问您的专栏。所以就这样做

first_match = df["Tunnel ID"].str.contains('ic').idxmax()

你应该可以开始了

当列名中存在空格时如何使用“pd.Series.str.contains()”？

How to use `pd.Series.str.contains()` when whitespace present in column name?

python

series

pandas