在 pandas 数据框中查找子字符串并保存在新列中

Question

我有一个大约有 dataframe。 10,000 行和 10 列。我有一个字符串，我想在数据框中搜索它，名为 'atmosphere'。该字符串连续只能找到一次。我只想保留包含此字符串 但包含其全部内容 的单元格，并将它们保存在新列中。我已经找到了以下解决方案，但它只返回“True”（当单元格包含字符串时）或“False”（当它不包含字符串时）。:

df.apply(lambda col: col.str.contains('atmosphere', case=False), axis=1)
Output:
  col_1  col_2  col_3  col_4 ...
1 True   False  False  False
2 False  True   False  False
3 True   False  False  False 
...

我怎样才能从这个到这个？：

   new_col
1 today**atmosphere**is
2 **atmosphere**humid
3 the**atmosphere**now

Answer 1

如果您已经有了结果，您可以简单地 stack 它：

df = pd.DataFrame({"a":["apple", "orange", "today atmosphere"],
                   "b":["pineapple", "atmosphere humid", "kiwi"],
                   "c":["the atmosphere now", "watermelon", "grapes"]})

                  a                 b                   c
0             apple         pineapple  the atmosphere now
1            orange  atmosphere humid          watermelon
2  today atmosphere              kiwi              grapes


print (df[df.apply(lambda col: col.str.contains('atmosphere', case=False), axis=1)].stack())

0  c    the atmosphere now
1  b      atmosphere humid
2  a      today atmosphere
dtype: object

在 pandas 数据框中查找子字符串并保存在新列中

Find substring in pandas dataframe and save in new column

python

substring

dataframe

pandas