pandas - 'dataframe' 对象没有属性 'str'
pandas - 'dataframe' object has no attribute 'str'
我正在尝试过滤掉包含产品列表的数据框。但是,每当我 运行 代码时,我都会收到 pandas - 'dataframe' object has no attribute 'str' 错误。
这是代码行:
include_clique = log_df.loc[log_df['Product'].str.contains("Product A")]
如果大家有什么想法建议,请告诉我。我已经搜索了很多次,但我很困惑。
产品是对象数据类型。
编辑:
import __future__
import os
import pandas as pd
import numpy as np
import tensorflow as tf
import math
data = pd.read_csv("FILE.csv", header = None)
headerName=["DRID","Product","M24","M23","M22","M21","M20","M19","M18","M17","M16","M15","M14","M13","M12","M11","M10","M9","M8","M7","M6","M5","M4","M3","M2","M1"]
cliques = [(Confidential)]
data.columns=[headerName]
log_df = data
log_df = np.log(1+data[["M24","M23","M22","M21","M20","M19","M18","M17","M16","M15","M14","M13","M12","M11","M10","M9","M8","M7","M6","M5","M4","M3","M2","M1"]])
copy = data[["DRID","Product"]].copy()
log_df = copy.join(log_df)
include_clique = log_df.loc[log_df['Product'].str.contains("Product A")]
这是头:
ID PRODUCT M24 M23 M22 M21
0 123421 A 0.000000 0.000000 1.098612 0.0
1 141840 A 0.693147 1.098612 0.000000 0.0
2 212006 A 0.693147 0.000000 0.000000 0.0
3 216097 A 1.098612 0.000000 0.000000 0.0
4 219517 A 1.098612 0.693147 1.098612 0.0
编辑 2:这里是 print(data),A 是产品。打印出来的时候好像A不在产品类别下。
DRID Product M24 M23 M22 M21 M20 \
0 52250 A 0.0 0.0 2.0 0.0 0.0
1 141840 A 1.0 2.0 0.0 0.0 0.0
2 212006 A 1.0 0.0 0.0 0.0 0.0
3 216097 A 2.0 0.0 0.0 0.0 0.0
简答:将data.columns=[headerName]
改为data.columns=headerName
说明:当你设置data.columns=[headerName]
时,列是MultiIndex对象。因此,您的 log_df['Product']
是一个 DataFrame,对于 DataFrame,没有 str
属性。
当您设置 data.columns=headerName
时,您的 log_df['Product']
是单列,您可以使用 str
属性。
出于任何原因,如果您需要将数据保存为 MultiIndex 对象,还有另一种解决方案:首先将您的 log_df['Product']
转换为 Series。之后,str
属性可用。
products = pd.Series(df.Product.values.flatten())
include_clique = products[products.str.contains("Product A")]
不过,我想第一个解决方案就是您要找的
我正在尝试过滤掉包含产品列表的数据框。但是,每当我 运行 代码时,我都会收到 pandas - 'dataframe' object has no attribute 'str' 错误。
这是代码行:
include_clique = log_df.loc[log_df['Product'].str.contains("Product A")]
如果大家有什么想法建议,请告诉我。我已经搜索了很多次,但我很困惑。
产品是对象数据类型。
编辑:
import __future__
import os
import pandas as pd
import numpy as np
import tensorflow as tf
import math
data = pd.read_csv("FILE.csv", header = None)
headerName=["DRID","Product","M24","M23","M22","M21","M20","M19","M18","M17","M16","M15","M14","M13","M12","M11","M10","M9","M8","M7","M6","M5","M4","M3","M2","M1"]
cliques = [(Confidential)]
data.columns=[headerName]
log_df = data
log_df = np.log(1+data[["M24","M23","M22","M21","M20","M19","M18","M17","M16","M15","M14","M13","M12","M11","M10","M9","M8","M7","M6","M5","M4","M3","M2","M1"]])
copy = data[["DRID","Product"]].copy()
log_df = copy.join(log_df)
include_clique = log_df.loc[log_df['Product'].str.contains("Product A")]
这是头:
ID PRODUCT M24 M23 M22 M21
0 123421 A 0.000000 0.000000 1.098612 0.0
1 141840 A 0.693147 1.098612 0.000000 0.0
2 212006 A 0.693147 0.000000 0.000000 0.0
3 216097 A 1.098612 0.000000 0.000000 0.0
4 219517 A 1.098612 0.693147 1.098612 0.0
编辑 2:这里是 print(data),A 是产品。打印出来的时候好像A不在产品类别下。
DRID Product M24 M23 M22 M21 M20 \
0 52250 A 0.0 0.0 2.0 0.0 0.0
1 141840 A 1.0 2.0 0.0 0.0 0.0
2 212006 A 1.0 0.0 0.0 0.0 0.0
3 216097 A 2.0 0.0 0.0 0.0 0.0
简答:将data.columns=[headerName]
改为data.columns=headerName
说明:当你设置data.columns=[headerName]
时,列是MultiIndex对象。因此,您的 log_df['Product']
是一个 DataFrame,对于 DataFrame,没有 str
属性。
当您设置 data.columns=headerName
时,您的 log_df['Product']
是单列,您可以使用 str
属性。
出于任何原因,如果您需要将数据保存为 MultiIndex 对象,还有另一种解决方案:首先将您的 log_df['Product']
转换为 Series。之后,str
属性可用。
products = pd.Series(df.Product.values.flatten())
include_clique = products[products.str.contains("Product A")]
不过,我想第一个解决方案就是您要找的