从 SPSS 转换为 Pandas...结果为所有变量提供 "b'var_name'"

Question

我正在尝试将 SPSS 文件转换为 Pandas，工作正常。但是，所有变量都显示为 "b'variable_name'"。它将 'b' 放在每个变量的前面，并在原始变量名称周围加上单引号。有没有办法做到这一点并保留原始变量名？

我试过重命名变量，但是引号会影响代码...而且...有很多变量，所以这很乏味而且不理想。

df = pd.DataFrame(list(s.SavReader(r'C:\Users\Nick\Desktop\GitProjects\Data\M2.sav', returnHeader=True, 
                                   recodeSysmisTo='NaN',ioUtf8=True,rawMode=True)))
df.head(10)

# Create a new variable called 'header' from the first row of the dataset
header = df.iloc[0]
# Replace the dataframe with a new one which does not contain the first row
df = df[1:]
# Rename the dataframe's column values with the header variable
M2 = df.rename(columns = header)
M2.head(10)

这是生成的日期范围。没关系，但我需要去掉每个变量周围的 'b' 和单引号。

Answer 1

为了快速解决这个问题：

header = list(map(str, df.iloc[0]))

所以 b'' 表示您所有的 header 名称都是字节，而不是字符串。这可能是由于用于阅读的功能。保存文件

从 SPSS 转换为 Pandas...结果为所有变量提供 "b'var_name'"

Converting from SPSS to Pandas...result gives "b'var_name'" for all variables

python

spss

dataframe

pandas