如何将跨列的不同二进制值替换为 1/0
How to replace different binary values across columns into 1/0
我有一个包含多个二进制值的数据集。
df = pd.DataFrame({"a": ["y", "n"], "b": ["t", "f"],
"c": ["known", "unknown"], "d": ['found', 'not found']})
我想将所有二进制列替换为 1/0,同时不影响其他数字列。有没有使用一两行的简单解决方案?数据集包含 500 多列,很难一一检查和替换。谢谢
可以将 pd.get_dummies
与 drop_first=True
归功于@piRSquared
pd.get_dummies(df, drop_first=True)
# a_y b_t c_unknown d_not found
#0 1 1 0 0
#1 0 0 1 1
如果只需要先对二进制对象列子集执行此操作。
df = pd.DataFrame({'a': ['y', 'n', 'c'],
'b': ['t', 'f', 't'],
'c': ['known', 'unknown', 'known'],
'd': ['found', 'not found', 'found'],
'e': [1, 2, 2]})
pd.get_dummies(df.loc[:, df.agg('nunique') == 2].select_dtypes(include='object'),
drop_first=True)
# b_t c_unknown d_not found
#0 1 0 0
#1 0 1 1
#2 1 0 0
如果跨列的二进制响应数量较少,请考虑创建字典并映射值:
d = {'y': 1, 'n': 0,
't': 1, 'f': 0,
'known': 1, 'unknown': 0,
'found': 1, 'not found': 0}
s = (df.agg('nunique') == 2) & (df.dtypes == 'object')
for col in s[s].index:
df[col] = df[col].map(d)
# a b c d e
#0 y 1 1 1 1
#1 n 0 0 0 2
#2 c 1 1 1 2
# |
# `a` not mapped because trinary
我有一个包含多个二进制值的数据集。
df = pd.DataFrame({"a": ["y", "n"], "b": ["t", "f"],
"c": ["known", "unknown"], "d": ['found', 'not found']})
我想将所有二进制列替换为 1/0,同时不影响其他数字列。有没有使用一两行的简单解决方案?数据集包含 500 多列,很难一一检查和替换。谢谢
可以将 pd.get_dummies
与 drop_first=True
归功于@piRSquared
pd.get_dummies(df, drop_first=True)
# a_y b_t c_unknown d_not found
#0 1 1 0 0
#1 0 0 1 1
如果只需要先对二进制对象列子集执行此操作。
df = pd.DataFrame({'a': ['y', 'n', 'c'],
'b': ['t', 'f', 't'],
'c': ['known', 'unknown', 'known'],
'd': ['found', 'not found', 'found'],
'e': [1, 2, 2]})
pd.get_dummies(df.loc[:, df.agg('nunique') == 2].select_dtypes(include='object'),
drop_first=True)
# b_t c_unknown d_not found
#0 1 0 0
#1 0 1 1
#2 1 0 0
如果跨列的二进制响应数量较少,请考虑创建字典并映射值:
d = {'y': 1, 'n': 0,
't': 1, 'f': 0,
'known': 1, 'unknown': 0,
'found': 1, 'not found': 0}
s = (df.agg('nunique') == 2) & (df.dtypes == 'object')
for col in s[s].index:
df[col] = df[col].map(d)
# a b c d e
#0 y 1 1 1 1
#1 n 0 0 0 2
#2 c 1 1 1 2
# |
# `a` not mapped because trinary