如何更改列中的值并在 python 中生成新的 DataFrame
How to change values in column and generate a new DataFrame in python
我有一个 DataFrame,我想从一列中生成一个新的变化值,并保留原始数据帧 intact.I 尝试使用 mask、where 和 iloc,但原始数据帧总是会改变。
import pandas as pd
data = {
"age": [50, 40, 30, 40, 20, 10, 30],
"qualified": [True, False, False, False, False, True, True]
}
df = pd.DataFrame(data)
newdf = df
newdf["age"] = newdf.where(newdf["age"] > 30,2)
print(newdf)
print(df)
结果:
age qualified
0 50 True
1 40 False
2 2 False
3 40 False
4 2 False
5 2 True
6 2 True
age qualified
0 50 True
1 40 False
2 2 False
3 40 False
4 2 False
5 2 True
6 2 True
有什么方法可以更改此值并保留原始值吗?
使用df.copy(deep=True)
What is the difference between a deep copy and a shallow copy?
import pandas as pd
import numpy as np
data = {
"age": [50, 40, 30, 40, 20, 10, 30],
"qualified": [True, False, False, False, False, True, True]
}
df = pd.DataFrame(data)
# deep copy
newdf = df.copy(deep=True)
newdf["age"] = np.where(newdf["age"] > 30, newdf["age"], 2)
print(newdf)
age qualified
0 50 True
1 40 False
2 2 False
3 40 False
4 2 False
5 2 True
6 2 True
print(df)
age qualified
0 50 True
1 40 False
2 30 False
3 40 False
4 20 False
5 10 True
6 30 True
我有一个 DataFrame,我想从一列中生成一个新的变化值,并保留原始数据帧 intact.I 尝试使用 mask、where 和 iloc,但原始数据帧总是会改变。
import pandas as pd
data = {
"age": [50, 40, 30, 40, 20, 10, 30],
"qualified": [True, False, False, False, False, True, True]
}
df = pd.DataFrame(data)
newdf = df
newdf["age"] = newdf.where(newdf["age"] > 30,2)
print(newdf)
print(df)
结果:
age qualified
0 50 True
1 40 False
2 2 False
3 40 False
4 2 False
5 2 True
6 2 True
age qualified
0 50 True
1 40 False
2 2 False
3 40 False
4 2 False
5 2 True
6 2 True
有什么方法可以更改此值并保留原始值吗?
使用df.copy(deep=True)
What is the difference between a deep copy and a shallow copy?
import pandas as pd
import numpy as np
data = {
"age": [50, 40, 30, 40, 20, 10, 30],
"qualified": [True, False, False, False, False, True, True]
}
df = pd.DataFrame(data)
# deep copy
newdf = df.copy(deep=True)
newdf["age"] = np.where(newdf["age"] > 30, newdf["age"], 2)
print(newdf)
age qualified
0 50 True
1 40 False
2 2 False
3 40 False
4 2 False
5 2 True
6 2 True
print(df)
age qualified
0 50 True
1 40 False
2 30 False
3 40 False
4 20 False
5 10 True
6 30 True