如何对特定列中的每个数字进行数学计算
How to do math calculation on each number in a specific column
我在 Python 中使用 pandas.read_excel()
导入了一个 Excel 文件。
然后我想对特定列中的每个数字进行数学计算,并生成一个新列。但是出现错误:
TypeError: cannot convert the series to
我该如何解决这个问题?下面是我的代码。
import pandas as pd
import math
N_DATA=pd.read_excel(r"path\datajl.xls",index_col='R')
rchdecay=N_DATA['column_name']
rchdcayf=math.exp(-rchdecay*0.008)
我认为你需要 numpy.exp
:
import numpy as np
rchdecay=N_DATA['column_name']
rchdcayf=np.exp(-rchdecay*0.008)
样本:
import pandas as pd
import numpy as np
N_DATA = pd.DataFrame({'column_name':[1,2,3]})
print (N_DATA)
column_name
0 1
1 2
2 3
rchdcayf=np.exp(-N_DATA['column_name']*0.008)
print (rchdcayf)
0 0.992032
1 0.984127
2 0.976286
Name: column_name, dtype: float64
或apply
math.exp
,但比较慢:
rchdcayf1=(-N_DATA['column_name']*0.008).apply(math.exp)
print (rchdcayf1)
0 0.992032
1 0.984127
2 0.976286
Name: column_name, dtype: float64
时间:
len(df)=3
In [61]: %timeit (-N_DATA['column_name']*0.008).apply(math.exp)
The slowest run took 5.46 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 209 µs per loop
In [62]: %timeit np.exp(-N_DATA['column_name']*0.008)
The slowest run took 4.59 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 168 µs per loop
len(df)=3k
:
In [64]: %timeit np.exp(-N_DATA['column_name']*0.008)
1000 loops, best of 3: 214 µs per loop
In [65]: %timeit (-N_DATA['column_name']*0.008).apply(math.exp)
1000 loops, best of 3: 873 µs per loop
时间代码:
import pandas as pd
import numpy as np
import math
N_DATA = pd.DataFrame({'column_name':[1,2,3]})
N_DATA = pd.concat([N_DATA]*1000).reset_index(drop=True)
rchdcayf=np.exp(-N_DATA['column_name']*0.008)
print (rchdcayf)
rchdcayf1=(-N_DATA['column_name']*0.008).apply(math.exp)
print (rchdcayf1)
我在 Python 中使用 pandas.read_excel()
导入了一个 Excel 文件。
然后我想对特定列中的每个数字进行数学计算,并生成一个新列。但是出现错误:
TypeError: cannot convert the series to
我该如何解决这个问题?下面是我的代码。
import pandas as pd
import math
N_DATA=pd.read_excel(r"path\datajl.xls",index_col='R')
rchdecay=N_DATA['column_name']
rchdcayf=math.exp(-rchdecay*0.008)
我认为你需要 numpy.exp
:
import numpy as np
rchdecay=N_DATA['column_name']
rchdcayf=np.exp(-rchdecay*0.008)
样本:
import pandas as pd
import numpy as np
N_DATA = pd.DataFrame({'column_name':[1,2,3]})
print (N_DATA)
column_name
0 1
1 2
2 3
rchdcayf=np.exp(-N_DATA['column_name']*0.008)
print (rchdcayf)
0 0.992032
1 0.984127
2 0.976286
Name: column_name, dtype: float64
或apply
math.exp
,但比较慢:
rchdcayf1=(-N_DATA['column_name']*0.008).apply(math.exp)
print (rchdcayf1)
0 0.992032
1 0.984127
2 0.976286
Name: column_name, dtype: float64
时间:
len(df)=3
In [61]: %timeit (-N_DATA['column_name']*0.008).apply(math.exp)
The slowest run took 5.46 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 209 µs per loop
In [62]: %timeit np.exp(-N_DATA['column_name']*0.008)
The slowest run took 4.59 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 168 µs per loop
len(df)=3k
:
In [64]: %timeit np.exp(-N_DATA['column_name']*0.008)
1000 loops, best of 3: 214 µs per loop
In [65]: %timeit (-N_DATA['column_name']*0.008).apply(math.exp)
1000 loops, best of 3: 873 µs per loop
时间代码:
import pandas as pd
import numpy as np
import math
N_DATA = pd.DataFrame({'column_name':[1,2,3]})
N_DATA = pd.concat([N_DATA]*1000).reset_index(drop=True)
rchdcayf=np.exp(-N_DATA['column_name']*0.008)
print (rchdcayf)
rchdcayf1=(-N_DATA['column_name']*0.008).apply(math.exp)
print (rchdcayf1)