进行循环计算
Doing a loop calculation
多多=
C1Date
C1Type
C2Date
C2Type
.....
C10Date
CType10
PolDate
dd-mm-yyyy
:Proposer
NaT
NaN
NaT
NaN
dd-mm-yyyy
dd-mm-yyyy
:Proposer
NaT
NaN
NaT
NaN
dd-mm-yyyy
dd-mm-yyyy
:Other
dd-mm-yyyy
Proposer
NaT
NaN
dd-mm-yyyy
dd-mm-yyyy
:Proposer
NaT
NaN
NaT
NaN
dd-mm-yyyy
dd-mm-yyyy
:Other
dd-mm-yyyy
Other
NaT
NaN
dd-mm-yyyy
其中 C
指的是 Claim
等等。即连续最多 10 Claims
。
我需要确定是否有任何 Claims
来自 Proposer
,并且对于这些声明,它们是否发生在 PolDate
的 3 年内(PolDate
是总是大于任何 Cdate
)
我能够执行以下操作,但我无法在循环中进行日期减法:
CLM = {}
for i in range(1 , 11):
CLM[i] = toto.loc[toto[f'C{i}Type'] == 'Proposer']
#can't get this date subtraction to work within the loop. But can do the subtraction outside of the loop.
CLM[i]['diff'] = (CLM[i]['PolDate'].sub(CLM[i][f'C{i}Date'],
axis=0)).dt.days
use_cols = ['CustomerID', f'C{i}Type', f'C{1}Date', 'PolDate ']
CLM[i] = CLM[i][use_cols]
print("Claim:" + f'{i}' +" "+ str(CLM[i].shape))
错误:
A value is trying to be set on a copy of a slice from a DataFrame. Try
using .loc[row_indexer,col_indexer] = value instead
此外,无法进行 3 年比较:
if (CLM[1]['diff'] > 1095):
#1095 = (365 * 3):
CLM[1]['CLMLAST3'] = 0
else:
CLM[1]['diff'] = 1
错误:
ValueError: The truth value of a Series is ambiguous. Use a.empty,
a.bool(), a.item(), a.any() or a.all().
简而言之,试试这个,它对我有用:(不太了解 pandas 所以也许效率是你的领域,我只是从发布的代码中删除了错误)
CLM = {}
for i in range(1 , 11):
CLM[i] = toto.loc[toto[f'C{i}Type'] == 'Proposer']
#can't get this date subtraction to work within the loop. But can do the subtraction outside of the loop.
**CLM.get(i).loc[:, 'diff'] = (pd.to_datetime(CLM[i]['PolDate'],format='%d-%m-%Y').sub(pd.to_datetime(CLM[i][f'C{i}Date'],format='%d-%m-%Y'))).dt.days**
use_cols = ['CustomerID', f'C{i}Type', f'C{1}Date', 'PolDate ']
CLM[i] = CLM[i][use_cols]
print("Claim:" + f'{i}' +" "+ str(CLM[i].shape))
注意事项:
警告“试图在 DataFrame 的切片副本上设置值。尝试使用 .loc[row_indexer,col_indexer] = value instead" 也出现在这段代码中。因为 CLM[i]['diff'] 不同于 CLM[i].loc['diff']。请参阅此处:
CLM[i]['PolDate'] 是字符串的“列表”,所以你不会从一个字符串中减去一个字符串,但是你可以减去一个 pandas 来自另一个的日期时间对象。因此,先将它们转换为 datetime 对象,然后再减去。
与您比较列表与值的额外问题相同,请参阅此
简而言之,您很可能想要这个:“if (CLM[1]['diff'].all() > 1095)”,因此它比较系列中的每个值,而不是整个系列与一个值。
多多=
C1Date | C1Type | C2Date | C2Type | ..... | C10Date | CType10 | PolDate |
---|---|---|---|---|---|---|---|
dd-mm-yyyy | :Proposer | NaT | NaN | NaT | NaN | dd-mm-yyyy | |
dd-mm-yyyy | :Proposer | NaT | NaN | NaT | NaN | dd-mm-yyyy | |
dd-mm-yyyy | :Other | dd-mm-yyyy | Proposer | NaT | NaN | dd-mm-yyyy | |
dd-mm-yyyy | :Proposer | NaT | NaN | NaT | NaN | dd-mm-yyyy | |
dd-mm-yyyy | :Other | dd-mm-yyyy | Other | NaT | NaN | dd-mm-yyyy |
其中 C
指的是 Claim
等等。即连续最多 10 Claims
。
我需要确定是否有任何 Claims
来自 Proposer
,并且对于这些声明,它们是否发生在 PolDate
的 3 年内(PolDate
是总是大于任何 Cdate
)
我能够执行以下操作,但我无法在循环中进行日期减法:
CLM = {}
for i in range(1 , 11):
CLM[i] = toto.loc[toto[f'C{i}Type'] == 'Proposer']
#can't get this date subtraction to work within the loop. But can do the subtraction outside of the loop.
CLM[i]['diff'] = (CLM[i]['PolDate'].sub(CLM[i][f'C{i}Date'],
axis=0)).dt.days
use_cols = ['CustomerID', f'C{i}Type', f'C{1}Date', 'PolDate ']
CLM[i] = CLM[i][use_cols]
print("Claim:" + f'{i}' +" "+ str(CLM[i].shape))
错误:
A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
此外,无法进行 3 年比较:
if (CLM[1]['diff'] > 1095):
#1095 = (365 * 3):
CLM[1]['CLMLAST3'] = 0
else:
CLM[1]['diff'] = 1
错误:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
简而言之,试试这个,它对我有用:(不太了解 pandas 所以也许效率是你的领域,我只是从发布的代码中删除了错误)
CLM = {}
for i in range(1 , 11):
CLM[i] = toto.loc[toto[f'C{i}Type'] == 'Proposer']
#can't get this date subtraction to work within the loop. But can do the subtraction outside of the loop.
**CLM.get(i).loc[:, 'diff'] = (pd.to_datetime(CLM[i]['PolDate'],format='%d-%m-%Y').sub(pd.to_datetime(CLM[i][f'C{i}Date'],format='%d-%m-%Y'))).dt.days**
use_cols = ['CustomerID', f'C{i}Type', f'C{1}Date', 'PolDate ']
CLM[i] = CLM[i][use_cols]
print("Claim:" + f'{i}' +" "+ str(CLM[i].shape))
注意事项:
警告“试图在 DataFrame 的切片副本上设置值。尝试使用 .loc[row_indexer,col_indexer] = value instead" 也出现在这段代码中。因为 CLM[i]['diff'] 不同于 CLM[i].loc['diff']。请参阅此处:
CLM[i]['PolDate'] 是字符串的“列表”,所以你不会从一个字符串中减去一个字符串,但是你可以减去一个 pandas 来自另一个的日期时间对象。因此,先将它们转换为 datetime 对象,然后再减去。
与您比较列表与值的额外问题相同,请参阅此