使用重复标识符将其他列传递到 spread 中
pass other column inside spread with duplicate identifiers
我有下面的数据框,我试图通过传递 actv_amt
来 spread
feature_code
,这样我就可以得到相应的 actv_amt
对应的 feature
代码。我正在尝试传递为 count_FEATURE = ACTV_AMT
它正在传递值但不合并数据。
供参考,我之前问过一个问题
Input type: 1
ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_CODE L_NU
7/27/16 7/27/16 265 O 15 1 INTEREST 855
7/27/16 7/27/16 265 O 14 1 INTEREST 855
getting Output
ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_INTEREST L_NU
7/27/16 7/27/16 265 O 29 1 2 855
Expected output:
ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_INTEREST L_NU
7/27/16 7/27/16 265 O 29 1 29 855
输入类型 2:
Input
ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_CODE L_NU
7/27/16 7/27/16 265 O 15 1 INTEREST 855
7/27/16 7/27/16 265 O 14 1 INSTALLMENT 855
Getting output:
ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO INTEREST INSTALLMENT L_NU
7/Expected7/16 265 O 29 1 1 1 855
Expected output:
ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO INTEREST INSTALLMENT L_NU
7/27/16 7/27/16 265 O 29 1 15 14 855
代码实现:
dt %>%
group_by(AB_NO,LO_NO,L_NU)%>%
mutate(ACTV_AMT = sum(ACTV_AMT),
ST_DATE = min(ST_DATE),
ND_DATE = max(ND_DATE)) %>%
ungroup() %>%
mutate(id = row_number(),
FEATURE_CODE = paste0("FTR_", FEATURE_CODE),
ACTV_CODE = paste0("ACTV_", ACTV_CODE),
count_FEATURE = 1,
count_ACTV = 1) %>%
spread(FEATURE_CODE, count_FEATURE) %>%
spread(ACTV_CODE, count_ACTV) %>%
select(-id) %>%
group_by(ST_DATE, ND_DATE, LO_NO, ACTV_AMT, AB_NO, L_NU) %>%
summarise_all(sum, na.rm=T) %>%
ungroup()
任何人都可以帮助我获得预期的输出。
你可以这样试试
library(reshape2)
df <- read.table(text = "ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_CODE L_NU
7/27/16 7/27/16 265 O 15 1 INTEREST 855
7/27/16 7/27/16 265 O 14 1 INTEREST 855", header = T)
dcast(df, ST_DATE+ND_DATE+LO_NO+ACTV_CODE+AB_NO+L_NU~FEATURE_CODE, value.var = "ACTV_AMT", fun.aggregate = sum)
output:
-------
ST_DATE ND_DATE LO_NO ACTV_CODE AB_NO L_NU INTEREST
1 7/27/16 7/27/16 265 O 1 855 29
input2:
-------
df <- read.table(text = "ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_CODE L_NU
7/27/16 7/27/16 265 O 15 1 INTEREST 855
7/27/16 7/27/16 265 O 14 1 INSTALLMENT 855", header = T)
dcast(df, ST_DATE+ND_DATE+LO_NO+ACTV_CODE+AB_NO+L_NU~FEATURE_CODE, value.var = "ACTV_AMT", fun.aggregate = sum)
output:
-------
ST_DATE ND_DATE LO_NO ACTV_CODE AB_NO L_NU INSTALLMENT INTEREST
1 7/27/16 7/27/16 265 O 1 855 14 15
我有下面的数据框,我试图通过传递 actv_amt
来 spread
feature_code
,这样我就可以得到相应的 actv_amt
对应的 feature
代码。我正在尝试传递为 count_FEATURE = ACTV_AMT
它正在传递值但不合并数据。
供参考,我之前问过一个问题
Input type: 1
ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_CODE L_NU
7/27/16 7/27/16 265 O 15 1 INTEREST 855
7/27/16 7/27/16 265 O 14 1 INTEREST 855
getting Output
ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_INTEREST L_NU
7/27/16 7/27/16 265 O 29 1 2 855
Expected output:
ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_INTEREST L_NU
7/27/16 7/27/16 265 O 29 1 29 855
输入类型 2:
Input
ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_CODE L_NU
7/27/16 7/27/16 265 O 15 1 INTEREST 855
7/27/16 7/27/16 265 O 14 1 INSTALLMENT 855
Getting output:
ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO INTEREST INSTALLMENT L_NU
7/Expected7/16 265 O 29 1 1 1 855
Expected output:
ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO INTEREST INSTALLMENT L_NU
7/27/16 7/27/16 265 O 29 1 15 14 855
代码实现:
dt %>%
group_by(AB_NO,LO_NO,L_NU)%>%
mutate(ACTV_AMT = sum(ACTV_AMT),
ST_DATE = min(ST_DATE),
ND_DATE = max(ND_DATE)) %>%
ungroup() %>%
mutate(id = row_number(),
FEATURE_CODE = paste0("FTR_", FEATURE_CODE),
ACTV_CODE = paste0("ACTV_", ACTV_CODE),
count_FEATURE = 1,
count_ACTV = 1) %>%
spread(FEATURE_CODE, count_FEATURE) %>%
spread(ACTV_CODE, count_ACTV) %>%
select(-id) %>%
group_by(ST_DATE, ND_DATE, LO_NO, ACTV_AMT, AB_NO, L_NU) %>%
summarise_all(sum, na.rm=T) %>%
ungroup()
任何人都可以帮助我获得预期的输出。
你可以这样试试
library(reshape2)
df <- read.table(text = "ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_CODE L_NU
7/27/16 7/27/16 265 O 15 1 INTEREST 855
7/27/16 7/27/16 265 O 14 1 INTEREST 855", header = T)
dcast(df, ST_DATE+ND_DATE+LO_NO+ACTV_CODE+AB_NO+L_NU~FEATURE_CODE, value.var = "ACTV_AMT", fun.aggregate = sum)
output:
-------
ST_DATE ND_DATE LO_NO ACTV_CODE AB_NO L_NU INTEREST
1 7/27/16 7/27/16 265 O 1 855 29
input2:
-------
df <- read.table(text = "ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_CODE L_NU
7/27/16 7/27/16 265 O 15 1 INTEREST 855
7/27/16 7/27/16 265 O 14 1 INSTALLMENT 855", header = T)
dcast(df, ST_DATE+ND_DATE+LO_NO+ACTV_CODE+AB_NO+L_NU~FEATURE_CODE, value.var = "ACTV_AMT", fun.aggregate = sum)
output:
-------
ST_DATE ND_DATE LO_NO ACTV_CODE AB_NO L_NU INSTALLMENT INTEREST
1 7/27/16 7/27/16 265 O 1 855 14 15