通过引用另一个数据框更新数据框列值
Updating dataframe column value by referring to another dataframe
我的第一个数据框 (df) 包含 Entrydate 和 ExitDate 列。另一个数据框 (n1) 具有所有交易日期。我需要在第一个数据框中添加一个新列,计算为根据第二个数据框计算的天数。如何为 df 的每一行调用此 dayCount 函数。当我尝试使用 mapply 时,我无法将 n1 作为参数传递。
dayCount <- function (startDate, endDate, n1) {
return (nrow(subset(n1, Date >= startDate & Date <= endDate)))
}
df<- structure(list(EntryDate = structure(c(11355, 11418, 11436, 11449,
11520, 11523, 11548, 11620, 11768, 11773), class = "Date"), ExitDate = structure(c(11360,
11422, 11438, 11457, 11522, 11526, 11554, 11625, 11772, 11778
), class = "Date")), row.names = c(22L, 65L, 76L, 84L, 135L,
138L, 155L, 204L, 305L, 307L), class = "data.frame")
n1<- structure(c(11354, 11355, 11358, 11359, 11360, 11361, 11362,
11365, 11366, 11367, 11368, 11369, 11372, 11373, 11374, 11375,
11376, 11379, 11380, 11381, 11382, 11383, 11386, 11388, 11389,
11390, 11393, 11394, 11395, 11396, 11397, 11400, 11401, 11402,
11403, 11404, 11407, 11408, 11409, 11410, 11411, 11414, 11415,
11416, 11418, 11421, 11422, 11423, 11424, 11428, 11429, 11430,
11431, 11432, 11435, 11436, 11437, 11438, 11439, 11442, 11444,
11445, 11446, 11449, 11450, 11451, 11452, 11453, 11456, 11457,
11458, 11459, 11460, 11463, 11464, 11465, 11466, 11467, 11470,
11471, 11472, 11473, 11474, 11477, 11478, 11479, 11480, 11481,
11484, 11485, 11486, 11487, 11488, 11491, 11492, 11493, 11494,
11495, 11498, 11499, 11500, 11501, 11502, 11505, 11506, 11507,
11508, 11509, 11512, 11513, 11514, 11515, 11516, 11519, 11520,
11521, 11522, 11523, 11526, 11527, 11528, 11529, 11530, 11533,
11534, 11535, 11536, 11537, 11540, 11541, 11542, 11543, 11544,
11547, 11548, 11550, 11551, 11554, 11555, 11557, 11558, 11561,
11562, 11563, 11564, 11565, 11568, 11569, 11570, 11571, 11572,
11575, 11576, 11577, 11578, 11579, 11582, 11583, 11584, 11585,
11586, 11589, 11590, 11591, 11592, 11593, 11596, 11598, 11599,
11600, 11603, 11604, 11605, 11606, 11607, 11610, 11611, 11612,
11613, 11614, 11617, 11618, 11619, 11620, 11624, 11625, 11626,
11627, 11628, 11631, 11632, 11633, 11634, 11635, 11638, 11639,
11640, 11641, 11645, 11646, 11647, 11648, 11649, 11652, 11653,
11654, 11655, 11659, 11660, 11661, 11662, 11663, 11666, 11667,
11668, 11669, 11670, 11674, 11675, 11676, 11677, 11680, 11682,
11683, 11684, 11687, 11688, 11689, 11690, 11691, 11694, 11695,
11696, 11697, 11698, 11701, 11702, 11703, 11704, 11705, 11708,
11709, 11710, 11711, 11712, 11715, 11716, 11717, 11718, 11719,
11722, 11723, 11724, 11725, 11726, 11729, 11730, 11731, 11732,
11733, 11736, 11737, 11738, 11739, 11740, 11743, 11744, 11745,
11746, 11747, 11750, 11751, 11752, 11753, 11754, 11757, 11758,
11759, 11760, 11761, 11764, 11765, 11766, 11767, 11768, 11772,
11773, 11774, 11778), class = "Date")
您可以使用 %in%
计算每个 EntryDate
和 ExitDate
之间 n1
的天数。
df$dayCount <- colSums(mapply(function(x, y) n1 %in% seq(x, y, by = '1 day'),
df$EntryDate, df$ExitDate))
df
# EntryDate ExitDate dayCount
#22 2001-02-02 2001-02-07 4
#65 2001-04-06 2001-04-10 3
#76 2001-04-24 2001-04-26 3
#84 2001-05-07 2001-05-15 7
#135 2001-07-17 2001-07-19 3
#138 2001-07-20 2001-07-23 2
#155 2001-08-14 2001-08-20 4
#204 2001-10-25 2001-10-30 3
#305 2002-03-22 2002-03-26 2
#307 2002-03-27 2002-04-01 3
我的第一个数据框 (df) 包含 Entrydate 和 ExitDate 列。另一个数据框 (n1) 具有所有交易日期。我需要在第一个数据框中添加一个新列,计算为根据第二个数据框计算的天数。如何为 df 的每一行调用此 dayCount 函数。当我尝试使用 mapply 时,我无法将 n1 作为参数传递。
dayCount <- function (startDate, endDate, n1) {
return (nrow(subset(n1, Date >= startDate & Date <= endDate)))
}
df<- structure(list(EntryDate = structure(c(11355, 11418, 11436, 11449,
11520, 11523, 11548, 11620, 11768, 11773), class = "Date"), ExitDate = structure(c(11360,
11422, 11438, 11457, 11522, 11526, 11554, 11625, 11772, 11778
), class = "Date")), row.names = c(22L, 65L, 76L, 84L, 135L,
138L, 155L, 204L, 305L, 307L), class = "data.frame")
n1<- structure(c(11354, 11355, 11358, 11359, 11360, 11361, 11362,
11365, 11366, 11367, 11368, 11369, 11372, 11373, 11374, 11375,
11376, 11379, 11380, 11381, 11382, 11383, 11386, 11388, 11389,
11390, 11393, 11394, 11395, 11396, 11397, 11400, 11401, 11402,
11403, 11404, 11407, 11408, 11409, 11410, 11411, 11414, 11415,
11416, 11418, 11421, 11422, 11423, 11424, 11428, 11429, 11430,
11431, 11432, 11435, 11436, 11437, 11438, 11439, 11442, 11444,
11445, 11446, 11449, 11450, 11451, 11452, 11453, 11456, 11457,
11458, 11459, 11460, 11463, 11464, 11465, 11466, 11467, 11470,
11471, 11472, 11473, 11474, 11477, 11478, 11479, 11480, 11481,
11484, 11485, 11486, 11487, 11488, 11491, 11492, 11493, 11494,
11495, 11498, 11499, 11500, 11501, 11502, 11505, 11506, 11507,
11508, 11509, 11512, 11513, 11514, 11515, 11516, 11519, 11520,
11521, 11522, 11523, 11526, 11527, 11528, 11529, 11530, 11533,
11534, 11535, 11536, 11537, 11540, 11541, 11542, 11543, 11544,
11547, 11548, 11550, 11551, 11554, 11555, 11557, 11558, 11561,
11562, 11563, 11564, 11565, 11568, 11569, 11570, 11571, 11572,
11575, 11576, 11577, 11578, 11579, 11582, 11583, 11584, 11585,
11586, 11589, 11590, 11591, 11592, 11593, 11596, 11598, 11599,
11600, 11603, 11604, 11605, 11606, 11607, 11610, 11611, 11612,
11613, 11614, 11617, 11618, 11619, 11620, 11624, 11625, 11626,
11627, 11628, 11631, 11632, 11633, 11634, 11635, 11638, 11639,
11640, 11641, 11645, 11646, 11647, 11648, 11649, 11652, 11653,
11654, 11655, 11659, 11660, 11661, 11662, 11663, 11666, 11667,
11668, 11669, 11670, 11674, 11675, 11676, 11677, 11680, 11682,
11683, 11684, 11687, 11688, 11689, 11690, 11691, 11694, 11695,
11696, 11697, 11698, 11701, 11702, 11703, 11704, 11705, 11708,
11709, 11710, 11711, 11712, 11715, 11716, 11717, 11718, 11719,
11722, 11723, 11724, 11725, 11726, 11729, 11730, 11731, 11732,
11733, 11736, 11737, 11738, 11739, 11740, 11743, 11744, 11745,
11746, 11747, 11750, 11751, 11752, 11753, 11754, 11757, 11758,
11759, 11760, 11761, 11764, 11765, 11766, 11767, 11768, 11772,
11773, 11774, 11778), class = "Date")
您可以使用 %in%
计算每个 EntryDate
和 ExitDate
之间 n1
的天数。
df$dayCount <- colSums(mapply(function(x, y) n1 %in% seq(x, y, by = '1 day'),
df$EntryDate, df$ExitDate))
df
# EntryDate ExitDate dayCount
#22 2001-02-02 2001-02-07 4
#65 2001-04-06 2001-04-10 3
#76 2001-04-24 2001-04-26 3
#84 2001-05-07 2001-05-15 7
#135 2001-07-17 2001-07-19 3
#138 2001-07-20 2001-07-23 2
#155 2001-08-14 2001-08-20 4
#204 2001-10-25 2001-10-30 3
#305 2002-03-22 2002-03-26 2
#307 2002-03-27 2002-04-01 3