按行排序日期
Sorting dates row-wise
我有以下数据集:
my_data <- structure(list(id = 1:5, d1 = structure(c(16764, 11375, 13873,
10665, 14395), class = "Date"), d2 = structure(c(14487, 14709,
11417, 13531, 13457), class = "Date"), d3 = structure(c(15706,
13542, 16722, 17001, 12621), class = "Date"), d4 = structure(c(14628,
11210, 14134, 14597, 15119), class = "Date"), d5 = structure(c(10664,
18150, 15536, 16109, 12236), class = "Date")), row.names = c(NA,
5L), class = "data.frame")
id d1 d2 d3 d4 d5
1 1 2015-11-25 2009-08-31 2013-01-01 2010-01-19 1999-03-14
2 2 2001-02-22 2010-04-10 2007-01-29 2000-09-10 2019-09-11
3 3 2007-12-26 2001-04-05 2015-10-14 2008-09-12 2012-07-15
4 4 1999-03-15 2007-01-18 2016-07-19 2009-12-19 2014-02-08
5 5 2009-05-31 2006-11-05 2004-07-22 2011-05-25 2003-07-03
- 对于每一行,我想将 5 个日期中最小的分配给 d1,将 5 个日期中的第二小分配给 d2,将 5 个日期中的第三小分配给 d3,将第四小的分配给 d3分配给 d4 的 5 个日期,以及分配给 d5 的 5 个日期中最小的日期。
这看起来像这样(例如第一行 - 对其余 99 行重复):
id d1 d2 d3 d4 d5
1 1 1999-03-14 2009-08-31 2010-01-19 2013-01-01 2015-11-25
我试着在 Whosebug 上寻找类似的例子,但到目前为止我找不到任何与我想要完成的事情相匹配的东西(例如 )。
有人可以告诉我怎么做吗?
谢谢!
注意:最接近我上班的例子(不涉及日期):
a = rnorm(100,100,100)
b = rnorm(100,100, 100)
c = rnorm(100,100,100)
d = rnorm(100,100,100)
e = rnorm(100,100,100)
df = data.frame(a,b,c,d,e)
head(df)
a b c d e
1 56.48320 -55.274406 83.32993 -96.55970 181.65859
2 94.17800 88.084876 31.58830 -44.06487 156.85318
3 257.56950 28.500542 34.76845 57.00857 51.76036
4 100.22489 6.946803 60.88848 116.18413 37.34444
5 21.77935 -7.538119 89.29565 -67.28311 43.98728
6 200.18950 -1.555829 123.91148 106.45983 107.50339
# now sort them (check if for each row, numbers increase from left to right)
n = data.frame(t(apply(df, 1, sort)))
X1 X2 X3 X4 X5
1 -96.559698 -55.274406 56.48320 83.32993 181.65859
2 -44.064873 31.588295 88.08488 94.17800 156.85318
3 28.500542 34.768449 51.76036 57.00857 257.56950
4 6.946803 37.344439 60.88848 100.22489 116.18413
5 -67.283113 -7.538119 21.77935 43.98728 89.29565
6 -1.555829 106.459828 107.50339 123.91148 200.18950
基数 R:
在这里,我们在对 my_data
的所有列进行排序后转置 my_data
,但第一列使用 apply
函数。最后我们添加第一列 cbind(my_data[1], ...
my_data_changed <- cbind(my_data[1], t(apply(my_data[-1], 1, sort)))
my_data_changed <- setnames(my_data_changed, colnames(my_data))
id d1 d2 d3 d4 d5
1 1 1999-03-14 2009-08-31 2010-01-19 2013-01-01 2015-11-25
2 2 2000-09-10 2001-02-22 2007-01-29 2010-04-10 2019-09-11
3 3 2001-04-05 2007-12-26 2008-09-12 2012-07-15 2015-10-14
4 4 1999-03-15 2007-01-18 2009-12-19 2014-02-08 2016-07-19
5 5 2003-07-03 2004-07-22 2006-11-05 2009-05-31 2011-05-25
或
我们可以先使用 pivot_longer
转换为长格式,然后使用 sort
并返回 pivot_wider
。
技巧:mutate(value = sort(value)
仅更改 value
library(dplyr)
library(tidyr)
my_data %>%
pivot_longer(
-id
) %>%
group_by(id) %>%
mutate(value = sort(value)) %>%
pivot_wider()
id d1 d2 d3 d4 d5
<int> <date> <date> <date> <date> <date>
1 1 1999-03-14 2009-08-31 2010-01-19 2013-01-01 2015-11-25
2 2 2000-09-10 2001-02-22 2007-01-29 2010-04-10 2019-09-11
3 3 2001-04-05 2007-12-26 2008-09-12 2012-07-15 2015-10-14
4 4 1999-03-15 2007-01-18 2009-12-19 2014-02-08 2016-07-19
5 5 2003-07-03 2004-07-22 2006-11-05 2009-05-31 2011-05-25
apply
方法也适用于日期。他们只是被强制转换为字符矩阵,但我们可以在其上强制 as.data.frame
和 lapply
as.Date
。
my_data[-1] <- as.data.frame(t(apply(my_data[-1], 1, sort))) |> lapply(as.Date)
给予
my_data
# id d1 d2 d3 d4 d5
# 1 1 1999-03-14 2009-08-31 2010-01-19 2013-01-01 2015-11-25
# 2 2 2000-09-10 2001-02-22 2007-01-29 2010-04-10 2019-09-11
# 3 3 2001-04-05 2007-12-26 2008-09-12 2012-07-15 2015-10-14
# 4 4 1999-03-15 2007-01-18 2009-12-19 2014-02-08 2016-07-19
# 5 5 2003-07-03 2004-07-22 2006-11-05 2009-05-31 2011-05-25
在哪里
str(my_data)
# 'data.frame': 5 obs. of 6 variables:
# $ id: int 1 2 3 4 5
# $ d1: Date, format: "1999-03-14" "2000-09-10" "2001-04-05" "1999-03-15" ...
# $ d2: Date, format: "2009-08-31" "2001-02-22" "2007-12-26" "2007-01-18" ...
# $ d3: Date, format: "2010-01-19" "2007-01-29" "2008-09-12" "2009-12-19" ...
# $ d4: Date, format: "2013-01-01" "2010-04-10" "2012-07-15" "2014-02-08" ...
# $ d5: Date, format: "2015-11-25" "2019-09-11" "2015-10-14" "2016-07-19" ...
我有以下数据集:
my_data <- structure(list(id = 1:5, d1 = structure(c(16764, 11375, 13873,
10665, 14395), class = "Date"), d2 = structure(c(14487, 14709,
11417, 13531, 13457), class = "Date"), d3 = structure(c(15706,
13542, 16722, 17001, 12621), class = "Date"), d4 = structure(c(14628,
11210, 14134, 14597, 15119), class = "Date"), d5 = structure(c(10664,
18150, 15536, 16109, 12236), class = "Date")), row.names = c(NA,
5L), class = "data.frame")
id d1 d2 d3 d4 d5
1 1 2015-11-25 2009-08-31 2013-01-01 2010-01-19 1999-03-14
2 2 2001-02-22 2010-04-10 2007-01-29 2000-09-10 2019-09-11
3 3 2007-12-26 2001-04-05 2015-10-14 2008-09-12 2012-07-15
4 4 1999-03-15 2007-01-18 2016-07-19 2009-12-19 2014-02-08
5 5 2009-05-31 2006-11-05 2004-07-22 2011-05-25 2003-07-03
- 对于每一行,我想将 5 个日期中最小的分配给 d1,将 5 个日期中的第二小分配给 d2,将 5 个日期中的第三小分配给 d3,将第四小的分配给 d3分配给 d4 的 5 个日期,以及分配给 d5 的 5 个日期中最小的日期。
这看起来像这样(例如第一行 - 对其余 99 行重复):
id d1 d2 d3 d4 d5
1 1 1999-03-14 2009-08-31 2010-01-19 2013-01-01 2015-11-25
我试着在 Whosebug 上寻找类似的例子,但到目前为止我找不到任何与我想要完成的事情相匹配的东西(例如
有人可以告诉我怎么做吗?
谢谢!
注意:最接近我上班的例子(不涉及日期):
a = rnorm(100,100,100)
b = rnorm(100,100, 100)
c = rnorm(100,100,100)
d = rnorm(100,100,100)
e = rnorm(100,100,100)
df = data.frame(a,b,c,d,e)
head(df)
a b c d e
1 56.48320 -55.274406 83.32993 -96.55970 181.65859
2 94.17800 88.084876 31.58830 -44.06487 156.85318
3 257.56950 28.500542 34.76845 57.00857 51.76036
4 100.22489 6.946803 60.88848 116.18413 37.34444
5 21.77935 -7.538119 89.29565 -67.28311 43.98728
6 200.18950 -1.555829 123.91148 106.45983 107.50339
# now sort them (check if for each row, numbers increase from left to right)
n = data.frame(t(apply(df, 1, sort)))
X1 X2 X3 X4 X5
1 -96.559698 -55.274406 56.48320 83.32993 181.65859
2 -44.064873 31.588295 88.08488 94.17800 156.85318
3 28.500542 34.768449 51.76036 57.00857 257.56950
4 6.946803 37.344439 60.88848 100.22489 116.18413
5 -67.283113 -7.538119 21.77935 43.98728 89.29565
6 -1.555829 106.459828 107.50339 123.91148 200.18950
基数 R:
在这里,我们在对 my_data
的所有列进行排序后转置 my_data
,但第一列使用 apply
函数。最后我们添加第一列 cbind(my_data[1], ...
my_data_changed <- cbind(my_data[1], t(apply(my_data[-1], 1, sort)))
my_data_changed <- setnames(my_data_changed, colnames(my_data))
id d1 d2 d3 d4 d5
1 1 1999-03-14 2009-08-31 2010-01-19 2013-01-01 2015-11-25
2 2 2000-09-10 2001-02-22 2007-01-29 2010-04-10 2019-09-11
3 3 2001-04-05 2007-12-26 2008-09-12 2012-07-15 2015-10-14
4 4 1999-03-15 2007-01-18 2009-12-19 2014-02-08 2016-07-19
5 5 2003-07-03 2004-07-22 2006-11-05 2009-05-31 2011-05-25
或
我们可以先使用 pivot_longer
转换为长格式,然后使用 sort
并返回 pivot_wider
。
技巧:mutate(value = sort(value)
仅更改 value
library(dplyr)
library(tidyr)
my_data %>%
pivot_longer(
-id
) %>%
group_by(id) %>%
mutate(value = sort(value)) %>%
pivot_wider()
id d1 d2 d3 d4 d5
<int> <date> <date> <date> <date> <date>
1 1 1999-03-14 2009-08-31 2010-01-19 2013-01-01 2015-11-25
2 2 2000-09-10 2001-02-22 2007-01-29 2010-04-10 2019-09-11
3 3 2001-04-05 2007-12-26 2008-09-12 2012-07-15 2015-10-14
4 4 1999-03-15 2007-01-18 2009-12-19 2014-02-08 2016-07-19
5 5 2003-07-03 2004-07-22 2006-11-05 2009-05-31 2011-05-25
apply
方法也适用于日期。他们只是被强制转换为字符矩阵,但我们可以在其上强制 as.data.frame
和 lapply
as.Date
。
my_data[-1] <- as.data.frame(t(apply(my_data[-1], 1, sort))) |> lapply(as.Date)
给予
my_data
# id d1 d2 d3 d4 d5
# 1 1 1999-03-14 2009-08-31 2010-01-19 2013-01-01 2015-11-25
# 2 2 2000-09-10 2001-02-22 2007-01-29 2010-04-10 2019-09-11
# 3 3 2001-04-05 2007-12-26 2008-09-12 2012-07-15 2015-10-14
# 4 4 1999-03-15 2007-01-18 2009-12-19 2014-02-08 2016-07-19
# 5 5 2003-07-03 2004-07-22 2006-11-05 2009-05-31 2011-05-25
在哪里
str(my_data)
# 'data.frame': 5 obs. of 6 variables:
# $ id: int 1 2 3 4 5
# $ d1: Date, format: "1999-03-14" "2000-09-10" "2001-04-05" "1999-03-15" ...
# $ d2: Date, format: "2009-08-31" "2001-02-22" "2007-12-26" "2007-01-18" ...
# $ d3: Date, format: "2010-01-19" "2007-01-29" "2008-09-12" "2009-12-19" ...
# $ d4: Date, format: "2013-01-01" "2010-04-10" "2012-07-15" "2014-02-08" ...
# $ d5: Date, format: "2015-11-25" "2019-09-11" "2015-10-14" "2016-07-19" ...