R替换因子水平中的重复值
R replace duplicate values in a factor level
这是我为我的问题创建可重现数据的尝试:
df<-as.data.frame(cbind("value"=rnorm(29),"both.dates"=c(
"July July","July August","July October","July November","July December",
"August August","August October","August November","August December",
"September August","September September", "September October",
"September November","September December","October August",
"October September", "October October","October November",
"October December","November August","November September",
"November October", "November November","November December",
"December August", "December September", "December October",
"December November","December December")))
df$value<-as.numeric(df$value)
head(df)
> head(df)
value both.dates
1 2 July July
2 8 July August
3 22 July October
4 3 July November
5 12 July December
6 17 August August
我有类似于“both.dates
”列的数据。 "August December" 与 "December August" 相同,我想用 "August December" 替换所有出现的 "December August"。
我已经尝试了 replace()
,但这对某些因素不起作用。
谢谢。
假设您要替换 every
个重复项,您可以 split
字符串,然后 sort
将级别指定为 month.name
。这将确保订单与月订单相同。
df$both.dates <- sapply(strsplit(as.character(df$both.dates), ' '),
function(x) paste(sort(factor(x, levels= month.name)),
collapse=' '))
这是我为我的问题创建可重现数据的尝试:
df<-as.data.frame(cbind("value"=rnorm(29),"both.dates"=c(
"July July","July August","July October","July November","July December",
"August August","August October","August November","August December",
"September August","September September", "September October",
"September November","September December","October August",
"October September", "October October","October November",
"October December","November August","November September",
"November October", "November November","November December",
"December August", "December September", "December October",
"December November","December December")))
df$value<-as.numeric(df$value)
head(df)
> head(df)
value both.dates
1 2 July July
2 8 July August
3 22 July October
4 3 July November
5 12 July December
6 17 August August
我有类似于“both.dates
”列的数据。 "August December" 与 "December August" 相同,我想用 "August December" 替换所有出现的 "December August"。
我已经尝试了 replace()
,但这对某些因素不起作用。
谢谢。
假设您要替换 every
个重复项,您可以 split
字符串,然后 sort
将级别指定为 month.name
。这将确保订单与月订单相同。
df$both.dates <- sapply(strsplit(as.character(df$both.dates), ' '),
function(x) paste(sort(factor(x, levels= month.name)),
collapse=' '))