如何在缺少一级(已删除)的情况下继续使用因子?
How to continue to use factor where one level is missing (has been removed)?
我正在使用我下载的一组 Fitbit 数据,它有一个工作日列表,我正在尝试正确排序。现在,当前数据集没有 "Fridays",但我希望因子无论如何都包含它。
即使数据集中只有 6 个工作日,我如何继续将工作日分解为 1-7?
file<-choose.files()
slp<-data.frame(read.csv(file))
wkdaylevels<-c("Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday")
slp$FellAsleepAt<-strptime(slp$FellAsleepAt, format="%B %e, %Y at %I:%M%p")
slp$AwokeAt<-strptime(slp$AwokeAt,format="%B %e, %Y at %I:%M%p")
slp$TotalTimeSlept<-gsub("h ",":",slp$TotalTimeSlept)
slp$TotalTimeSlept<-gsub("m","",slp$TotalTimeSlept)
slp$TimeAsleep<-as.numeric(difftime(slp$AwokeAt,slp$FellAsleepAt))
slp$Date<-as.Date(slp$FellAsleepAt, format="%M/%D/%Y")
slp$DayofWeek<-as.factor(weekdays(slp$Date),levels=wkdaylevels)
ggplot(slp,aes(x=DayofWeek,y=TimeAsleep))+
geom_point()
数据在这里:https://docs.google.com/spreadsheets/d/1Vdgmtwx0vNKDKEZFMEGAWQ58H66ia-xjI0evR7idfkc/edit?usp=sharing
使用levels<-
(?levels
寻求帮助):
wdays <- as.factor(c("Sunday", "Monday"))
wkdaylevels<-c("Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday")
levels(wdays) <- wkdaylevels
wdays
# [1] Monday Sunday
# Levels: Sunday Monday Tuesday Wednesday Thursday Friday Saturday
如果你想删除未使用的关卡,你可以使用
droplevels(wdays)
# [1] Monday Sunday
# Levels: Sunday Monday
或
factor(wdays)
# [1] Monday Sunday
# Levels: Sunday Monday
答案:使用factor
而不是as.factor
函数 as
将对象强制转换为 class。在您的情况下, as.<type>
强制转换为一种类型(适合您的因素)。函数 factor
用于将对象编码为一个因子。关键区别在于 as.factor
不允许级别参数,而 factor
允许。
如果您检查每个函数的源代码,您会发现 as.factor
通过使用对象的唯一级别作为其级别来执行强制转换。如果未指定 levels=
参数,factor
会执行此操作,但允许输入级别。
例如:
x <- 1:6
x2 <- factor(x, levels= 1:7)
levels(x2)
[1] "1" "2" "3" "4" "5" "6" "7"
x2 <- as.factor(x, levels= 1:7) # in this case, levels won't be evaluated due to lazy evaluation
Error in as.factor(x, levels = 1:7) : unused argument (levels = 1:7)
TBH,我不确定为什么你的 R 会话没有给你这个错误。您使用的是 R 3.2.3 吗?
我正在使用我下载的一组 Fitbit 数据,它有一个工作日列表,我正在尝试正确排序。现在,当前数据集没有 "Fridays",但我希望因子无论如何都包含它。
即使数据集中只有 6 个工作日,我如何继续将工作日分解为 1-7?
file<-choose.files()
slp<-data.frame(read.csv(file))
wkdaylevels<-c("Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday")
slp$FellAsleepAt<-strptime(slp$FellAsleepAt, format="%B %e, %Y at %I:%M%p")
slp$AwokeAt<-strptime(slp$AwokeAt,format="%B %e, %Y at %I:%M%p")
slp$TotalTimeSlept<-gsub("h ",":",slp$TotalTimeSlept)
slp$TotalTimeSlept<-gsub("m","",slp$TotalTimeSlept)
slp$TimeAsleep<-as.numeric(difftime(slp$AwokeAt,slp$FellAsleepAt))
slp$Date<-as.Date(slp$FellAsleepAt, format="%M/%D/%Y")
slp$DayofWeek<-as.factor(weekdays(slp$Date),levels=wkdaylevels)
ggplot(slp,aes(x=DayofWeek,y=TimeAsleep))+
geom_point()
数据在这里:https://docs.google.com/spreadsheets/d/1Vdgmtwx0vNKDKEZFMEGAWQ58H66ia-xjI0evR7idfkc/edit?usp=sharing
使用levels<-
(?levels
寻求帮助):
wdays <- as.factor(c("Sunday", "Monday"))
wkdaylevels<-c("Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday")
levels(wdays) <- wkdaylevels
wdays
# [1] Monday Sunday
# Levels: Sunday Monday Tuesday Wednesday Thursday Friday Saturday
如果你想删除未使用的关卡,你可以使用
droplevels(wdays)
# [1] Monday Sunday
# Levels: Sunday Monday
或
factor(wdays)
# [1] Monday Sunday
# Levels: Sunday Monday
答案:使用factor
而不是as.factor
函数 as
将对象强制转换为 class。在您的情况下, as.<type>
强制转换为一种类型(适合您的因素)。函数 factor
用于将对象编码为一个因子。关键区别在于 as.factor
不允许级别参数,而 factor
允许。
如果您检查每个函数的源代码,您会发现 as.factor
通过使用对象的唯一级别作为其级别来执行强制转换。如果未指定 levels=
参数,factor
会执行此操作,但允许输入级别。
例如:
x <- 1:6
x2 <- factor(x, levels= 1:7)
levels(x2)
[1] "1" "2" "3" "4" "5" "6" "7"
x2 <- as.factor(x, levels= 1:7) # in this case, levels won't be evaluated due to lazy evaluation
Error in as.factor(x, levels = 1:7) : unused argument (levels = 1:7)
TBH,我不确定为什么你的 R 会话没有给你这个错误。您使用的是 R 3.2.3 吗?