rowMeans 如果列名是数字

Question

我的数据框看起来像：

我的数据看起来像..

 Tester Type    Subject Type    Time        1     2     3
 TType1         SType1          Day 1       11    2     1         
 TType1         SType2          Day 1       3     2     13
 TType1         SType1          Day 2       2     3     15
 TType2         SType3          Day 2       1     4     3
 TType3         SType3          Day 2       2     3     4
 TType1         SType1          Day 1       7     2     2
 TType2         SType1          Day 2       2     6     7

所以我的列名是 c(Tester.Type, Subject.Type, Time, 1, 2, 3)

我想创建一个计算行均值的列，但前提是列名是数字。

我知道如何直接做：

avgdata <- rowMeans(data[,c(4:6)],na.rm=TRUE)

但是有没有一种编码方式可以在列名是数字 (is.numeric) 时自动提取？

这样一来，如果我有更多带有数字列名称的列，我就不必更改列范围了？

谢谢。

Answer 1

当您读入数据时。记得使用参数check.names=F.

df1 <- read.table(text="
TesterType    SubjectType    Time        1     2     3
TType1         SType1          Day1       11    2     1
TType1         SType2          Day1       3     2     13
TType1         SType1          Day2       2     3     15
TType2         SType3          Day2       1     4     3
TType3         SType3          Day2       2     3     4
TType1         SType1          Day1       7     2     2
TType2         SType1          Day2       2     6     7",
                 head=T, as.is=T, check.names = F)

df1
rowMeans(df1[colnames(df1)[!is.na(as.numeric(colnames(df1)))]])
# [1] 4.666667 6.000000 6.666667 2.666667 3.000000 3.666667 5.000000

or using regular expression.

rowMeans(df1[colnames(df1)[grepl("^\d+$", colnames(df1))]])
# [1] 4.666667 6.000000 6.666667 2.666667 3.000000 3.666667 5.000000

Answer 2

不宜使用以numbers开头的列名。我们可以使用 make.names

将其更改为附加前缀 'X'

rowMeans(df1[grep('^X', make.names(names(df1)))])
#[1] 4.666667 6.000000 6.666667 2.666667 3.000000 3.666667 5.000000

或使用dplyr

library(dplyr)
df1 %>% 
    select(matches('^\d+')) %>%
    Reduce(`+`, .)/3

Answer 3

基于@Ven Yao 的回答，使用 mutate:

创建一列 rowMeans

require(dplyr)
df1 <- read.table(text="
TesterType    SubjectType    Time        1     2     3
TType1         SType1          Day1       11    2     1
TType1         SType2          Day1       3     2     13
TType1         SType1          Day2       2     3     15
TType2         SType3          Day2       1     4     3
TType3         SType3          Day2       2     3     4
TType1         SType1          Day1       7     2     2
TType2         SType1          Day2       2     6     7",
                  head=T, as.is=T, check.names=F)

l<-which(!is.na(as.numeric(colnames(df1))))
df1 <- df1 %>%
  mutate(rowmean = apply(select(.,unlist(l)),1,mean))
df1
  TesterType SubjectType Time  1 2  3  rowmean
1     TType1      SType1 Day1 11 2  1 4.666667
2     TType1      SType2 Day1  3 2 13 6.000000
3     TType1      SType1 Day2  2 3 15 6.666667
4     TType2      SType3 Day2  1 4  3 2.666667
5     TType3      SType3 Day2  2 3  4 3.000000
6     TType1      SType1 Day1  7 2  2 3.666667
7     TType2      SType1 Day2  2 6  7 5.000000

rowMeans 如果列名是数字

rowMeans if column name is number

average

r

numeric

calculated-columns