R:替换数据框中的多个列名并保留其数值
R: substitute multiple column names in a data frame and keep their numerical value
我有一个名为 dataOrder 的 data.frame,其中列对应于样本名称 (n=384),行对应于基因实体 (n=180200)。
sample1 sample2 sample3 sample4 sample5 sample6
ENST00000000233 9 0 3499.51 0 0 0
ENST00000000412 0 0 0.00 0 0 0
ENST00000000442 0 0 0.00 0 0 0
ENST00000001008 0 0 0.00 0 0 0
ENST00000001146 0 0 0.00 0 0 0
ENST00000002125 0 0 0.00 0 0 0
我想将部分列名称 (str sample
) 替换为五个不同的名称:t1_、t2_、t3_、t4_ 和 t5_。
我尝试使用 gsub 函数来替换名称:
nameVec <- names(dataOrder)
nameVec <- gsub("sample","t2_",nameVec[1:96])
nameVec <- gsub("sample","t3_",nameVec[97:163])
nameVec <- gsub("sample","t4_",nameVec[164:259])
nameVec <- gsub("sample","t5_",nameVec[260:333])
nameVec <- gsub("sample","t1_",nameVec[334:384])
names(dataOrder) <- nameVec
head(dataOrder)
但是,我所有的列名都被替换为 NA。
如何替换标题中的 'sample' 字符串并在列中保留数字索引?
t1_1 t1_96 t2_97 t2_163 t3_164 t3_259
ENST00000000233 9 0 3499.51 0 0 0
ENST00000000412 0 0 0.00 0 0 0
ENST00000000442 0 0 0.00 0 0 0
ENST00000001008 0 0 0.00 0 0 0
ENST00000001146 0 0 0.00 0 0 0
ENST00000002125 0 0 0.00 0 0 0
这是可重现的数据示例(由@RuiBarradas 编写):
mydf <-
structure(list(target_id = c("ENST00000000233", "ENST00000000412",
"ENST00000000442", "ENST00000001008", "ENST00000001146", "ENST00000002125"
), sample1 = c(9L, 0L, 0L, 0L, 0L, 0L), sample10 = c(0L, 0L,
0L, 0L, 0L, 0L), sample100 = c(3499.51, 0, 0, 0, 0, 0), sample101 = c(0L,
0L, 0L, 0L, 0L, 0L), sample102 = c(0L, 0L, 0L, 0L, 0L, 0L), sample103 = c(0L,
0L, 0L, 0L, 0L, 0L)), .Names = c("target_id", "sample1", "sample10",
"sample100", "sample101", "sample102", "sample103"), class = "data.frame", row.names = c("1:",
"2:", "3:", "4:", "5:", "6:"))
result <- mydf[-1]
row.names(result) <- mydf$target_id
result
谢谢!
您仅用其中的片段替换了所有矢量。试试看
nameVec <- names(dataOrder)
nameVec[1:96] <- gsub("sample", "t2_", nameVec[1:96])
nameVec[97:163] <- gsub("sample", "t3_", nameVec[97:163])
nameVec[164:259] <- gsub("sample", "t4_", nameVec[164:259])
nameVec[260:333] <- gsub("sample", "t5_", nameVec[260:333])
nameVec[334:384] <- gsub("sample", "t1_", nameVec[334:384])
names(dataOrder) <- nameVec
我有一个名为 dataOrder 的 data.frame,其中列对应于样本名称 (n=384),行对应于基因实体 (n=180200)。
sample1 sample2 sample3 sample4 sample5 sample6
ENST00000000233 9 0 3499.51 0 0 0
ENST00000000412 0 0 0.00 0 0 0
ENST00000000442 0 0 0.00 0 0 0
ENST00000001008 0 0 0.00 0 0 0
ENST00000001146 0 0 0.00 0 0 0
ENST00000002125 0 0 0.00 0 0 0
我想将部分列名称 (str sample
) 替换为五个不同的名称:t1_、t2_、t3_、t4_ 和 t5_。
我尝试使用 gsub 函数来替换名称:
nameVec <- names(dataOrder)
nameVec <- gsub("sample","t2_",nameVec[1:96])
nameVec <- gsub("sample","t3_",nameVec[97:163])
nameVec <- gsub("sample","t4_",nameVec[164:259])
nameVec <- gsub("sample","t5_",nameVec[260:333])
nameVec <- gsub("sample","t1_",nameVec[334:384])
names(dataOrder) <- nameVec
head(dataOrder)
但是,我所有的列名都被替换为 NA。
如何替换标题中的 'sample' 字符串并在列中保留数字索引?
t1_1 t1_96 t2_97 t2_163 t3_164 t3_259
ENST00000000233 9 0 3499.51 0 0 0
ENST00000000412 0 0 0.00 0 0 0
ENST00000000442 0 0 0.00 0 0 0
ENST00000001008 0 0 0.00 0 0 0
ENST00000001146 0 0 0.00 0 0 0
ENST00000002125 0 0 0.00 0 0 0
这是可重现的数据示例(由@RuiBarradas 编写):
mydf <-
structure(list(target_id = c("ENST00000000233", "ENST00000000412",
"ENST00000000442", "ENST00000001008", "ENST00000001146", "ENST00000002125"
), sample1 = c(9L, 0L, 0L, 0L, 0L, 0L), sample10 = c(0L, 0L,
0L, 0L, 0L, 0L), sample100 = c(3499.51, 0, 0, 0, 0, 0), sample101 = c(0L,
0L, 0L, 0L, 0L, 0L), sample102 = c(0L, 0L, 0L, 0L, 0L, 0L), sample103 = c(0L,
0L, 0L, 0L, 0L, 0L)), .Names = c("target_id", "sample1", "sample10",
"sample100", "sample101", "sample102", "sample103"), class = "data.frame", row.names = c("1:",
"2:", "3:", "4:", "5:", "6:"))
result <- mydf[-1]
row.names(result) <- mydf$target_id
result
谢谢!
您仅用其中的片段替换了所有矢量。试试看
nameVec <- names(dataOrder)
nameVec[1:96] <- gsub("sample", "t2_", nameVec[1:96])
nameVec[97:163] <- gsub("sample", "t3_", nameVec[97:163])
nameVec[164:259] <- gsub("sample", "t4_", nameVec[164:259])
nameVec[260:333] <- gsub("sample", "t5_", nameVec[260:333])
nameVec[334:384] <- gsub("sample", "t1_", nameVec[334:384])
names(dataOrder) <- nameVec