基于数据框的 headers 的决策
Decision based on the headers of a data frame
我有一个数据框列表:
df
[[1]]
ID SignalIntensity SNR
1 109 6.182309 0.8453577
2 110 10.172777 4.3837078
3 111 7.292275 1.0725751
4 112 8.898467 2.3192185
5 113 9.591034 3.7133402
7 116 7.789323 1.3636656
8 117 7.194835 1.1349738
9 118 6.572773 0.9041846
11 120 9.371126 2.9968457
12 121 6.154944 0.7777584
[[2]]
ID SignalIntensity SNR
1 118 6.572773 0.9041846
2 119 5.377519 0.7098581
3 120 9.371126 2.9968457
4 121 6.154944 0.7777584
5 123 5.797446 0.7235425
6 124 5.573614 0.7019574
7 125 7.014537 0.3433343
8 126 6.089159 0.7971650
9 127 6.314820 0.7845944
10 131 5.342544 1.2300000
它有 header 个 ID
SignalIntensity
和 SNR
。我通过名称 (df[[1]]) 检查 headers。现在在检查 headers 之后我需要做出决定,例如如果 df[[1]]
的 headers 是 ID
、SingnalIntensity
和 SNR
那么做像
If(names(df[[1]]=="ID"))
{
print("This is data from Illumina platform")
my code..........
}
else if{my code...........}
如您所知,它有三个 header。
我知道我的做法是错误的,如下所示
if(names(df[[1]]=="ID, SignalIntensity, SNR"))
,它给了我
Error in if (names(df[[1]] == "ID, SignalIntensity, SNR")) { :
argument is of length zero
这很明显。
如何设置 if{}
使其匹配所有三个 header 或(我们选择的 header 1 r 2 r 3)并转到其他代码,如果 true
,否则做点别的。谢谢
试试这个代码:
headers <- c("ID", "SNR") # can add more header names here
hasHeader <- is.element(headers, names(df[[1]])) # c(T, T)
sumHeader <- sum(hasHeader, na.rm=T) # 2
result <- ifelse(sumHeader==length(sumHeader), T, F)
# result is T if "ID" and "SNR" are names of df[[1]]
如果您想直接进行代码:
wanted_colnames = c("ID","SignalIntensity","SNR")
lapply(df, function(u){
if(any(wanted_colnames %in% names(u)))
{
# do something
} else {
# do something
}
})
扩展我的评论,试试这个:
#dummy data
df <-
list(
data.frame(ID=1:5,
SignalIntensity=runif(5),
SNR=runif(5)),
data.frame(ID=1:3,
x=runif(3)),
data.frame(ID=1:5,
SignalIntensity=runif(5),
SNR=runif(5)))
#check 1st data frame
if(length(intersect(names(df[[1]]),c("ID","SignalIntensity","SNR")))==3){
print("Illumina platform")} else {
print("Non Illumina platform")}
# [1] "Illumina platform"
#check all dataframes
lapply(df,function(i)
if(length(intersect(names(i),c("ID","SignalIntensity","SNR")))==3){
"Illumina platform"} else {
"Non Illumina platform"})
# [[1]]
# [1] "Illumina platform"
#
# [[2]]
# [1] "Non Illumina platform"
#
# [[3]]
# [1] "Illumina platform"
我有一个数据框列表:
df
[[1]]
ID SignalIntensity SNR
1 109 6.182309 0.8453577
2 110 10.172777 4.3837078
3 111 7.292275 1.0725751
4 112 8.898467 2.3192185
5 113 9.591034 3.7133402
7 116 7.789323 1.3636656
8 117 7.194835 1.1349738
9 118 6.572773 0.9041846
11 120 9.371126 2.9968457
12 121 6.154944 0.7777584
[[2]]
ID SignalIntensity SNR
1 118 6.572773 0.9041846
2 119 5.377519 0.7098581
3 120 9.371126 2.9968457
4 121 6.154944 0.7777584
5 123 5.797446 0.7235425
6 124 5.573614 0.7019574
7 125 7.014537 0.3433343
8 126 6.089159 0.7971650
9 127 6.314820 0.7845944
10 131 5.342544 1.2300000
它有 header 个 ID
SignalIntensity
和 SNR
。我通过名称 (df[[1]]) 检查 headers。现在在检查 headers 之后我需要做出决定,例如如果 df[[1]]
的 headers 是 ID
、SingnalIntensity
和 SNR
那么做像
If(names(df[[1]]=="ID"))
{
print("This is data from Illumina platform")
my code..........
}
else if{my code...........}
如您所知,它有三个 header。
我知道我的做法是错误的,如下所示
if(names(df[[1]]=="ID, SignalIntensity, SNR"))
,它给了我
Error in if (names(df[[1]] == "ID, SignalIntensity, SNR")) { :
argument is of length zero
这很明显。
如何设置 if{}
使其匹配所有三个 header 或(我们选择的 header 1 r 2 r 3)并转到其他代码,如果 true
,否则做点别的。谢谢
试试这个代码:
headers <- c("ID", "SNR") # can add more header names here
hasHeader <- is.element(headers, names(df[[1]])) # c(T, T)
sumHeader <- sum(hasHeader, na.rm=T) # 2
result <- ifelse(sumHeader==length(sumHeader), T, F)
# result is T if "ID" and "SNR" are names of df[[1]]
如果您想直接进行代码:
wanted_colnames = c("ID","SignalIntensity","SNR")
lapply(df, function(u){
if(any(wanted_colnames %in% names(u)))
{
# do something
} else {
# do something
}
})
扩展我的评论,试试这个:
#dummy data
df <-
list(
data.frame(ID=1:5,
SignalIntensity=runif(5),
SNR=runif(5)),
data.frame(ID=1:3,
x=runif(3)),
data.frame(ID=1:5,
SignalIntensity=runif(5),
SNR=runif(5)))
#check 1st data frame
if(length(intersect(names(df[[1]]),c("ID","SignalIntensity","SNR")))==3){
print("Illumina platform")} else {
print("Non Illumina platform")}
# [1] "Illumina platform"
#check all dataframes
lapply(df,function(i)
if(length(intersect(names(i),c("ID","SignalIntensity","SNR")))==3){
"Illumina platform"} else {
"Non Illumina platform"})
# [[1]]
# [1] "Illumina platform"
#
# [[2]]
# [1] "Non Illumina platform"
#
# [[3]]
# [1] "Illumina platform"