使用 data.frame 中的数据生成字符串,使用列名,并根据值以不同方式组合数据

Generate strings using data in a data.frame, using the colnames, and combining the data in different ways based on the value

我有一个数据框,我想用它来构建 sql 查询。

这是我的一段数据:

Data <- structure(list(inclass = c("01", "99", "99"), childage = c("0", 
"2", "4"), high_edu = c("00", "00", "14"), ref_race = c("1", 
"1", "1")), .Names = c("inclass", "childage", "high_edu", "ref_race"
), row.names = c(1L, 2L, 3L), class = "data.frame")

我需要根据所有数据行创建不同类型的短语,如果值为“99”,它的行为会有所不同。

Phrase1: 如果值为 '99',它应该说“'99' 作为列名”,否则它应该只使用列名。

所以第二行看起来像:

'99' as inclass, childage, high_edu, ref_race

Phrase2: 如果不是 '99',则将列名与值组合。

所以第二行看起来像:

childage = '2', high_edu = '00', ref_race = '1'

短语 3: 粘贴值不是“99”的列名

childage, high_edu, ref_race

我很难弄清楚如何根据值 is/isn 不是 '99'

来组合数据

编辑

我想我的问题可能有点令人困惑。 我正在尝试为每一行获取这三个短语。 这些数据可能会让我更清楚地了解我正在尝试做的事情。

    structure(list(inclass = c("01", "99", "99", "1", "2"), childage = c("0", 
"99", "4", "6", "3"), high_edu = c("00", "99", "14", "99", "99"
), ref_race = c("1", "1", "1", "99", "4"), phrase1 = c("inclass, childage, high_edu, ref_race", 
"'99' as inclass, '99' as childage, '99' as high_edu, ref_race", 
"'99' as inclass, childage, as high_edu, ref_race", "inclass, childage, '99' as high_edu, '99' as ref_race", 
"inclass, childage, '99' as high_edu, ref_race"), phrase2 = c("inclass = '01', childage = '0', high_edu = '00', ref_race = '1'", 
"ref_race = '1'", "childage = '4', high_edu = '14', ref_race = '1'", 
"inclass = '1', childage = '6'", "inclass = '2', childage = '3', ref_race = '4'"
), phrase3 = c("inclass, childage, high_edu, ref_race", "ref_race", 
"childage, high_edu, ref_race", "inclass, childage", "inclass, childage, ref_race"
)), .Names = c("inclass", "childage", "high_edu", "ref_race", 
"phrase1", "phrase2", "phrase3"), row.names = c(NA, 5L), class = "data.frame")

我相信这应该可以解决您的问题。它可能不是最优雅的,但我认为它至少是直截了当的。当您想要这样的特定输出时,您想查看 any 之类的函数并熟悉 paste 。本质上,我所做的只是获取值是否为“99”的列名。然后就是好好利用paste的问题了。

for(i in seq(nrow(Data))){
    idx <- which(Data[i,1:4] == "99")
    idxn <- which(Data[i,1:4] != "99")
    cols99 <- colnames(Data)[idx]
    colsn99 <- colnames(Data)[idxn]
    if(any(idx)){
        nn <- paste("'99' as", colnames(Data)[idx])
        Data[i,"phrase1"] <-
            paste(
                c(nn, colsn99), 
                collapse=", ")            
    }else{
        Data[i,"phrase1"] <-
            paste(
                colsn99, 
                collapse=", ")
    }

    if(any(idxn)){
        Data[i, "phrase2"] <- 
            paste(
                paste(
                    colsn99, 
                    paste("'", Data[i,idxn],"'", sep=""), 
                    sep="="), 
                collapse=", ")
        Data[i, "phrase3"] <- paste(colsn99, collapse=", ")

    }
}