更改 R 中字段的名称
Change the name of fields in R
我是这个小组的新成员(也是一个相当新的 R 用户),我有一个问题。我有一个 data.table 这样的
Date V2 Deal Type
-----------------
1: 2009-1 Public sector bank Corporate Bond-Investment-Grade
2: 2009-1 Private sector bank Corporate Bond-Investment-Grade
3: 2009-7 Private sector industrial Corporate Bond-Investment-Grade
4: 2009-1 Private sector bank Corporate Bond-Investment-Grade
5: 2009-1 Private sector bank Covered Bond
6: 2009-1 Public sector bank Corporate Bond-Investment-Grade
7: 2009-1 Private sector bank Corporate Bond-Investment-Grade
问题是如何更改 V2 列中变量(和变量)的名称。例如,我希望 "public sector bank" 和 "private sector bank" 在新列中显示为 "financial","private sector industrial" 和 "public sector industrial" 显示为 "non-financial"。希望我已经足够清楚了。非常感谢您的帮助。
假设您的数据框名为 df,您可以执行以下操作:
df <- read.csv("data.csv", stringsAsFactors=FALSE)
df$newColumn[df$V2 == "Public sector bank" | df$V2 == "Private sector bank"] <- "financial"
df$newColumn[df$V2 == "Public sector industrial" | df$V2 == "Private sector industrial"] <- "non-financial"
或者如果您确定您的 V2 字段中包含单词 "bank" 和 "industrial",这就是您如何确定在新列中调用值的方式,您可以这样做:
df$newColumn[grepl("bank", df$V2)] <- "financial"
df$newColumn[grepl("industrial", df$V2)] <- "non-financial"
这对数据表也同样有效
如果 DT 是你的 data.table
`DT[,':='(V3 = ifelse(V2 %in% c("Public sector bank","Private sector bank"),"Non financial","Financial")`]
标准化文本字段通常是一个好习惯,因此您可以考虑:
DT[,':='(V3 = ifelse(tolower(gsub(" ","",V2)) %in% c("publicsectorbank","privatesectorbank"),"Non financial","Financial")]
希望对你有帮助,我也推荐https://s3.amazonaws.com/assets.datacamp.com/img/blog/data+table+cheat+sheet.pdf
replace() 在这种情况下可以派上用场。假设您的数据框为 DF,您的新列为 V2new:
# Creating new column V2new and replacing "Public/Private sector bank" to "financial"
DF$V2new <- replace(DF$V2 ,DF$V2 =="Public sector bank"|DF$V2=="Private sector bank","financial")
# Replacing "Public/Private sector industrial" from V2new to "non-financial"
DF$V2new <- replace(DF$V2new ,DF$V2new =="Public sector industrial"|DF$V2new =="Private sector industrial","non-financial")
我是这个小组的新成员(也是一个相当新的 R 用户),我有一个问题。我有一个 data.table 这样的
Date V2 Deal Type
-----------------
1: 2009-1 Public sector bank Corporate Bond-Investment-Grade
2: 2009-1 Private sector bank Corporate Bond-Investment-Grade
3: 2009-7 Private sector industrial Corporate Bond-Investment-Grade
4: 2009-1 Private sector bank Corporate Bond-Investment-Grade
5: 2009-1 Private sector bank Covered Bond
6: 2009-1 Public sector bank Corporate Bond-Investment-Grade
7: 2009-1 Private sector bank Corporate Bond-Investment-Grade
问题是如何更改 V2 列中变量(和变量)的名称。例如,我希望 "public sector bank" 和 "private sector bank" 在新列中显示为 "financial","private sector industrial" 和 "public sector industrial" 显示为 "non-financial"。希望我已经足够清楚了。非常感谢您的帮助。
假设您的数据框名为 df,您可以执行以下操作:
df <- read.csv("data.csv", stringsAsFactors=FALSE)
df$newColumn[df$V2 == "Public sector bank" | df$V2 == "Private sector bank"] <- "financial"
df$newColumn[df$V2 == "Public sector industrial" | df$V2 == "Private sector industrial"] <- "non-financial"
或者如果您确定您的 V2 字段中包含单词 "bank" 和 "industrial",这就是您如何确定在新列中调用值的方式,您可以这样做:
df$newColumn[grepl("bank", df$V2)] <- "financial"
df$newColumn[grepl("industrial", df$V2)] <- "non-financial"
这对数据表也同样有效
如果 DT 是你的 data.table
`DT[,':='(V3 = ifelse(V2 %in% c("Public sector bank","Private sector bank"),"Non financial","Financial")`]
标准化文本字段通常是一个好习惯,因此您可以考虑:
DT[,':='(V3 = ifelse(tolower(gsub(" ","",V2)) %in% c("publicsectorbank","privatesectorbank"),"Non financial","Financial")]
希望对你有帮助,我也推荐https://s3.amazonaws.com/assets.datacamp.com/img/blog/data+table+cheat+sheet.pdf
replace() 在这种情况下可以派上用场。假设您的数据框为 DF,您的新列为 V2new:
# Creating new column V2new and replacing "Public/Private sector bank" to "financial"
DF$V2new <- replace(DF$V2 ,DF$V2 =="Public sector bank"|DF$V2=="Private sector bank","financial")
# Replacing "Public/Private sector industrial" from V2new to "non-financial"
DF$V2new <- replace(DF$V2new ,DF$V2new =="Public sector industrial"|DF$V2new =="Private sector industrial","non-financial")