根据 R 中的字符串值添加新列
Add a new column based on string values in R
考虑数据集:
activity <- c("play football", "basketball player", "guitar sono","cinema", "piano")
country_and_type <- c("uk", "uk", "spain", "uk", "uk")
dataset <- data.frame(activity, country_and_type)
|activity |country_and_type |
|play football |uk |
|basketball playe |uk |
|guitar sono |spain |
|cinema |uk |
|piano |uk |
以及这些列表:
sport <- ("football", "basketball", "handball", "baseball")
music <- ("guitar", "piano", "microphone")
如果初始数据集$country_and_type值为"uk",我的目标是在数据集$[=中添加括号中列表的名称27=] 列根据字符串匹配。
如果没有匹配的值,类型应该是 "other".
为了更清楚,这是预期的输出:
|activity |country_and_type |
|play football |uk (sport) |
|basketball playe |uk (sport) |
|guitar sono |spain |
|cinema |uk (other) |
|piano |uk (music) |
你知道怎么做吗?
dataset$type=NA
> dataset$type[grepl(paste(sport,collapse = "|"),a)]="sport"
> dataset$type[grepl(paste(music,collapse = "|"),a)]="music"
> dataset
a type
1 play football sport
2 basketball player sport
3 guitar sono music
4 french piano music
5 ok handball sport
6 baseball game sport
7 microphone for singer music
>
改版后:
> sp=grepl(paste(sport,".*uk",collapse = "|"),do.call(paste,dataset))
> ms=grepl(paste(music,".*uk",collapse = "|"),do.call(paste,dataset))
> uk=grepl("uk",do.call(paste,dataset))
> dataset$type=""
> dataset$type[sp]="(sport)"
> dataset$type[ms]="(music)"
> dataset$type[!(ms|sp)&uk]="(other)"
> transform(dataset,country_and_type=paste(country_and_type,type))[-3]
activity country_and_type
1 play football uk (sport)
2 basketball player uk (sport)
3 guitar sono spain
4 cinema uk (other)
5 piano uk (music)
考虑数据集:
activity <- c("play football", "basketball player", "guitar sono","cinema", "piano")
country_and_type <- c("uk", "uk", "spain", "uk", "uk")
dataset <- data.frame(activity, country_and_type)
|activity |country_and_type |
|play football |uk |
|basketball playe |uk |
|guitar sono |spain |
|cinema |uk |
|piano |uk |
以及这些列表:
sport <- ("football", "basketball", "handball", "baseball")
music <- ("guitar", "piano", "microphone")
如果初始数据集$country_and_type值为"uk",我的目标是在数据集$[=中添加括号中列表的名称27=] 列根据字符串匹配。 如果没有匹配的值,类型应该是 "other".
为了更清楚,这是预期的输出:
|activity |country_and_type |
|play football |uk (sport) |
|basketball playe |uk (sport) |
|guitar sono |spain |
|cinema |uk (other) |
|piano |uk (music) |
你知道怎么做吗?
dataset$type=NA
> dataset$type[grepl(paste(sport,collapse = "|"),a)]="sport"
> dataset$type[grepl(paste(music,collapse = "|"),a)]="music"
> dataset
a type
1 play football sport
2 basketball player sport
3 guitar sono music
4 french piano music
5 ok handball sport
6 baseball game sport
7 microphone for singer music
>
改版后:
> sp=grepl(paste(sport,".*uk",collapse = "|"),do.call(paste,dataset))
> ms=grepl(paste(music,".*uk",collapse = "|"),do.call(paste,dataset))
> uk=grepl("uk",do.call(paste,dataset))
> dataset$type=""
> dataset$type[sp]="(sport)"
> dataset$type[ms]="(music)"
> dataset$type[!(ms|sp)&uk]="(other)"
> transform(dataset,country_and_type=paste(country_and_type,type))[-3]
activity country_and_type
1 play football uk (sport)
2 basketball player uk (sport)
3 guitar sono spain
4 cinema uk (other)
5 piano uk (music)