如何按字母顺序更改单词的顺序
How to change the order of words with alphabetic order
我得到了一个包含关键字列表的数据集(1 个关键字/行)。
- 我正在寻找一种方法来根据 KEYWORD 列创建新列 (ALPHABETICAL)。 ALPHABETICAL 列的值应根据关键字自动生成,但单词应按字母顺序排列。
像这样:
| KEYWORD | ALPHABETICAL |
| house blue | blue house |
| blue house | blue house |
| my blue house | blue house my |
| this house is blue | blue house is this |
| sky orange | orange sky |
| orange sky | orange sky |
| the orange sky | orange sky the |
感谢您的帮助!
迭代要按 " "
(strsplit
) 拆分的行,排序并向后折叠:
# Generate data
df <- data.frame(KEYWORD = c(paste(sample(letters, 3), collapse = " "),
paste(sample(letters, 3), collapse = " ")))
# KEYWORD
# z e s
# d a u
df$ALPHABETICAL <- apply(df, 1, function(x) paste(sort(unlist(strsplit(x, " "))),
collapse = " "))
# KEYWORD ALPHABETICAL
# z e s e s z
# d a u a d u
dplyr + stringr 的一个解决方案
library(dplyr)
library(stringr)
KEYWORDS <- c('house blue','blue house','my blue house','this house is blue','sky orange','orange sky','the orange sky')
ALPHABETICAL <- KEYWORDS %>% str_split(., ' ') %>% lapply(., 'sort') %>% lapply(., 'paste', collapse=' ') %>% unlist(.)
最后一行使用 str_split() 将 KEYWORDS 拆分为向量列表;然后将排序应用于每个列表元素;使用粘贴连接向量,最后将列表分解为向量。
结果是
> cbind(KEYWORDS, ALPHABETICAL)
KEYWORDS ALPHABETICAL
[1,] "house blue" "blue house"
[2,] "blue house" "blue house"
[3,] "my blue house" "blue house my"
[4,] "this house is blue" "blue house is this"
[5,] "sky orange" "orange sky"
[6,] "orange sky" "orange sky"
[7,] "the orange sky" "orange sky the"
df$ALPHABETICAL <- sapply(strsplit(df$KEYWORD," "),function(x) paste(sort(x),collapse=" "))
df
# KEYWORD ALPHABETICAL
# 1 house blue blue house
# 2 blue house blue house
# 3 my blue house blue house my
# 4 this house is blue blue house is this
# 5 sky orange orange sky
# 6 orange sky orange sky
# 7 the orange sky orange sky the
数据
df <- data.frame(KEYWORD = c(
'house blue',
'blue house',
'my blue house',
'this house is blue',
'sky orange',
'orange sky',
'the orange sky'),stringsAsFactors = FALSE)
我得到了一个包含关键字列表的数据集(1 个关键字/行)。
- 我正在寻找一种方法来根据 KEYWORD 列创建新列 (ALPHABETICAL)。 ALPHABETICAL 列的值应根据关键字自动生成,但单词应按字母顺序排列。
像这样:
| KEYWORD | ALPHABETICAL |
| house blue | blue house |
| blue house | blue house |
| my blue house | blue house my |
| this house is blue | blue house is this |
| sky orange | orange sky |
| orange sky | orange sky |
| the orange sky | orange sky the |
感谢您的帮助!
迭代要按 " "
(strsplit
) 拆分的行,排序并向后折叠:
# Generate data
df <- data.frame(KEYWORD = c(paste(sample(letters, 3), collapse = " "),
paste(sample(letters, 3), collapse = " ")))
# KEYWORD
# z e s
# d a u
df$ALPHABETICAL <- apply(df, 1, function(x) paste(sort(unlist(strsplit(x, " "))),
collapse = " "))
# KEYWORD ALPHABETICAL
# z e s e s z
# d a u a d u
dplyr + stringr 的一个解决方案
library(dplyr)
library(stringr)
KEYWORDS <- c('house blue','blue house','my blue house','this house is blue','sky orange','orange sky','the orange sky')
ALPHABETICAL <- KEYWORDS %>% str_split(., ' ') %>% lapply(., 'sort') %>% lapply(., 'paste', collapse=' ') %>% unlist(.)
最后一行使用 str_split() 将 KEYWORDS 拆分为向量列表;然后将排序应用于每个列表元素;使用粘贴连接向量,最后将列表分解为向量。
结果是
> cbind(KEYWORDS, ALPHABETICAL)
KEYWORDS ALPHABETICAL
[1,] "house blue" "blue house"
[2,] "blue house" "blue house"
[3,] "my blue house" "blue house my"
[4,] "this house is blue" "blue house is this"
[5,] "sky orange" "orange sky"
[6,] "orange sky" "orange sky"
[7,] "the orange sky" "orange sky the"
df$ALPHABETICAL <- sapply(strsplit(df$KEYWORD," "),function(x) paste(sort(x),collapse=" "))
df
# KEYWORD ALPHABETICAL
# 1 house blue blue house
# 2 blue house blue house
# 3 my blue house blue house my
# 4 this house is blue blue house is this
# 5 sky orange orange sky
# 6 orange sky orange sky
# 7 the orange sky orange sky the
数据
df <- data.frame(KEYWORD = c(
'house blue',
'blue house',
'my blue house',
'this house is blue',
'sky orange',
'orange sky',
'the orange sky'),stringsAsFactors = FALSE)