如何按字母顺序更改单词的顺序

How to change the order of words with alphabetic order

我得到了一个包含关键字列表的数据集(1 个关键字/行)。

  1. 我正在寻找一种方法来根据 KEYWORD 列创建新列 (ALPHABETICAL)。 ALPHABETICAL 列的值应根据关键字自动生成,但单词应按字母顺序排列。

像这样:

 | KEYWORD            | ALPHABETICAL       |
 | house blue         | blue house         | 
 | blue house         | blue house         | 
 | my blue house      | blue house my      | 
 | this house is blue | blue house is this | 
 | sky orange         | orange sky         | 
 | orange sky         | orange sky         | 
 | the orange sky     | orange sky the     | 

感谢您的帮助!

迭代要按 " "(strsplit) 拆分的行,排序并向后折叠:

# Generate data
df <- data.frame(KEYWORD = c(paste(sample(letters, 3), collapse = " "), 
                             paste(sample(letters, 3), collapse = " ")))
#  KEYWORD
#   z e s
#   d a u

df$ALPHABETICAL  <- apply(df, 1, function(x) paste(sort(unlist(strsplit(x, " "))),
                                                   collapse = " "))
#  KEYWORD ALPHABETICAL
#   z e s        e s z
#   d a u        a d u

dplyr + stringr 的一个解决方案

library(dplyr)
library(stringr)
KEYWORDS  <- c('house blue','blue house','my blue house','this house is blue','sky orange','orange sky','the orange sky')

ALPHABETICAL <- KEYWORDS %>% str_split(., ' ') %>% lapply(., 'sort') %>%  lapply(., 'paste', collapse=' ') %>% unlist(.)

最后一行使用 str_split() 将 KEYWORDS 拆分为向量列表;然后将排序应用于每个列表元素;使用粘贴连接向量,最后将列表分解为向量。

结果是

> cbind(KEYWORDS, ALPHABETICAL)
     KEYWORDS             ALPHABETICAL        
[1,] "house blue"         "blue house"        
[2,] "blue house"         "blue house"        
[3,] "my blue house"      "blue house my"     
[4,] "this house is blue" "blue house is this"
[5,] "sky orange"         "orange sky"        
[6,] "orange sky"         "orange sky"        
[7,] "the orange sky"     "orange sky the" 
df$ALPHABETICAL <- sapply(strsplit(df$KEYWORD," "),function(x) paste(sort(x),collapse=" "))

df
#              KEYWORD       ALPHABETICAL
# 1         house blue         blue house
# 2         blue house         blue house
# 3      my blue house      blue house my
# 4 this house is blue blue house is this
# 5         sky orange         orange sky
# 6         orange sky         orange sky
# 7     the orange sky     orange sky the

数据

df <- data.frame(KEYWORD = c(
  'house blue',
  'blue house',
  'my blue house',
  'this house is blue',
  'sky orange',
  'orange sky',
  'the orange sky'),stringsAsFactors = FALSE)