如何优雅地将字符串按字数分成两半?
How to elegantly split strings in half by word count?
我需要按字数将字符串分成两半(当字数为奇数时,中间的字应同时出现在左侧和右侧)。我还需要知道每个字符串来自哪一侧。
my_question <- data.frame(string_id = c(001, 002, 003),
string = (c("how do I split", "how do I split this", "how do I split this string")))
my_answer <- data.frame(string_id = c(001, 002, 003, 001, 002, 003),
string = (c("how do", "how do I", "how do I", "I split", "I split this", "split this string")),
side = c("R", "R", "R", "L", "L", "L"))
我更喜欢使用 stringr/tidyverse/hadleyverse。
我们可以编写一些辅助函数来使这更容易
library(tidyverse)
word_split <- function(x, side="left", sep=" ") {
words <- strsplit(as.character(x), sep)
nwords <- lengths(words)
if(side=="left") {
start <- 1
end <- ceiling(nwords/2)
} else if (side=="right") {
start <- ceiling((nwords+1)/2)
end <- nwords
}
cw <- function(words, start, stop) paste(words[start:stop], collapse=sep)
pmap_chr(list(words, start, end), cw)
}
left_words <- function(..., side) word_split(..., side="left")
right_words <- function(..., side) word_split(..., side="right")
然后我们可以使用更传统的管链来分享你想要的结果
my_question %>% mutate(L=left_words(string),
R=right_words(string)) %>%
select(-string) %>%
gather(side, string, L:R)
这导致
string_id side string
1 1 L how do
2 2 L how do I
3 3 L how do I
4 1 R I split
5 2 R I split this
6 3 R split this string
我需要按字数将字符串分成两半(当字数为奇数时,中间的字应同时出现在左侧和右侧)。我还需要知道每个字符串来自哪一侧。
my_question <- data.frame(string_id = c(001, 002, 003),
string = (c("how do I split", "how do I split this", "how do I split this string")))
my_answer <- data.frame(string_id = c(001, 002, 003, 001, 002, 003),
string = (c("how do", "how do I", "how do I", "I split", "I split this", "split this string")),
side = c("R", "R", "R", "L", "L", "L"))
我更喜欢使用 stringr/tidyverse/hadleyverse。
我们可以编写一些辅助函数来使这更容易
library(tidyverse)
word_split <- function(x, side="left", sep=" ") {
words <- strsplit(as.character(x), sep)
nwords <- lengths(words)
if(side=="left") {
start <- 1
end <- ceiling(nwords/2)
} else if (side=="right") {
start <- ceiling((nwords+1)/2)
end <- nwords
}
cw <- function(words, start, stop) paste(words[start:stop], collapse=sep)
pmap_chr(list(words, start, end), cw)
}
left_words <- function(..., side) word_split(..., side="left")
right_words <- function(..., side) word_split(..., side="right")
然后我们可以使用更传统的管链来分享你想要的结果
my_question %>% mutate(L=left_words(string),
R=right_words(string)) %>%
select(-string) %>%
gather(side, string, L:R)
这导致
string_id side string
1 1 L how do
2 2 L how do I
3 3 L how do I
4 1 R I split
5 2 R I split this
6 3 R split this string