如何使用带有 R 的 strsplit 的字符串定界符拆分文本？

Question

假设我有一本书的文本文件，其中包含多个包含文本的章节。

x <- "Chapter 1 Text. Text. Chapter 2 Text. Text. Chapter 3 Text. Text."

我想拆分此文本，并为每一章获取一个单独的文件。

"Chapter 1 Text. Text." "Chapter 2 Text. Text." "Chapter 3 Text. Text."

理想情况下，我想按章节保存每个文件，所以第1章，第2章和第3章。

我试过以下方法：

unlist(strsplit(x, "Chapter", perl = TRUE))

不幸的是，这删除了我想保留的分隔符。

我也试过以下方法：

unlist(strsplit(x, "(?<=Chapter)", perl=TRUE))

不幸的是，这似乎只适用于单个字符而不适用于字符串。

非常感谢您的帮助！

Answer 1

我们需要使用正则表达式前瞻

strsplit(x, "\s(?=Chapter)", perl = TRUE)[[1]]
#[1] "Chapter 1 Text. Text." "Chapter 2 Text. Text." "Chapter 3 Text. Text."

How to split a text using a string delimiter with R's strsplit?