替换向量中的字符串：每个实例都替换为先前找到的实例

Question

我正在处理许多加载到 R 中的文本文件，我试图用文本文件中较早找到的特定字符串替换 </SPEAKER> 的每个实例（或标记）。

示例： "<BOB> Lots of text here </SPEAKER> <HARRY> More text here by a different speaker </SPEAKER>"

我想根据之前找到的 NAME 将 "</SPEAKER>" 的每个实例替换为 "<BOB>" 和 "<HARRY>" 的名称，所以我会最后得到这个：

"<BOB> Lots of text here </BOB> <HARRY> More text here by a different speaker </HARRY>"

我正在考虑遍历矢量文本，但由于我对 R 的经验有限，我不知道如何解决这个问题。

如果有人对如何执行此操作有任何建议，甚至可能在 R 之外使用 Notepad++ 或其他 text/tag 编辑器，我将不胜感激。

谢谢！

Answer 1

匹配

<,
单词字符（在捕获组 1 中捕获它们），
>,
最短的字符串（在捕获组 2 中捕获它）直到
</SPEAKER>

然后将其替换为

<,
捕获组 1，
>,
捕获第 2 组并
</ 后跟
捕获第 1 组并
>

这给

x <- "<BOB> Lots of text here </SPEAKER> <HARRY> More text here by a different speaker </SPEAKER>"

gsub("<(\w+)>(.*?)</SPEAKER>", "<\1>\2</\1>", x)
## [1] "<BOB> Lots of text here </BOB> <HARRY> More text here by a different speaker </HARRY>"

替换向量中的字符串：每个实例都替换为先前找到的实例

Replacing strings in vector: Every instance replaced by previous found instance

string

text

design-patterns

replace

r