使用条件查找字符串数组中最长的文本片段

Question

晚上好。所以我有一个文本文件，其中包含多行带分隔符的文本。我需要找到最长的文本片段，条件是单词的最后一个字母必须是后面单词的第一个字母。该文本片段可以继续多行，而不仅仅是一行。

F.e.

我们有这个字符串数组：

Hello, obvious smile eager ruler.
Rave, eyes, random.

所以从这两行我们得到我们的文本片段将是：

Hello, obvious smile eager ruler.
Rave, eyes

我们的文本片段以单词 "eyes" 结尾，因为 "random" 不是以 "s".

开头

我们的 txt 文件中的下两行：

Johnny, you use.
Eye eager sun.

所以从另外两行我们得到我们的文本片段将是：

Johnny, you use.
Eye eager

我们的文本片段以单词 "eager" 结尾，因为 "sun" 不是以 "r".

开头

所以我们的输入文件 (txt) 中有多行带有分隔符的文本，我需要在所有文本中找到最大的文本片段。该文本片段包含单词和分隔符。

我什至不知道从哪里开始，我想我将不得不使用像 String.Length、Substring 和 String.Split 这样的函数，也许 Redex 可能会派上用场，但我'我对 Redex 及其功能还不是很熟悉。

我尽量解释清楚了，英语不是我的母语，所以有点难。

我的问题是：我应该使用哪种算法将我的文本分成单独的字符串，其中一个字符串包含一个词和该词后的分隔符？

Answer 1

让我们创建算法：

使用 while 循环逐行读取文件。
使用拆分方法和逗号作为分隔符来拆分行。
使用 for 循环遍历创建的数组以比较 arr[i] 的最后一个字符与 arr[i+1] 的第一个字符。
比较完字数统计后
将当前长度与之前的长度进行比较，并保存最长的和较长的文本片段。
转到步骤 2 重复此工作。
打印文本片段

Answer 2

您需要执行以下操作：

将文本拆分为单个单词
比较单词的首字母和末字母是否相同
如果不相同，返回原文，获取遇到该词之前的初始文本片段。

执行此操作的一种方法如下：

String text = "Johnny, you use.\nEye eager sun.";

// Splits the text into individual words
String[] words = text.ToLower().Split(new String[] {" ", ",", ".", "\n"}, StringSplitOptions.RemoveEmptyEntries);

String lastLetter = text.ToLower()[0].ToString();
String newText = text;

// Checks to see if the last letter if the previous word matches with the first letter of the next word
foreach (String word in words)
{
    if (word.StartsWith(lastLetter))
    {
        lastLetter = word[word.Length - 1].ToString();
    }
    else
    {
        newText = text.Split(new String[] { word }, StringSplitOptions.RemoveEmptyEntries)[0]; // Split the original text at the location where the inconsistency happens and take the first text fragment.
        break;
    }
}

Console.WriteLine(text);
Console.WriteLine(newText);

使用条件查找字符串数组中最长的文本片段

Finding longest text fragment in string array with conditions

c#

string

text

string-length

separator