只输出有重复单词的行

Output only lines with duplicate words

我正在尝试获取行列表,并且 PHP 只输出包含相同单词(变量)的行两次。它应该匹配单词的单复数形式。

示例行列表:

This is a best website of all the websites out there

This is a great website

Here is a website I found while looking for websites

Website is a cool new word

我会将这些行放入文本框中,脚本将输出:

This is a best website of all the websites out there

Here is a website I found while looking for websites


无需显示任何计数,仅显示包含单词两次的原始行。

我在操作线条方面还不错,但是我到处寻找答案,似乎没有。

出于测试目的,我没有使用 $text = $_POST['text']; 之类的东西,而是使用变量来存储文本,此外,我用来使单词复数的 class 来自 here.

注意:我回滚了答案以准确解决问题,之前试图解决评论的答案已经移动 here.

<?php    

$text = "This is a best website of all the websites out there
    This is a great website
    Here is a website I found while looking for websites
    Website is a cool new word';
// helps us pluralize all words, so we can check the duplicates 
include('class.php'); 

// loop into each line one by one
foreach(explode("\n", $text) as $line)
{
        // remove special characters
        $tline = preg_replace('/[^A-Za-z0-9\-\s]/', '', $line);

        // create a list of words from current line
        $words_list = preg_split('/\s+/', strtolower($tline));

        // convert all singular words to plural
        foreach($words_list as $word)
        {
                $w[] = Inflect::pluralize($word);
        }

         // if the count of words in this line was bigger that of unique
         // words then we got some duplicates, echo this line out
        if( count($w) > count(array_unique($w)) )
                echo $line . '</br>';

        // empty the array for next line
        $w = [];
}

所需文本的输出为:

This is a best website of all the websites out there
Here is a website I found while looking for websites

然而,代码的正确性实际上取决于我们的 pluralize 方法是如何工作的。


工作原理

首先,我逐行循环使用,在每次迭代中,我都会列出该行的单词,然后我们应该将所有单数单词转换为复数(或将复数转换为单数)真的很重要),现在我有一个单词列表,它们都是复数,我可以很容易地检查它们,看看它们是否都是唯一的,如果那一行的单词数量大于独特的单词然后我可以发现那里有重复的单词所以我应该打印出那一行。