如何格式化来自脚本的 I/O 数据

How to format I/O data from script

我正在使用脚本从另一个关键字列表中排除一个单词列表。我想更改输出格式。 (我在这个网站上找到了脚本,并做了一些修改。)

示例:

结果中的短语:我的话

我要加引号:"my word"

我在想我应该把结果放在new-file.txt然后重写它,但我不知道如何捕获结果。请给我一些提示。这是我的第一个剧本:)

代码如下:

<?php
    $myfile = fopen("newfile1.txt", "w") or die("Unable to open file!");
    //    Open a file to write the changes - test
    $file = file_get_contents("test-action-write-a-doc-small.txt");
    //  In small.txt there are words that will be excluded from the big list  
    $searchstrings = file_get_contents("test-action-write-a-doc-full.txt");
    //  From this list the script is excluding the words that are in small.txt      
    $breakstrings = explode(',',$searchstrings);
    foreach ($breakstrings as $values){
      if(!strpos($file, $values)) {
        echo $values." = Not found;\n";
      } 
      else {
        echo $values." = Found; \n";
      }
    }
    echo "<h1>Outcome:</h1>";  
    foreach ($breakstrings as $values){
      if(!strpos($file, $values)) {
        echo $values."\n";
      } 
    }
    fwrite($myfile, $values); //    write the result in newfile1.txt - test

    //    a loop is missing?

    fclose($myfile); //    close newfile1.txt - test
?>   

脚本中也有一点错误。它工作正常,但是在 test-action-write-a-doc-full.txttest-action-write-a-doc-small.txt 中输入单词列表之前,我必须在第一行中断,否则它找不到第一个单词。

示例:

test-action-write-a-doc-small.txt字:

pick, lol, file, cool,

test-action-write-a-doc-full.txt 词中:

pick, bad, computer, lol, break, file.

结果:

Pick = Not found -- here is the mistake.

如果我不在 .txt

中的第一行换行,就会发生这种情况

lol = Found

file = Found

在此先感谢您的帮助! :)

您可以将接受的单词收集在一个数组中,然后将所有这些数组元素粘合成一个文本,然后将其写入文件。像这样:

echo "<h1>Outcome:</h1>";  
// Build an array with accepted words
$keepWords = array();
foreach ($breakstrings as $values){
  // remove white space surrounding word
  $values = trim($values);
  // compare with false, and skip empty strings
  if ($values !== "" and false === strpos($file, $values)) {
    // Add word to end of array, you can add quotes if you want
    $keepWords[] = '"' . $values . '"';
  } 
}
// Glue all words together with commas
$keepText = implode(",", $keepWords);
// Write that to file
fwrite($myfile, $keepText);

请注意,您不应写 !strpos(..),而应写 false === strpos(..),如 docs 中所述。

另请注意,这种在 $file 中搜索的方法可能会产生意想不到的结果。例如,如果您的 $file 字符串中有 "misery",那么单词 "is"(如果在原始文件中以逗号分隔)将被拒绝,因为它在 $file 中。您可能想查看此内容。

关于第二个问题

如果不先在您的文件中添加换行符,它就无法工作,这让我认为它与 Byte-Order Mark (BOM) that appears in the beginning of many UTF-8 encoded files. The problem and possible solutions are discussed here and elsewhere.

有关

如果确实是这个问题,我建议有两种解决方案:

使用文本编辑器将文件另存为 UTF-8,但不带 BOM。例如,notepad++the encoding menu中有这种可能性。

或者,将此添加到您的代码中:

function removeBOM($str = "") {
    if (substr($str, 0,3) == pack("CCC",0xef,0xbb,0xbf)) {
        $str = substr($str, 3);
    }
    return $str;
}

然后用该函数包装所有 file_get_contents 调用,如下所示:

$file = removeBOM(file_get_contents("test-action-write-a-doc-small.txt"));
//  In small.txt there are words that will be excluded from the big list
$searchstrings = removeBOM(file_get_contents("test-action-write-a-doc-full.txt"));
//  From this list the script is excluding the words that are in small.txt

这将从文件中提取的字符串的开头去除这些有趣的字节。