如何格式化来自脚本的 I/O 数据

Question

我正在使用脚本从另一个关键字列表中排除一个单词列表。我想更改输出格式。（我在这个网站上找到了脚本，并做了一些修改。）

示例：

结果中的短语：我的话

我要加引号："my word"

我在想我应该把结果放在new-file.txt然后重写它，但我不知道如何捕获结果。请给我一些提示。这是我的第一个剧本:)

代码如下：

<?php
    $myfile = fopen("newfile1.txt", "w") or die("Unable to open file!");
    //    Open a file to write the changes - test
    $file = file_get_contents("test-action-write-a-doc-small.txt");
    //  In small.txt there are words that will be excluded from the big list  
    $searchstrings = file_get_contents("test-action-write-a-doc-full.txt");
    //  From this list the script is excluding the words that are in small.txt      
    $breakstrings = explode(',',$searchstrings);
    foreach ($breakstrings as $values){
      if(!strpos($file, $values)) {
        echo $values." = Not found;\n";
      } 
      else {
        echo $values." = Found; \n";
      }
    }
    echo "<h1>Outcome:</h1>";  
    foreach ($breakstrings as $values){
      if(!strpos($file, $values)) {
        echo $values."\n";
      } 
    }
    fwrite($myfile, $values); //    write the result in newfile1.txt - test

    //    a loop is missing?

    fclose($myfile); //    close newfile1.txt - test
?>

脚本中也有一点错误。它工作正常，但是在 test-action-write-a-doc-full.txt 和 test-action-write-a-doc-small.txt 中输入单词列表之前，我必须在第一行中断，否则它找不到第一个单词。

示例：

中test-action-write-a-doc-small.txt字：

pick, lol, file, cool,

在test-action-write-a-doc-full.txt 词中：

pick, bad, computer, lol, break, file.

结果：

Pick = Not found -- here is the mistake.

如果我不在 .txt

中的第一行换行，就会发生这种情况

lol = Found

file = Found

在此先感谢您的帮助！ :)

Answer 1

您可以将接受的单词收集在一个数组中，然后将所有这些数组元素粘合成一个文本，然后将其写入文件。像这样：

echo "<h1>Outcome:</h1>";  
// Build an array with accepted words
$keepWords = array();
foreach ($breakstrings as $values){
  // remove white space surrounding word
  $values = trim($values);
  // compare with false, and skip empty strings
  if ($values !== "" and false === strpos($file, $values)) {
    // Add word to end of array, you can add quotes if you want
    $keepWords[] = '"' . $values . '"';
  } 
}
// Glue all words together with commas
$keepText = implode(",", $keepWords);
// Write that to file
fwrite($myfile, $keepText);

请注意，您不应写 !strpos(..)，而应写 false === strpos(..)，如 docs 中所述。

另请注意，这种在 $file 中搜索的方法可能会产生意想不到的结果。例如，如果您的 $file 字符串中有 "misery"，那么单词 "is"（如果在原始文件中以逗号分隔）将被拒绝，因为它在 $file 中。您可能想查看此内容。

关于第二个问题

如果不先在您的文件中添加换行符，它就无法工作，这让我认为它与 Byte-Order Mark (BOM) that appears in the beginning of many UTF-8 encoded files. The problem and possible solutions are discussed here and elsewhere.

有关

如果确实是这个问题，我建议有两种解决方案：

使用文本编辑器将文件另存为 UTF-8，但不带 BOM。例如，notepad++在the encoding menu中有这种可能性。

或者，将此添加到您的代码中：

function removeBOM($str = "") {
    if (substr($str, 0,3) == pack("CCC",0xef,0xbb,0xbf)) {
        $str = substr($str, 3);
    }
    return $str;
}

然后用该函数包装所有 file_get_contents 调用，如下所示：

$file = removeBOM(file_get_contents("test-action-write-a-doc-small.txt"));
//  In small.txt there are words that will be excluded from the big list
$searchstrings = removeBOM(file_get_contents("test-action-write-a-doc-full.txt"));
//  From this list the script is excluding the words that are in small.txt

这将从文件中提取的字符串的开头去除这些有趣的字节。

如何格式化来自脚本的 I/O 数据

How to format I/O data from script

php

file-get-contents