在文件中搜索多个字符串并输出数据

Searching a file for multiple strings and output the data

如何在 .tsv 文件中搜索与字符串的多个匹配项并将它们导出到数据库?

我想做的是在一个名为 mdata.tsv(150 万行)的大文件中搜索从数组中给定的字符串。之后输出匹配的列数据。

当前代码是我卡住的地方:

<?php 

$file = fopen("mdata.tsv","r"); //open file
$movies = glob('./uploads/Videos/*/*/*/*.mp4', GLOB_BRACE); //Find all the movies
$movID = array(); //Array for movies IDs
//Get XML and add the IDs to $movID()
foreach ($movies as $movie){ 
    $pos = strrpos($movie, '/');
    $xml = simplexml_load_file((substr($movie, 0, $pos + 1) .'movie.xml'));
    array_push($movID, $xml->id);

}

//Loop through the TSV rows and search for the $tmdbID then print out the movies category.
foreach ($movID as $tmdbID) { 
    while(($row = fgetcsv($file, 0, "\t")) !== FALSE) {
        fseek($file,0);
        $myString = $row[0];

        $b = strstr( $myString, $tmdbID );
        //Dump out the row for the sake of clarity.
        //var_dump($row);
        $myString = $row[0];
        if ($b == $tmdbID){
            echo 'Match ' . $row[0] .' '. $row[8];
        }       // Displays movie ID and category
    }
    }

fclose($file);

?>

tsv 文件示例:

tt0043936   movie   The Lawton Story    The Lawton Story    0   1949    \N  \N  Drama,Family
tt0043937   short   The Prize Pest  The Prize Pest  0   1951    \N  7   Animation,Comedy,Family
tt0043938   movie   The Prowler The Prowler 0   1951    \N  92  Drama,Film-Noir,Thriller
tt0043939   movie   Przhevalsky Przhevalsky 0   1952    \N  \N  Biography,Drama

看起来您可以通过使用 in_array() 而不是嵌套循环来查看当前行是否在所需 ID 列表中来简化此代码。确保此功能有效所需的一项更改是您需要确保将字符串存储在 $movID 数组中。

$file = fopen("mdata.tsv","r"); //open file
$movies = glob('./uploads/Videos/*/*/*/*.mp4', GLOB_BRACE); //Find all the movies
$movID = array(); //Array for movies IDs
//Get XML and add the IDs to $movID()
foreach ($movies as $movie){
    $pos = strrpos($movie, '/');
    $xml = simplexml_load_file((substr($movie, 0, $pos + 1) .'movie.xml'));
    // Store ID as string
    $movID[] = (string) $xml->id;
}

while(($row = fgetcsv($file, 0, "\t")) !== FALSE) {
    if ( in_array($row[0], $movID) ){
        echo 'Match ' . $row[0] .' '. $row[8];
    }       // Displays movie ID and category
}