使用 PHP 将大型 CSV 文件导入相关 MySQL 表并进行重复检查

Import big CSV file to dependent MySQL tables using PHP with duplicate checking

我正在读取大约 50,000 多行 .csv 并创建插入查询。插入查询值中有一个动态值。如果存在则来自 SELECT,如果不存在则来自 INSERT 并获取插入的 ID。这只是测试目的,忽略安全漏洞。

$multiquery = "";
$lines = file($furl);

foreach ($lines as $line_num => $line) {
    if (($line_num + 1) % 1000 == 0) {
        include('connection.php');
    }

    $cols = split(';', $line);

    $originid = 1;
    $dest = $cols[3];
    $cost = (int)$cols[5];

    // === start: consume a lot of connections ===
    $query = "SELECT id FROM dests WHERE name = '$dest'";
    if (!$dests = mysqli_query($link, $query)) {
        die(json_encode(array("errmsg" => "Selecting existing shipdest. Error: ".mysqli_error($link))));
    }
    if (mysqli_num_rows($dests) > 0) {
        $dest = mysqli_fetch_assoc($dests);
    }
    else {
        $query = "INSERT INTO dests (name) VALUES ('$dest')";
        if (!mysqli_query($link, $query)) {
            die(json_encode(array("errmsg" => "Inserting new dest.")));
        }

        $dest['id'] = mysqli_insert_id($link);
    }
    // === end: consume a lot of connections ===

    $multiquery .= "INSERT INTO packages (id_origin, id_dest, cost) VALUES ($originid, ".$dest['id'].", ".$cost."); ";

    if (($line_num + 1) % 1000 == 0 && !mysqli_multi_query($link, $multiquery)) {
        die(json_encode(array("errmsg" => "Failed at line ".$line_num)));
    }
}

如何将 PHP 块合并为 $multiquery 中的 id_dest 值?

我最终得到了这段代码:

ini_set('max_execution_time', 900); // 15 minutes

$lines = file($furl);

$multiquery = "";
foreach ($lines as $line_num => $line) {
    $cols = split(';', $line);
    $dest = $cols[3];

    $multiquery .= "INSERT IGNORE INTO dests (name) VALUES ('$dest'); ";

    if (($line_num + 1) % 1000 == 0) {
        include('connection.php');
        if (!mysqli_multi_query($link, $multiquery)) {
            die(json_encode(array("errmsg" => "Failed at line ".$line_num." Error: ".mysqli_error($link))));
        }
        $multiquery = "";
    }
}
include('connection.php');
if (!mysqli_multi_query($link, $multiquery)) {
    die(json_encode(array("errmsg" => "Failed at line ".$line_num." Error: ".mysqli_error($link))));
}

$multiquery = "";
foreach ($lines as $line_num => $line) {
    $cols = split(';', $line);

    $originid = 1;
    $dest = $cols[3];
    $cost = (int)$cols[5];

    $multiquery .= "INSERT INTO packages (id_origin, id_dest, cost) VALUES ($originid, (SELECT id FROM dests WHERE name = '$dest'), ".$cost."); ";

    include('connection.php');
    if (!mysqli_multi_query($link, $multiquery)) {  
        die(json_encode(array("errmsg" => "Failed at line ".$line_num." Error: ".mysqli_error($link))));
    }
    $multiquery = "";
}
include('connection.php');
if (!mysqli_multi_query($link, $multiquery)) {
    die(json_encode(array("errmsg" => "Failed at line ".$line_num." Error: ".mysqli_error($link))));
}