使用 PHP 将大型 CSV 文件导入相关 MySQL 表并进行重复检查
Import big CSV file to dependent MySQL tables using PHP with duplicate checking
我正在读取大约 50,000 多行 .csv 并创建插入查询。插入查询值中有一个动态值。如果存在则来自 SELECT,如果不存在则来自 INSERT 并获取插入的 ID。这只是测试目的,忽略安全漏洞。
$multiquery = "";
$lines = file($furl);
foreach ($lines as $line_num => $line) {
if (($line_num + 1) % 1000 == 0) {
include('connection.php');
}
$cols = split(';', $line);
$originid = 1;
$dest = $cols[3];
$cost = (int)$cols[5];
// === start: consume a lot of connections ===
$query = "SELECT id FROM dests WHERE name = '$dest'";
if (!$dests = mysqli_query($link, $query)) {
die(json_encode(array("errmsg" => "Selecting existing shipdest. Error: ".mysqli_error($link))));
}
if (mysqli_num_rows($dests) > 0) {
$dest = mysqli_fetch_assoc($dests);
}
else {
$query = "INSERT INTO dests (name) VALUES ('$dest')";
if (!mysqli_query($link, $query)) {
die(json_encode(array("errmsg" => "Inserting new dest.")));
}
$dest['id'] = mysqli_insert_id($link);
}
// === end: consume a lot of connections ===
$multiquery .= "INSERT INTO packages (id_origin, id_dest, cost) VALUES ($originid, ".$dest['id'].", ".$cost."); ";
if (($line_num + 1) % 1000 == 0 && !mysqli_multi_query($link, $multiquery)) {
die(json_encode(array("errmsg" => "Failed at line ".$line_num)));
}
}
如何将 PHP 块合并为 $multiquery 中的 id_dest 值?
我最终得到了这段代码:
ini_set('max_execution_time', 900); // 15 minutes
$lines = file($furl);
$multiquery = "";
foreach ($lines as $line_num => $line) {
$cols = split(';', $line);
$dest = $cols[3];
$multiquery .= "INSERT IGNORE INTO dests (name) VALUES ('$dest'); ";
if (($line_num + 1) % 1000 == 0) {
include('connection.php');
if (!mysqli_multi_query($link, $multiquery)) {
die(json_encode(array("errmsg" => "Failed at line ".$line_num." Error: ".mysqli_error($link))));
}
$multiquery = "";
}
}
include('connection.php');
if (!mysqli_multi_query($link, $multiquery)) {
die(json_encode(array("errmsg" => "Failed at line ".$line_num." Error: ".mysqli_error($link))));
}
$multiquery = "";
foreach ($lines as $line_num => $line) {
$cols = split(';', $line);
$originid = 1;
$dest = $cols[3];
$cost = (int)$cols[5];
$multiquery .= "INSERT INTO packages (id_origin, id_dest, cost) VALUES ($originid, (SELECT id FROM dests WHERE name = '$dest'), ".$cost."); ";
include('connection.php');
if (!mysqli_multi_query($link, $multiquery)) {
die(json_encode(array("errmsg" => "Failed at line ".$line_num." Error: ".mysqli_error($link))));
}
$multiquery = "";
}
include('connection.php');
if (!mysqli_multi_query($link, $multiquery)) {
die(json_encode(array("errmsg" => "Failed at line ".$line_num." Error: ".mysqli_error($link))));
}
我正在读取大约 50,000 多行 .csv 并创建插入查询。插入查询值中有一个动态值。如果存在则来自 SELECT,如果不存在则来自 INSERT 并获取插入的 ID。这只是测试目的,忽略安全漏洞。
$multiquery = "";
$lines = file($furl);
foreach ($lines as $line_num => $line) {
if (($line_num + 1) % 1000 == 0) {
include('connection.php');
}
$cols = split(';', $line);
$originid = 1;
$dest = $cols[3];
$cost = (int)$cols[5];
// === start: consume a lot of connections ===
$query = "SELECT id FROM dests WHERE name = '$dest'";
if (!$dests = mysqli_query($link, $query)) {
die(json_encode(array("errmsg" => "Selecting existing shipdest. Error: ".mysqli_error($link))));
}
if (mysqli_num_rows($dests) > 0) {
$dest = mysqli_fetch_assoc($dests);
}
else {
$query = "INSERT INTO dests (name) VALUES ('$dest')";
if (!mysqli_query($link, $query)) {
die(json_encode(array("errmsg" => "Inserting new dest.")));
}
$dest['id'] = mysqli_insert_id($link);
}
// === end: consume a lot of connections ===
$multiquery .= "INSERT INTO packages (id_origin, id_dest, cost) VALUES ($originid, ".$dest['id'].", ".$cost."); ";
if (($line_num + 1) % 1000 == 0 && !mysqli_multi_query($link, $multiquery)) {
die(json_encode(array("errmsg" => "Failed at line ".$line_num)));
}
}
如何将 PHP 块合并为 $multiquery 中的 id_dest 值?
我最终得到了这段代码:
ini_set('max_execution_time', 900); // 15 minutes
$lines = file($furl);
$multiquery = "";
foreach ($lines as $line_num => $line) {
$cols = split(';', $line);
$dest = $cols[3];
$multiquery .= "INSERT IGNORE INTO dests (name) VALUES ('$dest'); ";
if (($line_num + 1) % 1000 == 0) {
include('connection.php');
if (!mysqli_multi_query($link, $multiquery)) {
die(json_encode(array("errmsg" => "Failed at line ".$line_num." Error: ".mysqli_error($link))));
}
$multiquery = "";
}
}
include('connection.php');
if (!mysqli_multi_query($link, $multiquery)) {
die(json_encode(array("errmsg" => "Failed at line ".$line_num." Error: ".mysqli_error($link))));
}
$multiquery = "";
foreach ($lines as $line_num => $line) {
$cols = split(';', $line);
$originid = 1;
$dest = $cols[3];
$cost = (int)$cols[5];
$multiquery .= "INSERT INTO packages (id_origin, id_dest, cost) VALUES ($originid, (SELECT id FROM dests WHERE name = '$dest'), ".$cost."); ";
include('connection.php');
if (!mysqli_multi_query($link, $multiquery)) {
die(json_encode(array("errmsg" => "Failed at line ".$line_num." Error: ".mysqli_error($link))));
}
$multiquery = "";
}
include('connection.php');
if (!mysqli_multi_query($link, $multiquery)) {
die(json_encode(array("errmsg" => "Failed at line ".$line_num." Error: ".mysqli_error($link))));
}