为 MySQL 构建批量查询,每插入 1000 个项目
Build a batch query for MySQL insert each 1000 items
我需要在 MySQL/MariaDB 中执行批量插入,但由于数据是动态的,我需要构建正确的 SQL 查询。几个步骤:
- 我应该在 table 中查找当前行是否存在 - 这是循环中的第一个 SELECT
- 现在我有 1454 但必须稍后插入大约 150k,批量查询比循环中每个项目 150k INSERT 更好
- 如果记录已经存在,我应该更新它,如果不存在,那么我应该插入,我只是不关心更新,你看到的代码只用于插入
所以这就是我正在做的事情:
// Get values from Csv file as an array of values
$data = convertCsvToArray($fileName);
echo "DEBUG count(data): ", count($data), "\n";
$i = 0;
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) ";
// Processing on each row of data
foreach ($data as $row) {
$sql = "SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='{$row['Id']}'";
echo "DEBUG: ", $sql, "\n";
$rs = $conn->query($sql);
if ($rs === false) {
echo 'Wrong SQL: '.$sql.' Error: '.$conn->error, E_USER_ERROR;
} else {
$rows_returned = $rs->num_rows;
$veeva_rep_id = "'".$conn->real_escape_string($row['Id'])."'";
$first = "'".$conn->real_escape_string(ucfirst(strtolower($row['FirstName'])))."'";
$last = "'".$conn->real_escape_string(ucfirst(strtolower($row['LastName'])))."'";
$email = "'".$conn->real_escape_string($row['Email'])."'";
$username = "'".$conn->real_escape_string($row['Username'])."'";
$display_name = "'".$conn->real_escape_string(
ucfirst(strtolower($row['FirstName'])).' '.ucfirst(strtolower($row['LastName']))
)."'";
// VALUES should be added only if row doesn't exists
if ($rows_returned === 0) {
// VALUES should be append until they reach 1000
while ($i % 1000 !== 0) {
$sqlInsert .= "VALUES($veeva_rep_id,$first,$last,$email,$username,NOW(),NOW(),$display_name,'VEEVA','https://pdone.s3.amazonaws.com/avatar/default_avatar.png',NOW(),NOW())";
++$i;;
}
// QUERY should be output to console to see if it's right or something is wrong
echo "DEBUG: ", $sqlInsert, "\n";
// QUERY should be executed if there are 1000 VALUES ready to add as a batch
/*$rs = $conn->query($sqlInsert);
if ($rs === false) {
echo 'Wrong SQL: '.$sqlInsert.' Error: '.$conn->error, E_USER_ERROR;*/
}
} else {
// UPDATE
echo "UPDATE";
}
}
}
但是这行代码:echo "DEBUG: ", $sql, "\n";
没有向控制台输出任何内容。我一定做错了什么,但我找不到什么。任何人都可以帮助我构建正确的批处理查询并每附加 1000 个值执行一次吗?
正确的输出应该是:
DEBUG count(data): 1454
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008ReolAAC'
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='005800000039SIWAA2'
....
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES(...), VALUES(...), VALUES(...)
得到的结果:
DEBUG count(data): 1454
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008RGg6AAG'
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt)
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008RQ4CAAW'
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt)
.... // until reach 1454 results
table 是空的,所以它永远不会通过 ELSE
条件(更新一个)。
编辑
在答案的帮助下,代码现在是这样的:
$data = convertCsvToArray($fileName);
echo "DEBUG count(data): ", count($data), "\n";
$i = 1;
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES";
foreach ($data as $row) {
$sql = "SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='{$row['Id']}'";
$rs = $conn->query($sql);
if ($rs === false) {
echo 'Wrong SQL: '.$sql.' Error: '.$conn->error, E_USER_ERROR;
} else {
$rows_returned = $rs->num_rows;
$veeva_rep_id = "'".$conn->real_escape_string($row['Id'])."'";
$first = "'".$conn->real_escape_string(ucfirst(strtolower($row['FirstName'])))."'";
$last = "'".$conn->real_escape_string(ucfirst(strtolower($row['LastName'])))."'";
$email = "'".$conn->real_escape_string($row['Email'])."'";
$username = "'".$conn->real_escape_string($row['Username'])."'";
$display_name = "'".$conn->real_escape_string(
ucfirst(strtolower($row['FirstName'])).' '.ucfirst(strtolower($row['LastName']))
)."'";
if ($rows_returned === 0) {
if ($i % 1000 === 0) {
file_put_contents("output.log", $sqlInsert."\n", FILE_APPEND);
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES";
} else {
$sqlInsert .= "($veeva_rep_id,$first,$last,$email,$username,NOW(),NOW(),$display_name,'VEEVA','https://pdone.s3.amazonaws.com/avatar/default_avatar.png',NOW(),NOW()), ";
}
$i++;
} else {
echo "UPDATE";
}
}
}
但仍然有问题,因为:
- 我有第一个空的 INSERT 查询:
INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES
- 我有第二个 INSERT 查询,追加了 1000 个 VALUES(),但其余部分发生了什么?剩下的454个?
可以给我另一个提示吗?帮忙?
你应该有这样的东西:
// Try fetching data from table 1
// If there is no record available, then fetch some data from table 2
// and insert that data inito table 1
你刚刚写了
$sql = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) ";
// Processing on each row of data
foreach ($data as $row) {
但是从插入没有数据是 selected 和第二...你没有 运行 select,$data
来自哪里?
更新 使用if ($i % 1000 === 0) {
代替while ($i % 1000 !== 0) {
$i = 0;
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,...) ";
// Processing on each row of data
foreach ($data as $row) {
$sql = "SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='{$row['Id']}'";
echo "DEBUG: ", $sql, "\n";
$rs = $conn->query($sql);
if ($rs === false) {
echo 'Wrong SQL: '.$sql.' Error: '.$conn->error, E_USER_ERROR;
} else {
$veeva_rep_id = ...;
$first = ...;
$last = ...;
$email = ...;
// ...
// VALUES should be added only if row doesn't exists
if($rs->num_rows == 0) {
// Insert some data
$i++;
if ($i % 1000 === 0) {
echo "DEBUG: ", $sqlInsert, "\n";
// execSql($sqlInsert);
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,...) "; // reset
} else {
$sqlInsert .= "VALUES($veeva_rep_id,$first,$last,$email,...) ";
}
} else {
echo "UPDATE";
}
}
}
考虑使用 INSERT IGNORE INTO table 检查记录是否已经存在。 How to 'insert if not exists' in MySQL?
如果您还没有这样做,请将 veeva_rep_id 设为主键,这样 INSERT IGNORE 就可以工作了
还检查使用 PDO 处理事务、准备好的语句以及使用 PDO 动态生成查询
PDO Prepared Inserts multiple rows in single query
<?php
$sql = 'INSERT IGNORE INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES ';
$insertQuery = array();
$insertData = array();
/*
assuming the array from the csv is like this
$data = array(
0 => array('name' => 'Robert', 'value' => 'some value'),
1 => array('name' => 'Louise', 'value' => 'another value')
);
*/
foreach ($data as $row) {
$insertQuery[] = '(:veeva_rep_id' . $n . ', :first' . $n . ', :last' . $n . ', :email' . $n . ', :username' . $n . ', :lastLoginAt' . $n . ', :lastSyncAt' . $n . ', :display_name' . $n . ', :rep_type' . $n . ', :avatar_url' . $n . ', :createdAt' . $n . ', :updatedAt' . $n . ')';
$insertData['veeva_rep_id' . $n] = $row['name'];
$insertData['first' . $n] = $row['value'];
$insertData['last' . $n] = $row['name'];
$insertData['email' . $n] = $row['value'];
$insertData['username' . $n] = $row['name'];
$insertData['lastLoginAt' . $n] = $row['value'];
$insertData['lastSyncAt' . $n] = $row['value'];
$insertData['display_name' . $n] = $row['name'];
$insertData['rep_type' . $n] = $row['value'];
$insertData['avatar_url' . $n] = $row['value'];
$insertData['createdAt' . $n] = $row['name'];
$insertData['updatedAt' . $n] = $row['value'];
$n++;
}
$db->beginTransaction();
if (!empty($insertQuery) and count($insertQuery)>1000) {
$sql .= implode(', ', $insertQuery);
$stmt = $db->prepare($sql);
$stmt->execute($insertData);
}
$db->commit();
print $sql . PHP_EOL;
如果有帮助请告诉我。
由于您似乎正在尝试从 CSV 文件加载数据,因此您可能需要考虑使用专门为此目的设计的 LOAD DATA INFILE
功能。
这里是link文档:https://dev.mysql.com/doc/refman/5.6/en/load-data.html
我需要在 MySQL/MariaDB 中执行批量插入,但由于数据是动态的,我需要构建正确的 SQL 查询。几个步骤:
- 我应该在 table 中查找当前行是否存在 - 这是循环中的第一个 SELECT
- 现在我有 1454 但必须稍后插入大约 150k,批量查询比循环中每个项目 150k INSERT 更好
- 如果记录已经存在,我应该更新它,如果不存在,那么我应该插入,我只是不关心更新,你看到的代码只用于插入
所以这就是我正在做的事情:
// Get values from Csv file as an array of values
$data = convertCsvToArray($fileName);
echo "DEBUG count(data): ", count($data), "\n";
$i = 0;
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) ";
// Processing on each row of data
foreach ($data as $row) {
$sql = "SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='{$row['Id']}'";
echo "DEBUG: ", $sql, "\n";
$rs = $conn->query($sql);
if ($rs === false) {
echo 'Wrong SQL: '.$sql.' Error: '.$conn->error, E_USER_ERROR;
} else {
$rows_returned = $rs->num_rows;
$veeva_rep_id = "'".$conn->real_escape_string($row['Id'])."'";
$first = "'".$conn->real_escape_string(ucfirst(strtolower($row['FirstName'])))."'";
$last = "'".$conn->real_escape_string(ucfirst(strtolower($row['LastName'])))."'";
$email = "'".$conn->real_escape_string($row['Email'])."'";
$username = "'".$conn->real_escape_string($row['Username'])."'";
$display_name = "'".$conn->real_escape_string(
ucfirst(strtolower($row['FirstName'])).' '.ucfirst(strtolower($row['LastName']))
)."'";
// VALUES should be added only if row doesn't exists
if ($rows_returned === 0) {
// VALUES should be append until they reach 1000
while ($i % 1000 !== 0) {
$sqlInsert .= "VALUES($veeva_rep_id,$first,$last,$email,$username,NOW(),NOW(),$display_name,'VEEVA','https://pdone.s3.amazonaws.com/avatar/default_avatar.png',NOW(),NOW())";
++$i;;
}
// QUERY should be output to console to see if it's right or something is wrong
echo "DEBUG: ", $sqlInsert, "\n";
// QUERY should be executed if there are 1000 VALUES ready to add as a batch
/*$rs = $conn->query($sqlInsert);
if ($rs === false) {
echo 'Wrong SQL: '.$sqlInsert.' Error: '.$conn->error, E_USER_ERROR;*/
}
} else {
// UPDATE
echo "UPDATE";
}
}
}
但是这行代码:echo "DEBUG: ", $sql, "\n";
没有向控制台输出任何内容。我一定做错了什么,但我找不到什么。任何人都可以帮助我构建正确的批处理查询并每附加 1000 个值执行一次吗?
正确的输出应该是:
DEBUG count(data): 1454
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008ReolAAC'
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='005800000039SIWAA2'
....
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES(...), VALUES(...), VALUES(...)
得到的结果:
DEBUG count(data): 1454
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008RGg6AAG'
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt)
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008RQ4CAAW'
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt)
.... // until reach 1454 results
table 是空的,所以它永远不会通过 ELSE
条件(更新一个)。
编辑
在答案的帮助下,代码现在是这样的:
$data = convertCsvToArray($fileName);
echo "DEBUG count(data): ", count($data), "\n";
$i = 1;
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES";
foreach ($data as $row) {
$sql = "SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='{$row['Id']}'";
$rs = $conn->query($sql);
if ($rs === false) {
echo 'Wrong SQL: '.$sql.' Error: '.$conn->error, E_USER_ERROR;
} else {
$rows_returned = $rs->num_rows;
$veeva_rep_id = "'".$conn->real_escape_string($row['Id'])."'";
$first = "'".$conn->real_escape_string(ucfirst(strtolower($row['FirstName'])))."'";
$last = "'".$conn->real_escape_string(ucfirst(strtolower($row['LastName'])))."'";
$email = "'".$conn->real_escape_string($row['Email'])."'";
$username = "'".$conn->real_escape_string($row['Username'])."'";
$display_name = "'".$conn->real_escape_string(
ucfirst(strtolower($row['FirstName'])).' '.ucfirst(strtolower($row['LastName']))
)."'";
if ($rows_returned === 0) {
if ($i % 1000 === 0) {
file_put_contents("output.log", $sqlInsert."\n", FILE_APPEND);
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES";
} else {
$sqlInsert .= "($veeva_rep_id,$first,$last,$email,$username,NOW(),NOW(),$display_name,'VEEVA','https://pdone.s3.amazonaws.com/avatar/default_avatar.png',NOW(),NOW()), ";
}
$i++;
} else {
echo "UPDATE";
}
}
}
但仍然有问题,因为:
- 我有第一个空的 INSERT 查询:
INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES
- 我有第二个 INSERT 查询,追加了 1000 个 VALUES(),但其余部分发生了什么?剩下的454个?
可以给我另一个提示吗?帮忙?
你应该有这样的东西:
// Try fetching data from table 1
// If there is no record available, then fetch some data from table 2
// and insert that data inito table 1
你刚刚写了
$sql = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) ";
// Processing on each row of data
foreach ($data as $row) {
但是从插入没有数据是 selected 和第二...你没有 运行 select,$data
来自哪里?
更新 使用if ($i % 1000 === 0) {
代替while ($i % 1000 !== 0) {
$i = 0;
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,...) ";
// Processing on each row of data
foreach ($data as $row) {
$sql = "SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='{$row['Id']}'";
echo "DEBUG: ", $sql, "\n";
$rs = $conn->query($sql);
if ($rs === false) {
echo 'Wrong SQL: '.$sql.' Error: '.$conn->error, E_USER_ERROR;
} else {
$veeva_rep_id = ...;
$first = ...;
$last = ...;
$email = ...;
// ...
// VALUES should be added only if row doesn't exists
if($rs->num_rows == 0) {
// Insert some data
$i++;
if ($i % 1000 === 0) {
echo "DEBUG: ", $sqlInsert, "\n";
// execSql($sqlInsert);
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,...) "; // reset
} else {
$sqlInsert .= "VALUES($veeva_rep_id,$first,$last,$email,...) ";
}
} else {
echo "UPDATE";
}
}
}
考虑使用 INSERT IGNORE INTO table 检查记录是否已经存在。 How to 'insert if not exists' in MySQL? 如果您还没有这样做,请将 veeva_rep_id 设为主键,这样 INSERT IGNORE 就可以工作了
还检查使用 PDO 处理事务、准备好的语句以及使用 PDO 动态生成查询 PDO Prepared Inserts multiple rows in single query
<?php
$sql = 'INSERT IGNORE INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES ';
$insertQuery = array();
$insertData = array();
/*
assuming the array from the csv is like this
$data = array(
0 => array('name' => 'Robert', 'value' => 'some value'),
1 => array('name' => 'Louise', 'value' => 'another value')
);
*/
foreach ($data as $row) {
$insertQuery[] = '(:veeva_rep_id' . $n . ', :first' . $n . ', :last' . $n . ', :email' . $n . ', :username' . $n . ', :lastLoginAt' . $n . ', :lastSyncAt' . $n . ', :display_name' . $n . ', :rep_type' . $n . ', :avatar_url' . $n . ', :createdAt' . $n . ', :updatedAt' . $n . ')';
$insertData['veeva_rep_id' . $n] = $row['name'];
$insertData['first' . $n] = $row['value'];
$insertData['last' . $n] = $row['name'];
$insertData['email' . $n] = $row['value'];
$insertData['username' . $n] = $row['name'];
$insertData['lastLoginAt' . $n] = $row['value'];
$insertData['lastSyncAt' . $n] = $row['value'];
$insertData['display_name' . $n] = $row['name'];
$insertData['rep_type' . $n] = $row['value'];
$insertData['avatar_url' . $n] = $row['value'];
$insertData['createdAt' . $n] = $row['name'];
$insertData['updatedAt' . $n] = $row['value'];
$n++;
}
$db->beginTransaction();
if (!empty($insertQuery) and count($insertQuery)>1000) {
$sql .= implode(', ', $insertQuery);
$stmt = $db->prepare($sql);
$stmt->execute($insertData);
}
$db->commit();
print $sql . PHP_EOL;
如果有帮助请告诉我。
由于您似乎正在尝试从 CSV 文件加载数据,因此您可能需要考虑使用专门为此目的设计的 LOAD DATA INFILE
功能。
这里是link文档:https://dev.mysql.com/doc/refman/5.6/en/load-data.html