使用 PHP 重新格式化 CSV 文件
Reformatting a CSV file with PHP
我需要重新格式化从一个数据库导出的 CSV 文件以符合另一个数据库的标准。我根据 "recipient" 字段(电子邮件地址)订购了 CSV。我需要做的是,如果电子邮件地址重复,它应该将最后一行 "Concat" 标记为前一行,并带有“|”作为分隔符。它最终需要看起来像这样:
recipient,lastSent,aftersunset,notes,fk_rty_id,confirmed,rty_id,rty_type,EnglishDate,,Concat
" bheller@email.org",1/21/17 5:00,1,,1,1,1,Yahrzeit,1/9/1991,01/09/1991,JOEL E. WEINGARTEN-01/09/1991
" 123456@email.com",6/29/16 5:00,0,,1,1,1,Yahrzeit,6/11/2015,06/11/2015,ANN SCHONBERG-06/11/2015|ALEXANDER SCHONBERG-12/26/2009
1234benn@email.net,3/24/17 5:00,0,,1,1,1,Yahrzeit,3/20/1985,03/20/1985,LEE I HOWARD-03/20/1985|IDA GALES-02/27/1990
这是我的 CSV
recipient,lastSent,aftersunset,notes,fk_rty_id,confirmed,rty_id,rty_type,EnglishDate,,Concat
" bheller@email.org",1/21/17 5:00,1,,1,1,1,Yahrzeit,1/9/1991,01/09/1991,JOEL E. WEINGARTEN-01/09/1991
" 123456@email.com",6/29/16 5:00,0,,1,1,1,Yahrzeit,6/11/2015,06/11/2015,ANN SCHONBERG-06/11/2015
" 123456@email.com",1/6/17 5:00,0,,1,1,1,Yahrzeit,12/26/2009,12/26/2009,ALEXANDER SCHONBERG-12/26/2009
1234benn@email.net,3/24/17 5:00,0,,1,1,1,Yahrzeit,3/20/1985,03/20/1985,LEE I HOWARD-03/20/1985
1234benn@email.net,2/27/17 5:00,0,,1,1,1,Yahrzeit,2/27/1990,02/27/1990,IDA GALES-02/27/1990
这是我目前拥有的 PHP 代码:
<?php
$file = fopen("yz-email.csv","r");
while(! feof($file))
{
$data = fgetcsv($file);
$num = count($data);
$concat = $data[22];
if ($concat != $newConcat ) {
/*for ( $c=0; $c<$num;$c++) {
print $data[$c].",";
} */
$newConcat = $concat;
} else {
array_push($data, $newConcat);
}
print "<pre>";
print_r($data);
print "</pre>";
}
fclose($file);
?>
最简单的方法是将整组数据加载到一个数组中,然后写入生成的 CSV 文件中。这种方法只有在数据量很大并且不适合 PHP 允许的内存时才会引起麻烦。这是完成这项工作的示例脚本。它假定第一行是 header.
<?php
$fp = fopen('yz-email.csv','r');
$hdr = false;
$skip_header = true;
$data = [];
$contact_index = null; // Will take the last column index, if not set
if ($fp) {
while(!feof($fp)) {
$row = fgetcsv($fp);
// Skip empty lines
if ((count($row) === 1) && is_null($row[0])) continue;
// Skip header
if (!$hdr) {
$hdr = true;
if (!isset($contact_index)) $contact_index = count($row)-1;
if ($skip_header) continue;
}
$email = strtolower(trim($row[0]));
if (isset($data[$email])) $data[$email][$contact_index].='|'.trim($row[$contact_index]);
else $data[$email] = array_map('trim',$row);
}
fclose($fp);
}
$fp = fopen('result.csv','w');
if ($fp) {
foreach($data as $row) {
fputcsv($fp,$row);
}
fclose($fp);
}
我刚刚重新开始,请原谅我没有使用您的确切代码并从那里开始构建。我添加了内联文档,因此应该很容易理解。
<?php
$fname = "emails.csv"; //name of input file
$strOut = ""; //output string
$fileContents = file_get_contents($fname); //read contents of file
$arrData = array_map("str_getcsv", preg_split('/\r*\n+|\r+/', $fileContents));; //convert string into an array
$i=0; //counter
$lastEmail = "";
foreach($arrData as $row) { //loop over the array
if(count($row) > 1) { //for some reason, I was getting an extra empty array element, so I make sure it's a valid row here
if(compareEmails($row[0],$lastEmail)) { //if different email, just append array
$strOut = $strOut . "|" .$row[10];
} else {
$strOut .= "\r\n"; //ad the carriage return to the previous row, because we know it's a new email
$strOut = appendToString($row,$strOut); //append to string
}
$i++;
}
$lastEmail = $row[0];
}
function appendToString($arrIn,$strOut) { //append the content onto the string
$strOut .= $arrIn[0] . ",";
$strOut .= $arrIn[1] . ",";
$strOut .= $arrIn[2] . ",";
$strOut .= $arrIn[3] . ",";
$strOut .= $arrIn[4] . ",";
$strOut .= $arrIn[5] . ",";
$strOut .= $arrIn[6] . ",";
$strOut .= $arrIn[7] . ",";
$strOut .= $arrIn[8] . ",";
$strOut .= $arrIn[9] . ",";
$strOut .= $arrIn[10];
return $strOut;
}
function compareEmails($curEmail,$lastEmail) {
$curEmail = trim(str_replace('"', "", $curEmail)); //remove the quotes
$lastEmail = trim(str_replace('"', "", $lastEmail)); //remove the quotes
if($curEmail == $lastEmail) { //compare them
return true;
} else {
return false;
}
}
?>
<pre>
<?php echo $strOut; ?>
</pre>
我需要重新格式化从一个数据库导出的 CSV 文件以符合另一个数据库的标准。我根据 "recipient" 字段(电子邮件地址)订购了 CSV。我需要做的是,如果电子邮件地址重复,它应该将最后一行 "Concat" 标记为前一行,并带有“|”作为分隔符。它最终需要看起来像这样:
recipient,lastSent,aftersunset,notes,fk_rty_id,confirmed,rty_id,rty_type,EnglishDate,,Concat
" bheller@email.org",1/21/17 5:00,1,,1,1,1,Yahrzeit,1/9/1991,01/09/1991,JOEL E. WEINGARTEN-01/09/1991
" 123456@email.com",6/29/16 5:00,0,,1,1,1,Yahrzeit,6/11/2015,06/11/2015,ANN SCHONBERG-06/11/2015|ALEXANDER SCHONBERG-12/26/2009
1234benn@email.net,3/24/17 5:00,0,,1,1,1,Yahrzeit,3/20/1985,03/20/1985,LEE I HOWARD-03/20/1985|IDA GALES-02/27/1990
这是我的 CSV
recipient,lastSent,aftersunset,notes,fk_rty_id,confirmed,rty_id,rty_type,EnglishDate,,Concat
" bheller@email.org",1/21/17 5:00,1,,1,1,1,Yahrzeit,1/9/1991,01/09/1991,JOEL E. WEINGARTEN-01/09/1991
" 123456@email.com",6/29/16 5:00,0,,1,1,1,Yahrzeit,6/11/2015,06/11/2015,ANN SCHONBERG-06/11/2015
" 123456@email.com",1/6/17 5:00,0,,1,1,1,Yahrzeit,12/26/2009,12/26/2009,ALEXANDER SCHONBERG-12/26/2009
1234benn@email.net,3/24/17 5:00,0,,1,1,1,Yahrzeit,3/20/1985,03/20/1985,LEE I HOWARD-03/20/1985
1234benn@email.net,2/27/17 5:00,0,,1,1,1,Yahrzeit,2/27/1990,02/27/1990,IDA GALES-02/27/1990
这是我目前拥有的 PHP 代码:
<?php
$file = fopen("yz-email.csv","r");
while(! feof($file))
{
$data = fgetcsv($file);
$num = count($data);
$concat = $data[22];
if ($concat != $newConcat ) {
/*for ( $c=0; $c<$num;$c++) {
print $data[$c].",";
} */
$newConcat = $concat;
} else {
array_push($data, $newConcat);
}
print "<pre>";
print_r($data);
print "</pre>";
}
fclose($file);
?>
最简单的方法是将整组数据加载到一个数组中,然后写入生成的 CSV 文件中。这种方法只有在数据量很大并且不适合 PHP 允许的内存时才会引起麻烦。这是完成这项工作的示例脚本。它假定第一行是 header.
<?php
$fp = fopen('yz-email.csv','r');
$hdr = false;
$skip_header = true;
$data = [];
$contact_index = null; // Will take the last column index, if not set
if ($fp) {
while(!feof($fp)) {
$row = fgetcsv($fp);
// Skip empty lines
if ((count($row) === 1) && is_null($row[0])) continue;
// Skip header
if (!$hdr) {
$hdr = true;
if (!isset($contact_index)) $contact_index = count($row)-1;
if ($skip_header) continue;
}
$email = strtolower(trim($row[0]));
if (isset($data[$email])) $data[$email][$contact_index].='|'.trim($row[$contact_index]);
else $data[$email] = array_map('trim',$row);
}
fclose($fp);
}
$fp = fopen('result.csv','w');
if ($fp) {
foreach($data as $row) {
fputcsv($fp,$row);
}
fclose($fp);
}
我刚刚重新开始,请原谅我没有使用您的确切代码并从那里开始构建。我添加了内联文档,因此应该很容易理解。
<?php
$fname = "emails.csv"; //name of input file
$strOut = ""; //output string
$fileContents = file_get_contents($fname); //read contents of file
$arrData = array_map("str_getcsv", preg_split('/\r*\n+|\r+/', $fileContents));; //convert string into an array
$i=0; //counter
$lastEmail = "";
foreach($arrData as $row) { //loop over the array
if(count($row) > 1) { //for some reason, I was getting an extra empty array element, so I make sure it's a valid row here
if(compareEmails($row[0],$lastEmail)) { //if different email, just append array
$strOut = $strOut . "|" .$row[10];
} else {
$strOut .= "\r\n"; //ad the carriage return to the previous row, because we know it's a new email
$strOut = appendToString($row,$strOut); //append to string
}
$i++;
}
$lastEmail = $row[0];
}
function appendToString($arrIn,$strOut) { //append the content onto the string
$strOut .= $arrIn[0] . ",";
$strOut .= $arrIn[1] . ",";
$strOut .= $arrIn[2] . ",";
$strOut .= $arrIn[3] . ",";
$strOut .= $arrIn[4] . ",";
$strOut .= $arrIn[5] . ",";
$strOut .= $arrIn[6] . ",";
$strOut .= $arrIn[7] . ",";
$strOut .= $arrIn[8] . ",";
$strOut .= $arrIn[9] . ",";
$strOut .= $arrIn[10];
return $strOut;
}
function compareEmails($curEmail,$lastEmail) {
$curEmail = trim(str_replace('"', "", $curEmail)); //remove the quotes
$lastEmail = trim(str_replace('"', "", $lastEmail)); //remove the quotes
if($curEmail == $lastEmail) { //compare them
return true;
} else {
return false;
}
}
?>
<pre>
<?php echo $strOut; ?>
</pre>