使用 PhpSpreadsheet 时单元格具有 html 个特殊字符时输出错误

Getting wrong output when cells have html special characters when using PhpSpreadsheet

我正在使用 PhpSpreadsheet 库 ( https://github.com/PHPOffice/PhpSpreadsheet ) 读取上传的 excel 文件。 excel 文件包含 html 标签和 html 特殊字符。当我的函数遍历单元格时,我得到了错误的结果。

示例代码:

$fileType = \PhpOffice\PhpSpreadsheet\IOFactory::identify($inputFile);
$objReader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($fileType);
$objReader->setinputencoding('ISO-8859-1');
$objReader->setReadDataOnly(true);
$spreadsheet = $objReader ->load($inputFile);

foreach ($spreadsheetUploaded->getWorksheetIterator() as $worksheet) {
    $array = $worksheet->toArray();
}

var_dump($array);

输出:

array(2) { [0]=> array(1) { [0]=> string(20) "cell_1,cell_2,cell_3" } [1]=> array(1) { [0]=> string(480) "
Heading – 2
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum   dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
,," } }

预期输出

array(2) { [0]=> array(3) { [0]=> string(6) "cell_1" [1]=> string(6) "cell_2" [2]=> string(6) "cell_3" } [1]=> array(3) { [0]=> string(472) "
Heading - 2
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
" [1]=> NULL [2]=> NULL } }

请注意,单元格 1 包含 html 特殊字符“–”,相当于“–+分号”

Excel 使用的文件https://docs.google.com/spreadsheets/d/1IdLJsEmnIXiL0xPEl0J2np0fLP9gl41twk3yNHl3DzI/edit?usp=sharing

文件格式:Csv

您应该更新到最新的开发版本哦 PhpSpreadsheet,特别是在 that commit 之后。

然后,代码如下:

<?php

require __DIR__ . '/vendor/autoload.php';

use PhpOffice\PhpSpreadsheet\IOFactory;

$csv = <<<STRING
cell_1,cell_2,cell_3
"<h2>Heading &#8211; 2</h2><p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>",,
STRING;

$file = 'test.csv';
file_put_contents($file, $csv);
$spreadsheet = IOFactory::load($file);

var_dump($spreadsheet->getActiveSheet()->toArray());

将正确输出:

array(2) {
  [0] =>
  array(3) {
    [0] =>
    string(6) "cell_1"
    [1] =>
    string(6) "cell_2"
    [2] =>
    string(6) "cell_3"
  }
  [1] =>
  array(3) {
    [0] =>
    string(478) "<h2>Heading &#8211; 2</h2><p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>"
    [1] =>
    NULL
    [2] =>
    NULL
  }
}