使用 PHP 解析 pdftk dump_data_fields?

Parsing pdftk dump_data_fields using PHP?

我需要一些建议,了解使用 PHP 解析 pdftk dump_data_fields 给出的输出的最佳方法是什么?

此外,我需要提取的属性是:FieldNameFieldNameAlt 以及可选的 FieldMaxLengthFieldStateOptions

FieldType: Text
FieldName: TestName1
FieldNameAlt: TestName1
FieldFlags: 29360128
FieldJustification: Left
FieldMaxLength: 5
---
FieldType: Button
FieldName: TestName3
FieldFlags: 0
FieldJustification: Left
FieldStateOption: Off
FieldStateOption: Yes
---
...

像这样就够了吗?

$handle = fopen("/tmp/bla.txt", "r");
if ($handle) {
    $output = array();
    while (($line = fgets($handle)) !== false) {
        if (trim($line) === "---") {
            // Block completed; process it
            if (sizeof($output) > 0) {
                print_r($output);
            }
            $output = array();
            continue;
        }
        // Process contents of data block
        $parts = explode(":", $line);
        if (sizeof($parts) === 2) {
            $key = trim($parts[0]);
            $value = trim($parts[1]);
            if (isset($output[$key])) {
                $i = 1;
                while(isset($output[$key.$i])) $i++;
                $output[$key.$i] = $value;
            }
            else {
                $output[$key] = $value;
            }
        }
        else {
            // handle malformed input
        }
    }

    // process final block
    if (sizeof($output) > 0) {
        print_r($output);
    }
    fclose($handle);
}
else {
    // error while opening the file
}

这将为您提供以下输出:

Array
(
    [FieldType] => Text
    [FieldName] => TestName1
    [FieldNameAlt] => TestName1
    [FieldFlags] => 29360128
    [FieldJustification] => Left
    [FieldMaxLength] => 5
)
Array
(
    [FieldType] => Button
    [FieldName] => TestName3
    [FieldFlags] => 0
    [FieldJustification] => Left
    [FieldStateOption] => Off
    [FieldStateOption1] => Yes
)

找出这些值就像:

echo $output["FieldName"];

我对上面的代码进行了一些修改,并修复了一些问题,比如最后一个元素字段没有进入数组。现在数组的更新代码如下。

        // Get form data fields 
        $fieldsDataStr = '';
        $fieldsDataStr = $pdf->getDataFields();

    /* explode by \n and convert string into array. */
    $lines = explode("\n", $fieldsDataStr);  
    /* added '---' into end of lines array beucase we need to get last field value also based on below logic. */
    array_push($lines, "---");

    $output = array();
    $pdfDataArray = array();
    $counterField = 0;
    foreach($lines as $line) {
    if (trim($line) === "---") {
        // Block completed; process it
        if (sizeof($output) > 0) { 
        $pdfDataArray[] = $output;
        $counterField = $counterField + 1; //fields counter
        }
        $output = array();
        continue;
    }
    // Process contents of data block
    $parts = array();           
    $parts = explode(":", $line, 2); //2 is return array max limit, it will return array with first occurence of colon          
    if (sizeof($parts) === 2) {
        $key = trim($parts[0]);
        $value = trim($parts[1]);
        $output[$key] = $value;
    }   
        }

    print_r($pdfDataArray);

它将 return 适当的数组