我有一个文本文件，想提取空间段落 b/w php 中的两个特殊行

Question

我有这段代码，想从特定两行之间的文本文件中提取数据。我想提取每个部分 b/w 这两行。 TEXT 文件示例在这里

---
 - ID: some random id

 \_______________________________\_
HELLO 
This is an example text.
I AM SECTION 1
\_______________________________\_
HELLO 
This is an example text.
I AM SECTION 2
\_______________________________\_
HELLO 
This is an example text.
I AM SECTION 3
\_______________________________\_
hello 
this is example text here
and i am section 4

这里我有一些代码，我匹配了这些行，但没有找到如何从文本文件中提取每个部分，包括最后一个部分。

并且需要这样的输出：

[0] => ' HELLO 
         This is an example text.
         I AM SECTION 1',
[1] => ' HELLO 
         This is an example text.
         I AM SECTION 2',
[2] => ' HELLO 
         This is an example text.
         I AM SECTION 3',
[3] => ' HELLO 
         This is an example text.
         I AM SECTION 4',


public static function find_section_in_file($file = '', $directory = '')
{
    $response = ['error' => true, 'section' => NULL];
    if (isset($file) && isset($directory)) {
        $handle = fopen($directory."\".$file, "r");
        $section = [];
        if ($handle) {

            while (($line = fgets($handle)) !== false) {
                $new_line = trim(preg_replace('/\s+/', ' ', $line));
                $start = self::startsWith($new_line, '\__');
                $end = self::endsWith($new_line, '_\_');

                if ($start && $end){
                    array_push($section, $line);
                }
            }
            fclose($handle);
            $response = ['error' => false, 'section' => $section];

        }
        //need To write Query to save section in DB
    }
    return $response;
}

Answer 1

您可以匹配所有不以 backslash/underscores 行开头的行并捕获捕获组 1 中的那些行。

^\h*\_+\_\R((?:.*\R(?!\h*\_+\_).*)*)

说明

^ 字符串开头
\h*\_+\_\R 匹配 0+ 个水平空白字符，\，1+ 个下划线，\，_ 和一个 unicode 换行符序列
( 捕获组 1
- (?:非捕获组
  - .*\R匹配整行和换行
  - (?!\h*\_+\_) 否定前瞻，断言该行不以 backslash/underscores
  - .*匹配匹配整行
- )*关闭非捕获组并重复1+次
) 关闭捕获组

Regex demo | php demo

例如

$re = '/^\h*\\_+\\_\R((?:.*\R(?!\h*\\_+\\_).*)*)/m';
$str = '---
 - ID: some random id

 \_______________________________\_
HELLO 
This is an example text.
I AM SECTION 1
\_______________________________\_
HELLO 
This is an example text.
I AM SECTION 2
\_______________________________\_
HELLO 
This is an example text.
I AM SECTION 3
\_______________________________\_
hello 
this is example text here
and i am section 4';

preg_match_all($re, $str, $matches);
print_r($matches[1]);

输出

Array
(
    [0] => HELLO 
This is an example text.
I AM SECTION 1
    [1] => HELLO 
This is an example text.
I AM SECTION 2
    [2] => HELLO 
This is an example text.
I AM SECTION 3
    [3] => hello 
this is example text here
and i am section 4
)

我有一个文本文件，想提取空间段落 b/w php 中的两个特殊行

i have a text file and want to extract spacific paragraph b/w two special line in php

php

regex

file

file-handling