使用 REGEX 将 .txt 日志文件数据提取输出到 CSV

.txt Log File Data Extraction Output to CSV with REGEX

我之前问过这个问题,LotPings 给出了完美的结果。当与用户交谈时,这涉及到我一开始只得到了一半的信息!

现在确切地知道需要什么我将再次解释该场景...

注意事项:

这是日志文件

L      TRANSACTIONS LOGGED FROM 01/05/2018 0001 TO 31/05/2018 2359
        SELECTED FOR OPERATOR  891234

START                 TERMINAL    USER        ENQUIRER                    TERMINAL IP
========================================================================================================================
01/05/18 1603       A555        CART87565       46573 RBCO NPC SERVICES GW/10/0043                           
        SEARCH ENQUIRY               RECORD NO : S48456/06P     CHAPTER CODE =   
                                 RECORD DISPLAYED : S48853/98D

                                  PRINT REQUESTED : SINGLE RECORD
========================================================================================================================
03/05/18 1107       A555        CERT16574       BTD/54/1786 16475                                    
        REF ENQUIRY                  DHF ID : 58/94710W     CHAPTER CODE =   
                                 RECORD DISPLAYED : S585988/84H
========================================================================================================================
24/05/18 1015       A555        CERT15473       19625 CBRS DDS SERVICES NM/18/0199                           

        IMAGE ENQUIRY                      NAME : TREVOR SMITH CHAPTER CODE =  

                                    DATE OF BIRTH :   /  /1957
========================================================================================================================
24/05/18 1025       A555        CERT15473       15325 CBRS DDS SERVICES NM/12/0999                           
        REF ENQUIRY                  DDS ID : 04/102578R     CHAPTER CODE =  
========================================================================================================================

这里是日志文件的例子以及需要提取的内容和在什么下面header。

像这样的 CSV

PowerShell 脚本 LotPings 完美运行,我只需要从顶行提取用户 ID,以说明并非所有具有 DOB 的记录并且存在不止一种类型的查询,即参考查询、搜索查询, 图片查询.

$FileIn   = '.\SO_51209341_data.txt'
$TodayCsv = '.\SO_51209341_data.csv'

$RE1 = [RegEx]'(?m)(?<Date>\d{2}\/\d{2}\/\d{2}) (?<Time>\d{4}) +(?<Terminal>A\d{3}) +(?<User>C[A-Z0-9]+) +(?<Enquirer>.*)$'
$RE2 = [RegEx]'\s+SEARCH REF\s+NAME : (?<Enquiry>.+?) (PAGE|CHAPTER) CODE ='
$RE3 = [RegEx]'\s+DATE OF BIRTH : (?<DOB>[0-9 /]+?/\d{4})'

$Sections = (Get-Content $FileIn -Raw) -split "={30,}`r?`n" -ne ''

$Csv = ForEach($Section in $Sections){
    $Row= @{} | Select-Object Date, Time, Terminal, User, Enquirer, Enquiry, DOB
    $Cnt = 0
    if ($Section -match $RE1) {
        ++$Cnt
        $Row.Date     = $Matches.Date
        $Row.Time     = $Matches.Time
        $Row.Terminal = $Matches.Terminal
        $Row.User     = $Matches.User
        $Row.Enquirer = $Matches.Enquirer.Trim()
    }
    if ($Section -match $RE2) {
        ++$Cnt
        $Row.Enquiry  = $Matches.Enquiry
    }
    if ($Section -match $RE3){
        ++$Cnt
        $Row.DOB      = $Matches.DOB
    }
    if ($Cnt -eq 3) {$Row}
}

$csv | Format-Table
$csv | Export-Csv $Todaycsv -NoTypeInformation

有了如此精确的数据,第一个答案可能是:

## Q:\Test18\SO_51311417.ps1
$FileIn   = '.\SO_51311417_data.txt'
$TodayCsv = '.\SO_51311417_data.csv'

$RE0 = [RegEx]'SELECTED FOR OPERATOR\s+(?<UserID>\d{6})'
$RE1 = [RegEx]'(?m)(?<Date>\d{2}\/\d{2}\/\d{2}) (?<Time>\d{4}) +(?<Terminal>A\d{3}) +(?<Enquirer>.*)$'
$RE2 = [RegEx]'\s+(SEARCH|REF|IMAGE) ENQUIRY\s+(?<SearchResult>.+?)\s+(PAGE|CHAPTER) CODE'
$RE3 = [RegEx]'\s+DATE OF BIRTH : (?<DOB>[0-9 /]+?/\d{4})'

$Sections = (Get-Content $FileIn -Raw) -split "={30,}`r?`n" -ne ''
$UserID = "n/a"
$Csv = ForEach($Section in $Sections){
    If ($Section -match $RE0){
        $UserID = $Matches.UserID
    } Else {
        $Row= @{} | Select-Object Date,Time,Terminal,UserID,Enquirer,SearchResult,DOB
        $Cnt = 0
        If ($Section -match $RE1){
            $Row.Date     = $Matches.Date
            $Row.Time     = $Matches.Time
            $Row.Terminal = $Matches.Terminal
            $Row.Enquirer = $Matches.Enquirer.Trim()
            $Row.UserID   = $UserID
        }
        If ($Section -match $RE2){
            $Row.SearchResult  = $Matches.SearchResult
        }
        If ($Section -match $RE3){
            $Row.DOB      = $Matches.DOB
        }
        $Row
    }
}

$csv | Format-Table
$csv | Export-Csv $Todaycsv -NoTypeInformation

示例输出

Date     Time Terminal UserID Enquirer                                           SearchResult           DOB
----     ---- -------- ------ --------                                           ------------           ---
01/05/18 1603 A555     891234 CART87565       46573 RBCO NPC SERVICES GW/10/0043 RECORD NO : S48456/06P
03/05/18 1107 A555     891234 CERT16574       BTD/54/1786 16475                  DHF ID : 58/94710W
24/05/18 1015 A555     891234 CERT15473       19625 CBRS DDS SERVICES NM/18/0199 NAME : TREVOR SMITH      /  /1957
24/05/18 1025 A555     891234 CERT15473       15325 CBRS DDS SERVICES NM/12/0999 DDS ID : 04/102578R