从平面文件中提取特定子串后的数字

Question

我需要为平面文件编写 VB 脚本实用程序来查找字符串 MD*。如果找到 MD*，找到 MD* 旁边的数字的长度，如果数字的长度大于 10，则将 MD* 替换为 XXXXXX*。

到目前为止我已经写了这个：

Dim index,str
str = "MD*"
index = InStr(str, "MD*") + 1
Const ForReading = 1
Const ForWriting = 2

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("C:\Users\Test\test.txt", ForReading)

strText = objFile.ReadAll
objFile.Close
If Len(InStr("MD*") + 1) > 9 Then
    strText = Replace(strText, "MD*", "XXXX*")
End If

Set objFile = objFSO.OpenTextFile("C:\Users\Test\test.txt", ForWriting)
objFile.WriteLine strText

objFile.Close

来自文件的示例数据：

NM1*IL*1*GOTODS*NEOL*X***MD*70238
NM1*IL*1*GOTODS*DAVID****MD*19446836789

Answer 1

您的代码看起来已经很不错了，但我认为您必须逐行检查文件并检查 MD* 字段。查看示例数据，我认为最好检查 *MD* 以确保其他字段的 none 以 "MD".

结尾

试试这个：

Option Explicit

Const ForReading = 1
Const ForWriting = 2

Dim objFSO, objFile, strText, lines, i, md, number

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("C:\Users\Test\test.txt", ForReading)

' read all text
strText = objFile.ReadAll
' split it into an array of lines on Newline (vbCrLf)
lines = Split(strText, vbCrLf)
objFile.Close

'loop through each line to see if in contains the string "*MD*"
For i = 0 to uBound(lines)
    md = InStr(1, lines(i), "*MD*", vbBinaryCompare )  'use vbTextCompare for case insensitive search
    If md > 0 Then
        number = Trim(Mid(lines(i), md + 4))
        If Len(number) > 9 Then
            'update this line
            lines(i) = Replace(lines(i),"*MD*","*XXXX*")
        End If
    End If
Next

'now write the updated array back to file
'set a different filename here so as not to overwrite your original source file
Set objFile = objFSO.OpenTextFile("C:\Users\Test\test_updated.txt", ForWriting)
For i = 0 to uBound(lines)
    objFile.WriteLine lines(i)
Next
objFile.Close

'clean up objects used
Set objFile = Nothing
Set objFSO  = Nothing

说明

在 For i = 0 to uBound(lines) 循环中，首先要做的是检查该行是否确实具有字符串值 *MD*。对此的测试是

md = InStr(1, lines(i), "*MD*", vbBinaryCompare)

了解 Instr()

如果测试成功，变量md就会大于0，所以接下来我们尝试获取*MD*字符串右边的值。由于变量 md 保存了字符串（第（i）行）中的起始位置，我们只需将其加上 4（*MD* 的长度）即可获得其后面值的起始位置。

根据你的例子，行中这个值后面没有任何东西，所以我们可以使用MID()函数来检索它，从位置md + 4开始，不指定结束位置所以它将获得该行剩余的所有内容。然后这个值被捕获在名为 number 的变量中，因为它总是代表一个数值：

number = Trim(Mid(lines(i), md + 4))

了解 Mid()

根据您的评论，我了解到可能有 whitespace 个字符，例如 space、制表符和/或换行符围绕着您想要的值，因此我在周围放置了一个 Trim()它来消除那些。

了解 Trim()

请注意：number 只是一个 STRING 值，但这是您要测试的内容，因此我们测试此字符串的长度是否大于 9：

If Len(number) > 9 Then

查找Len()

如果是这种情况，请将行中的 *MD* 替换为 *XXXX*，然后转到下一行，直到我们完成。

希望对您有所帮助。

p.s。我不隶属于 w3schools，但对于 VBScript 初学者来说，这是获取信息的好地方。

从平面文件中提取特定子串后的数字

Extract number after particular substring from a flat file

vbscript