如何提取多次使用的两个标识符之间的文本文件值列表
How to Extract List of Text File Values Between 2 Identifiers That are Used Several Times
我正在尝试制作一个 VB.NET Visual Studio 2019 表格,它将要求一个文本文件并在我称为 TextBox4
的文本框中输出一个名称列表,所以我不必须创建任何文件(或者可能创建一个文本文件,将其复制到 TextBox4,然后删除它?)。文本文件中的名称介于 "Customer_Name"
和 "Customer_ID"
之间。除了这 2 个标识符之外,该文件似乎没有任何押韵或原因,因此很难有效地将其拆分。如果相关的话,每个文件通常有 100 到 1000 个条目。
样本(模拟)数据:
"Customer_name":"JOHN DOE","Customer_id":"9251954","Customer_team_id":"HOST","Customer_position_id":"MGR","Customer_short_name":"Joey","Customer_eligibility":"LT5","Customer_page_url":"google.com","Customer_alt_id":"M7","Customer_name":"JANE DOE","Customer_id":"8734817","Customer_team_id":"HOST","Customer_position_id":"TECH","Customer_name":"JOSEPH DOE","Customer_id":"8675307",
我想在文本框中显示:
约翰·多伊
简·多伊
约瑟夫·多伊
看看这个正则表达式模式:
(?:")(?<key>\w+)(?:":")(?<value>((\w|\s|\.))+)(?:",)
这有几件事:
(?:")
- 创建非捕获组以匹配左引号
(?<key>\w+)
- 创建一个命名组以匹配密钥(例如 Customer_name)
(?:":")
- 创建一个非捕获组来匹配右引号、冒号和左引号
(?<value>((\w|\s|\.))+)
- 创建一个命名组以匹配值(例如 John Doe)
(?:",)
- 创建一个非捕获组来匹配右引号
有了这个,您可以遍历匹配项和匹配项的组以仅获取客户名称:
' declare the pattern and input (escaping quotation marks) as well as a collection to store just the customer_name values
Dim pattern As String = "(?:"")(?<key>\w+)(?:"":"")(?<value>((\w|\s|\.))+)(?:"",)"
Dim input As String = """Customer_name"":""JOHN DOE"",""Customer_id"":""9251954"",""Customer_team_id"":""HOST"",""Customer_position_id"":""MGR"",""Customer_short_name"":""Joey"",""Customer_eligibility"":""LT5"",""Customer_page_url"":""google.com"",""Customer_alt_id"":""M7"",""Customer_name"":""JANE DOE"",""Customer_id"":""8734817"",""Customer_team_id"":""HOST"",""Customer_position_id"":""TECH"",""Customer_name"":""JOSEPH DOE"",""Customer_id"":""8675307"""
Dim matches As MatchCollection = Regex.Matches(input, pattern)
Dim names As New List(Of String)()
' loop over each match
For Each match As Match In matches
' loop over each group in the match
For index As Integer = 0 To match.Groups.Count - 1
Dim group As Group = match.Groups.Item(index)
' only do something if we're on the "key" group, the "key" group's value is Customer_name, and there's at least one more group left
If (group.Name = "key" AndAlso group.Value = "Customer_name" AndAlso index < match.Groups.Count - 1)
' only do something if the next group is the "value" group
Dim valueGroup As Group = match.Groups.Item(index + 1)
If (valueGroup.Name = "value") Then
' add the key's value
names.Add(valueGroup.Value)
Exit For
End If
End If
Next
Next
' set the TextBox's lines
TextBox4.Lines = names.ToArray()
Fiddle: https://dotnetfiddle.net/Zja46U
编辑 - 请记住,因为我们使用的是命名组,所以现在可以扩展此代码以获取任何 key/value 对。只是为了这个例子,我只得到 Customer_name
key/value 对。
我正在尝试制作一个 VB.NET Visual Studio 2019 表格,它将要求一个文本文件并在我称为 TextBox4
的文本框中输出一个名称列表,所以我不必须创建任何文件(或者可能创建一个文本文件,将其复制到 TextBox4,然后删除它?)。文本文件中的名称介于 "Customer_Name"
和 "Customer_ID"
之间。除了这 2 个标识符之外,该文件似乎没有任何押韵或原因,因此很难有效地将其拆分。如果相关的话,每个文件通常有 100 到 1000 个条目。
样本(模拟)数据:
"Customer_name":"JOHN DOE","Customer_id":"9251954","Customer_team_id":"HOST","Customer_position_id":"MGR","Customer_short_name":"Joey","Customer_eligibility":"LT5","Customer_page_url":"google.com","Customer_alt_id":"M7","Customer_name":"JANE DOE","Customer_id":"8734817","Customer_team_id":"HOST","Customer_position_id":"TECH","Customer_name":"JOSEPH DOE","Customer_id":"8675307",
我想在文本框中显示:
约翰·多伊
简·多伊
约瑟夫·多伊
看看这个正则表达式模式:
(?:")(?<key>\w+)(?:":")(?<value>((\w|\s|\.))+)(?:",)
这有几件事:
(?:")
- 创建非捕获组以匹配左引号(?<key>\w+)
- 创建一个命名组以匹配密钥(例如 Customer_name)(?:":")
- 创建一个非捕获组来匹配右引号、冒号和左引号(?<value>((\w|\s|\.))+)
- 创建一个命名组以匹配值(例如 John Doe)(?:",)
- 创建一个非捕获组来匹配右引号
有了这个,您可以遍历匹配项和匹配项的组以仅获取客户名称:
' declare the pattern and input (escaping quotation marks) as well as a collection to store just the customer_name values
Dim pattern As String = "(?:"")(?<key>\w+)(?:"":"")(?<value>((\w|\s|\.))+)(?:"",)"
Dim input As String = """Customer_name"":""JOHN DOE"",""Customer_id"":""9251954"",""Customer_team_id"":""HOST"",""Customer_position_id"":""MGR"",""Customer_short_name"":""Joey"",""Customer_eligibility"":""LT5"",""Customer_page_url"":""google.com"",""Customer_alt_id"":""M7"",""Customer_name"":""JANE DOE"",""Customer_id"":""8734817"",""Customer_team_id"":""HOST"",""Customer_position_id"":""TECH"",""Customer_name"":""JOSEPH DOE"",""Customer_id"":""8675307"""
Dim matches As MatchCollection = Regex.Matches(input, pattern)
Dim names As New List(Of String)()
' loop over each match
For Each match As Match In matches
' loop over each group in the match
For index As Integer = 0 To match.Groups.Count - 1
Dim group As Group = match.Groups.Item(index)
' only do something if we're on the "key" group, the "key" group's value is Customer_name, and there's at least one more group left
If (group.Name = "key" AndAlso group.Value = "Customer_name" AndAlso index < match.Groups.Count - 1)
' only do something if the next group is the "value" group
Dim valueGroup As Group = match.Groups.Item(index + 1)
If (valueGroup.Name = "value") Then
' add the key's value
names.Add(valueGroup.Value)
Exit For
End If
End If
Next
Next
' set the TextBox's lines
TextBox4.Lines = names.ToArray()
Fiddle: https://dotnetfiddle.net/Zja46U
编辑 - 请记住,因为我们使用的是命名组,所以现在可以扩展此代码以获取任何 key/value 对。只是为了这个例子,我只得到 Customer_name
key/value 对。