如何从字符串中提取逗号分隔的子字符串?
How to extract comma separated substrings from a string?
需要在组中解析以逗号分隔的算法。
SSH Enabled - version 2.0
Authentication methods:publickey,keyboard-interactive,password
Encryption Algorithms:aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc
MAC Algorithms:hmac-sha1,hmac-sha1-96
Authentication timeout: 120 secs; Authentication retries: 3
Minimum expected Diffie Hellman key size : 1024 bits
IOS Keys in SECSH format(ssh-rsa, base64 encoded):
我试过用逗号分隔它们,但没有得到预期的结果:
^Encryption Algorithms:(.*?)(?:,|$)
预期结果是第 1 组中的每个算法都没有空组
aes128-ctr
aes192-ctr
aes256-ctr
aes128-cbc
3des-cbc
aes192-cbc
aes256-cbc
这可能不是最好的方法,但它可能是将我们的字符串分成三部分的一种方法,甚至可能在 运行 通过 RegEx 引擎将其拆分之前。如果情况并非如此,我们希望有一个表达式,这可能很接近:
(.+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC.+)
如果你也有换行,你可能想用其他表达式来测试,可能类似于:
([\s\S]+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC[\s\S]+)
([\w\W]+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC[\w\W]+)
([\d\D]+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC[\d\D]+)
Demo 1
Demo 2
正则表达式
如果不需要此表达式,可以在 regex101.com 中对其进行修改或更改。
正则表达式电路
jex.im 可视化正则表达式:
测试
# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility
import re
regex = r"([\w\W]+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC[[\w\W]+)"
test_str = ("SSH Enabled - version 2.0\n"
"Authentication methods:publickey,keyboard-interactive,password\n"
"Encryption Algorithms:aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc\n"
"MAC Algorithms:hmac-sha1,hmac-sha1-96\n"
"Authentication timeout: 120 secs; Authentication retries: 3\n"
"Minimum expected Diffie Hellman key size : 1024 bits\n"
"IOS Keys in SECSH format(ssh-rsa, base64 encoded):\n")
subst = "\2 "
# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)
if result:
print (result)
# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.
演示
const regex = /(.+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC.+)/gm;
const str = `SSH Enabled - version 2.0 Authentication methods:publickey,keyboard-interactive,password Encryption Algorithms:aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc MAC Algorithms:hmac-sha1,hmac-sha1-96 Authentication timeout: 120 secs; Authentication retries: 3 Minimum expected Diffie Hellman key size : 1024 bits IOS Keys in SECSH format(ssh-rsa, base64 encoded):`;
const subst = ` `;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
另一种方法是匹配以 Encryption Algorithms:
开头的字符串,然后在组中捕获一个重复模式,该模式与带连字符的部分匹配,并以逗号开头重复。
如果匹配,您可以用逗号分隔第一个捕获组。
^Encryption Algorithms:(\w+-\w+(?:,\w+-\w+)*)
说明
^
Encryption Algorithms:
(
开始抓包
\w+-\w+
匹配 1+ 个单词字符,-
和 1+ 个单词字符
(?:,\w+-\w+)*
0+ 次重复逗号后跟 1+ 个单词字符,-
和 1+ 个单词字符
)
关闭捕获组
import re
regex = r"^Encryption Algorithms:(\w+-\w+(?:,\w+-\w+)*)"
test_str = ("SSH Enabled - version 2.0\n"
"Authentication methods:publickey,keyboard-interactive,password\n"
"Encryption Algorithms:aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc\n"
"MAC Algorithms:hmac-sha1,hmac-sha1-96\n"
"Authentication timeout: 120 secs; Authentication retries: 3\n"
"Minimum expected Diffie Hellman key size : 1024 bits\n"
"IOS Keys in SECSH format(ssh-rsa, base64 encoded):")
matches = re.search(regex, test_str, re.MULTILINE)
if matches:
print(matches.group(1).split(","))
结果:
['aes128-ctr', 'aes192-ctr', 'aes256-ctr', 'aes128-cbc', '3des-cbc', 'aes192-cbc', 'aes256-cbc']
需要在组中解析以逗号分隔的算法。
SSH Enabled - version 2.0
Authentication methods:publickey,keyboard-interactive,password
Encryption Algorithms:aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc
MAC Algorithms:hmac-sha1,hmac-sha1-96
Authentication timeout: 120 secs; Authentication retries: 3
Minimum expected Diffie Hellman key size : 1024 bits
IOS Keys in SECSH format(ssh-rsa, base64 encoded):
我试过用逗号分隔它们,但没有得到预期的结果:
^Encryption Algorithms:(.*?)(?:,|$)
预期结果是第 1 组中的每个算法都没有空组
aes128-ctr
aes192-ctr
aes256-ctr
aes128-cbc
3des-cbc
aes192-cbc
aes256-cbc
这可能不是最好的方法,但它可能是将我们的字符串分成三部分的一种方法,甚至可能在 运行 通过 RegEx 引擎将其拆分之前。如果情况并非如此,我们希望有一个表达式,这可能很接近:
(.+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC.+)
如果你也有换行,你可能想用其他表达式来测试,可能类似于:
([\s\S]+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC[\s\S]+)
([\w\W]+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC[\w\W]+)
([\d\D]+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC[\d\D]+)
Demo 1
Demo 2
正则表达式
如果不需要此表达式,可以在 regex101.com 中对其进行修改或更改。
正则表达式电路
jex.im 可视化正则表达式:
测试
# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility
import re
regex = r"([\w\W]+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC[[\w\W]+)"
test_str = ("SSH Enabled - version 2.0\n"
"Authentication methods:publickey,keyboard-interactive,password\n"
"Encryption Algorithms:aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc\n"
"MAC Algorithms:hmac-sha1,hmac-sha1-96\n"
"Authentication timeout: 120 secs; Authentication retries: 3\n"
"Minimum expected Diffie Hellman key size : 1024 bits\n"
"IOS Keys in SECSH format(ssh-rsa, base64 encoded):\n")
subst = "\2 "
# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)
if result:
print (result)
# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.
演示
const regex = /(.+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC.+)/gm;
const str = `SSH Enabled - version 2.0 Authentication methods:publickey,keyboard-interactive,password Encryption Algorithms:aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc MAC Algorithms:hmac-sha1,hmac-sha1-96 Authentication timeout: 120 secs; Authentication retries: 3 Minimum expected Diffie Hellman key size : 1024 bits IOS Keys in SECSH format(ssh-rsa, base64 encoded):`;
const subst = ` `;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
另一种方法是匹配以 Encryption Algorithms:
开头的字符串,然后在组中捕获一个重复模式,该模式与带连字符的部分匹配,并以逗号开头重复。
如果匹配,您可以用逗号分隔第一个捕获组。
^Encryption Algorithms:(\w+-\w+(?:,\w+-\w+)*)
说明
^
Encryption Algorithms:
(
开始抓包\w+-\w+
匹配 1+ 个单词字符,-
和 1+ 个单词字符(?:,\w+-\w+)*
0+ 次重复逗号后跟 1+ 个单词字符,-
和 1+ 个单词字符
)
关闭捕获组
import re
regex = r"^Encryption Algorithms:(\w+-\w+(?:,\w+-\w+)*)"
test_str = ("SSH Enabled - version 2.0\n"
"Authentication methods:publickey,keyboard-interactive,password\n"
"Encryption Algorithms:aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc\n"
"MAC Algorithms:hmac-sha1,hmac-sha1-96\n"
"Authentication timeout: 120 secs; Authentication retries: 3\n"
"Minimum expected Diffie Hellman key size : 1024 bits\n"
"IOS Keys in SECSH format(ssh-rsa, base64 encoded):")
matches = re.search(regex, test_str, re.MULTILINE)
if matches:
print(matches.group(1).split(","))
结果:
['aes128-ctr', 'aes192-ctr', 'aes256-ctr', 'aes128-cbc', '3des-cbc', 'aes192-cbc', 'aes256-cbc']