python 请求的正则表达式
Regex for python request
您好,我正在寻找一种解决方案来创建一个函数,该函数 returns 下一个结构的字典列表
示例:
example_dict = {"host":"146.204.224.152",
"user_name":"feest6811",
"time":"21/Jun/2019:15:45:24 -0700",
"request":"POST /incentivize HTTP/1.1"}
数据如下所示:
146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622
197.109.77.178 - kertzmann3129 [21/Jun/2019:15:45:25 -0700] "DELETE /virtual/solutions/target/web+services HTTP/2.0" 203 26554
156.127.178.177 - okuneva5222 [21/Jun/2019:15:45:27 -0700] "DELETE /interactive/transparent/niches/revolutionize HTTP/1.1" 416 14701
*Keeps going more entries...*
我的函数如下所示:
import re
def logs():
with open("assets/logdata.txt", "r") as file:
logdata = file.read()
pattern="""
(?P<host>.[\d.]*\s?) #host
(?P<user_name>[\s\w-]*\s?) #user_name
(?P<time>[\w\/\:\.\[\s-]*[\]\s]) #time
(?P<request>[\w\/\"\s.]*"?) #request"""
group=[]
for item in re.finditer(pattern,logdata,re.VERBOSE):
group.append(item.groupdict())
return group
raise NotImplementedError()
并返回类似这样的内容:
[{'host': '146.204.224.152 ',
'user_name': '- feest6811 ',
'time': '[21/Jun/2019:15:45:24 -0700]',
'request': ' "POST /incentivize HTTP/1.1" 302 4622\n197.109.77.178 '},
{'host': '- ',
'user_name': 'kertzmann3129 ',
'time': '[21/Jun/2019:15:45:25 -0700]',
'request': ' "DELETE /virtual/solutions/target/web'},
{'host': '+',
'user_name': 'services',
'time': ' ',
'request': 'HTTP/2.0" 203 26554\n156.127.178.177 '}]
我可以更改什么以解决此错误?
您可以尝试使用正则表达式。
(?P<host>[\d.]+)(?:\s*-\s*)(?P<user_name>\w+)(?:\s*\[)(?P<time>.*?)(?:\])(?:\s*)(?P<request>\".*?\")
试试这个:
pattern="""
(?P<host>\d{1,3}(?:\.\d{1,3}){3})\s-\s #host (IPv4 only)
(?P<user_name>[\s\w-]*?)\s? #user_name
\[(?P<time>[\w\/\:\.\s-]*)\]\s? #time
"(?P<request>.*?)"\s? #request
(?P<code>\d{3})\s? #response code
(?P<bytes>\d+)\s? #bytes sent or received
"""
您好,我正在寻找一种解决方案来创建一个函数,该函数 returns 下一个结构的字典列表
示例:
example_dict = {"host":"146.204.224.152",
"user_name":"feest6811",
"time":"21/Jun/2019:15:45:24 -0700",
"request":"POST /incentivize HTTP/1.1"}
数据如下所示:
146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622
197.109.77.178 - kertzmann3129 [21/Jun/2019:15:45:25 -0700] "DELETE /virtual/solutions/target/web+services HTTP/2.0" 203 26554
156.127.178.177 - okuneva5222 [21/Jun/2019:15:45:27 -0700] "DELETE /interactive/transparent/niches/revolutionize HTTP/1.1" 416 14701
*Keeps going more entries...*
我的函数如下所示:
import re
def logs():
with open("assets/logdata.txt", "r") as file:
logdata = file.read()
pattern="""
(?P<host>.[\d.]*\s?) #host
(?P<user_name>[\s\w-]*\s?) #user_name
(?P<time>[\w\/\:\.\[\s-]*[\]\s]) #time
(?P<request>[\w\/\"\s.]*"?) #request"""
group=[]
for item in re.finditer(pattern,logdata,re.VERBOSE):
group.append(item.groupdict())
return group
raise NotImplementedError()
并返回类似这样的内容:
[{'host': '146.204.224.152 ',
'user_name': '- feest6811 ',
'time': '[21/Jun/2019:15:45:24 -0700]',
'request': ' "POST /incentivize HTTP/1.1" 302 4622\n197.109.77.178 '},
{'host': '- ',
'user_name': 'kertzmann3129 ',
'time': '[21/Jun/2019:15:45:25 -0700]',
'request': ' "DELETE /virtual/solutions/target/web'},
{'host': '+',
'user_name': 'services',
'time': ' ',
'request': 'HTTP/2.0" 203 26554\n156.127.178.177 '}]
我可以更改什么以解决此错误?
您可以尝试使用正则表达式。
(?P<host>[\d.]+)(?:\s*-\s*)(?P<user_name>\w+)(?:\s*\[)(?P<time>.*?)(?:\])(?:\s*)(?P<request>\".*?\")
试试这个:
pattern="""
(?P<host>\d{1,3}(?:\.\d{1,3}){3})\s-\s #host (IPv4 only)
(?P<user_name>[\s\w-]*?)\s? #user_name
\[(?P<time>[\w\/\:\.\s-]*)\]\s? #time
"(?P<request>.*?)"\s? #request
(?P<code>\d{3})\s? #response code
(?P<bytes>\d+)\s? #bytes sent or received
"""