如何从这个url中提取base、version等相关字段？

Question

我有一个url

http://example.com/embed/comments/?base=default&version=17f88c4&f=fir&t_i=article_25&t_u=http%3A%2F%2Fwww.firstpost.com%2Fsports%2Fkotla-test-22.html%09&t_e=Kotla%20Test20struggle&t_d=Kotla%20Test20struggle&t_t=Kotla%20Test20struggle&s_o=default

查询是基础、版本、f、t_i、t_u、t_e、t_d、t_t、s_o

约束：

基础、版本、f 是必需的。
其他t_i、t_u、t_e、t_d、t_d、s_o是可选的，即有些有时会出现，有时会出现没有。

我需要找到正确的正则表达式。了解了他们并想出了这个

r'^embed/comments/?base=(\w+)&version=(\w+)&f=\w+&t_i=\w+&t_u=.+&t_e=.+&t_d=.+&t_t=.+&s_o=\w+'

我正在使用 django，所以在 urls.py 中，上面应该匹配并且确实匹配。

Q.0。如何提取base、version等相关字段？有了约束，正则表达式应该修改成什么？

例如保存论坛，使用下面的正则表达式。我搜索了两个多小时，但找不到 ?P<forum> 功能是什么

问题 1。 ?P<forum> 是什么意思？

r'^forum/(?P<forum>.+)/$'

P.S。我是正则表达式的新手，请耐心等待并用更简单的术语进行解释。非常感谢

Answer 1

Q.0 ：它们是查询参数，因此您不必将它们放在 url 正则表达式中。您应该测试视图中是否缺少某些查询参数。

这是一个完整的例子：

在 urls.py 文件中，使用这个正则表达式：

url(r'embed/comments/', views.your_view),

然后在您的 views.py 文件中：

def your_view(request):
    # get your query params like this
    base = request.GET.get('base')
    version = request.GET.get('version')
    f = request.GET.get('f')
    # then test if some parameter are missing
    if base and version and f:
        # do what you want

Q.1 : 这是一个命名组。在 Django 中，此语法将使您能够在视图中获取此参数。

例如：如果用户达到 forum/hello-n00b，那么在您看来

def example(request, forum):
    # forum is equals to 'hello-n00b'

Answer 2

使用命名组，我会这样：

anyChar = "[^&]" # pattern to match any character inside url
necessary = ['base', 'version', 'f'] # list of necessary segments
others = ['t_i', 't_u', 't_e', 't_d', 't_d', 's_o'] # list of other allowed segments
pattern = "^embed/comments/\?" # start of pattern

# wrap all necessary to naming groups
necessaryPatterns = ["(?P<" + name + ">" + name + "=" + anyChar + "+)" for name in necessary]
# wrap all others into naming groups
othersPatterns = ["(?P<" + name + ">" + name + "=" + anyChar + "+)" for name in othersPatterns]


pattern += "&".join(necessaryPatterns) # append to pattern all necessary separated by "&" sign
pattern += "(?:&" # start optional segments with nom-matching group
pattern += ")?(?:&".join(othersPatterns) # append all others with closing-opening non-matching group marked as optional
pattern += ")?$" # end with $ to match end of string

regex = re.compile(pattern) # compile pattern
url = "your_super_long_url" # your url to match
match = re.match(pattern, url) # run match operation
if matchObj: # check if it matched
    base = match.group('base') # access matched named groups easily 
    version = match.group('version')
    ....

这个例子可能有错误，但它应该给你基本的想法。段的名称应该写成常量，将名称包装到命名组中可以通过函数来完成，但我目前的Python技能不允许我在合理的时间内写出完整的class。

如何从这个url中提取base、version等相关字段？

How to extract the related fields for base, version and others from this url?

python

regex

django

django-urls