尝试在 python 中使用正则表达式
trying to regex in python
任何人都可以帮助我理解这段代码片段,来自 http://garethrees.org/2007/05/07/python-challenge/ Level2
>>> import urllib
>>> def get_challenge(s):
... return urllib.urlopen('http://www.pythonchallenge.com/pc/' + s).read()
...
>>> src = get_challenge('def/ocr.html')
>>> import re
>>> text = re.compile('<!--((?:[^-]+|-[^-]|--[^>])*)-->', re.S).findall(src)[-1]
>>> counts = {}
>>> for c in text: counts[c] = counts.get(c, 0) + 1
>>> counts
http://garethrees.org/2007/05/07/python-challenge/
re.compile('<!--((?:[^-]+|-[^-]|--[^>])*)-->', re.S).findall(src)[-1]
为什么我们这里有[-1]有什么用呢?是将其转换为列表吗? **
它已经是一个列表。如果您有一个列表 myList
,myList[-1]
returns 该列表中的最后一个元素。
读这个:https://docs.python.org/2/tutorial/introduction.html#lists。
是的。 re.findall()
returns 所有匹配的列表。看看 the documentation.
re.findall(pattern, string, flags=0)
Return all non-overlapping matches of pattern in string, as a list of
strings. The string is scanned left-to-right, and matches are returned
in the order found. If one or more groups are present in the pattern,
return a list of groups; this will be a list of tuples if the pattern
has more than one group. Empty matches are included in the result
unless they touch the beginning of another match.
在结果上调用 [-1]
时,访问列表 末尾的第一个元素 。
例如;
>>> a = [1,2,3,4,5]
>>> a[-1]
5
还有:
>>> re.compile('.*?-').findall('-foo-bar-')[-1]
'bar-'
任何人都可以帮助我理解这段代码片段,来自 http://garethrees.org/2007/05/07/python-challenge/ Level2
>>> import urllib
>>> def get_challenge(s):
... return urllib.urlopen('http://www.pythonchallenge.com/pc/' + s).read()
...
>>> src = get_challenge('def/ocr.html')
>>> import re
>>> text = re.compile('<!--((?:[^-]+|-[^-]|--[^>])*)-->', re.S).findall(src)[-1]
>>> counts = {}
>>> for c in text: counts[c] = counts.get(c, 0) + 1
>>> counts
http://garethrees.org/2007/05/07/python-challenge/
re.compile('<!--((?:[^-]+|-[^-]|--[^>])*)-->', re.S).findall(src)[-1]
为什么我们这里有[-1]有什么用呢?是将其转换为列表吗? **
它已经是一个列表。如果您有一个列表 myList
,myList[-1]
returns 该列表中的最后一个元素。
读这个:https://docs.python.org/2/tutorial/introduction.html#lists。
是的。 re.findall()
returns 所有匹配的列表。看看 the documentation.
re.findall(pattern, string, flags=0)
Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.
在结果上调用 [-1]
时,访问列表 末尾的第一个元素 。
例如;
>>> a = [1,2,3,4,5]
>>> a[-1]
5
还有:
>>> re.compile('.*?-').findall('-foo-bar-')[-1]
'bar-'