用于从文本中查找基因产物的正则表达式
regex for finding gene product from the text
我应该使用什么正则表达式来匹配这样的文本
/product="hypothetical protein"".
到目前为止我已经厌倦了这种模式:
x = re.match(r"^s*\=product(.*)",line)"
使用
import re
test_str = ' /product="hypothetical protein"'
match = re.search(r'product="([^"]+)"', test_str)
if match:
print(match.group(1))
参见regex proof。
解释
--------------------------------------------------------------------------------
product=" 'product="'
--------------------------------------------------------------------------------
( group and capture to :
--------------------------------------------------------------------------------
[^"]+ any character except: '"' (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of
--------------------------------------------------------------------------------
" '"'
我应该使用什么正则表达式来匹配这样的文本
/product="hypothetical protein"".
到目前为止我已经厌倦了这种模式:
x = re.match(r"^s*\=product(.*)",line)"
使用
import re
test_str = ' /product="hypothetical protein"'
match = re.search(r'product="([^"]+)"', test_str)
if match:
print(match.group(1))
参见regex proof。
解释
--------------------------------------------------------------------------------
product=" 'product="'
--------------------------------------------------------------------------------
( group and capture to :
--------------------------------------------------------------------------------
[^"]+ any character except: '"' (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of
--------------------------------------------------------------------------------
" '"'