获取 'package' 和 'endpackage' 可选字符串之外的结构名称列表
Get list of struct names that are outside of 'package' and 'endpackage' optional strings
我正在尝试获取 package
和 endpackage
可选字符串之外的结构名称。
如果没有 package
和 endpackage
字符串,那么脚本应该 return 所有结构名称。
这是我的脚本:
import re
a = """
package new;
typedef struct packed
{
logic a;
logic b;
} abc_y;
typedef struct packed
{
logic a;
logic b;
} abc_t;
endpackage
typedef struct packed
{
logic a;
logic b;
} abc_x;
"""
print(re.findall(r'(?!package)*.*?typedef\s+struct\s+packed\s*{.*?}\s*(\w+);.*?(?!endpackage)*', a, re.MULTILINE|re.DOTALL))
这是输出:
['abc_y', 'abc_t', 'abc_x']
预期输出:
['abc_x']
我在正则表达式中遗漏了一些东西,但无法弄清楚是什么。有人可以帮我解决这个问题吗?提前致谢。
使用
\bpackage.*?\bendpackage\b|typedef\s+struct\s+packed\s*{[^{}]*}\s*(\w+);
解释
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
package 'package'
--------------------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
endpackage 'endpackage'
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
typedef 'typedef'
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
struct 'struct'
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
packed 'packed'
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
{ '{'
--------------------------------------------------------------------------------
[^{}]* any character except: '{', '}' (0 or more
times (matching the most amount possible))
--------------------------------------------------------------------------------
} '}'
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
( group and capture to :
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of
--------------------------------------------------------------------------------
; ';'
print(list(filter(None,re.findall(r'\bpackage.*?\bendpackage\b|typedef\s+struct\s+packed\s*{[^{}]*}\s*(\w+);', a, re.DOTALL))))
结果:['abc_x']
我正在尝试获取 package
和 endpackage
可选字符串之外的结构名称。
如果没有 package
和 endpackage
字符串,那么脚本应该 return 所有结构名称。
这是我的脚本:
import re
a = """
package new;
typedef struct packed
{
logic a;
logic b;
} abc_y;
typedef struct packed
{
logic a;
logic b;
} abc_t;
endpackage
typedef struct packed
{
logic a;
logic b;
} abc_x;
"""
print(re.findall(r'(?!package)*.*?typedef\s+struct\s+packed\s*{.*?}\s*(\w+);.*?(?!endpackage)*', a, re.MULTILINE|re.DOTALL))
这是输出:
['abc_y', 'abc_t', 'abc_x']
预期输出:
['abc_x']
我在正则表达式中遗漏了一些东西,但无法弄清楚是什么。有人可以帮我解决这个问题吗?提前致谢。
使用
\bpackage.*?\bendpackage\b|typedef\s+struct\s+packed\s*{[^{}]*}\s*(\w+);
解释
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
package 'package'
--------------------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
endpackage 'endpackage'
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
typedef 'typedef'
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
struct 'struct'
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
packed 'packed'
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
{ '{'
--------------------------------------------------------------------------------
[^{}]* any character except: '{', '}' (0 or more
times (matching the most amount possible))
--------------------------------------------------------------------------------
} '}'
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
( group and capture to :
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of
--------------------------------------------------------------------------------
; ';'
print(list(filter(None,re.findall(r'\bpackage.*?\bendpackage\b|typedef\s+struct\s+packed\s*{[^{}]*}\s*(\w+);', a, re.DOTALL))))
结果:['abc_x']