使用正则表达式匹配 free() 和 malloc() 调用
Matching free() and malloc() calls with regular expressions
我正在创建一个 powershell 脚本来解析包含 C 代码的文件并检测它是否包含对 free()、malloc()[= 的调用36=] 或 realloc() 函数。
file_one.c
int MethodOne()
{
return 1;
}
int MethodTwo()
{
free();
return 1;
}
file_two.c
int MethodOne()
{
//free();
return 1;
}
int MethodTwo()
{
free();
return 1;
}
检查。ps1
$regex = "(^[^/]*free\()|(^[^/]*malloc\()|(^[^/]*realloc\()"
$file_one= "Z:\PATH\file_one.txt"
$file_two= "Z:\PATH\file_two.txt"
$contentOne = Get-Content $file_one -Raw
$contentOne -match $regex
$contentTwo = Get-Content $file_two-Raw
$contentTwo -match $regex
一次处理整个文件似乎与 contentOne 配合得很好,
事实上我得到 True (因为 MethodTwo 中的 free())。
处理 contentTwo 就没那么幸运了 returns False 而不是 True
(因为 MethodTwo 中的 free())。
有人可以帮我写一个在这两种情况下都适用的更好的正则表达式吗?
好的,就是这个
原始:
^(?>(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//(?:[^\]|\(?:\r?\n)?)*?(?:\r?\n))|(?:"[^"\]*(?:\[\S\s][^"\]*)*"|'[^'\]*(?:\[\S\s][^'\]*)*'|(?!\b(?:free|malloc|realloc)\()[\S\s](?:(?!\b(?:free|malloc|realloc)\()[^/"'\])*))*(?:(\bfree\()|(\bmalloc\()|(\brealloc\())
弦乐:
"^(?>(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//(?:[^\\]|\\(?:\r?\n)?)*?(?:\r?\n))|(?:\"[^\"\\]*(?:\\[\S\s][^\"\\]*)*\"|'[^'\\]*(?:\\[\S\s][^'\\]*)*'|(?!\b(?:free|malloc|realloc)\()[\S\s](?:(?!\b(?:free|malloc|realloc)\()[^/\"'\\])*))*(?:(\bfree\()|(\bmalloc\()|(\brealloc\())"
逐字记录:
@"^(?>(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//(?:[^\]|\(?:\r?\n)?)*?(?:\r?\n))|(?:""[^""\]*(?:\[\S\s][^""\]*)*""|'[^'\]*(?:\[\S\s][^'\]*)*'|(?!\b(?:free|malloc|realloc)\()[\S\s](?:(?!\b(?:free|malloc|realloc)\()[^/""'\])*))*(?:(\bfree\()|(\bmalloc\()|(\brealloc\())"
已解释
^
(?>
(?: # Comments
/\* # Start /* .. */ comment
[^*]* \*+
(?: [^/*] [^*]* \*+ )*
/ # End /* .. */ comment
|
// # Start // comment
(?: # Possible line-continuation
[^\]
| \
(?: \r? \n )?
)*?
(?: \r? \n ) # End // comment
)
| # OR,
(?: # Non - comments
"
[^"\]* # Double quoted text
(?: \ [\S\s] [^"\]* )*
"
| '
[^'\]* # Single quoted text
(?: \ [\S\s] [^'\]* )*
'
| # OR,
(?! # ASSERT: Here, cannot be free / malloc / realloc {}
\b
(?: free | malloc | realloc )
\(
)
[\S\s] # Any char which could start a comment, string, etc..
# (Technically, we're going past a C++ source code error)
(?: # -------------------------
(?! # ASSERT: Here, cannot be free / malloc / realloc {}
\b
(?: free | malloc | realloc )
\(
)
[^/"'\] # Char which doesn't start a comment, string, escape,
# or line continuation (escape + newline)
)* # -------------------------
) # Done Non - comments
)*
(?:
( \b free\( ) # (1), Free()
|
( \b malloc\( ) # (2), Malloc()
|
( \b realloc\( ) # (3), Realloc()
)
一些注意事项:
这只会使用 ^
锚从字符串开头找到第一个。
要全部找到它们,只需从正则表达式中删除 ^
。
之所以有效,是因为它可以匹配您要查找的所有内容。
在本例中,它找到的是捕获组 1、2 或 3。
祝你好运!!
正则表达式包含的内容:
----------------------------------
* Format Metrics
----------------------------------
Atomic Groups = 1
Cluster Groups = 10
Capture Groups = 3
Assertions = 2
( ? ! = 2
Free Comments = 25
Character Classes = 12
编辑
根据请求,解释处理
的正则表达式部分
/**/
评论。这个 -> /\*[^*]*\*+(?:[^/*][^*]*\*+)*/
这是一个经过修改的展开循环正则表达式,它假定一个开始定界符
/*
和 */
.
的结尾
请注意 open/close 在其定界符
中共享一个公共字符 /
顺序。
为了能够在没有环视断言的情况下做到这一点,使用了一种方法
在循环内移动尾随定界符的星号。
使用此因式分解,所需要做的就是检查收盘价 /
完成定界序列。
/\* # Opening delimiter /*
[^*]* # Optionally, consume all non-asterisks
\*+ # This must be 1 or more asterisks anchor's or FAIL.
# This is matched here to align the optional loop below
# because it is looking for the closing /.
(?: # The optional loop part
[^/*] # Specifically a single non / character (nor asterisk).
# Since a / will be the next closing delimiter, it must be excluded.
[^*]* # Optional non-asterisks.
# This will accept a / because it is supposed to consume ALL
# opening delimiter's as it goes
# and will consider the very next */ as a close.
\*+ # This must be 1 or more asterisks anchor's or FAIL.
)* # Repeat 0 to many times.
/ # Closing delimiter /
我正在创建一个 powershell 脚本来解析包含 C 代码的文件并检测它是否包含对 free()、malloc()[= 的调用36=] 或 realloc() 函数。
file_one.c
int MethodOne()
{
return 1;
}
int MethodTwo()
{
free();
return 1;
}
file_two.c
int MethodOne()
{
//free();
return 1;
}
int MethodTwo()
{
free();
return 1;
}
检查。ps1
$regex = "(^[^/]*free\()|(^[^/]*malloc\()|(^[^/]*realloc\()"
$file_one= "Z:\PATH\file_one.txt"
$file_two= "Z:\PATH\file_two.txt"
$contentOne = Get-Content $file_one -Raw
$contentOne -match $regex
$contentTwo = Get-Content $file_two-Raw
$contentTwo -match $regex
一次处理整个文件似乎与 contentOne 配合得很好,
事实上我得到 True (因为 MethodTwo 中的 free())。
处理 contentTwo 就没那么幸运了 returns False 而不是 True
(因为 MethodTwo 中的 free())。
有人可以帮我写一个在这两种情况下都适用的更好的正则表达式吗?
好的,就是这个
原始:
^(?>(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//(?:[^\]|\(?:\r?\n)?)*?(?:\r?\n))|(?:"[^"\]*(?:\[\S\s][^"\]*)*"|'[^'\]*(?:\[\S\s][^'\]*)*'|(?!\b(?:free|malloc|realloc)\()[\S\s](?:(?!\b(?:free|malloc|realloc)\()[^/"'\])*))*(?:(\bfree\()|(\bmalloc\()|(\brealloc\())
弦乐:
"^(?>(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//(?:[^\\]|\\(?:\r?\n)?)*?(?:\r?\n))|(?:\"[^\"\\]*(?:\\[\S\s][^\"\\]*)*\"|'[^'\\]*(?:\\[\S\s][^'\\]*)*'|(?!\b(?:free|malloc|realloc)\()[\S\s](?:(?!\b(?:free|malloc|realloc)\()[^/\"'\\])*))*(?:(\bfree\()|(\bmalloc\()|(\brealloc\())"
逐字记录:
@"^(?>(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//(?:[^\]|\(?:\r?\n)?)*?(?:\r?\n))|(?:""[^""\]*(?:\[\S\s][^""\]*)*""|'[^'\]*(?:\[\S\s][^'\]*)*'|(?!\b(?:free|malloc|realloc)\()[\S\s](?:(?!\b(?:free|malloc|realloc)\()[^/""'\])*))*(?:(\bfree\()|(\bmalloc\()|(\brealloc\())"
已解释
^
(?>
(?: # Comments
/\* # Start /* .. */ comment
[^*]* \*+
(?: [^/*] [^*]* \*+ )*
/ # End /* .. */ comment
|
// # Start // comment
(?: # Possible line-continuation
[^\]
| \
(?: \r? \n )?
)*?
(?: \r? \n ) # End // comment
)
| # OR,
(?: # Non - comments
"
[^"\]* # Double quoted text
(?: \ [\S\s] [^"\]* )*
"
| '
[^'\]* # Single quoted text
(?: \ [\S\s] [^'\]* )*
'
| # OR,
(?! # ASSERT: Here, cannot be free / malloc / realloc {}
\b
(?: free | malloc | realloc )
\(
)
[\S\s] # Any char which could start a comment, string, etc..
# (Technically, we're going past a C++ source code error)
(?: # -------------------------
(?! # ASSERT: Here, cannot be free / malloc / realloc {}
\b
(?: free | malloc | realloc )
\(
)
[^/"'\] # Char which doesn't start a comment, string, escape,
# or line continuation (escape + newline)
)* # -------------------------
) # Done Non - comments
)*
(?:
( \b free\( ) # (1), Free()
|
( \b malloc\( ) # (2), Malloc()
|
( \b realloc\( ) # (3), Realloc()
)
一些注意事项:
这只会使用 ^
锚从字符串开头找到第一个。
要全部找到它们,只需从正则表达式中删除 ^
。
之所以有效,是因为它可以匹配您要查找的所有内容。
在本例中,它找到的是捕获组 1、2 或 3。
祝你好运!!
正则表达式包含的内容:
----------------------------------
* Format Metrics
----------------------------------
Atomic Groups = 1
Cluster Groups = 10
Capture Groups = 3
Assertions = 2
( ? ! = 2
Free Comments = 25
Character Classes = 12
编辑
根据请求,解释处理
的正则表达式部分
/**/
评论。这个 -> /\*[^*]*\*+(?:[^/*][^*]*\*+)*/
这是一个经过修改的展开循环正则表达式,它假定一个开始定界符
/*
和 */
.
的结尾
请注意 open/close 在其定界符
中共享一个公共字符 /
顺序。
为了能够在没有环视断言的情况下做到这一点,使用了一种方法
在循环内移动尾随定界符的星号。
使用此因式分解,所需要做的就是检查收盘价 /
完成定界序列。
/\* # Opening delimiter /*
[^*]* # Optionally, consume all non-asterisks
\*+ # This must be 1 or more asterisks anchor's or FAIL.
# This is matched here to align the optional loop below
# because it is looking for the closing /.
(?: # The optional loop part
[^/*] # Specifically a single non / character (nor asterisk).
# Since a / will be the next closing delimiter, it must be excluded.
[^*]* # Optional non-asterisks.
# This will accept a / because it is supposed to consume ALL
# opening delimiter's as it goes
# and will consider the very next */ as a close.
\*+ # This must be 1 or more asterisks anchor's or FAIL.
)* # Repeat 0 to many times.
/ # Closing delimiter /