删除注释时无法维护代码结构
Unable to maintain code structure when removing comments
我正在尝试替换所有类型的评论(单行、内联和多行)。当 //
& /* */
没有出现在任何类型的引号 ""
或 """"""
之间时,初始正则表达式工作得非常好。当我稍微修改正则表达式以处理和排除引号之间 // 的出现时,它失败并弄乱了初始代码结构。
这是我最初的正则表达式 (Regex:1):(?:/\*(?:[^*]|(?:\*+[^*/]))*\*+/)|(?://.*)
这是我调整的正则表达式,试图处理引号内的单行注释 (Regex:2):(?:/\*(?:[^*]|(?:\*+[^*/]))*\*+/)|[^\"](?://.*)[^\"]
考虑这个示例数据:
// Comment 1
/* Multiline comments
ends here */ Some text
Random statement // something else
import something..
import something else /* few random stuff
that goes on */ /* Lets try this again */
Text to show
val tryThis = " something // else "
val tryAgain = "12345"
val again = " /* kskokds // */ "
Regex:1 =>
的实际结果
Some text
Random statement
import something..
import something else
Text to show
val tryThis = " something
val tryAgain = "12345"
val again = " "
Regex:2 =>
的实际结果
// Comment 1
Some text
Random statementimport something..
import something else
Text to show
val tryThis = " somethingval tryAgain = "12345"
val again = " "
预期结果=>
Some text
Random statement
import something..
import something else
Text to show
val tryThis = " something // else "
val tryAgain = "12345"
val again = " /* kskokds // */ "
我是第一个 post 回答 link 这个著名问题的人:
RegEx match open tags except XHTML self-contained tags
认真的回答是
I think the flaw here is that HTML is a Chomsky Type 2 grammar
(context free grammar) and RegEx is a Chomsky Type 3 grammar (regular
grammar). Since a Type 2 grammar is fundamentally more complex than a
Type 3 grammar (see the Chomsky hierarchy), it is mathematically
impossible to parse XML with RegEx.
Java 注释的标准也不是 context-free 语法。所以关于解析 html 所说的一切都适用于此。
我正在尝试替换所有类型的评论(单行、内联和多行)。当 //
& /* */
没有出现在任何类型的引号 ""
或 """"""
之间时,初始正则表达式工作得非常好。当我稍微修改正则表达式以处理和排除引号之间 // 的出现时,它失败并弄乱了初始代码结构。
这是我最初的正则表达式 (Regex:1):(?:/\*(?:[^*]|(?:\*+[^*/]))*\*+/)|(?://.*)
这是我调整的正则表达式,试图处理引号内的单行注释 (Regex:2):(?:/\*(?:[^*]|(?:\*+[^*/]))*\*+/)|[^\"](?://.*)[^\"]
考虑这个示例数据:
// Comment 1
/* Multiline comments
ends here */ Some text
Random statement // something else
import something..
import something else /* few random stuff
that goes on */ /* Lets try this again */
Text to show
val tryThis = " something // else "
val tryAgain = "12345"
val again = " /* kskokds // */ "
Regex:1 =>
的实际结果 Some text
Random statement
import something..
import something else
Text to show
val tryThis = " something
val tryAgain = "12345"
val again = " "
Regex:2 =>
的实际结果// Comment 1
Some text
Random statementimport something..
import something else
Text to show
val tryThis = " somethingval tryAgain = "12345"
val again = " "
预期结果=>
Some text
Random statement
import something..
import something else
Text to show
val tryThis = " something // else "
val tryAgain = "12345"
val again = " /* kskokds // */ "
我是第一个 post 回答 link 这个著名问题的人: RegEx match open tags except XHTML self-contained tags
认真的回答是
I think the flaw here is that HTML is a Chomsky Type 2 grammar (context free grammar) and RegEx is a Chomsky Type 3 grammar (regular grammar). Since a Type 2 grammar is fundamentally more complex than a Type 3 grammar (see the Chomsky hierarchy), it is mathematically impossible to parse XML with RegEx.
Java 注释的标准也不是 context-free 语法。所以关于解析 html 所说的一切都适用于此。