如何在保留拆分字符的同时拆分多个正则表达式上的字符串
How to split a string on multiple regular expressions while keeping the splitting characters
我是第一次写 lexer/scanner,运行 遇到了分割输入字符串的问题。
示例:
val result = "func add(Num x, Num y) = x+y;".split(???)
result == Array("func", "add", "(", "Num", "x", ",", "Num", "y", ")", "=", "x", "+", "y", ";")
但问题是我不能简单地拆分空白字符,例如,这样做不会将 add
与 (
分开。
有什么帮助吗?
这会给你一堆你的 EE 必须处理的空项目,但是添加单词边界 - \b
- 应该可以做到。
即...split('\s|\b')
(或/\s|\b/
)。
此致
调查http://www.scala-lang.org/api/rc/index.html#scala.util.parsing.combinator.RegexParsers
这是一个未完成的例子:
import scala.util.parsing.combinator.RegexParsers
trait Element
case class Function(name: String,
params:Map[String, String],
expression:Seq[String]) extends Element
case class Class(name: String,
params: Map[String,String],
body: Seq[String]) extends Element
object LanguageParser extends RegexParsers {
val name: Parser[String] = ".*".r
val `type`: Parser[String] = ???
val parameters: Parser[Map[String,String]] = "(" ~> (`type` ~ name).* <~")" ^^ {
case t => (t map {
case a ~ b => a -> b
}).toMap
}
val expression: Parser[Seq[String]] = ???
val function: Parser[Function] =
"func " ~> name ~ parameters ~ "="~ expression ^^ {
case name ~ params ~ _ ~ expr => Function(name, params, expr)
}
val `class`: Parser[Class] =
"class " ~> name ~ parameters ~ "{" ~ expression ~ "}" ^^ {
case name ~ params ~ _ ~ expr ~_ => Class(name, params, expr)
}
val topLevelParsers: Parser[Seq[Element]] =
function |
`class` |
value |
ifelse
def parse(s: String): Seq[Element] = parseAll(topLevelParsers, s.trim) getOrElse
(throw newIllegalArgumentException("Could not parse the given string: " + s.trim))
def parseAll(s: String):Seq[Element] =
s split ";" flatMap parse
}
干杯
我是第一次写 lexer/scanner,运行 遇到了分割输入字符串的问题。 示例:
val result = "func add(Num x, Num y) = x+y;".split(???)
result == Array("func", "add", "(", "Num", "x", ",", "Num", "y", ")", "=", "x", "+", "y", ";")
但问题是我不能简单地拆分空白字符,例如,这样做不会将 add
与 (
分开。
有什么帮助吗?
这会给你一堆你的 EE 必须处理的空项目,但是添加单词边界 - \b
- 应该可以做到。
即...split('\s|\b')
(或/\s|\b/
)。
此致
调查http://www.scala-lang.org/api/rc/index.html#scala.util.parsing.combinator.RegexParsers
这是一个未完成的例子:
import scala.util.parsing.combinator.RegexParsers
trait Element
case class Function(name: String,
params:Map[String, String],
expression:Seq[String]) extends Element
case class Class(name: String,
params: Map[String,String],
body: Seq[String]) extends Element
object LanguageParser extends RegexParsers {
val name: Parser[String] = ".*".r
val `type`: Parser[String] = ???
val parameters: Parser[Map[String,String]] = "(" ~> (`type` ~ name).* <~")" ^^ {
case t => (t map {
case a ~ b => a -> b
}).toMap
}
val expression: Parser[Seq[String]] = ???
val function: Parser[Function] =
"func " ~> name ~ parameters ~ "="~ expression ^^ {
case name ~ params ~ _ ~ expr => Function(name, params, expr)
}
val `class`: Parser[Class] =
"class " ~> name ~ parameters ~ "{" ~ expression ~ "}" ^^ {
case name ~ params ~ _ ~ expr ~_ => Class(name, params, expr)
}
val topLevelParsers: Parser[Seq[Element]] =
function |
`class` |
value |
ifelse
def parse(s: String): Seq[Element] = parseAll(topLevelParsers, s.trim) getOrElse
(throw newIllegalArgumentException("Could not parse the given string: " + s.trim))
def parseAll(s: String):Seq[Element] =
s split ";" flatMap parse
}
干杯