{补充符号和象形文字} 未被 Scala 识别
{Supplemental Symbols and Pictographs} Not being identified by Scala
我正在尝试识别句子中的表情符号
def extractEmojiFromSentence (sentence: Any) : Seq[String] = {
return raw"[\p{block=Emoticons}\p{block=Miscellaneous Symbols and Pictographs}\p{block=Supplemental Symbols and Pictographs}]".r.findAllIn(sentence.toString).toSeq
}
这给出了以下错误
Exception in thread "main" java.util.regex.PatternSyntaxException:
Unknown character block name {Supplemental Symbols and Pictographs}
near index 112 [\p{block=Emoticons}\p{block=Miscellaneous Symbols and
Pictographs}\p{block=Supplemental Symbols and Pictographs}]
我是否必须将一些库导入到我的 build.sbt 中。或者是上面错误的原因是什么?
更新
我正在按照评论中的建议使用以下代码
val x = raw"\p{block=Supplemental Symbols and Pictographs}".r.findAllIn(mySentence.toString).toSeq
但是我收到以下错误
Exception in thread "main" java.util.regex.PatternSyntaxException: Unknown character block name {Supplemental Symbols and Pictographs} near index 45
\p{block=Supplemental Symbols and Pictographs}
^
您的 JVM 版本中的正则表达式引擎似乎无法识别该块标签。 (我的也没有。)
您可以只提供等效的字符范围。
def extractEmojiFromSentence(sentence: String): Seq[String] =
("[\p{block=Emoticons}" +
"\p{block=Miscellaneous Symbols and Pictographs}" +
"\uD83E\uDD00-\uD83E\uDDFF]") //Supplemental Symbols & Pictographs
.r.findAllIn(sentence).toSeq
我正在尝试识别句子中的表情符号
def extractEmojiFromSentence (sentence: Any) : Seq[String] = {
return raw"[\p{block=Emoticons}\p{block=Miscellaneous Symbols and Pictographs}\p{block=Supplemental Symbols and Pictographs}]".r.findAllIn(sentence.toString).toSeq
}
这给出了以下错误
Exception in thread "main" java.util.regex.PatternSyntaxException: Unknown character block name {Supplemental Symbols and Pictographs} near index 112 [\p{block=Emoticons}\p{block=Miscellaneous Symbols and Pictographs}\p{block=Supplemental Symbols and Pictographs}]
我是否必须将一些库导入到我的 build.sbt 中。或者是上面错误的原因是什么?
更新
我正在按照评论中的建议使用以下代码
val x = raw"\p{block=Supplemental Symbols and Pictographs}".r.findAllIn(mySentence.toString).toSeq
但是我收到以下错误
Exception in thread "main" java.util.regex.PatternSyntaxException: Unknown character block name {Supplemental Symbols and Pictographs} near index 45
\p{block=Supplemental Symbols and Pictographs}
^
您的 JVM 版本中的正则表达式引擎似乎无法识别该块标签。 (我的也没有。)
您可以只提供等效的字符范围。
def extractEmojiFromSentence(sentence: String): Seq[String] =
("[\p{block=Emoticons}" +
"\p{block=Miscellaneous Symbols and Pictographs}" +
"\uD83E\uDD00-\uD83E\uDDFF]") //Supplemental Symbols & Pictographs
.r.findAllIn(sentence).toSeq