我如何从部分失败的正则表达式中获取一个组
How do I get a group from a partially failing Regex
我想将一些文本传递给 class 并让它 return 一个数字和该文本的等级。
现在,如果没有给出等级,class 应该是 return 数字和等级 0.0
我现在的问题是我正在使用正则表达式(组)提取该数据。
一旦模式与整个文本不匹配,我就无法再检索号码。
没有成绩的示例文本输入为:
2456272 Max Mustermann 20.02.1968
2456272 would be the number
A grade would be at the very end of the input
到目前为止我的代码:
static final String REGEX = "^(?<studentnumber>\d{7})" // start with student number
+ ".*" // anything in between
+ "\s+" // separated by at least one space
+ "(?<grade>10(\.0)?|\d([.,]\d)?)" // 10(.0)? or one digit and optionally comma or period followed one digits.
+ "$"; // and nothing else
private static final Pattern PATTERN = Pattern.compile(REGEX);
private final Matcher matcher;
/**
* Construct a GradeFilter from a string (line).
*
* @param line to read
*/
public GradeCapture(String line) {
matcher = PATTERN.matcher(line);
matcher.matches();
}
/**
* Create a tuple. Use AbstractMap. SimpleEntry as implementing class.
* @return the tuple.
*/
public AbstractMap.SimpleEntry<Integer, Double> getResult() {
if (hasResult()) {
return new AbstractMap.SimpleEntry<>(studentId(), grade());
}
return new AbstractMap.SimpleEntry<>(studentId(), 0D);
}
/**
* Does the line contain the required data?
* @return whether there is a match
*/
public boolean hasResult() {
Integer studentId = studentId();
Double grade = grade();
if (studentId == null || grade == null) {
return false;
}
return true;
}
//</editor-fold>
/**
* Get the grade, if any.
*
* @return the grade or null
*/
public Double grade() {
try {
String grade = matcher.group("grade");
grade = grade.replaceAll(",", ".");
return Double.parseDouble(grade);
} catch (Exception e) {
return null;
}
}
/**
* Get the student id, if any.
*
* @return the student id or null when no match.
*/
public Integer studentId() {
try {
String studentnumber = matcher.group("studentnumber");
return Integer.parseInt(studentnumber);
} catch (Exception e) {
return null;
}
}
我只想在部分匹配失败时检索匹配器组“studentnumber”。
您需要知道当成绩不存在时结束字符串的文本模式,而该模式显然不能是 \b\d+\.\d+$
。在示例中,它是特定格式的日期。如果所有字符串都包含这样的日期,后跟行尾或 space 后跟行尾的成绩,则可以匹配正则表达式
^(\d+).*\d{2}\.\d{2}\.\d{4}\s*(\d+\.\d+)?$
Demo(点击"Java")
正则表达式有两个捕获组。第一个保存行首的数字串。如果成绩存在,第二个捕获组将持有它;如果不是,第二个捕获组将为空。您的 Java 代码只需要将成绩设置为捕获组 2 的内容(如果它不为空);否则将其设置为 0.0
.
我想将一些文本传递给 class 并让它 return 一个数字和该文本的等级。
现在,如果没有给出等级,class 应该是 return 数字和等级 0.0
我现在的问题是我正在使用正则表达式(组)提取该数据。 一旦模式与整个文本不匹配,我就无法再检索号码。
没有成绩的示例文本输入为:
2456272 Max Mustermann 20.02.1968
2456272 would be the number
A grade would be at the very end of the input
到目前为止我的代码:
static final String REGEX = "^(?<studentnumber>\d{7})" // start with student number
+ ".*" // anything in between
+ "\s+" // separated by at least one space
+ "(?<grade>10(\.0)?|\d([.,]\d)?)" // 10(.0)? or one digit and optionally comma or period followed one digits.
+ "$"; // and nothing else
private static final Pattern PATTERN = Pattern.compile(REGEX);
private final Matcher matcher;
/**
* Construct a GradeFilter from a string (line).
*
* @param line to read
*/
public GradeCapture(String line) {
matcher = PATTERN.matcher(line);
matcher.matches();
}
/**
* Create a tuple. Use AbstractMap. SimpleEntry as implementing class.
* @return the tuple.
*/
public AbstractMap.SimpleEntry<Integer, Double> getResult() {
if (hasResult()) {
return new AbstractMap.SimpleEntry<>(studentId(), grade());
}
return new AbstractMap.SimpleEntry<>(studentId(), 0D);
}
/**
* Does the line contain the required data?
* @return whether there is a match
*/
public boolean hasResult() {
Integer studentId = studentId();
Double grade = grade();
if (studentId == null || grade == null) {
return false;
}
return true;
}
//</editor-fold>
/**
* Get the grade, if any.
*
* @return the grade or null
*/
public Double grade() {
try {
String grade = matcher.group("grade");
grade = grade.replaceAll(",", ".");
return Double.parseDouble(grade);
} catch (Exception e) {
return null;
}
}
/**
* Get the student id, if any.
*
* @return the student id or null when no match.
*/
public Integer studentId() {
try {
String studentnumber = matcher.group("studentnumber");
return Integer.parseInt(studentnumber);
} catch (Exception e) {
return null;
}
}
我只想在部分匹配失败时检索匹配器组“studentnumber”。
您需要知道当成绩不存在时结束字符串的文本模式,而该模式显然不能是 \b\d+\.\d+$
。在示例中,它是特定格式的日期。如果所有字符串都包含这样的日期,后跟行尾或 space 后跟行尾的成绩,则可以匹配正则表达式
^(\d+).*\d{2}\.\d{2}\.\d{4}\s*(\d+\.\d+)?$
Demo(点击"Java")
正则表达式有两个捕获组。第一个保存行首的数字串。如果成绩存在,第二个捕获组将持有它;如果不是,第二个捕获组将为空。您的 Java 代码只需要将成绩设置为捕获组 2 的内容(如果它不为空);否则将其设置为 0.0
.