如何从字符串中提取 3 个数字

How to extract 3 numbers from a String

我有一个包含以下内容的字符串数组:

"1 years, 2 months, 22 days",
"1 years, 1 months, 14 days",
"4 years, 24 days",
"13 years, 21 days",
"9 months, 1 day";

我需要提取列表中每个项目的年数、月数、天数。

我尝试过但失败的:

String[] split = duracao.split(",");

if (split.length >= 3) {

    anos = Integer.parseInt(split[0].replaceAll("[^-?0-9]+", ""));
    meses = Integer.parseInt(split[1].replaceAll("[^-?0-9]+", ""));
    dias = Integer.parseInt(split[2].replaceAll("[^-?0-9]+", ""));
} else if (split.length >= 2) {

    meses = Integer.parseInt(split[0].replaceAll("[^-?0-9]+", ""));
    dias = Integer.parseInt(split[1].replaceAll("[^-?0-9]+", ""));
} else if (split.length >= 1) {

    dias = Integer.parseInt(split[0].replaceAll("[^-?0-9]+", ""));
}

它不起作用,因为字符串中的第一项有时是年,有时是月。

是否可以使用正则表达式来实现我想要的? 为了应对“多元化”,我可以这样做:

duration = duration.replace("months", "month");
duration = duration.replace("days", "day");
duration = duration.replace("years", "year");

但是现在如何提取我需要的数据?

我建议使用正则表达式方法逐一查找每个部分,因为顺序可能会有所不同

static void parse(String value) {
    int year = 0, month = 0, day = 0;
    Matcher m;
    if ((m = year_ptn.matcher(value)).find())
        year = Integer.parseInt(m.group(1));
    if ((m = month_ptn.matcher(value)).find())
        month = Integer.parseInt(m.group(1));
    if ((m = day_ptn.matcher(value)).find())
        day = Integer.parseInt(m.group(1));

    System.out.format("y=%2s  m=%2s  d=%2s\n", year, month, day);
}

static Pattern year_ptn = Pattern.compile("(\d+)\s+year");
static Pattern month_ptn = Pattern.compile("(\d+)\s+month");
static Pattern day_ptn = Pattern.compile("(\d+)\s+day");

public static void main(String[] args) {
    List<String> values = Arrays.asList("1 years, 2 months, 22 days", "1 years, 1 months, 14 days",
            "1 months, 1 years, 14 days", "4 years, 24 days", "13 years, 21 days", "9 months, 1 day");

    for (String s : values) {
        parse(s);
    }
}
y= 1  m= 2  d=22
y= 1  m= 1  d=14
y= 1  m= 1  d=14
y= 4  m= 0  d=24
y=13  m= 0  d=21
y= 0  m= 9  d= 1

不是真正的 java 程序员,但是,您可以遍历字符串并执行以下操作。 整数迭代器=0; 而(迭代器!=str.length) 1.Create 一个新的空字符串。

2.If 当前字符是一个数字 将它添加到字符串中 推进迭代器并重复第 2 阶段。

  1. 否则,如果当前字符不是数字,则执行以下操作。 3.1.将您创建的字符串转换为数字 3.2 如果当前字符是 'y' 年,'m' 月 'd' 天。

  2. 将迭代器移动到下一个第一个数字的位置。

您应该将所有数字与年、月、日相关联

我会为此简单地使用一个 switch 块。

int years = 0, months = 0, days = 0;

String[] fields = s1.split(", +");
for (String field : fields) {
    String[] parts = field.split(" ");
    int value = Integer.parseInt(parts[0]);

    switch (parts[1]) {
        case "year":
        case "years":
            years = value;
            break;
        case "month":
        case "months":
            months = value;
            break;
        case "day":
        case "days":
            days = value;
            break;
        default:
            throw new IllegalArgumentException("Unknown time unit: " + parts[1]);
    }
}

解决方案使用 java.time API:

我推荐你使用java.time.Period which is modelled on ISO-8601 standards and was introduced with Java-8 as part of JSR-310 implementation

演示:

import java.time.Period;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class Main {
    public static void main(String[] args) {
        String[] arr = { "1 years, 2 months, 22 days", "1 years, 1 months, 14 days", "4 years, 24 days",
                "13 years, 21 days", "9 months, 1 day" };

        List<Period> periodList = 
                Arrays.stream(arr)
                    .map(s -> Period.parse( 
                                "P" + s.replaceAll("[\s+,]", "")
                                        .replaceAll("years?","Y")
                                        .replaceAll("months?", "M")
                                        .replaceAll("days?", "D")
                            )
                    )
                    .collect(Collectors.toList());
        
        System.out.println(periodList);
        
        // Now you can retrieve year,  month and day from the Period e.g.
        periodList.forEach(p -> 
            System.out.println(
                    p + " => " + 
                    p.getYears() + " years " + 
                    p.getMonths() + " months "+ 
                    p.getDays() +" days"
            )
        );
    }
}

输出:

[P1Y2M22D, P1Y1M14D, P4Y24D, P13Y21D, P9M1D]
P1Y2M22D => 1 years 2 months 22 days
P1Y1M14D => 1 years 1 months 14 days
P4Y24D => 4 years 0 months 24 days
P13Y21D => 13 years 0 months 21 days
P9M1D => 0 years 9 months 1 days

ONLINE DEMO

详细了解 modern Date-Time API* from Trail: Date Time

解决方案使用 Java RegEx API:

另一种方法是使用 Matcher#find.

演示:

import java.util.Arrays;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {
    public static void main(String[] args) {
        String[] arr = { "1 years, 2 months, 22 days", "1 years, 1 months, 14 days", "4 years, 24 days",
                "13 years, 21 days", "9 months, 1 day" };

        int[] years = new int[arr.length];
        int[] months = new int[arr.length];
        int[] days = new int[arr.length];

        Pattern yearPattern = Pattern.compile("\d+(?= year(?:s)?)");
        Pattern monthPattern = Pattern.compile("\d+(?= month(?:s)?)");
        Pattern dayPattern = Pattern.compile("\d+(?= day(?:s)?)");

        for (int i = 0; i < arr.length; i++) {
            Matcher yearMatcher = yearPattern.matcher(arr[i]);
            Matcher monthMatcher = monthPattern.matcher(arr[i]);
            Matcher dayMatcher = dayPattern.matcher(arr[i]);

            years[i] = yearMatcher.find() ? Integer.parseInt(yearMatcher.group()) : 0;
            months[i] = monthMatcher.find() ? Integer.parseInt(monthMatcher.group()) : 0;
            days[i] = dayMatcher.find() ? Integer.parseInt(dayMatcher.group()) : 0;
        }

        // Display
        System.out.println(Arrays.toString(years));
        System.out.println(Arrays.toString(months));
        System.out.println(Arrays.toString(days));
    }
}

输出:

[1, 1, 4, 13, 0]
[2, 1, 0, 0, 9]
[22, 14, 24, 21, 1]

ONLINE DEMO

正则表达式的解释:

  • \d+:一位或多位
  • (?= :先行断言模式的开始
    • year :空白字符后跟 year
    • (?:s)? : 可选字符,s
  • ):先行断言模式结束

检查此 regex demo 以更深入地了解正则表达式。


* 如果您正在为一个 Android 项目工作并且您的 Android API 水平仍然不符合 Java-8,检查Java 8+ APIs available through desugaring. Note that Android 8.0 Oreo already provides support for java.time

如果字段必须按 yearsmonthsdays:

的顺序给出,我会使用一个正则表达式
var pattern = Pattern.compile("(?:(\d+) years?),? ?(?:(\d+) months?),? ?(?:(\d+) days?)");
var matcher = pattern.matcher(duracao);
if (matcher.matches()) {
    var anos = Integer.parseInt(Objects.requireNonNullElse(matcher.group(1), "0"));    
    var meses = Integer.parseInt(Objects.requireNonNullElse(matcher.group(2), "0"));
    var dias = Integer.parseInt(Objects.requireNonNullElse(matcher.group(3), "0"));
    ...
} else {
    // mensagem de erro
}

这是另一种方法,可以提取您拥有的字符串列表中某个项目的年数、月数、天数。

我建议使用正则表达式的映射和函数来匹配正则表达式,return 结果作为 ChronoUnit 的映射 |数量对。

下面是一些示例代码来说明我的建议。

 private static final Map<ChronoUnit,String> durationRegexMap = Map.ofEntries(
        Map.entry(ChronoUnit.YEARS,"\d+ (years|year)"),
        Map.entry(ChronoUnit.MONTHS,"\d+ (months|month)"),
        Map.entry(ChronoUnit.DAYS, "\d+ (days|day)")
);

private Map<ChronoUnit, Integer> parseDuration(String durationString) {
    return new MapStringToChronoUnitsFunction(durationRegexMap)
            .apply(durationString);
}

class MapStringToChronoUnitsFunction implements Function<String, Map<ChronoUnit, Integer>> {

    Map<ChronoUnit,String> durationRegexMap = new HashMap<>();

    public MapStringToChronoUnitsFunction(Map<ChronoUnit,String> durationByRegex) {
        durationRegexMap.putAll(durationByRegex);
    }

    @Override
    public Map<ChronoUnit, Integer> apply(String textWithDurations) {
        String[] splittedTextWithDurations = textWithDurations.split(",");

        return this.durationRegexMap.entrySet().stream()
                .flatMap(regex -> Arrays.stream(splittedTextWithDurations)
                        .map(String::trim)
                        .filter(trimmedDurationString -> trimmedDurationString.matches(regex.getValue()))
                        .map(matchingTrimmedDurationString -> matchingTrimmedDurationString.replaceAll("\w+[a-zA-Z]", " "))
                        .map(String::trim)
                        .map(t -> Map.entry(regex.getKey(),Integer.valueOf(t))))
                .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
    }
}

函数调用如下所示

Map<ChronoUnit, Integer> durationList = chronoUnitsMapper.parseDuration("1 years, 2 months, 22 days");

函数MapStringToChronoUnitsFunction运行在durationRegexMap中注册的正则表达式。它将输入字符串的每个逗号分隔部分与正则表达式进行匹配,并且 return 将匹配结果作为 ChronoUnit 和值对。