根据长度拆分和添加字符串

Split and add the string based on length

我有一个段落作为输入字符串。我正在尝试将段落拆分为句子数组,其中每个元素包含的确切句子不超过 250 个字符。

我尝试根据分隔符(如 .)拆分字符串。将所有字符串转换为列表。使用 StringBuilder ,我试图根据长度(250 个字符)附加字符串。

    List<String> list = new ArrayList<String>();

    String text = "Perhaps far exposed age effects. Now distrusts you her delivered applauded affection out sincerity. As tolerably recommend shameless unfeeling he objection consisted. She although cheerful perceive screened throwing met not eat distance. Viewing hastily or written dearest elderly up weather it as. So direction so sweetness or extremity at daughters. Provided put unpacked now but bringing. Unpleasant astonished an diminution up partiality. Noisy an their of meant. Death means up civil do an offer wound of. Called square an in afraid direct. Resolution diminution conviction so mr at unpleasing simplicity no. No it as breakfast up conveying earnestly immediate principle. Him son disposed produced humoured overcame she bachelor improved. Studied however out wishing but inhabit fortune windows. ";

    Pattern re = Pattern.compile("[^.!?\s][^.!?]*(?:[.!?](?!['\"]?\s|$)[^.!?]*)*[.!?]?['\"]?(?=\s|$)",
            Pattern.MULTILINE | Pattern.COMMENTS);

    Matcher reMatcher = re.matcher(text);
    while (reMatcher.find()) {
        list.add(reMatcher.group());
    }
    String textDelimted[] = new String[list.size()];
    textDelimted = list.toArray(textDelimted);

    StringBuilder stringB = new StringBuilder(100);

    for (int i = 0; i < textDelimted.length; i++) {
        while (stringB.length() + textDelimted[i].length() < 250)
            stringB.append(textDelimted[i]);

        System.out.println("!#@#$%" +stringB.toString());
    }
}

预期结果:

[0] : 可能暴露了年龄的影响。现在不信任你,她真诚地表达了赞赏的感情。由于容忍推荐无耻绝情,他表示反对。她虽然快乐,但看到屏蔽投掷遇到了不吃的距离。

[1] : 草草看还是写的最亲爱的老人up weather it as。所以方向对女儿如此甜蜜或极端。提供现在打开包装但带来。令人不快的是减少了偏爱。吵他们的意思。

[2] : 死亡意味着向上文献上伤口的。叫方安怕直接。决议减少信念如此先生在令人不快的简单没有。不把它当作早餐,认真传达即时原则。

[3] 他儿子的性格变得幽默起来,克服了她单身汉的进步。求学却心存侥幸windows.

您的问题不清楚,请尝试改写以明确您的问题所在。

话虽这么说,我假设 "I tried split the string based on deliminator (as .) . Converted all the string into list" 意味着你想在任何时候拆分 String 一个“。”出现,并转换为 List<String>。可以按如下方式完成:

String input = "hello.world.with.delimiters";
String[] words = input.split("\.");  // String[] with contents {"hello", "world", "with", "delimiters"}
List<String> list = Arrays.asList(words);  // Identical contents, just in a List<String>


// if you want to append to a StringBuilder based on length
StringBuilder sb = new StringBuilder();
for (String s : list) {
    if (someLengthCondition(s.length())) sb.append(list);
}

当然,您对 someLengthCondition() 的实施将取决于您的需要。我无法提供一个,因为很难理解您要做什么。

我认为您需要稍微修改一下循环。 我的结果匹配。

import java.util.List;
import java.util.ArrayList;
import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class MyClass {
    public static void main(String args[]) {

        List<String> list = new ArrayList<String>();

        String text = "Perhaps far exposed age effects. Now distrusts you her delivered applauded affection out sincerity. As tolerably recommend shameless unfeeling he objection consisted. She although cheerful perceive screened throwing met not eat distance. Viewing hastily or written dearest elderly up weather it as. So direction so sweetness or extremity at daughters. Provided put unpacked now but bringing. Unpleasant astonished an diminution up partiality. Noisy an their of meant. Death means up civil do an offer wound of. Called square an in afraid direct. Resolution diminution conviction so mr at unpleasing simplicity no. No it as breakfast up conveying earnestly immediate principle. Him son disposed produced humoured overcame she bachelor improved. Studied however out wishing but inhabit fortune windows. ";

        Pattern re = Pattern.compile("[^.!?\s][^.!?]*(?:[.!?](?!['\"]?\s|$)[^.!?]*)*[.!?]?['\"]?(?=\s|$)",
                Pattern.MULTILINE | Pattern.COMMENTS);

        Matcher reMatcher = re.matcher(text);
        while (reMatcher.find()) {
            list.add(reMatcher.group());
        }
        String textDelimted[] = new String[list.size()];
        textDelimted = list.toArray(textDelimted);

        StringBuilder stringB = new StringBuilder(300);

        for (int i = 0; i < textDelimted.length; i++) {
            if(stringB.length() + textDelimted[i].length() < 250) {
                stringB.append(textDelimted[i]);
            } else {
                System.out.println("!#@#$%" +stringB.toString());
                stringB = new StringBuilder(300);
                stringB.append(textDelimted[i]);
            }

        }
        System.out.println("!#@#$%" +stringB.toString());
    }
}

用此代码替换 println 以获得结果列表:

ArrayList<String> arrlist = new ArrayList<String>(5);
..
arrlist.add(stringB.toString());
..