Java 无法替换后跟 \n 的字符串
Java Cannot replace a string followed by \n
我是一名大学生,正在做一个为期一个学期的项目,但我的项目遇到了瓶颈。在我继续之前,请知道我查看了关于堆栈溢出的类似线程,其中 none 似乎符合我的情况。
我有一个从 pdf 生成的字符串输入,其中包含来自 table 的大量数据。问题是,由于格式化,部门列的某些 table 条目从第 1 行变为第 2 行,我无法解决它。例如,
PS 253(我的算法处理得很好)
平均线
243HON(打破一切)
我需要最终能够将它们放在同一行并删除 MA 之后的“\n”以将其发送到程序的其余部分。我试图在部门代码 (MA) 之后检查 \n 一两个索引位置并更改我从中获得 243HON 的索引,但没有用。
我也试过 String = string.replaceAll("MA \n", "MA ") 如代码所示。删除 MA 和 \n 之间的 space 没有任何作用。这是我的代码的相关部分。谢谢!
public static String[] departments = {"\nAS","\nSF","\nAE","\nAF","\nAT","\nLAR","\nAMS","\nBIO","\nBA","\nCHM","\nLCH","\nCIV","\nCSO",
"\nCOM","\nCEC","\nCS","\nCYB","\nEC","\nEE","\nEGR","\nEP","\nES","\nFA","\nGCS","\nHS","\nHON","\nHF","\nHU","\nMA","\nME","\nWX",
"\nMSL","\nNSC","\nPE","\nPS","\nPSY","\nSIM","\nSS","\nSE","\nSP","\nSYS","\nUNIV","\nUA"};
public static String[] departmentsFix = {"\nAS \n","\nSF \n","\nAE \n","\nAF \n","\nAT \n","\nLAR \n","\nAMS \n","\nBIO \n","\nBA \n","\nCHM \n","\nLCH \n","\nCIV \n","\nCSO \n",
"\nCOM \n","\nCEC \n","\nCS \n","\nCYB \n","\nEC \n","\nEE \n","\nEGR \n","\nEP \n","\nES \n","\nFA \n","\nGCS \n","\nHS \n","\nHON \n","\nHF \n","\nHU \n","\nMA \n","\nME \n","\nWX \n",
"\nMSL \n","\nNSC \n","\nPE \n","\nPS \n","\nPSY \n","\nSIM \n","\nSS \n","\nSE \n","\nSP \n","\nSYS \n","\nUNIV \n","\nUA \n"};
public static void main(String[] args) {
// TODO Auto-generated method stub
Loader loader = new Loader();
try {
File file = new File("C:\Users\User\Desktop\EclipseWorkspace\SE 300\ER_SCHED_PRT.pdf");
PDDocument document = PDDocument.load(file);
PDFTextStripper s = new PDFTextStripper();
loader.content = s.getText(document);
String[] splitString = loader.content.split("Instructor", 2);
loader.content = splitString[1];
int index = 0;
for (String y : departmentsFix) {
//find any departments with a \n after them and replace it with a space
loader.content = loader.content.replaceAll(y, departments[index] + " ");
index++;
}
我刚刚修好了。通过查找函数,我发现格式不是 \nMA \n,而是 \nMA \r\n。改变这在很大程度上解决了一个无关紧要的小错误的问题,可以使用额外的 space 来补偿这个问题。尽管如此,还是感谢您的帮助。
我是一名大学生,正在做一个为期一个学期的项目,但我的项目遇到了瓶颈。在我继续之前,请知道我查看了关于堆栈溢出的类似线程,其中 none 似乎符合我的情况。
我有一个从 pdf 生成的字符串输入,其中包含来自 table 的大量数据。问题是,由于格式化,部门列的某些 table 条目从第 1 行变为第 2 行,我无法解决它。例如,
PS 253(我的算法处理得很好)
平均线
243HON(打破一切)
我需要最终能够将它们放在同一行并删除 MA 之后的“\n”以将其发送到程序的其余部分。我试图在部门代码 (MA) 之后检查 \n 一两个索引位置并更改我从中获得 243HON 的索引,但没有用。
我也试过 String = string.replaceAll("MA \n", "MA ") 如代码所示。删除 MA 和 \n 之间的 space 没有任何作用。这是我的代码的相关部分。谢谢!
public static String[] departments = {"\nAS","\nSF","\nAE","\nAF","\nAT","\nLAR","\nAMS","\nBIO","\nBA","\nCHM","\nLCH","\nCIV","\nCSO",
"\nCOM","\nCEC","\nCS","\nCYB","\nEC","\nEE","\nEGR","\nEP","\nES","\nFA","\nGCS","\nHS","\nHON","\nHF","\nHU","\nMA","\nME","\nWX",
"\nMSL","\nNSC","\nPE","\nPS","\nPSY","\nSIM","\nSS","\nSE","\nSP","\nSYS","\nUNIV","\nUA"};
public static String[] departmentsFix = {"\nAS \n","\nSF \n","\nAE \n","\nAF \n","\nAT \n","\nLAR \n","\nAMS \n","\nBIO \n","\nBA \n","\nCHM \n","\nLCH \n","\nCIV \n","\nCSO \n",
"\nCOM \n","\nCEC \n","\nCS \n","\nCYB \n","\nEC \n","\nEE \n","\nEGR \n","\nEP \n","\nES \n","\nFA \n","\nGCS \n","\nHS \n","\nHON \n","\nHF \n","\nHU \n","\nMA \n","\nME \n","\nWX \n",
"\nMSL \n","\nNSC \n","\nPE \n","\nPS \n","\nPSY \n","\nSIM \n","\nSS \n","\nSE \n","\nSP \n","\nSYS \n","\nUNIV \n","\nUA \n"};
public static void main(String[] args) {
// TODO Auto-generated method stub
Loader loader = new Loader();
try {
File file = new File("C:\Users\User\Desktop\EclipseWorkspace\SE 300\ER_SCHED_PRT.pdf");
PDDocument document = PDDocument.load(file);
PDFTextStripper s = new PDFTextStripper();
loader.content = s.getText(document);
String[] splitString = loader.content.split("Instructor", 2);
loader.content = splitString[1];
int index = 0;
for (String y : departmentsFix) {
//find any departments with a \n after them and replace it with a space
loader.content = loader.content.replaceAll(y, departments[index] + " ");
index++;
}
我刚刚修好了。通过查找函数,我发现格式不是 \nMA \n,而是 \nMA \r\n。改变这在很大程度上解决了一个无关紧要的小错误的问题,可以使用额外的 space 来补偿这个问题。尽管如此,还是感谢您的帮助。