如何定位一个字符串然后得到后面的字符直到某个字符
How to locate a string then get the following characters up to a certain character
这是一个示例输入:
<div><a class="document-subtitle category" href="/store/apps/category/GAME_ADVENTURE"> <span itemprop="genre">Adventure</span> </a> </div> <div> </div>
我要查找的字符串是这样的:
document-subtitle category" href="/store/apps/category/
我想提取该字符串后面的字符,直到 href 属性结束 (">)。
在这种情况下,我的输出应该是:
GAME_ADVENTURE
我的输入文件保证只有一个字符串完全匹配:
document-subtitle category" href="/store/apps/category/
实现此目标的最简单方法是什么?
对于这种特殊情况,这就是我在 java 中的做法:
private static final String _control = "document-subtitle category";
private static final String _href = "href";
private String getCategoryFromInput(String input) {
if (input.contains(_control)) {
int hrefStart = input.indexOf(_href);
int openQuote = input.indexOf('"', hrefStart + 1);
int endQuote = input.indexOf('"', openQuote + 1);
String chunk = input.substring(openQuote, endQuote);
int finalDelimeter = chunk.lastIndexOf("/");
return chunk.substring(finalDelimeter);
} else {
return "";
}
}
这对我有用:
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
public class ExtractData {
public static String matcher = "document-subtitle category\" href=\"/store/apps/category/";
public static void main(String[] args) throws IOException {
String filePath = args[0];
String content = new String(Files.readAllBytes(Paths.get(filePath)));
int startIndex = content.indexOf(matcher);
int endIndex = content.indexOf("\">", startIndex);
String category = content.substring(startIndex + matcher.length(), endIndex);
System.out.println("category is " + category);
}
}
这是一个示例输入:
<div><a class="document-subtitle category" href="/store/apps/category/GAME_ADVENTURE"> <span itemprop="genre">Adventure</span> </a> </div> <div> </div>
我要查找的字符串是这样的:
document-subtitle category" href="/store/apps/category/
我想提取该字符串后面的字符,直到 href 属性结束 (">)。
在这种情况下,我的输出应该是:
GAME_ADVENTURE
我的输入文件保证只有一个字符串完全匹配:
document-subtitle category" href="/store/apps/category/
实现此目标的最简单方法是什么?
对于这种特殊情况,这就是我在 java 中的做法:
private static final String _control = "document-subtitle category";
private static final String _href = "href";
private String getCategoryFromInput(String input) {
if (input.contains(_control)) {
int hrefStart = input.indexOf(_href);
int openQuote = input.indexOf('"', hrefStart + 1);
int endQuote = input.indexOf('"', openQuote + 1);
String chunk = input.substring(openQuote, endQuote);
int finalDelimeter = chunk.lastIndexOf("/");
return chunk.substring(finalDelimeter);
} else {
return "";
}
}
这对我有用:
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
public class ExtractData {
public static String matcher = "document-subtitle category\" href=\"/store/apps/category/";
public static void main(String[] args) throws IOException {
String filePath = args[0];
String content = new String(Files.readAllBytes(Paths.get(filePath)));
int startIndex = content.indexOf(matcher);
int endIndex = content.indexOf("\">", startIndex);
String category = content.substring(startIndex + matcher.length(), endIndex);
System.out.println("category is " + category);
}
}