如何从字符串中获取文本计数
How to get the text count from String
我有以下字符串
Salary and Benefits <span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span>
Job Security <span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span>
Career Growth <span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barnone"></span>
Work Environment <span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span>
CEO Rating <span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span>
我需要像下面的格式显示计数("read-barfull" 计数)
Salary and Benefits 5
Job Security 5
Career Growth 4
Work Environment 5
CEO Rating 5
请帮我获取格式
提前谢谢你
如果您要计算的 "token" 字符串是静态的(或至少 "predefined"),您可以执行如下操作,它使用 Apache commons-lang:
String str = "Salary and Benefits <span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span>";
String spanText = "<span class=\"read-barfull\"></span>";
int count = StringUtils.countMatches(str, spanText);
将您的逻辑分为两种方式
- 创建
List<String>
- 迭代列表并使用字符串缓冲区或拆分搜索单词并获得计数器增量。 {做简单的逻辑来分离 "read-barfull" 和字符串 "key"(ie.Salary 和 Benefits)
- 从中获取计数值。
- 创建
Map<String,Integer>
仅此而已。
以下是使用 Jsoup 的方法(因为您的问题已用它标记)。总体思路是
- 逐行阅读HTML,
- 获取HTML这一行表示的文本
- select 所有
<span class="read-barfull"></span>
元素(不管它们是否为空,但您可以根据需要更改它)-简单 select("span.read-barfull")
将为我们完成此操作
- 打印 selected
span
个元素的计数(size()
在这里很有用)
代码:
String html = "Salary and Benefits <span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span>\r\n" +
"Job Security <span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span>\r\n" +
"Career Growth <span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barnone\"></span>\r\n" +
"Work Environment <span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span>\r\n" +
"CEO Rating <span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span>";
Scanner sc = new Scanner(html);
while(sc.hasNextLine()){
Document doc = Jsoup.parse(sc.nextLine());
System.out.println(doc.text()+" "+doc.select("span.read-barfull").size());
}
输出:
Salary and Benefits 5
Job Security 5
Career Growth 4
Work Environment 5
CEO Rating 5
我有以下字符串
Salary and Benefits <span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span>
Job Security <span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span>
Career Growth <span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barnone"></span>
Work Environment <span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span>
CEO Rating <span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span><span class="read-barfull"></span>
我需要像下面的格式显示计数("read-barfull" 计数)
Salary and Benefits 5
Job Security 5
Career Growth 4
Work Environment 5
CEO Rating 5
请帮我获取格式 提前谢谢你
如果您要计算的 "token" 字符串是静态的(或至少 "predefined"),您可以执行如下操作,它使用 Apache commons-lang:
String str = "Salary and Benefits <span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span>";
String spanText = "<span class=\"read-barfull\"></span>";
int count = StringUtils.countMatches(str, spanText);
将您的逻辑分为两种方式
- 创建
List<String>
- 迭代列表并使用字符串缓冲区或拆分搜索单词并获得计数器增量。 {做简单的逻辑来分离 "read-barfull" 和字符串 "key"(ie.Salary 和 Benefits)
- 从中获取计数值。
- 创建
Map<String,Integer>
仅此而已。
以下是使用 Jsoup 的方法(因为您的问题已用它标记)。总体思路是
- 逐行阅读HTML,
- 获取HTML这一行表示的文本
- select 所有
<span class="read-barfull"></span>
元素(不管它们是否为空,但您可以根据需要更改它)-简单select("span.read-barfull")
将为我们完成此操作 - 打印 selected
span
个元素的计数(size()
在这里很有用)
代码:
String html = "Salary and Benefits <span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span>\r\n" +
"Job Security <span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span>\r\n" +
"Career Growth <span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barnone\"></span>\r\n" +
"Work Environment <span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span>\r\n" +
"CEO Rating <span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span><span class=\"read-barfull\"></span>";
Scanner sc = new Scanner(html);
while(sc.hasNextLine()){
Document doc = Jsoup.parse(sc.nextLine());
System.out.println(doc.text()+" "+doc.select("span.read-barfull").size());
}
输出:
Salary and Benefits 5
Job Security 5
Career Growth 4
Work Environment 5
CEO Rating 5