java 字符串中的单词

Question

我是 Java 的新手，作为初学者，我被邀请在家尝试这个。

Write a program that will find out number of occurences of a smaller string in a bigger string as a part of it as well as an individual word. For example,

Bigger string = "I AM IN AMSTERDAM", smaller string = "AM".

Output: As part of string: 3, as a part of word: 1.

虽然我确实搞定了第二部分（作为单词的一部分），甚至在第一部分就开始了（搜索作为字符串一部分的单词），但我似乎不明白如何破解第一部分。它在示例输入中一直为我显示 1，而它应该是 3。

我确实犯了一个错误-如果您能指出错误并加以纠正，我将不胜感激。作为请求，我是一个很好奇的学习者 - 所以如果可能的话（按你的意愿） - 请解释为什么会这样。

import java.util.Scanner;
public class Program {
static Scanner sc = new Scanner(System.in);
static String search,searchstring;
static int n;
void input(){
    System.out.println("What do you want to do?"); System.out.println("1.     
Search as part of string?");
    System.out.println("2. Search as part of word?");
    int n = sc.nextInt();
    System.out.println("Enter the main string"); searchstring = 
sc.nextLine();
    sc.nextLine(); //Clear buffer
    System.out.println("Enter the search string"); search = sc.nextLine();
}
static int asPartOfWord(String main,String search){
    int count = 0; 
    char c; String w = "";
    for (int i = 0; i<main.length();i++){
        c = main.charAt(i);
        if (!(c==' ')){
            w += c;
        }
        else {
            if (w.equals(search)){
                count++;
            }
            w = ""; // Flush old value of w
        }
    }
    return count;
}
static int asPartOfString(String main,String search){
    int count = 0;
    char c; String w = ""; //Stores the word 
    for (int i = 0; i<main.length();i++){
        c = main.charAt(i);
        if (!(c==' ')){
            w += c;
        }
        else {
            if (w.length()==search.length()){
                if (w.equals(search)){
                    count++;
                }
            }
            w = ""; // Replace with new value, no string
        }
    }
    return count;
}
public static void main(String[] args){
    Program a = new Program();
    a.input();
    switch(n){
        case 1: System.out.println("Total occurences: " + 
         asPartOfString(searchstring,search));
        case 2: System.out.println("Total occurences: " +  
         asPartOfWord(searchstring,search));
        default: System.out.println("ERROR: No valid number entered");
    }
  }
}

编辑：我将使用循环结构。

Answer 1

您可以使用正则表达式，尝试 ".*<target string>.*"（将 target string 替换为您要搜索的内容。

查看 Java 文档 "Patterns & Regular Expressions"

要搜索字符串中的匹配项，这可能会有所帮助。

Matcher matcher = Pattern.compile(".*AM.*").matcher("I AM IN AMSTERDAM")
int count = 0;

while (matcher.find()) {
    count++;
}

Answer 2

一种更简单的方法是使用正则表达式（这可能会打败自己编写它的想法，尽管学习正则表达式是一个好主意，因为它们非常强大：如您所见，我的代码的核心是 4 行在 countMatches 方法中很长）。

public static void main(String... args) {
  String bigger = "I AM IN AMSTERDAM";
  String smaller = "AM";

  System.out.println("Output: As part of string: " + countMatches(bigger, smaller) +
          ", as a part of word: " + countMatches(bigger, "\b" + smaller + "\b"));
}

private static int countMatches(String in, String regex) {
  Matcher m = Pattern.compile(regex).matcher(in);
  int count = 0;
  while (m.find()) count++;
  return count;
}

它是如何工作的？

我们创建一个 Matcher，它会在您的字符串中找到一个特定的模式，然后迭代以找到下一个匹配项，直到剩下 none 并递增一个计数器
模式本身："AM" 将在字符串中的任何位置找到任何出现的 AM。 “\bAM\b”将只匹配整个单词（\b 是单词分隔符）。

这可能不是您想要的，但我认为看到另一种方法会很有趣。从技术上讲，我正在使用循环 :-)

Answer 3

对于练习的第一部分，这应该有效：

static int asPartOfWord(String main, String search) {
    int count = 0;
    while(main.length() >= search.length()) { // while String main is at least as long as String search
         if (main.substring(0,search.length()).equals(search)) {  // if String main from index 0 until exclusively search.length() equals the String search, count is incremented;
             count++;
         }
    main = main.substring(1); // String main is shortened by cutting off the first character
    }
    return count;

您可能会考虑变量的命名方式：

   static String search,searchstring;
   static int n;

虽然 search 和 searchstring 会告诉我们是什么意思，但您应该将第一个单词写成小写，后面的每个单词都应该将第一个字母写成大写。这提高了可读性。

static int n 不会告诉你它的用途，如果你几天后再次阅读你的代码，你可能会在这里使用更有意义的东西。

  static String search, searchString;
  static int command;

Answer 4

这是使用 Pattern and Matcher、
或更常见的正则表达式使其工作的另一种（更短的）方法。

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class CountOccurances {

    public static void main(String[] args) {

        String main = "I AM IN AMSTERDAM";
        String search = "AM";

        System.out.printf("As part of string: %d%n",
                asPartOfString(main, search));

        System.out.printf("As part of word: %d%n",
                asPartOfWord(main, search));        
    }

    private static int asPartOfString(String main, String search) {

        Matcher m = Pattern.compile(search).matcher(main);
        int count = 0;
        while (m.find()) {
            count++;
        }
        return count;
    }

    private static int asPartOfWord(String main, String search) {

        // \b - A word boundary
        return asPartOfString(main, "\b" + search + "\b");
    }
}

输出：

As part of string: 3
As part of word: 1

Answer 5

虽然自己编写带有大量循环的代码来解决问题可能会执行得更快（有争议），但如果可以的话最好使用 JDK，因为要编写的代码更少，调试更少，而且您可以专注于高级的东西而不是字符迭代和比较的低级实现。

碰巧，你需要解决这个问题的工具已经存在，虽然使用它们需要你没有的知识，但它们很优雅，每个方法都是一行代码。

以下是我的解决方法：

static int asPartOfString(String main,String search){
    return main.split(search, -1).length - 1;
}

static int asPartOfWord(String main,String search){
    return main.split("\b" + search + "\b", -1).length - 1
}

查看此代码的 live demo 运行您的示例输入，其中（可能是故意的）包含边缘情况（见下文）。

性能？大概几微秒——足够快了。但真正的好处是代码很少，完全清楚发生了什么，几乎没有错误或需要调试的地方。

使用此解决方案需要了解的内容：

"word boundary" 的正则表达式是 \b
split() 将正则表达式作为其搜索词
split() 的第二个参数控制字符串末尾的行为：负数表示 "retain blanks at end of split"，它处理主字符串以较小字符串结尾的边缘情况。在没有 -1 的情况下，调用 split 会丢弃这种边缘情况下的尾随空白。

java 字符串中的单词

Word in a java string

java

string

words