词频统计——链表

Counting frequency of words - linked list

我正在尝试将单词插入链表,因为我想计算一个单词在列表中出现的次数,然后 return 按频率从高到低的顺序排列这些单词。但是,我不断收到断言错误。这是我的 insertgetCountgetWords 方法。 insertgetCount 似乎给我带来了问题。

public class Frequency<E extends Comparable<E>> implements Iterable<E>{
    private Node first; //starting node
    private Node parent;    //parent of currently processed node
    private int N;  //number of words


    /**
     * Linked List Node
     */
    private class Node{
        private E key;
        private int count;
        private Node next;

        Node(E e){
           key = e;
           count = 1;
           next = null;
        }

        Node(E e, Node r){
            key = e;
            count = 1;
            next = r;
         }

        @Override 
        public String toString(){
            return "("+key +","+count+")";
        }
    }

   /**
    * Inserts a word into linked list
    * @param key to be inserted 
    * @return true if the key is inserted successfully.
    */
    public boolean insert(E key){
        if (first == null || first.key != key) {
            first = new Node(key, first);
        } else {
            Node curr = first;
            while (curr.next != null) {
                curr.next = new Node(key, first.next);
            }
            curr = curr.next;
            N++;
        }
        return true;
    }

/**
     * 
     * @param key is the key to be searched for
     * @return frequency of the key. Returns -1 if key does not exist
     * 
     */
    public int getCount(E key){
        // go through the linked list and count the number of times each word appears
        // return the count of the word that is being called.
        if (key == null) {
            return -1;
        }
        int N = 0;
        Node curr = first;
        while (curr != null) {
            if (curr.key.equals(key)) {
                N++;
            }
            curr = curr.next;
        }
        return N;   
    }
    /**
     * Returns the first n words and count
     * @param n number of words to be returned
     * @return first n words in (word, count) format
     */
    public String getWords(int n){
        Node curr = first;
        for (int i = 1; i < n; i++) {
            curr = curr.next;
        }
        return curr.toString();
    }


    /**
     * Frequency List iterator
     */
    @Override
    public Iterator<E> iterator() {
        return new FreqIterator();
    }
    /**
     * 
     * Frequency List iterator class
     *
     */
    private class FreqIterator implements Iterator<E>{

        @Override
        public boolean hasNext() {
            Node curr = first;
            if(curr != null) {
                return true;
            }
            return false;
        }

        @Override
        public E next() {
            Node curr = first;
            if(hasNext() == false) {
                return null;
            }
            E item = curr.key;
            curr = curr.next;
            return item;
        }
    }
}

编辑

这是我将同一个词两次插入链表的测试。我希望结果为 2,但实际上我得到的是 1。我假设它与我的 insert 方法有关。

    @Test
    public void testInsert() {
        Frequency<String> freq = new Frequency<>();
        freq.insert("dog");
        freq.insert("dog");
        String answer = freq.getWords(1);
        assertEquals("(dog,2)", answer);
    }

它不起作用,因为在您的 getCount(E key) 中,您在遍历它时从不检查传入的 key 是否等于当前节点。

对方法做这个小改动:

public int getCount(E key) {
    if (key == null) {
        return -1;
    }
    int N = 0;
    Node curr = first;
    while (curr != null) {
        if (curr.key.equals(key)) {  // change made here 
            N++;
        }
        curr = curr.next;
    }
    return N;
}

根据编辑,insert(E key) 中存在逻辑错误。您需要迭代以找到 list 中的最后一个元素,然后将下一个引用分配给 new 创建的节点。

public boolean insert(E key) {
    if (first == null || first.key != key) {
        first = new Node(key, first);
    } else {
        Node curr = first;
        while (curr.next != null) {  // iterate till the end of the list
            curr = curr.next; 
        }
        curr.next = new Node(key);   // point last node's next ref to new node
        N++;
    }
    return true;
}