在不搜索的情况下从 Solr 读取所有文档（仅在可能的情况下使用文档 ID）

Question

我知道 Solr 是用来搜索的。

但是，我正在做一些基准测试，我想知道是否有办法检索每个索引文档的文档 ID。

最好的选择是不搜索而检索（如果有办法的话）。

我想另一种方法是查询所有文档，但只询问文档 ID。

我将使用 SolrJ，因此 SolrJ 的操作会很有用

Answer 1

使用/export终点：Exporting result sets.

它支持使用与常规搜索相同的 fl 参数（尽管当您使用 SolrJ 时仅搜索 *:* 可能表现得非常相似）。

在 SolrJ 中，您必须使用 CloudSolrStream class 来正确流式传输结果（与搜索 *:* 时的常规行为相比）。

来自Joel Bernstein's example when introducing the feature：

import org.apache.solr.client.solrj.io.*;
import java.util.*;

public class StreamingClient {

   public static void main(String args[]) throws IOException {
      String zkHost = args[0];
      String collection = args[1];

      Map props = new HashMap();
      props.put("q", "*:*");
      props.put("qt", "/export");
      props.put("sort", "fieldA asc");
      props.put("fl", "fieldA,fieldB,fieldC");

      CloudSolrStream cstream = new CloudSolrStream(zkHost, 
                                                    collection, 
                                                    props);
      try {

        cstream.open();
        while(true) {

          Tuple tuple = cstream.read();
          if(tuple.EOF) {
             break;
          }

          String fieldA = tuple.getString("fieldA");
          String fieldB = tuple.getString("fieldB");
          String fieldC = tuple.getString("fieldC");
          System.out.println(fieldA + ", " + fieldB + ", " + fieldC);
        }

      } finally {
       cstream.close();
      }
   }
}

在不搜索的情况下从 Solr 读取所有文档（仅在可能的情况下使用文档 ID）

Reading in all docs (doc id only if possible) from Solr without searching

solr

solrj