使用数据移动 API 时如何对结果进行排序?
How to have sorted results when using Data Movement API?
我正在使用 marklogic 9 和数据移动 api 来导出搜索结果。我创建了一个查询:
StructuredQueryDefinition query = ...
String sortOptionsXml = "<sort-order direction=\"" + sortDirection + "\">" + "<path-index>" + sortIndex + "</path-index>" + "</sort-order>";
String queryAsXml = "<search xmlns=\"http://marklogic.com/appservices/search\">" + query.serialize()
+ "<options><search-option>filtered</search-option>\n" + sortOptions + "</options>" + "</search>"
RawCombinedQueryDefinition combinedQueryDefinition = queryManager
.newRawCombinedQueryDefinition(new StringHandle().with(queryAsXml).withFormat(Format.XML));
然后我将此查询发送到 QueryBatcher,创建方式如下:
QueryBatcher batcher = databaseClient.newDataMovementManager().newQueryBatcher(query)
并启动批处理程序:
batcher.withBatchSize(500)
.withThreadCount(8)
.onUrisReady(
new QueryBatchListener() {
@Override
public void processEvent(QueryBatch queryBatch) {
LOGGER.info(String.join(",", queryBatch.getItems()));
}
}
)
.onQueryFailure(Exception::printStackTrace);
JobTicket jobTicket = dataMovementManager.startJob(batcher);
batcher.awaitCompletion();
dataMovementManager.stopJob(jobTicket);
但是,如果我查看退回的项目,它们并没有按照我使用的排序选项进行排序,甚至在批次中也没有。是否可以对 URI 进行排序?
回答自己:不可能。我在 marklogic github 项目 (https://github.com/marklogic/java-client-api/issues/916) and the final conclusion was (https://github.com/marklogic/java-client-api/issues/916#issuecomment-385773027) 上创建了一张票:
DMSDK splits the huge tasks into small chunks by directing the query
to each forest and getting the URIs from each forest in batches. Even
if we pass in the sort options, we would get an ordering per forest
and not on the whole. Since we can't get a global ordering, we chose
to drop the options and it doesn't make sense to have an ordering at
the forest level.
我正在使用 marklogic 9 和数据移动 api 来导出搜索结果。我创建了一个查询:
StructuredQueryDefinition query = ...
String sortOptionsXml = "<sort-order direction=\"" + sortDirection + "\">" + "<path-index>" + sortIndex + "</path-index>" + "</sort-order>";
String queryAsXml = "<search xmlns=\"http://marklogic.com/appservices/search\">" + query.serialize()
+ "<options><search-option>filtered</search-option>\n" + sortOptions + "</options>" + "</search>"
RawCombinedQueryDefinition combinedQueryDefinition = queryManager
.newRawCombinedQueryDefinition(new StringHandle().with(queryAsXml).withFormat(Format.XML));
然后我将此查询发送到 QueryBatcher,创建方式如下:
QueryBatcher batcher = databaseClient.newDataMovementManager().newQueryBatcher(query)
并启动批处理程序:
batcher.withBatchSize(500)
.withThreadCount(8)
.onUrisReady(
new QueryBatchListener() {
@Override
public void processEvent(QueryBatch queryBatch) {
LOGGER.info(String.join(",", queryBatch.getItems()));
}
}
)
.onQueryFailure(Exception::printStackTrace);
JobTicket jobTicket = dataMovementManager.startJob(batcher);
batcher.awaitCompletion();
dataMovementManager.stopJob(jobTicket);
但是,如果我查看退回的项目,它们并没有按照我使用的排序选项进行排序,甚至在批次中也没有。是否可以对 URI 进行排序?
回答自己:不可能。我在 marklogic github 项目 (https://github.com/marklogic/java-client-api/issues/916) and the final conclusion was (https://github.com/marklogic/java-client-api/issues/916#issuecomment-385773027) 上创建了一张票:
DMSDK splits the huge tasks into small chunks by directing the query to each forest and getting the URIs from each forest in batches. Even if we pass in the sort options, we would get an ordering per forest and not on the whole. Since we can't get a global ordering, we chose to drop the options and it doesn't make sense to have an ordering at the forest level.