CQL 语句的 PagingState

PagingState for Statement in CQL

我试图了解 PagingState 如何在 Cassandra 中使用 Statement。我尝试了一个将几千条记录插入数据库的示例,并尝试从数据库中读取相同的数据,并将获取大小设置为 10 并使用分页状态。这工作得很好。这是我的示例 junit 代码:

@Before
public void setup() {
    cassandraTemplate.executeQuery("create table if not exists pagesample(a int, b int, c int, primary key(a,b))");
    String insertQuery = "insert into pagesample(a,b,c) values(?,?,?)";
    PreparedStatement insertStmt = cassandraTemplate.getConnection().prepareStatement(insertQuery);
    for(int i=0; i < 5; i++){
        for(int j=100; j<1000; j++){
            cassandraTemplate.executeQuery(insertStmt, new Object[]{i, j, RandomUtils.nextInt()});
        }
    }
}

@Test
public void testPagination() {
    String selectQuery = "select * from pagesample where a=?";
    String pagingStateStr = null;
    for(int run=0; run<90; run++){
        ResultSet resultSet = selectRows(selectQuery, 10, pagingStateStr, 1);
        int fetchedCount = resultSet.getAvailableWithoutFetching();
        System.out.println(run+". Fetched size: "+fetchedCount);
        for(Row row : resultSet){
            System.out.print(row.getInt("b")+", ");
            if(--fetchedCount == 0){
                break;
            }
        }
        System.out.println();

        PagingState pagingState = resultSet.getExecutionInfo().getPagingState();
        pagingStateStr =  pagingState.toString();
    }
}

public ResultSet selectRows(String cql, int fetchSize, String pagingState, Object... bindings){
    SimpleStatement simpleStatement = new SimpleStatement(cql, bindings);
    statement.setFetchSize(fetchSize);
    if(StringUtils.isNotEmpty(pagingState)){
        statement.setPagingState(PagingState.fromString(pagingState));
    }
    return getSession().execute(simpleStatement);
}

当我执行这个程序时,我看到 testPagination 中的每次迭代都恰好打印 10 条记录。但这是文档所说的:

我真的无法理解为什么 Cassandra return 的行数与获取大小中指定的行数不完全相同。查询中没有提供 where 子句时会出现这种情况吗?当查询受限于分区键时,它会 return 确切的记录数吗?请说明。

来自CQL protocol specification

Clients should also not assert that no result will have more than result_page_size results. While the current implementation always respect the exact value of result_page_size, we reserve ourselves the right to return slightly smaller or bigger pages in the future for performance reasons

因此,最好始终依赖 getAvailableWithoutFetching 而不是页面大小,以防 Cassandra 将来更改其实现。