带有 'take' 的 Azure 表异步查询筛选器未提供前 #n 个实体

Azure Tables async query filter with a 'take' is not giving top #n entities

我想要几千个实体中与我的 TableQuery 过滤器匹配的前 100 个实体(最近的),我尝试了两种方法:

  1. 第一次尝试是在 foreach 循环中使用索引计数器,一旦达到“100”就中断。这给了我一个奇怪的随机数据子集,其中大部分都丢失了,而不是 100 个实体;更像是几百而不是偶数。

  2. 第二次尝试粘贴在下面,基本上忽略了我的延续标记,并将 .take 设置为“100”。这给了我与 take 整数匹配的实体数量,但是很多实体丢失了。

每次尝试都会返回不同的结果,我想我知道为什么,但我不知道如何修复它以取回我需要的东西。我意识到出于性能原因,在时间戳上设置查询过滤器并不是很好(它没有索引......对吗?)。那么我应该用 date/time 值填充另一个字段以过滤掉吗?

        public async Task<List<ActivityModel>> GetActivitiesAsync(string DomainName, string NodeId, string ComputerName)
    {
        List<ActivityModel> activities = new List<ActivityModel>();
        CloudTable cloudTable = TableConnection("NodeEvents");
        string domainFilter = TableQuery.GenerateFilterCondition("DomainName", QueryComparisons.Equal, DomainName);
        string nodeIdFilter = TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, NodeId);
        string computerNameFilter = TableQuery.GenerateFilterCondition("ComputerName", QueryComparisons.Equal, ComputerName);
        string filter1 = TableQuery.CombineFilters(domainFilter, TableOperators.And, nodeIdFilter);
        string filter2 = TableQuery.CombineFilters(filter1, TableOperators.And, computerNameFilter);
        TableContinuationToken continuationToken = null;

        var result = await cloudTable.ExecuteQuerySegmentedAsync(new TableQuery<ActivityModel>().Where(filter2).Take(100), continuationToken);

        if (result.Results != null)
        {
            foreach (ActivityModel entity in result.Results)
            {
                activities.Add(entity);
            }
        }

        return activities;
    }

您可以参考doc中的Log Tail Pattern

Retrieve the n entities most recently added to a partition by using a RowKey value that sorts in reverse date and time order.

Context and problem

A common requirement is to be able to retrieve the most recently created entities, for example the ten most recent expense claims submitted by an employee. Table queries support a $top query operation to return the first n entities from a set: there is no equivalent query operation to return the last n entities in a set.

Solution

Store the entities using a RowKey that naturally sorts in reverse date/time order by using so the most recent entry is always the first one in the table.

For example, to be able to retrieve the ten most recent expense claims submitted by an employee, you can use a reverse tick value derived from the current date/time. The following C# code sample shows one way to create a suitable "inverted ticks" value for a RowKey that sorts from the most recent to the oldest:

string invertedTicks = string.Format("{0:D19}", DateTime.MaxValue.Ticks - DateTime.UtcNow.Ticks);

You can get back to the date time value using the following code:

DateTime dt = new DateTime(DateTime.MaxValue.Ticks - Int64.Parse(invertedTicks));

The table query looks like this:

https://myaccount.table.core.windows.net/EmployeeExpense(PartitionKey='empid')?$top=10

Issues and considerations

Consider the following points when deciding how to implement this pattern:

  • You must pad the reverse tick value with leading zeroes to ensure the string value sorts as expected.

  • You must be aware of the scalability targets at the level of a partition. Be careful not create hot spot partitions.