按类别索引并在 R sqldf 包中按列排序

Index by category with sorting by column in R sqldf package

如何在 sqldf 包中按列排序的 R 中按类别添加索引。我寻找相当于 SQL:

ROW_NUMBER() over(partition by [Category] order by [Date] desc

假设我们有一个 table:

+----------+-------+------------+
| Category | Value |    Date    |
+----------+-------+------------+
| apples   |     3 | 2018-07-01 |
| apples   |     2 | 2018-07-02 |
| apples   |     1 | 2018-07-03 |
| bananas  |     9 | 2018-07-01 |
| bananas  |     8 | 2018-07-02 |
| bananas  |     7 | 2018-07-03 |
+----------+-------+------------+

期望的结果是:

+----------+-------+------------+-------------------+
| Category | Value |    Date    | Index by category |
+----------+-------+------------+-------------------+
| apples   |     3 | 2018-07-01 |                 3 |
| apples   |     2 | 2018-07-02 |                 2 |
| apples   |     1 | 2018-07-03 |                 1 |
| bananas  |     9 | 2018-07-01 |                 3 |
| bananas  |     8 | 2018-07-02 |                 2 |
| bananas  |     7 | 2018-07-03 |                 1 |
+----------+-------+------------+-------------------+

感谢您在评论中提示如何在许多不同于 sqldf 的其他包中完成它:Numbering rows within groups in a data frame

1) PostgreSQL 这可以通过 PostgreSQL 后端到 sqldf:

library(RPostgreSQL)
library(sqldf)

sqldf('select *, 
       ROW_NUMBER() over (partition by "Category" order by "Date" desc) as seq
       from "DF"
       order by "Category", "Date" ')

给予:

  Category Value       Date seq
1   apples     3 2018-07-01   3
2   apples     2 2018-07-02   2
3   apples     1 2018-07-03   1
4  bananas     9 2018-07-01   3
5  bananas     8 2018-07-02   2
6  bananas     7 2018-07-03   1

2) SQLite 要使用 SQLite 后端(默认后端),我们需要修改 SQL 语句适当。确保在执行此操作之前未加载 RPostgreSQL。我们假设数据已经根据问题中显示的数据在每个类别中按日期排序,但如果不是这种情况,那么扩展 SQL 以首先对其进行排序就足够容易了。

library(sqldf)

sqldf("select a.*, count(*) seq 
       from DF a left join DF b on a.Category = b.Category and b.rowid >= a.rowid 
       group by a.rowid 
       order by a.Category, a.Date")

备注

可重现形式的输入DF是:

Lines <- "
Category  Value  Date    
apples        3  2018-07-01 
apples        2  2018-07-02 
apples        1  2018-07-03 
bananas       9  2018-07-01 
bananas       8  2018-07-02 
bananas       7  2018-07-03 
"
DF <- read.table(text = Lines, header = TRUE, as.is = TRUE)