建议 table 按工作日和时间统计节省访客的方案

Question

我想在数据库中保存每个客户在工作日和小时内进入网站的次数。这意味着对于每个客户，我将拥有 24 * 7 的值，这些值将不断更新以反映客户访问次数最多的高峰时段。我看到明显的建议 Database structure for holding statistics by day, week, month, year 为每个入口创建一条新线路而不是使用数据，它不会工作，我们有数百万条线路，我需要每个客户的高峰时段可用。此外，为每个客户端创建 168 列看起来有点极端。有什么建议吗？

Answer 1

可能需要进行现实检查。

I will have 24 * 7 values that will be constantly updating to reflect the peak hour with the most visits for the client.

这是假设您每天存储 24 个条目，即客户端未显示的条目。除非你的客户是公司，否则人们会睡觉。

to create a new line for each entrance and than use the data, it won't work, we have millions of lines

所以呢？ table 中的数百万行在 30 年前不是问题。今天绝对不是问题。是的，它可能不会运行在 20 年前的台式机上 - 但在具有半 TB 内存和适当磁盘布局的体面的中型服务器上，您可以存储数百 GB 的数据并快速处理它们。

Also, creating 168 columns for each clients looks a little extreme.

也是傻逼。我的意思是愚蠢。看 - 问题是虽然您（和您的应用程序）可以使用它，但您很快就会发现，如果您尝试加载数据，即加载到报告工具中并找到特定时间的所有行 - 您生活在痛苦的世界。每个条目一行是关系数据模型所建议的。其他任何东西要么是 SMART（对于非常有限的用例），要么证明工具非常快地遵循关系定理，而你生活在一个充满痛苦代码的世界中。

并不是说它不会发生。我看到有人为每写一张发票写一个新的 table（所以发票详细信息 table 不会太长）...

每个入口一个条目，您可以按星期几组成一个组，按房屋聚合 - 每行 168 个字段，这不容易。

总的来说：这是 2020 年。中档台式机有 64gb 内存。中档服务器有 1 TB 或 2 TB。 SSD 存储使数据库可以轻松快速地处理数百 GB 的数据——这在硬盘时代是非常痛苦的。数百万行是 26 年前我在第一个商业级数据库项目中的一个笑话。今天，数十亿只是零钱。

Answer 2

这是一个 table 结构（类似于我看到的实现的结构），它将摘要统计信息分为周、日期（或天）和小时 tables 与主要连接外键关系的键。它不是将一天中的不同时间存储为列（在 rdbms 中不推荐这样做），而是将它们存储在行中。可以根据需要使用适当的索引和分区处理每天（或每小时）数百万次访问。

像这样

DDL

create table dbo.visitor_events(
    v_id       int identity(1,1) primary key not null,
    client_id   int not null references clients(client_id),
    visit_dt    datetime2(7) not null default sysutcdatetime());

create table dbo.visitor_event_weeks(
    vsw_id       int identity(1,1) primary key not null,
    client_id   int not null references clients(client_id),
    visit_wk    int not null,
    visits      int not null);

create table dbo.visitor_event_dates(
    vsd_id      int identity(1,1) primary key not null,
    client_id   int not null references clients(client_id),
    vsw_id      int not null references visitor_event_weeks(vsw_id),
    visit_wk    int not null,
    visit_dt    datetime not null,
    visits      int not null);

create table dbo.visitor_event_hours(
    vsh_id       int identity(1,1) primary key not null,
    client_id   int not null references clients(client_id),
    vsd_id      int not null references visitor_event_dates(vsd_id),
    visit_hr    datetime not null,
    visits      int not null);

变量

变量和 insert/update 语句（根据最适合 OP 的内容而有所不同）

declare 
  @client_id          int=123,
  @visit_dt           datetime2(7)=sysutcdatetime();
declare
  @v_id               int;
declare   
  @vsw                table(vsw_id      int unique not null);
declare   
  @vsd                table(vse_id      int unique not null);

/* Insert a visit */
insert dbo.visitor_events(client_id, visit_dt) values
(@client_id, @visit_dt);
select @v_id=scope_identity();

/* Insert/update a visit week */
update dbo.visitor_event_weeks
set visits=visits+1
output inserted.vsw_id into @vsw
where client_id=@client_id
      and visit_wk=datediff(wk, 0, @visit_dt);
if @@rowcount>0
    begin
        insert dbo.visitor_event_weeks(client_id, visit_wk, visits) 
        output inserted.vsw_id into @vsw
        values (@client_id, datediff(wk, 0, @visit_dt), 1);
    end

/* Insert a visit date */
update dbo.visitor_event_dates
set visits=visits+1
output inserted.vsd_id into @vsd
where client_id=@client_id
      and vsw_id=(select top 1 vsw_id from @vsw)
      and visit_dt=cast(@visit_dt as date);
if @@rowcount>0
    begin
        insert dbo.visitor_event_dates(client_id, vsw_id, visit_dt, visits) 
        output inserted.vsd_id into @vsd
        values (@client_id, (select top 1 vsw_id from @vsw), cast(@visit_dt as date), 1);
    end

/* Insert a visit date hour */
update dbo.visitor_event_dates
set visits=visits+1
output inserted.vsd_id into @vsd
where client_id=@client_id
      and visit_dt=cast(@visit_dt as date);
if @@rowcount>0
    begin
        insert dbo.visitor_event_hours(client_id, vsd_id, visit_dt, visits) 
        output inserted.vsd_id into @vsd
        values (@client_id, (select top 1 vsw_id from @vsw), cast(@visit_dt as date), 1);
    end

建议 table 按工作日和时间统计节省访客的方案

Suggested table scheme for saving visitors by week days and hours statistics

database

database-design

visitor-statistic